SQLite Archiver for packaging assets

SQLite Archiver for packaging assets

The SQLite Archiver is an experimental tool released by the SQLite folks. It has a command line utility that that takes a file or a list of files and and creates a SQLite database with the files as stored as BLOBs. By default the utility compresses files with zlib, but can store the files uncompressed as well. According to the README, this method only increases the size of the resulting archive by around 2%.

The resulting archive is a SQLite database that can used with existing SQLite tools & libraries. The code also includes a utility that allows you to mount the archive as a FUSE filesystem to browse the files in the archive.

The resulting database has the following schema:


  CREATE TABLE sqlar(
      name TEXT PRIMARY KEY,  -- name of the file
      mode INT,               -- access permissions
      mtime INT,              -- last modification time
      sz INT,                 -- original file size
      data BLOB               -- compressed content
    );
    
    

There isn’t anything keeping you from adding your own tables to the resulting database though.

The tool is just a proof-of-concept so I wouldn’t want to use it in a critical digital preservation workflow, but it is interesting to think about how packaging assets in this way would allow for some very interesting applications.

Something quite like the BagIt specification could be duplicated using this method. Instead of a checksum manifest file another table could be added to the database with checksums:


  CREATE TABLE manifest(
  name TEXT,
  md5 TEXT,
  sha256 TEXT,
  sha512 TEXT
  FOREIGN KEY (name) REFERENCES slqar(name)
  );
  

The bag-info file could also be duplicated:


  CREATE TABLE info(
  name TEXT,
  source_orginization TEXT, 
  FOREIGN KEY (name) REFERENCES slqar(name)
  );
  

Then of course you would be able to add any other additional (structured) data you would want to the archive. There are many benefits and drawbacks to storing files this way, which you can read about here, nevertheless I do enjoy thinking about the possibilities of this approach.