Improving FITS Performance

Improving FITS Performance

Harvard’s fits utility bundles a lot of file identification tools and allows you to run them on files easily. It’s used in Samvera applications for file characterization. If you are importing a large amount of content into your repository you’ll be starting and stopping fits many times. The standard way of doing this (running fits.sh over and over again) isn’t very efficient.

fits includes a script that allows you to run the utility with nailgun. nailgun is a minimal server that allows you to spin up a JVM for the Java application and reuse it. The normal way that works is creating and destroying a JVM instance each time the application is run. If you are starting and stopping the JVM many thousands of times, that can be inefficient.

nailgun has a client that is written in C and a server jar. To get the nailgun-ified version of fits running on OSX you can install the nailgun client it with brew:

brew install nailgun

However, you’ll need to get a copy of the nailgun server jar by downloading the nailgun source from GitHub running mvn install.

You pass the fits-ngserver.sh script the server jar:

./fits-ngserver.sh ~/nailgun/nailgun-server/target/nailgun-server-0.9.2-SNAPSHOT.jar

Then interact with fits through the nailgun client instead of using fits.sh:

~/Sync/repos/nailgun/ng edu.harvard.hul.ois.fits.Fits -i ~/Pictures/Something.jpg

This reduces the time it takes to run fits:

With nailgun:

real	0m2.797s
user	0m0.001s
sys	0m0.002s

Running directly:

real	0m5.543s
user	0m11.829s
sys	0m0.862s

Hyrax calls fits from a shell command, so you only need to switch out the fits.sh path setting in your Hyrax installation with the ng command.