How to Generate IIIF Manifests from METS with oSullivan
Jun 10, 2015
There has been some interest in the community around how to use the osullivan gem to generate manifests, so I have decided to write a blog post that will document my experience and hopefully be of some use to others who want to start using osullivan. In this post, I will step through the process of cooking up a utility app that will turn structured data in the METS format into IIIF Presentation API manifests.
Since we are “cleaning up” the extensive METS data to make it “presentable” on the web, I’m calling the app “spiiiffy”. Our infrastructure already has lots of support for Rails apps, so I’m going to go with the flow and use ruby 2.0.0p451 paired with Rails 4.2.0. Here we go:
Spiiiffy will need to take a URL that provides a METS document, use the nokogiri gem to parse the XML data, and use the osullivan gem to create the Presentation API manifest structure. Then it will need to store both datastreams and deliver either depending on the request.
So we modify our Gemfile by adding these lines:
Let’s generate our Metadata model and run the migration to create the table:
Add the accessors to the model:
Before saving, extract the title and objid from the record for human and computer usability:
We’re going to make sure that we can get to the object using the OBJID instead of the Rails internal id.
Start building a manifest with osullivan
Note: for some reason, I needed to add require 'iiif/presentation' to the top of the model file even though it’s in the Gemfile. May not be loading correctly.
I should probably write a test for this, but just to quickly check to make sure it’s working, I spun up the rails server and noticed the json data in the manifest textarea when attempting to edit a record. Huzzzaaah, a baby manifest!
Now we’re starting to really get our hands dirty, and it’s probably time to refactor some things. For example, in both set_title and set_objid we are doing the same thing:
We should pull these duplicate steps out and when initialized we create a Nokogiri nodeset object that’s accessible to all methods within this class. I don’t know how to do this yet.
We will put off the refactoring for the moment in the interest of working code. The next step is to iterate over the mets:structMap and add canvases to our baby manifest. So, yet again, we create a Nokogiri doc from the METS, get our ordered list, and also grab all the image files from the fileSec:
Our mets:file references have two different identifiers in addition to a URN associated with them. One identifier (@ADMID) is used by mets:techMD and the other (@ID) is usec by mets:structMap. In order to easily work access both, I’m going to create a hash with both of these identifiers and grab our :
Ok, so it’s easy to get the label and order number from our structMap, and we can use that file_hash to get at the image dimensions to set on each canvas:
Uh oh! Nothing! Looking at the mets:techMD section I think there’s a namespace issue in getting at our image dimensions:
Of course, we can bypass the whole namespace issue, and we don’t really need the namespaces anyway. So, in the interest of getting working code I’m going to clone the mets_doc and wipe out the namespaces for this one step:
Hooray! Now we can remove our puts statement and actually add all this data to the Canvas. The full block would look like this:
So, that is a long block. In fact, we’re going to need to make it longer because in the same pass we are going to want to add the Images to the Canvas before adding it to m.sequences:
Ok, we have to do one more thing before we can have a workable IIIF Manifest: add the service information. This should be rather simple as we just need to create a new resource that we stuff into the ImageResource, the same way we inserted the ImageResource into the Images Resource that lists ImageResources (I know, it gets crazy!). Probably simpler to show: