Fedora 5→6 Migration w/ Islandora

2022-04-06

Back in January we discovered that, although our original files stored in Fedora were being served through our Islandora site, metadata updates were failing. (See the thread on the Fedora Slack #tech channel.) After a bit of debugging we determined that the best solution would be to upgrade our Fedora 5 instance to Fedora 6.

A Fedora 5 migration to Fedora 6 is theoretically simple, but because it involves a complete reworking of the persistant storage, it can be resource intensive for larger repositories. We had more than 12 terabytes of files to migrate. The migration consists of three stages:

  1. export the Fedora 5 repository,
  2. transform the export into the Oxford Common File Layout (OCFL),
  3. install the Fedora 6 code and configure it to use the transformed repository content.

Exporting Fedora 5

Naturally, we tested this process on our development instance to make sure we got it right first. However, this development server is significanly under-powered compared to production and uses a relatively slow network drive for larger storage. This significantly slowed testing: the test export on the development server was aborted early after running for two weeks compared to the seventeen hours for the production site export. Fortunately, this was the only complication with the export. The fcrepo-import-export tool was straight-forward to use.

Migrating the Export to OCFL

The export migration to OCFL was a bit trickier to get right. Fortunately I had learned from the eariler mistake and ran our test migration on a machine with much more gusto. Even then, these tests took time. Yes, tests is plural. The first run with the migration tool, fcrepo-upgrade-utils, worked without any reported errors. I then used the one-click jar deployment of fedora 6 on the test set to inspect it before testing the MySQL and Tomcat deployment. I quickly noticed that the resource records were being split between localhost-based URIs and URIs with the development server host name:

$ curl http://dams.library.unlv.edu:8080/fcrepo/rest/00/0f/b3/c8/000fb3c8-32a4-47cd-88bf-2db4f3b23785
@prefix schema: <http://schema.org/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ldp: <http://www.w3.org/ns/ldp#> .

<http://localhost:8080/fcrepo/rest/00/0f/b3/c8/000fb3c8-32a4-47cd-88bf-2db4f3b23785>
        schema:sameAs        <http://n2t.net/ark:/62930/d1pk09q7f> ;
        dcterms:identifier   <http://n2t.net/ark:/62930/d1pk09q7f> ;
        dcterms:title        "jhp000420-034-004"@en ;
        schema:dateCreated   "2021-10-26T04:44:31+00:00"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;
        dcterms:publisher    <http://dams.library.unlv.edu/taxonomy/term/1990> ;
        dcterms:identifier   "jhp000420-034-004" ;
        rdf:type             <http://pcdm.org/models#Object> ;
        schema:dateModified  "2022-03-11T16:36:51+00:00"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;
        <http://pcdm.org/models#memberOf>  <http://dams.library.unlv.edu/node/446297> .

<http://dams.library.unlv.edu:8080/fcrepo/rest/00/0f/b3/c8/000fb3c8-32a4-47cd-88bf-2db4f3b23785>
        <http://fedora.info/definitions/v4/repository#created>  "2021-10-26T05:31:30.283Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;
        <http://fedora.info/definitions/v4/repository#lastModified>  "2022-03-11T17:10:20.216Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;
        rdf:type  ldp:BasicContainer ;
        rdf:type  ldp:Resource ;
        rdf:type  <http://fedora.info/definitions/v4/repository#Resource> ;
        rdf:type  ldp:RDFSource ;
        rdf:type  ldp:Container ;
        rdf:type  <http://fedora.info/definitions/v4/repository#Container> .

All of these triples should have been associated with a single consistent subject URI.

Once again I reached out to the Fedora Slack for help and Peter Winckles came to the rescue. With the aid of his rocfl tool for inspecting the OCFL data, we determined that I incorrectly used the base-uri flag on the migration tool that resulted in this resource bifurcation. I supplied the base-uri flag with the host’s URI when the original URIs had used localhost. I re-ran the migration tool using --base-uri http://localhost:8080/fcrepo/rest instead which resulted in info:fedora/ based URIs which Fedora can swap out with the instance’s REST API root on request. (I.e. it will return either http://localhost:8080/fcrepo/rest or http://mydomain.test:8080/fcrepo/rest depending on how the site is configured with the same OCFL data.)

This process similarly took about seventeen hours to perform in production.

Install and Configure Fedora 6

The previous steps, while tricky in their own ways, boil down to running a single command. Setting up Fedora 6 for deployment is quite a bit trickier.

I noticed during my early testing with the one-click jar deployment was that Fedora insists on building it’s index before it will begin responding to requests. We have not only significant file volume, but a large resource count, above 400k resources. This takes a long time to index. I didn’t want to shut down our Fedora 5, start up Fedora 6, and then wait for the index to complete before our instance would begin responding; especially the production site. It was clear I needed to carefully plan this transition to minimize public downtime.

Part of the strategy would be to index the Fedora instance into a production MySQL/MariaDB database before starting it up in place of Fedora 5. We could do this by configuring our fcrepo.properties file with the production database connection information but running the one-click jar deployment. The one-click jar uses an embedded Jetty server but it uses the same default port as Tomcat: 8080. I looked in the Fedora documentation for a way to change which port Jetty would use (I didn’t find it there) but was again pointed the right way by Peter Winckles. The one-click jar can recieve a --port flag with the desired port number. This allowed me to run Fedora 5 and 6 in parallel and index my Fedora 6 before using it with Tomcat.

We’ve been running Fedora 5 with Tomcat 8 but Fedora 6 requires Tomcat 9. This was actually to my advantage as I could set up Tomcat 9 in a separate directory and then my downtime would be only long enough to shutdown Tomcat, redirect the softlink tomcat reference from our Tomcat 8 directory to the Tomcat 9 one, and restart. (Assuming no errors pop up! 🤞) But first we needed to get our Tomcat 9 and Fedora 6 configured.

A quick diff of the two Tomcat config directories showed quite a bit of boiler-plate changes that could be safely ignored. The key was to find the bits we had added to the Tomcat configuration specifically for our webapps. The primary difference of concern was copying over the Syn valve configurations that allowed Islandora to issue authenticated requests allowing Milliner to perform updates.

I mentioned above adjusting the fcrepo.properties file, which is the most important part, although fortunately simple. The other bit that needs converting from Fedora 5 to 6 is the namespaces file which is now in YAML syntax, a simple enough conversion.

This all worked to bring up Fedora 6 with my test. Fedora came up and I could request our resources from it. But Islandora still wasn’t able to update resources. 😕 After checking the Tomcat logs and the Alpaca logs I was lead to the Milliner logs. Milliner was throwing errors but it took some code reading to find the core reason. Milliner has some custom logic that differs between working with Fedora 5 and 6. But how was Milliner supposed to know which it was working with? There turned out to be an undocumented configuration setting, fcrepo6, that needs to be ‘true’.

With that, I was able to shut down tomcat, update the tomcat path link to Tomcat 9, and restart Tomcat and Apache (to reload the updated Milliner config) to now have Fedora 6 running in production for our Islandora site. 🎉🎊

The Adventure Continues

I would like to say that all was roses and daisys from there, but since then two issues appeared.

First, requesting the fcrepo root of the application returns a 500 error. This isn’t critical, as our Fedora is hidden behind the scenes and we don’t need to visit it. All our requests go to the REST endpoint which work fine. I’ve made an enquiry to the Fedora Slack, but no fruit yet. I suspect that this has to do with my running the one-click jar (where /rest is the root instead of /fcrepo/rest) and then using that index with Tomcat, but I have no evidence to support it; only my suspicions.

Second, and far more critical, is that Milliner is rejecting my request to re-index older items. This is because the migration gave all our resources a date last modified corresponding to the migration. Now when Milliner sees that our record in Drupal is “older” than the record in Fedora, it refuses to update it. We have a lot of resource records where we’ve updated the metadata after the indexing began to fail and before the migration, so we need to get this addressed. There are a few ways to get around this, but I’m tempted to temporarily comment out this check in Milliner while I do some reindexing and then restore it when I’m done. I’m still considering options, so perhaps I’ll have an update on this later.

But then again, a developer’s work is never done…