A number of our teams within LSA Technology Services support the Research Museums Center (RMC) by enabling the digital collections management systems that ultimately share records and images of specimens with the public. The RMC holds collections from the Museum of Anthropological Archeology, the University of Michigan Herbarium, the Museum of Paleontology, the Museum of Zoology, and the Michigan Pathogen Biorepository. Many of these collections use Specify Collections Consortium to enable the sharing of these specimens with the public, digitally. Specify provides a user interface and schema on top of the database, making it easier for users to manage the collections’ information. Because the specimen data is made digitally available, researchers from around the world can access much of the information they need without needing to travel to browse the collections. There are currently thousands of research articles that have used the data in these digitally available collections.
Updating Specify
To keep up with the large collections, LSA Technology Services updated the Specify collection management system so that the software and servers that house the data remained secure and functional. This was a complex project that involved seven servers housing an excess of 10 million records across nine different collections with roughly 100 users. Since the entire server configuration had to be rebuilt during this process, LSA Technology Services took this as an opportunity to figure out if there was room for improvement on the organization of all of the data and processes within these servers.
After several months of reviewing all of the functionality and stored data and organizing it in a way that reduced the number of servers from seven to three, LSA Technology Services is now able to update the web-based version of Specify on a more regular schedule and incorporate bug fixes faster. “It makes it easier to update on our end [and] easier to stay current with the Specify version. Previously, we were multiple versions behind [but now] we're able to get the new versions rolled out a lot more quickly,” said Linda Hudson, Systems Administrator, LSA Technology Services Infrastructure. “Sometimes, because our system was so out of date, we could do maybe one to two updates a year; and they would occupy an enormous amount of staff time,” said Garth Holman, Application Systems Analyst, LSA Technology Services Web and Application Development. Specify users might also see an increase in the speed of the software due to the consolidation of servers because there will be less pinging back and forth between servers to get data into the software.
Digitizing Collections
Different collections are in different phases of the digitization of their specimens. However, the Museum of Zoology’s vertebrate collection is almost 100% digitized with a specimen record for everything. Now, they are slowly working on adding images to each record. Images present a different challenge for LSA Technology Services. The Museum of Zoology is “using, essentially a CT scanner, a descendant of a medical technology, to scan their specimens. They now have maybe one to two percent of their vertebrate collections scanned, adding up to a size of about 100 terabytes. We're talking about petabytes of data that we're going to need to store and make available,” said Holman.
LSA Technology Services partners with the University of Michigan Digital Library to help make all of the information available to the public. One of the servers that was reviewed as part of the Specify upgrade was the LSA Integrated Publishing Toolkit (IPT) server, which allows the museums to share their records with the world. The records are then distributed to national and international data-consolidation sites, like iDigBio, Symbiota, and the Global BioDiversity Information Facility (GBIF). Each specimen’s data record is exported from Specify and the library provides permanent links and archives for images, and those pieces of information are combined in the IPT so that when someone accesses the specimen on GBIF, for example, it can be viewed as one record with all of the information displayed.
The newest collection that has been added to the database is the Michigan Pathogen Biorepository (M-PABI). Multiple different divisions across the university are contributing to this research biorepository to provide a lending library of tissues. LSA Technology Services is supporting both the data and the sharing of that data. A perk of using Specify for this type of data is that users can classify which portions of the data are restricted to the public, especially in regards to legally protected species, controlled substances, or pathogens. A researcher might collect a bat specimen that also has mites, but after studying their DNA, they realize there’s also a pathogen present. So now the specimen isn’t simply a bat, it’s also a mite and a pathogen, which were all collected at the same time and place. Additionally, all of the images that come from the collection of that one original specimen need to be documented and all linked together for researchers to easily access. Being able to assist researchers in mapping out these specimens with zoonotic pathogens, like COVID, is essential for our understanding of the next zoonotic disease where a pathogen comes from an animal and spreads to humans.
If you’d like to learn more about the vast amount of data available to the public as a result of digitizing the RMC’s collections, you can visit any of the below links to the GBIF sites.