Who lives in a pineapple under the sea? Spongebob Squarepants! And his evil nemesis Sheldon J. Plankton. Cartoon Network aside, Yale researchers recently constructed the most extensive publicly available archive of planktonic foraminifera, a tiny single-celled organism, using a technique they developed from scratch. The six faculty members from Yale’s Geology and Geophysics Department published their study in the journal Scientific Data on Aug. 28.

Planktonic foraminifera, or forams for short, are protists with a shell. They are the most common type of microfossil in the sea. Past studies have used a technique called computed tomography imaging for scanning fossils and compiling data reports, but the researchers cut time and effort by creating their own scanning program called AutoMorph. Although foraminifera are the most common type of microfossil we have today, many questions about them remain unanswered.

“Functional morphology is not well understood,” said geology and geophysics professor Leanne Elder, first author of the study. “The hope is that by having such a huge data set, we can understand what the importance of shape is in them since they have such a good fossil record.”

Elder and her colleagues set out to create this data set. To do so, they collected microfossils and scanned and processed them using the technique they developed alongside the Yale Center for Research Computing.

The group collected planktonic foraminifera in the North and South Atlantic oceans. They then placed the marine objects on slides — about 1,000 per slide — and scanned them with a microscope.

Kaylea Nelson, a computational research support analyst at the Yale Center for Research Computing, coordinated the technical efforts for the team. The center is home to Grace, one of Yale’s five high-performance computing clusters, or supercomputers. Nelson said that the additional nodes and memory on Grace’s software allow it to solve problems that are too large to fit on a laptop.

Nelson helped the team develop the AutoMorph program, whose name stands for automated morphometric post-processing. AutoMorph is the preprocessing step that prepares data for the analysis, she said.

AutoMorph extracts 2D and 3D information based on four categories: object selection, shape extraction, size measurements and object classification.

“It is new to have 3D data and imaging because they’re such large files,” Elder said, adding that she is actually a biologist by training and wanted the study to begin to branch the fields of ancient geology and modern biology.

Elder said her colleague and co-author Sara Kahanamoku has used a similar technique for collecting data on lipids, which are slightly larger than foraminifera. She added, however, that Kahanamoku has only compiled 2D, not 3D, data. With the publication of the recent report, the group hopes to apply their results to lipids and other species, specifically molluscs.

In total, the data report published 124,000 objects in an open-access report that is currently available on Zenodo, an online open data repository. The specimens photographed in the report are currently based at the Yale Peabody Museum of Natural History.

“We’re hoping other people use this data set for things we haven’t thought of in understanding morphology,” Elder said. “The way to progress data is to be open with these things and share it so others can compare and use your data for things you might not even have thought of.”

The supercomputer Grace at the Yale Center for Research Computing is named after Grace Hopper GRD ’34.

Samuel Turner | samuel.turner@yale.edu

SAMUEL TURNER