skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on September 1, 2026

Title: Foram3D: A pipeline for 3D synthetic data generation and rendering of foraminifera for image analysis and reconstruction
Foraminifera play an important role in oceanographic and paleoceanographic research. The test morphology and chemistry within species, as well as the presence or absence of certain species, are affected by environmental conditions. Classification of different species of foraminifera is a crucial yet tedious task for researchers. Deep-learning approaches can help with morphological studies and aid in species classification; however, they require large-scale datasets that are challenging to obtain and annotate because of the extremely small size and delicate handling of these microorganisms. In this work, we expand on an existing mathematical model for foraminifera shell growth to generate 3D synthetic models to aid in these studies. We define parameter spaces for the model which are intended to approximate seven randomly chosen foraminifera taxa. Along with providing an open-source code base to support other researchers in generating models and studying growth patterns, we further extend the synthetic data generation to include a rendering component that mimics two existing robotic imaging systems. We provide two use cases for our synthetic dataset. First, we show how orientation can affect the automated classification of different species and how incorporating aleatoric uncertainty indicators can help select the next views of the samples to significantly improve classification accuracy from 82% to 89%. Next, we show how a sparse set of synthetic 2D images can be used to extract 3D morphology of foraminifera using Neural Radiance Fields (NeRFs).  more » « less
Award ID(s):
2411214
PAR ID:
10620705
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Elsevier
Date Published:
Journal Name:
Marine Micropaleontology
Volume:
200
Issue:
C
ISSN:
0377-8398
Page Range / eLocation ID:
102486
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Fossil single-celled marine organisms known as foraminifera are widely used in oceanographic research. The identification of species is one of the most common tasks when analyzing ocean samples. One of the primary criteria for species identification is their morphology. Automatic segmentation of images of foraminifera would aid on the identification task as well as on other morphological studies. We pose this problem as an edge detection task for which capturing the correct topological structure is essential. Due to the presence of soft edges and even unclosed segments, state-of-the-art techniques have problems capturing the correct edge structure. Standard pixel-based loss functions are also sensitive to small deformations and shifts of the edges penalizing location more heavily than actual structure. Hence, we propose a homology-based detector of local structural difference between two edge maps with a tolerable deformation. This detector is employed as a new criterion for the training and design of data-driven approaches that focus on enhancing these structural differences. Our approaches demonstrate significant improvement on morphological segmentation of foraminifera when considering region-based and topology-based metrics. Human ranking of the quality of the results by marine researchers also supports these findings. 
    more » « less
  2. Topological data analysis (TDA) is a branch of computational mathematics, bridging algebraic topology and data science, that provides compact, noise-robust representations of complex structures. Deep neural networks (DNNs) learn millions of parameters associated with a series of transformations defined by the model architecture resulting in high-dimensional, difficult to interpret internal representations of input data. As DNNs become more ubiquitous across multiple sectors of our society, there is increasing recognition that mathematical methods are needed to aid analysts, researchers, and practitioners in understanding and interpreting how these models' internal representations relate to the final classification. In this paper we apply cutting edge techniques from TDA with the goal of gaining insight towards interpretability of convolutional neural networks used for image classification. We use two common TDA approaches to explore several methods for modeling hidden layer activations as high-dimensional point clouds, and provide experimental evidence that these point clouds capture valuable structural information about the model's process. First, we demonstrate that a distance metric based on persistent homology can be used to quantify meaningful differences between layers and discuss these distances in the broader context of existing representational similarity metrics for neural network interpretability. Second, we show that a mapper graph can provide semantic insight as to how these models organize hierarchical class knowledge at each layer. These observations demonstrate that TDA is a useful tool to help deep learning practitioners unlock the hidden structures of their models. 
    more » « less
  3. Abstract Computed tomography (CT) scanning and other high‐throughput three‐dimensional (3D) visualization tools are transforming the ways we study morphology, ecology and evolutionary biology research beyond generating vast digital repositories of anatomical data. Contrast‐enhanced chemical staining methods, which render soft tissues radio‐opaque when coupled with CT scanning, encompass several approaches that are growing in popularity and versatility. Of these, the various diceCT techniques that use an iodine‐based solution like Lugol's have provided access to an array of morphological data sets spanning extant vertebrate lineages. This contribution outlines straightforward means for applying diceCT techniques to preserved museum specimens of cartilaginous and bony fishes, collectively representing half of vertebrate species diversity. This study contrasts the benefits of using either aqueous or ethylic Lugol's solutions and reports few differences between these methods with respect to the time required to achieve optimal tissue contrast. It also explores differences in minimum stain duration required for different body sizes and shapes and provides recommendations for staining specimens individually or in small batches. As reported by earlier studies, the authors note a decrease in pH during staining with either aqueous or ethylic Lugol's. Nonetheless, they could not replicate the drastic declines in pH reported elsewhere. They provide recommendations for researchers and collections staff on how to incorporate diceCT into existing curatorial practices, while offsetting risk to specimens. Finally, they outline how diceCT with Lugol's can aid ichthyologists of all kinds in visualizing anatomical structures of interest: from brains and gizzards to gas bladders and pharyngeal jaw muscles. 
    more » « less
  4. Abstract Background Fluorescence image analysis in biochemical science often involves the complex tasks of identifying samples for analysis and calculating the desired information from the intensity traces. Analyzing giant unilamellar vesicles (GUVs) is one of these tasks. Researchers need to identify many vesicles to statistically analyze the degree of molecular interaction or state of molecular organization on the membranes. This analysis is complicated, requiring a careful manual examination by researchers, so automating the analysis can significantly aid in improving its efficiency and reliability. Results We developed a convolutional neural network (CNN) assisted intelligent analysis routine based on the whole 3D z-stack images. The programs identify the vesicles with desired morphology and analyzes the data automatically. The programs can perform protein binding analysis on the membranes or state decision analysis of domain phase separation. We also show that the method can easily be applied to similar problems, such as intensity analysis of phase-separated protein droplets. CNN-based classification approach enables the identification of vesicles even from relatively complex samples. We demonstrate that the proposed artificial intelligence-assisted classification can further enhance the accuracy of the analysis close to the performance of manual examination in vesicle selection and vesicle state determination analysis. Conclusions We developed a MATLAB based software capable of efficiently analyzing confocal fluorescence image data of giant unilamellar vesicles. The program can automatically identify GUVs with desired morphology and perform intensity-based calculation and state decision for each vesicle. We expect our method of CNN implementation can be expanded and applied to many similar problems in image data analysis. 
    more » « less
  5. D'Andrea, Rafael (Ed.)
    Data on the three dimensional shape of organismal morphology is becoming increasingly available, and forms part of a new revolution in high-throughput phenomics that promises to help understand ecological and evolutionary processes that influence phenotypes at unprecedented scales. However, in order to meet the potential of this revolution we need new data analysis tools to deal with the complexity and heterogeneity of large-scale phenotypic data such as 3D shapes. In this study we explore the potential of generative Artificial Intelligence to help organize and extract meaning from complex 3D data. Specifically, we train a deep representational learning method known as DeepSDF on a dataset of 3D scans of the bills of 2,020 bird species. The model is designed to learn a continuous vector representation of 3D shapes, along with a ’decoder’ function, that allows the transformation from this vector space to the original 3D morphological space. We find that approach successfully learns coherent representations: particular directions in latent space are associated with discernible morphological meaning (such as elongation, flattening, etc.). More importantly, learned latent vectors have ecological meaning as shown by their ability to predict the trophic niche of the bird each bill belongs to with a high degree of accuracy. Unlike existing 3D morphometric techniques, this method has very little requirements for human supervised tasks such as landmark placement, increasing it accessibility to labs with fewer labour resources. It has fewer strong assumptions than alternative dimension reduction techniques such as PCA. Once trained, 3D morphology predictions can be made from latent vectors very computationally cheaply. The trained model has been made publicly available and can be used by the community, including for finetuning on new data, representing an early step toward developing shared, reusable AI models for analyzing organismal morphology. 
    more » « less