Abstract The morphology and morphodynamics of cells as important biomarkers of the cellular state are widely appreciated in both fundamental research and clinical applications. Quantification of cell morphology often requires a large number of geometric measures that form a high-dimensional feature vector. This mathematical representation creates barriers to communicating, interpreting, and visualizing data. Here, we develop a deep learning-based algorithm to project 13-dimensional (13D) morphological feature vectors into 2-dimensional (2D) morphological latent space (MLS). We show that the projection has less than 5% information loss and separates the different migration phenotypes of metastatic breast cancer cells. Using the projection, we demonstrate the phenotype-dependent motility of breast cancer cells in the 3D extracellular matrix, and the continuous cell state change upon drug treatment. We also find that dynamics in the 2D MLS quantitatively agrees with the morphodynamics of cells in the 13D feature space, preserving the diffusive power and the Lyapunov exponent of cell shape fluctuations even though the dimensional reduction projection is highly nonlinear. Our results suggest that MLS is a powerful tool to represent and understand the cell morphology and morphodynamics.
more »
« less
This content will become publicly available on March 17, 2026
Generative AI extracts ecological meaning from the complex three dimensional shapes of bird bills
Data on the three dimensional shape of organismal morphology is becoming increasingly available, and forms part of a new revolution in high-throughput phenomics that promises to help understand ecological and evolutionary processes that influence phenotypes at unprecedented scales. However, in order to meet the potential of this revolution we need new data analysis tools to deal with the complexity and heterogeneity of large-scale phenotypic data such as 3D shapes. In this study we explore the potential of generative Artificial Intelligence to help organize and extract meaning from complex 3D data. Specifically, we train a deep representational learning method known as DeepSDF on a dataset of 3D scans of the bills of 2,020 bird species. The model is designed to learn a continuous vector representation of 3D shapes, along with a ’decoder’ function, that allows the transformation from this vector space to the original 3D morphological space. We find that approach successfully learns coherent representations: particular directions in latent space are associated with discernible morphological meaning (such as elongation, flattening, etc.). More importantly, learned latent vectors have ecological meaning as shown by their ability to predict the trophic niche of the bird each bill belongs to with a high degree of accuracy. Unlike existing 3D morphometric techniques, this method has very little requirements for human supervised tasks such as landmark placement, increasing it accessibility to labs with fewer labour resources. It has fewer strong assumptions than alternative dimension reduction techniques such as PCA. Once trained, 3D morphology predictions can be made from latent vectors very computationally cheaply. The trained model has been made publicly available and can be used by the community, including for finetuning on new data, representing an early step toward developing shared, reusable AI models for analyzing organismal morphology.
more »
« less
- Award ID(s):
- 2329701
- PAR ID:
- 10583024
- Editor(s):
- D'Andrea, Rafael
- Publisher / Repository:
- Public Library of Science
- Date Published:
- Journal Name:
- PLOS Computational Biology
- Volume:
- 21
- Issue:
- 3
- ISSN:
- 1553-7358
- Page Range / eLocation ID:
- e1012887
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Data-driven generative design (DDGD) methods utilize deep neural networks to create novel designs based on existing data. The structure-aware DDGD method can handle complex geometries and automate the assembly of separate components into systems, showing promise in facilitating creative designs. However, determining the appropriate vectorized design representation (VDR) to evaluate 3D shapes generated from the structure-aware DDGD model remains largely unexplored. To that end, we conducted a comparative analysis of surrogate models’ performance in predicting the engineering performance of 3D shapes using VDRs from two sources: the trained latent space of structure-aware DDGD models encoding structural and geometric information and an embedding method encoding only geometric information. We conducted two case studies: one involving 3D car models focusing on drag coefficients and the other involving 3D aircraft models considering both drag and lift coefficients. Our results demonstrate that using latent vectors as VDRs can significantly deteriorate surrogate models’ predictions. Moreover, increasing the dimensionality of the VDRs in the embedding method may not necessarily improve the prediction, especially when the VDRs contain more information irrelevant to the engineering performance. Therefore, when selecting VDRs for surrogate modeling, the latent vectors obtained from training structure-aware DDGD models must be used with caution, although they are more accessible once training is complete. The underlying physics associated with the engineering performance should be paid attention. This paper provides empirical evidence for the effectiveness of different types of VDRs of structure-aware DDGD for surrogate modeling, thus facilitating the construction of better surrogate models for AI-generated designs.more » « less
-
Word embeddings, which represent words as dense feature vectors, are widely used in natural language processing. In their seminal paper on word2vec, Mikolov and colleagues showed that a feature space created by training a word prediction network on a large text corpus will encode semantic information that supports analogy by vector arithmetic, e.g., "king" minus "man" plus "woman" equals "queen". To help novices appreciate this idea, people have sought effective graphical representations of word embeddings.We describe a new interactive tool for visually exploring word embeddings. Our tool allows users to define semantic dimensions by specifying opposed word pairs, e.g., gender is defined by pairs such as boy/girl and father/mother, and age by pairs such as father/son and mother/daughter. Words are plotted as points in a zoomable and rotatable 3D space, where the third ”residual” dimension encodes distance from the hyperplane defined by all the opposed word vectors with age and gender subtracted out. Our tool allows users to visualize vector analogies, drawing the vector from “king” to “man” and a parallel vector from “woman” to “king-man+woman”, which is closest to “queen”. Visually browsing the embedding space and experimenting with this tool can make word embeddings more intuitive. We include a series of experiments teachers can use to help K-12 students appreciate the strengths and limitations of this representation.more » « less
-
Word embeddings, which represent words as dense feature vectors, are widely used in natural language processing. In their seminal paper on word2vec, Mikolov and colleagues showed that a feature space created by training a word prediction network on a large text corpus will encode semantic information that supports analogy by vector arithmetic, e.g., "king" minus "man" plus "woman" equals "queen". To help novices appreciate this idea, people have sought effective graphical representations of word embeddings.We describe a new interactive tool for visually exploring word embeddings. Our tool allows users to define semantic dimensions by specifying opposed word pairs, e.g., gender is defined by pairs such as boy/girl and father/mother, and age by pairs such as father/son and mother/daughter. Words are plotted as points in a zoomable and rotatable 3D space, where the third ”residual” dimension encodes distance from the hyperplane defined by all the opposed word vectors with age and gender subtracted out. Our tool allows users to visualize vector analogies, drawing the vector from “king” to “man” and a parallel vector from “woman” to “king-man+woman”, which is closest to “queen”. Visually browsing the embedding space and experimenting with this tool can make word embeddings more intuitive. We include a series of experiments teachers can use to help K-12 students appreciate the strengths and limitations of this representation.more » « less
-
Since its introduction to North America in 1999, the West Nile virus (WNV) has resulted in over 50,000 human cases and 2400 deaths. WNV transmission is maintained via mosquito vectors and avian reservoir hosts, yet mosquito and avian infections are not uniform across ecological landscapes. As a result, it remains unclear whether the ecological communities of the vectors or reservoir hosts are more predictive of zoonotic risk at the microhabitat level. We examined this question in central Iowa, representative of the midwestern United States, across a land use gradient consisting of suburban interfaces with natural and agricultural habitats. At eight sites, we captured mosquito abundance data using New Jersey light traps and monitored bird communities using visual and auditory point count surveys. We found that the mosquito minimum infection rate (MIR) was better predicted by metrics of the mosquito community than metrics of the bird community, where sites with higher proportions of Culex pipiens group mosquitoes during late summer (after late July) showed higher MIRs. Bird community metrics did not significantly influence mosquito MIRs across sites. Together, these data suggest that the microhabitat suitability of Culex vector species is of greater importance than avian community composition in driving WNV infection dynamics at the urban and agricultural interface.more » « less
An official website of the United States government
