skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Rapid and Precise Topological Comparison with Merge Tree Neural Networks
Merge trees are a valuable tool in the scientific visualization of scalar fields; however, current methods for merge tree comparisons are computationally expensive, primarily due to the exhaustive matching between tree nodes. To address this challenge, we introduce the Merge Tree Neural Network (MTNN), a learned neural network model designed for merge tree comparison. The MTNN enables rapid and high-quality similarity computation. We first demonstrate how to train graph neural networks, which emerged as effective encoders for graphs, in order to to produce embeddings of merge trees in vector spaces for efficient similarity comparison. Next, we formulate the novel MTNN model that further improves the similarity comparisons by integrating the tree and node embeddings with a new topological attention mechanism. We demonstrate the effectiveness of our model on real-world data in different domains and examine our model's generalizability across various datasets. Our experimental analysis demonstrates our approach's superiority in accuracy and efficiency. In particular, we speed up the prior state-of-the-art by more than 100x on the benchmark datasets while maintaining an error rate below 0.1%.  more » « less
Award ID(s):
2046730 2107434
PAR ID:
10562513
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ACM
Date Published:
Journal Name:
IEEE Transactions on Visualization and Computer Graphics
Volume:
31
Issue:
1
ISSN:
1077-2626
Page Range / eLocation ID:
1322 to 1332
Subject(s) / Keyword(s):
Computational topology, merge trees, graph neural networks
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We present a novel approach to perform instance segmentation and counting for densely packed self-similar trees using a top-view RGB image sequence. We propose a solution that leverages pixel content, shape, and self-occlusion. First, we perform an initial over-segmentation of the image sequence and aggregate structural characteristics into a contour graph with temporal information incorporated. Second, using a graph convolutional network and its inherent local messaging passing abilities, we merge adjacent tree crown patches into a final set of tree crowns. Per various studies and comparisons, our method is superior to all prior methods and results in high-accuracy instance segmentation and counting despite the trees being tightly packed. Finally, we provide various forest image sequence datasets suitable for subsequent benchmarking and evaluation captured at different altitudes and leaf conditions. 
    more » « less
  2. We present a novel approach to perform instance segmentation and counting for densely packed self-similar trees using a top-view RGB image sequence. We propose a solution that leverages pixel content, shape, and self-occlusion. First, we perform an initial over-segmentation of the image sequence and aggregate structural characteristics into a contour graph with temporal information incorporated. Second, using a graph convolutional network and its inherent local messaging passing abilities, we merge adjacent tree crown patches into a final set of tree crowns. Per various studies and comparisons, our method is superior to all prior methods and results in high-accuracy instance segmentation and counting despite the trees being tightly packed. Finally, we provide various forest image sequence datasets suitable for subsequent benchmarking and evaluation captured at different altitudes and leaf conditions. 
    more » « less
  3. Abstract Despite the ubiquitous use of statistical models for phylogenomic and population genomic inferences, this model-based rigor is rarely applied to post hoc comparison of trees. In a recent study, Garba et al. derived new methods for measuring the distance between two gene trees computed as the difference in their site pattern probability distributions. Unlike traditional metrics that compare trees solely in terms of geometry, these measures consider gene trees and associated parameters as probabilistic models that can be compared using standard information theoretic approaches. Consequently, probabilistic measures of phylogenetic tree distance can be far more informative than simply comparisons of topology and/or branch lengths alone. However, in their current form, these distance measures are not suitable for the comparison of species tree models in the presence of gene tree heterogeneity. Here, we demonstrate an approach for how the theory of Garba et al. (2018), which is based on gene tree distances, can be extended naturally to the comparison of species tree models. Multispecies coalescent (MSC) models parameterize the discrete probability distribution of gene trees conditioned upon a species tree with a particular topology and set of divergence times (in coalescent units), and thus provide a framework for measuring distances between species tree models in terms of their corresponding gene tree topology probabilities. We describe the computation of probabilistic species tree distances in the context of standard MSC models, which assume complete genetic isolation postspeciation, as well as recent theoretical extensions to the MSC in the form of network-based MSC models that relax this assumption and permit hybridization among taxa. We demonstrate these metrics using simulations and empirical species tree estimates and discuss both the benefits and limitations of these approaches. We make our species tree distance approach available as an R package called pSTDistanceR, for open use by the community. 
    more » « less
  4. Many algorithms for virtual tree generation exist, but the visual realism of the 3D models is unknown. This problem is usually addressed by performing limited user studies or by a side-by-side visual comparison. We introduce an automated system for realism assessment of the tree model based on their perception. We conducted a user study in which 4,000 participants compared over one million pairs of images to collect subjective perceptual scores of a large dataset of virtual trees. The scores were used to train two neural-network-based predictors. A view independent ICTreeF uses the tree model's geometric features that are easy to extract from any model. The second is ICTreeI that estimates the perceived visual realism of a tree from its image. Moreover, to provide an insight into the problem, we deduce intrinsic attributes and evaluate which features make trees look like real trees. In particular, we show that branching angles, length of branches, and widths are critical for perceived realism. We also provide three datasets: carefully curated 3D tree geometries and tree skeletons with their perceptual scores, multiple views of the tree geometries with their scores, and a large dataset of images with scores suitable for training deep neural networks. 
    more » « less
  5. The recent rapid development in Natural Language Processing (NLP) has greatly en- hanced the effectiveness of Intelligent Tutoring Systems (ITS) as tools for healthcare education. These systems hold the potential to improve health-related quality of life (HRQoL) outcomes, especially for populations with limited English reading and writing skills. However, despite the progress in pre-trained multilingual NLP models, there exists a noticeable research gap when it comes to code-switching within the medical context. Code-switching is a prevalent phenomenon in multilingual communities where individuals seamlessly transition between languages during conversations. This presents a distinctive challenge for healthcare ITS aimed at serving multilin- gual communities, as it demands a thorough understanding of and accurate adaptation to code- switching, which has thus far received limited attention in research. The hypothesis of our work asserts that the development of an ITS for healthcare education, culturally appropriate to the Hispanic population with frequent code-switching practices, is both achievable and pragmatic. Given that text classification is a core problem to many tasks in ITS, like sentiment analysis, topic classification, and smart replies, we target text classification as the application domain to validate our hypothesis. Our model relies on pre-trained word embeddings to offer rich representations for understand- ing code-switching medical contexts. However, training such word embeddings, especially within the medical domain, poses a significant challenge due to limited training corpora. In our approach to address this challenge, we identify distinct English and Spanish embeddings, each trained on medical corpora, and subsequently merge them into a unified vector space via space transforma- tion. In our study, we demonstrate that singular value decomposition (SVD) can be used to learn a linear transformation (a matrix), which aligns monolingual vectors from two languages in a single meta-embedding. As an example, we assessed the similarity between the words “cat” and “gato” both before and after alignment, utilizing the cosine similarity metric. Prior to alignment, these words exhibited a similarity score of 0.52, whereas after alignment, the similarity score increased to 0.64. This example illustrates that aligning the word vectors in a meta-embedding enhances the similarity between these words, which share the same meaning in their respective languages. To assess the quality of the representations in our meta-embedding in the context of code-switching, we employed a neural network to conduct text classification tasks on code-switching datasets. Our results demonstrate that, compared to pre-trained multilingual models, our model can achieve high performance in text classification tasks while utilizing significantly fewer parameters. 
    more » « less