skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Solís‐Lemus, Claudia"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Phylogenetic networks encode a broader picture of evolution by the inclusion of reticulate processes such as hybridization, introgression, or horizontal gene transfer. Each hybridization event is represented by a ‘hybridization cycle’. Here, we investigate the statistical identifiability of the position of the hybrid node in a 4-node hybridization cycle in a semi-directed level-1 phylogenetic network. That is, we investigate if our model is able to detect the correct placement of the hybrid node in the hybridization cycle using quartet concordance factors as data. In the current study, we prove that the correct placement of the hybrid node in 4-node hybridization cycles, included in level-1 phylogenetic networks, is generically identifiable if the assumptions are non-restrictive such as t∈(0,∞) for all branch (or edge) lengths and γ∈(0,1) for the inheritance probability of the hybrid edges. However, simulations show that accurate detection of these cycles can be complicated by inadequate sampling, small sample size, or gene tree estimation error. We identify practical advice for evolutionary biologists on best sampling strategies to improve the detection of this type of hybridization cycle. 
    more » « less
  2. Abstract Microbial networks offer critical insights into community structure, ecological interactions and host–microbe dynamics. However, constructing reliable microbiome networks remains challenging due to variability among existing inference methods, limited overlap between inferred networks and the absence of a gold standard (a universally accepted reference for benchmarking) for validation.We developedCMiNet, an R package and interactive Shiny App(https://cminet.wid.wisc.edu) that enables consensus microbiome network construction by integrating up to 10 widely used inference algorithms.CMiNetsupports both correlation‐based and conditional dependence‐based methods and provides users with flexible options to construct individual or consensus networks across different approaches.CMiNetintegrates results from multiple inference methods through a voting strategy that retains edges supported by a user‐defined number of methods. To assess robustness, we complement this with a bootstrap analysis that quantifies edge stability under resampling. By jointly reporting method support and bootstrap confidence,CMiNetprovides a reproducible framework that explicitly communicates both agreement across methods and stability under perturbation.We appliedCMiNetto gut and soil microbiome datasets, constructing consensus networks that retained edges supported by multiple methods and confirmed by bootstrap reproducibility values. To identify disease‐associated taxa, we developed an integrative strategy that compared results across machine learning, differential abundance and network‐based approaches, ensuring that selected taxa were consistently recovered across methods. In the soil dataset, this analysis highlighted key taxa such asKtedonobacteria, Acidobacteriae, Vicinamibacteria, MB‐A2‐108, IgnavibacteriaandAnaerolineae, all of which were confirmed by multiple independent strategies. 
    more » « less
    Free, publicly-accessible full text available January 1, 2027
  3. Reticulate evolution has long been recognized as a key mechanism that contributes to genetic and trait diversity. With the widespread availability of genomic data, investigating historical reticulate evolution across taxa has gained significant attention, driven by the rapid development of statistical methods for detecting nontreelike patterns. Phylogenetic networks provide a biologically intuitive approach to depicting evolutionary processes such as hybrid speciation and introgressive hybridization, which result in signatures of historical gene flow. Interpreting phylogenetic networks is especially critical for groups of conservation concern that lack reference genome resources and explicit hypotheses from prior investigation, such as those based on molecular data, morphology, or species distributions. Here, we highlight recent advances in computational methods for inferring networks from genome-scale data and offer guidelines for deriving biological insights from phylogenetic networks. Particular emphasis is placed on modeling hybridization and whole-genome duplication in the context of allopolyploidization. Practical recommendations for empirical studies and the limitations of commonly used methods are discussed throughout. We anticipate that phylogenetic networks will influence conservation biology and biodiversity research, emphasizing the need for careful consideration of reticulate evolution inferred from these networks in the near future. Networks will accelerate other pressing avenues of biodiversity research, especially investigations of orphan crops and climate change resilience in natural systems. The promise of phylogenetic networks connects with broader themes in the special feature Monitoring and restoring gene flow in the increasingly fragmented ecosystems of the Anthropocene by providing an emerging probabilistic framework for inferring historical connectivity between species and populations. 
    more » « less
  4. Symbiotic relationships shape the evolution of organisms. Fungi in the genus Escovopsis share an evolutionary history with the fungus-growing “attine” ant system and are only found in association with these social insects. Despite this close relationship, there are key aspects of Escovopsis evolution that remain poorly understood. To gain further insight into the evolutionary history of these unique fungi, we delve deeper into Escovopsis’ origin and distribution, considering the largest sampling, so far, across the Americas. Furthermore, we investigate Escovopsis’ trait evolution, and relationship with attine ants. We demonstrate that, while the genus originated approximately 56.9 Mya, it only became associated with 'higher attine' ants in the last 38 My. Our results, however, indicate that it is likely that the ancestor of Escovopsis lived in symbiosis with early-diverging fungus-growing ants. Since then, the fungi have evolved morphological and physiological adaptations that have increased their reproductive efficiency, possibly to overcome barriers mounted by the ants and their other associated microbes. Taken together, these results provide new clues as to how Escovopsis has evolved within the context of this complex symbiosis and shed light on the evolutionary history of the fungus-growing ant system. 
    more » « less
    Free, publicly-accessible full text available December 1, 2026
  5. This special collection includes topics related to the development of novel methods for reconstructing phylogenetic networks from different mathematical, statistical, and computational approaches that highlight the challenges of network reconstruction and the needs of contemporary genomic data. In addition, the collection broadcasts diverse applications of phylogenetic networks on a wide variety of organisms across the Tree of Life. 
    more » « less
    Free, publicly-accessible full text available April 16, 2026
  6. Birtwistle, Marc R (Ed.)
    High-dimensional mixed-effects models are an increasingly important form of regression in which the number of covariates rivals or exceeds the number of samples, which are collected in groups or clusters. The penalized likelihood approach to fitting these models relies on a coordinate descent algorithm that lacks guarantees of convergence to a global optimum. Here, we empirically study the behavior of this algorithm on simulated and real examples of three types of data that are common in modern biology: transcriptome, genome-wide association, and microbiome data. Our simulations provide new insights into the algorithm’s behavior in these settings, and, comparing the performance of two popular penalties, we demonstrate that the smoothly clipped absolute deviation (SCAD) penalty consistently outperforms the least absolute shrinkage and selection operator (LASSO) penalty in terms of both variable selection and estimation accuracy across omics data. To empower researchers in biology and other fields to fit models with the SCAD penalty, we implement the algorithm in a Julia package,HighDimMixedModels.jl. 
    more » « less
  7. Abstract MotivationThe abundance of gene flow in the Tree of Life challenges the notion that evolution can be represented with a fully bifurcating process which cannot capture important biological realities like hybridization, introgression, or horizontal gene transfer. Coalescent-based network methods are increasingly popular, yet not scalable for big data, because they need to perform a heuristic search in the space of networks as well as numerical optimization that can be NP-hard. Here, we introduce a novel method to reconstruct phylogenetic networks based on algebraic invariants. While there is a long tradition of using algebraic invariants in phylogenetics, our work is the first to define phylogenetic invariants on concordance factors (frequencies of four-taxon splits in the input gene trees) to identify level-1 phylogenetic networks under the multispecies coalescent model. ResultsOur novel hybrid detection methodology is optimization-free as it only requires the evaluation of polynomial equations, and as such, it bypasses the traversal of network space, yielding a computational speed at least 10 times faster than the fastest-to-date network methods. We illustrate our method’s performance on simulated and real data from the genus Canis. Availability and implementationWe present an open-source publicly available Julia package PhyloDiamond.jl available at https://github.com/solislemuslab/PhyloDiamond.jl with broad applicability within the evolutionary community. 
    more » « less
  8. Ouangraoua, Aida (Ed.)
    Abstract Scientists world-wide are putting together massive efforts to understand how the biodiversity that we see on Earth evolved from single-cell organisms at the origin of life and this diversification process is represented through the Tree of Life. Low sampling rates and high heterogeneity in the rate of evolution across sites and lineages produce a phenomenon denoted “long branch attraction” (LBA) in which long non-sister lineages are estimated to be sisters regardless of their true evolutionary relationship. LBA has been a pervasive problem in phylogenetic inference affecting different types of methodologies from distance-based to likelihood-based. Here, we present a novel neural network model that outperforms standard phylogenetic methods and other neural network implementations under LBA settings. Furthermore, unlike existing neural network models in phylogenetics, our model naturally accounts for the tree isomorphisms via permutation invariant functions which ultimately result in lower memory and allows the seamless extension to larger trees. 
    more » « less
  9. Abstract Phylogenetic regression is a type of generalised least squares (GLS) method that incorporates a modelled covariance matrix based on the evolutionary relationships between species (i.e. phylogenetic relationships). While this method has found widespread use in hypothesis testing via phylogenetic comparative methods, such as phylogenetic ANOVA, its ability to account for non‐linear relationships has received little attention.To address this, here we implement a phylogenetic Kernel Ridge Regression (phyloKRR) method that utilises GLS in a high‐dimensional feature space, employing linear combinations of phylogenetically weighted data to account for non‐linearity. We analysed two biological datasets using the Radial Basis Function and linear kernel function. The first dataset contained morphometric data, while the second dataset comprised discrete trait data and diversification rates as response variable. Hyperparameter tuning of the model was achieved through cross‐validation rounds in the training set.In the tested biological datasets, phyloKRR reduced the error rate (as measured by RMSE) by around 20% compared to linear‐based regression when data did not exhibit linear relationships. In simulated datasets, the error rate decreased almost exponentially with the level of non‐linearity.These results show that introducing kernels into phylogenetic regression analysis presents a novel and promising tool for complementing phylogenetic comparative methods. We have integrated this method into Python package named phyloKRR, which is freely available at:https://github.com/ulises‐rosas/phylokrr. 
    more » « less