The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.
Explore Research Products in the PAR It may take a few hours for recently added research products to appear in PAR search results.
Hector Banos, Nathaniel Bushek
(, The journal of software for algebra and geometry)
null
(Ed.)
We introduce the package PhylogeneticTrees for Macaulay2, which allows users to compute phylogenetic invariants for group-based tree models. We provide some background information on phylogenetic algebraic geometry and show how the package PhylogeneticTrees can be used to calculate a generating set for a phylogenetic ideal as well as a lower bound for its dimension. Finally, we show how methods within the package can be used to compute a generating set for the join of any two ideals.
Abstract Motivation As genome-wide reconstruction of phylogenetic trees becomes more widespread, limitations of available data are being appreciated more than ever before. One issue is that phylogenomic datasets are riddled with missing data, and gene trees, in particular, almost always lack representatives from some species otherwise available in the dataset. Since many downstream applications of gene trees require or can benefit from access to complete gene trees, it will be beneficial to algorithmically complete gene trees. Also, gene trees are often unrooted, and rooting them is useful for downstream applications. While completing and rooting a gene tree with respect to a given species tree has been studied, those problems are not studied in depth when we lack such a reference species tree. Results We study completion of gene trees without a need for a reference species tree. We formulate an optimization problem to complete the gene trees while minimizing their quartet distance to the given set of gene trees. We extend a seminal algorithm by Brodal et al. to solve this problem in quasi-linear time. In simulated studies and on a large empirical data, we show that completion of gene trees using other gene trees is relatively accurate and, unlike the case where a species tree is available, is unbiased. Availability and implementation Our method, tripVote, is available at https://github.com/uym2/tripVote. Supplementary information Supplementary data are available at Bioinformatics online.
Aouad, Ali; Elmachtoub, Adam N.; Ferreira, Kris J.; McNellis, Ryan
(, Manufacturing & Service Operations Management)
Problem definition: We seek to provide an interpretable framework for segmenting users in a population for personalized decision making. Methodology/results: We propose a general methodology, market segmentation trees (MSTs), for learning market segmentations explicitly driven by identifying differences in user response patterns. To demonstrate the versatility of our methodology, we design two new specialized MST algorithms: (i) choice model trees (CMTs), which can be used to predict a user’s choice amongst multiple options, and (ii) isotonic regression trees (IRTs), which can be used to solve the bid landscape forecasting problem. We provide a theoretical analysis of the asymptotic running times of our algorithmic methods, which validates their computational tractability on large data sets. We also provide a customizable, open-source code base for training MSTs in Python that uses several strategies for scalability, including parallel processing and warm starts. Finally, we assess the practical performance of MSTs on several synthetic and real-world data sets, showing that our method reliably finds market segmentations that accurately model response behavior. Managerial implications: The standard approach to conduct market segmentation for personalized decision making is to first perform market segmentation by clustering users according to similarities in their contextual features and then fit a “response model” to each segment to model how users respond to decisions. However, this approach may not be ideal if the contextual features prominent in distinguishing clusters are not key drivers of response behavior. Our approach addresses this issue by integrating market segmentation and response modeling, which consistently leads to improvements in response prediction accuracy, thereby aiding personalization. We find that such an integrated approach can be computationally tractable and effective even on large-scale data sets. Moreover, MSTs are interpretable because the market segments can easily be described by a decision tree and often require only a fraction of the number of market segments generated by traditional approaches. Disclaimer: This work was done prior to Ryan McNellis joining Amazon. Funding: This work was supported by the National Science Foundation [Grants CMMI-1763000 and CMMI-1944428]. Supplemental Material: The online appendices are available at https://doi.org/10.1287/msom.2023.1195 .
Bonk, Mario; Meyer, Daniel
(, Transactions of the American Mathematical Society)
A quasiconformal tree T is a (compact) metric tree that is doubling and of bounded turning. We call T trivalent if every branch point of T has exactly three branches. If the set of branch points is uniformly relatively separated and uniformly relatively dense, we say that T is uniformly branching. We prove that a metric space T is quasisymmetrically equivalent to the continuum self-similar tree if and only if it is a trivalent quasiconformal tree that is uniformly branching. In particular, any two trees of this type are quasisymmetrically equivalent.
Blanc, G.; Lange, J.; Tan, L.
(, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022))
We give the first reconstruction algorithm for decision trees: given queries to a function f that is opt-close to a size-s decision tree, our algorithm provides query access to a decision tree T where: - T has size S := s^O((log s)²/ε³); - dist(f,T) ≤ O(opt)+ε; - Every query to T is answered with poly((log s)/ε)⋅ log n queries to f and in poly((log s)/ε)⋅ n log n time. This yields a tolerant tester that distinguishes functions that are close to size-s decision trees from those that are far from size-S decision trees. The polylogarithmic dependence on s in the efficiency of our tester is exponentially smaller than that of existing testers. Since decision tree complexity is well known to be related to numerous other boolean function properties, our results also provide a new algorithm for reconstructing and testing these properties.
Pardo-De la Hoz, C. J., Magain, N., Piatkowski, B., Cornet, L., Dal Forno, M., Carbone, I., Miadlikowska, J., and Lutzoni, F. Ancient Rapid Radiation Explains Most Conflicts Among Gene Trees and Well-supported Phylogenomic Trees of Nostocalean Cyanobacteria. Retrieved from https://par.nsf.gov/biblio/10468536. Systematic biology . Web. doi:10.1093/sysbio/syad008.
Pardo-De la Hoz, C. J., Magain, N., Piatkowski, B., Cornet, L., Dal Forno, M., Carbone, I., Miadlikowska, J., & Lutzoni, F. Ancient Rapid Radiation Explains Most Conflicts Among Gene Trees and Well-supported Phylogenomic Trees of Nostocalean Cyanobacteria. Systematic biology, (). Retrieved from https://par.nsf.gov/biblio/10468536. https://doi.org/10.1093/sysbio/syad008
Pardo-De la Hoz, C. J., Magain, N., Piatkowski, B., Cornet, L., Dal Forno, M., Carbone, I., Miadlikowska, J., and Lutzoni, F.
"Ancient Rapid Radiation Explains Most Conflicts Among Gene Trees and Well-supported Phylogenomic Trees of Nostocalean Cyanobacteria". Systematic biology (). Country unknown/Code not available: Oxford University Press. https://doi.org/10.1093/sysbio/syad008.https://par.nsf.gov/biblio/10468536.
@article{osti_10468536,
place = {Country unknown/Code not available},
title = {Ancient Rapid Radiation Explains Most Conflicts Among Gene Trees and Well-supported Phylogenomic Trees of Nostocalean Cyanobacteria},
url = {https://par.nsf.gov/biblio/10468536},
DOI = {10.1093/sysbio/syad008},
abstractNote = {},
journal = {Systematic biology},
publisher = {Oxford University Press},
author = {Pardo-De la Hoz, C. J. and Magain, N. and Piatkowski, B. and Cornet, L. and Dal Forno, M. and Carbone, I. and Miadlikowska, J. and Lutzoni, F.},
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.