This content will become publicly available on September 1, 2026
tMHG-Finder: Tree-Guided Maximal Homologous Group Finder for Bacterial Genomes
- Award ID(s):
- 2126387
- PAR ID:
- 10635764
- Publisher / Repository:
- Springer Nature Switzerland
- Date Published:
- ISBN:
- 978-3-031-94928-9
- Page Range / eLocation ID:
- 87 to 104
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Evans, Christopher J.; Bryant, Julia J.; Motohara, Kentaro (Ed.)The Keck Planet Finder (KPF) is a fiber-fed, high-resolution, high-stability spectrometer in development at the UC Berkeley Space Sciences Laboratory for the W.M. Keck Observatory. KPF is designed to characterize exoplanets via Doppler spectroscopy with a goal of a single measurement precision of 0.3 m s-1 or better, however its resolution and stability will enable a wide variety of astrophysical pursuits. Here we provide post-preliminary design review design updates for several subsystems, including: the main spectrometer, the fabrication of the Zerodur optical bench; the data reduction pipeline; fiber agitator; fiber cable design; fiber scrambler; VPH testing results and the exposure meter.more » « less
-
Subgraph matching is a core primitive across a number of disciplines, ranging from data mining, databases, information retrieval, computer vision to natural language processing. Despite decades of efforts, it is still highly challenging to balance between the matching accuracy and the computational efficiency, especially when the query graph and/or the data graph are large. In this paper, we propose an index-based algorithm (G-FINDER) to find the top-k approximate matching subgraphs. At the heart of the proposed algorithm are two techniques, including (1) a novel auxiliary data structure (LOOKUP-TABLE) in conjunction with a neighborhood expansion method to effectively and efficiently index candidate vertices, and (2) a dynamic filtering and refinement strategy to prune the false candidates at an early stage. The proposed G-FINDER bears some distinctive features, including (1) generality, being able to handle different types of inexact matching (e.g., missing nodes, missing edges, intermediate vertices) on node attributed and/or edge attributed graphs or multigraphs; (2) effectiveness, achieving up to 30% F1-Score improvement over the best known competitor; and (3) efficiency, scaling near-linearly w.r.t. the size of the data graph as well as the query graph.more » « less
-
Model-finders such as SAT-solvers are attractive for produc- ing concrete models, either as sample instances or as counterexamples when properties fail. However, the generated model is arbitrary. To ad- dress this, several research efforts have proposed principled forms of output from model-finders. These include minimal and maximal models, unsat cores, and proof-based provenance of facts. While these methods enjoy elegant mathematical foundations, they have not been subjected to rigorous evaluation on users to assess their utility. This paper presents user studies of these three forms of output performed on advanced students. We find that most of the output forms fail to be effective, and in some cases even actively mislead users. To make such studies feasible to run frequently and at scale, we also show how we can pose such studies on the crowdsourcing site Mechanical Turk.more » « less
An official website of the United States government
