skip to main content


Search for: All records

Award ID contains: 2004751

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Motivation

    The human microbiome, which is linked to various diseases by growing evidence, has a profound impact on human health. Since changes in the composition of the microbiome across time are associated with disease and clinical outcomes, microbiome analysis should be performed in a longitudinal study. However, due to limited sample sizes and differing numbers of timepoints for different subjects, a significant amount of data cannot be utilized, directly affecting the quality of analysis results. Deep generative models have been proposed to address this lack of data issue. Specifically, a generative adversarial network (GAN) has been successfully utilized for data augmentation to improve prediction tasks. Recent studies have also shown improved performance of GAN-based models for missing value imputation in a multivariate time series dataset compared with traditional imputation methods.

    Results

    This work proposes DeepMicroGen, a bidirectional recurrent neural network-based GAN model, trained on the temporal relationship between the observations, to impute the missing microbiome samples in longitudinal studies. DeepMicroGen outperforms standard baseline imputation methods, showing the lowest mean absolute error for both simulated and real datasets. Finally, the proposed model improved the predicted clinical outcome for allergies, by providing imputation for an incomplete longitudinal dataset used to train the classifier.

    Availability and implementation

    DeepMicroGen is publicly available at https://github.com/joungmin-choi/DeepMicroGen.

     
    more » « less
  2. Abstract Purpose of Review

    Mounting evidence indicates that habitats such as wastewater and environmental waters are pathways for the spread of antibiotic-resistant bacteria (ARB) and mobile antibiotic resistance genes (ARGs). We identified antibiotic-resistant members of the generaAcinetobacter,Aeromonas, andPseudomonasas key opportunistic pathogens that grow or persist in built (e.g., wastewater) or natural aquatic environments. Effective methods for monitoring these ARB in the environment are needed to understand their influence on dissemination of ARB and ARGs, but standard methods have not been developed. This systematic review considers peer-reviewed papers where the ARB above were cultured from wastewater or surface water, focusing on the accuracy of current methodologies.

    Recent Findings

    Recent studies suggest that many clinically important ARGs were originally acquired from environmental microorganisms.Acinetobacter,Aeromonas,andPseudomonasspecies are of interest because their ability to persist and grow in the environment provides opportunities to engage in horizontal gene transfer with other environmental bacteria. Pathogenic strains of these organisms resistant to multiple, clinically relevant drug classes have been identified as an urgent threat. However, culture methods for these bacteria were generally developed for clinical samples and are not well-vetted for environmental samples.

    Summary

    The search criteria yielded 60 peer-reviewed articles over the past 20 years, which reported a wide variety of methods for isolation, confirmation, and antibiotic resistance assays. Based on a systematic comparison of the reported methods, we suggest a path forward for standardizing methodologies for monitoring antibiotic resistant strains of these bacteria in water environments.

     
    more » « less
  3. Abstract Background

    There is concern that the microbially rich activated sludge environment of wastewater treatment plants (WWTPs) may contribute to the dissemination of antibiotic resistance genes (ARGs). We applied long-read (nanopore) sequencing to profile ARGs and their neighboring genes to illuminate their fate in the activated sludge treatment by comparing their abundance, genetic locations, mobility potential, and bacterial hosts within activated sludge relative to those in influent sewage across five WWTPs from three continents.

    Results

    The abundances (gene copies per Gb of reads, aka gc/Gb) of all ARGs and those carried by putative pathogens decreased 75–90% from influent sewage (192-605 gc/Gb) to activated sludge (31-62 gc/Gb) at all five WWTPs. Long reads enabled quantification of the percent abundance of ARGs with mobility potential (i.e., located on plasmids or co-located with other mobile genetic elements (MGEs)). The abundance of plasmid-associated ARGs decreased at four of five WWTPs (from 40–73 to 31–68%), and ARGs co-located with transposable, integrative, and conjugative element hallmark genes showed similar trends. Most ARG-associated elements decreased 0.35–13.52% while integrative and transposable elements displayed slight increases at two WWTPs (1.4–2.4%). While resistome and taxonomic compositions both shifted significantly, host phyla for chromosomal ARG classes remained relatively consistent, indicating vertical gene transfer via active biomass growth in activated sludge as the key pathway of chromosomal ARG dissemination.

    Conclusions

    Overall, our results suggest that the activated sludge process acted as a barrier against the proliferation of most ARGs, while those that persisted or increased warrant further attention.

     
    more » « less
  4. The demand for memory is ever increasing. Many prior works have explored hardware memory compression to increase effective memory capacity. However, prior works compress and pack/migrate data at a small - memory blocklevel - granularity; this introduces an additional block-level translation after the page-level virtual address translation. In general, the smaller the granularity of address translation, the higher the translation overhead. As such, this additional block-level translation exacerbates the well-known address translation problem for large and/or irregular workloads. A promising solution is to only save memory from cold (i.e., less recently accessed) pages without saving memory from hot (i.e., more recently accessed) pages (e.g., keep the hot pages uncompressed); this avoids block-level translation overhead for hot pages. However, it still faces two challenges. First, after a compressed cold page becomes hot again, migrating the page to a full 4KB DRAM location still adds another level (albeit page-level, instead of block-level) of translation on top of existing virtual address translation. Second, only compressing cold data require compressing them very aggressively to achieve high overall memory savings; decompressing very aggressively compressed data is very slow (e.g., > 800ns assuming the latest Deflate ASIC in industry). This paper presents Translation-optimized Memory Compression for Capacity (TMCC) to tackle the two challenges above. To address the first challenge, we propose compressing page table blocks in hardware to opportunistically embed compression translations into them in a software-transparent manner to effectively prefetch compression translations during a page walk, instead of serially fetching them after the walk. To address the second challenge, we perform a large design space exploration across many hardware configurations and diverse workloads to derive and implement in HDL an ASIC Deflate that is specialized for memory; for memory pages, it is 4X as fast as the state-of-the art ASIC Deflate, with little to no sacrifice in compression ratio. Our evaluations show that for large and/or irregular workloads, TMCC can either improve performance by 14% without sacrificing effective capacity or provide 2.2x the effective capacity without sacrificing performance compared to a stateof-the-art hardware memory compression for capacity. 
    more » « less
  5. Nojiri, Hideaki (Ed.)
    ABSTRACT Bacterial mobile genetic elements (MGEs) encode functional modules that perform both core and accessory functions for the element, the latter of which are often only transiently associated with the element. The presence of these accessory genes, which are often close homologs to primarily immobile genes, incur high rates of false positives and, therefore, limits the usability of these databases for MGE annotation. To overcome this limitation, we analyzed 10,776,849 protein sequences derived from eight MGE databases to compile a comprehensive set of 6,140 manually curated protein families that are linked to the “life cycle” (integration/excision, replication/recombination/repair, transfer, stability/transfer/defense, and phage-specific processes) of plasmids, phages, integrative, transposable, and conjugative elements. We overlay experimental information where available to create a tiered annotation scheme of high-quality annotations and annotations inferred exclusively through bioinformatic evidence. We additionally provide an MGE-class label for each entry (e.g., plasmid or integrative element), and assign to each entry a major and minor category. The resulting database, mobileOG-db (for mobile orthologous groups), comprises over 700,000 deduplicated sequences encompassing five major mobileOG categories and more than 50 minor categories, providing a structured language and interpretable basis for an array of MGE-centered analyses. mobileOG-db can be accessed at mobileogdb.flsi.cloud.vt.edu/, where users can select, refine, and analyze custom subsets of the dynamic mobilome. IMPORTANCE The analysis of bacterial mobile genetic elements (MGEs) in genomic data is a critical step toward profiling the root causes of antibiotic resistance, phenotypic or metabolic diversity, and the evolution of bacterial genera. Existing methods for MGE annotation pose high barriers of biological and computational expertise to properly harness. To bridge this gap, we systematically analyzed 10,776,849 proteins derived from eight databases of MGEs to identify 6,140 MGE protein families that can serve as candidate hallmarks, i.e., proteins that can be used as “signatures” of MGEs to aid annotation. The resulting resource, mobileOG-db, provides a multilevel classification scheme that encompasses plasmid, phage, integrative, and transposable element protein families categorized into five major mobileOG categories and more than 50 minor categories. mobileOG-db thus provides a rich resource for simple and intuitive element annotation that can be integrated seamlessly into existing MGE detection pipelines and colocalization analyses. 
    more » « less