skip to main content

Search for: All records

Creators/Authors contains: "Huang, Jun"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    DNA double-strand breaks require repair or risk corrupting the language of life. To ensure genome integrity and viability, multiple DNA double-strand break repair pathways function in eukaryotes. Two such repair pathways, canonical non-homologous end joining and homologous recombination, have been extensively studied, while other pathways such as microhomology-mediated end joint and single-strand annealing, once thought to serve as back-ups, now appear to play a fundamental role in DNA repair. Here, we review the molecular details and hierarchy of these four DNA repair pathways, and where possible, a comparison for what is known between animal and fungal models. We address the factors contributing to break repair pathway choice, and aim to explore our understanding and knowledge gaps regarding mechanisms and regulation in filamentous pathogens. We additionally discuss how DNA double-strand break repair pathways influence genome engineering results, including unexpected mutation outcomes. Finally, we review the concept of biased genome evolution in filamentous pathogens, and provide a model, termed Biased Variation, that links DNA double-strand break repair pathways with properties of genome evolution. Despite our extensive knowledge for this universal process, there remain many unanswered questions, for which the answers may improve genome engineering and our understanding of genome evolution.

  2. Recent years have witnessed the increasing penetration of wireless charging base stations in the workplace and public areas, such as airports and cafeterias. Such an emerging wireless charging infrastructure has presented opportunities for new indoor localization and identification services for mobile users. In this paper, we present QID, the first system that can identify a Qi-compliant mobile device during wireless charging in real-time. QID extracts features from the clock oscillator and control scheme of the power receiver and employs light-weight algorithms to classify the device. QID adopts a 2-dimensional motion unit to emulate a variety of multi-coil designs of Qi, which allows for fine-grained device fingerprinting. Our results show that QID achieves high recognition accuracy. With the prevalence of public wireless charging stations, our results also have important implications for mobile user privacy.
    Free, publicly-accessible full text available May 31, 2023
  3. Recent trends towards large machine learning models require both training and inference tasks to be distributed. Considering the huge cost of training these models, it is imperative to unlock optimizations in computation and communication to obtain best performance. However, the current logical separation between computation and communication kernels in machine learning frameworks misses optimization opportunities across this barrier. Breaking this abstraction can provide many optimizations to improve the performance of distributed workloads. However, manually applying these optimizations requires modifying the underlying computation and communication libraries for each scenario, which is both time consuming and error-prone. Therefore, we present CoCoNet, which contains (i) a domain specific language to express a distributed machine learning program in the form of computation and communication operations, (ii) a set of semantics preserving transformations to optimize the program, and (iii) a compiler to generate jointly optimized communication and computation GPU kernels. Providing both computation and communication as first class constructs allows users to work on a high-level abstraction and apply powerful optimizations, such as fusion or overlapping of communication and computation. CoCoNet enabled us to optimize data-, model- and pipeline-parallel workloads in large language models with only a few lines of code. Our experiments show thatmore »CoCoNet significantly outperforms state-of-the-art distributed machine learning implementations.« less
  4. SARS-CoV-2 and HIV-1 are RNA viruses that have killed millions of people worldwide. Understanding the similarities and differences between these two infections is critical for understanding disease progression and for developing effective vaccines and therapies, particularly for 38 million HIV-1+ individuals who are vulnerable to SARS-CoV-2 co-infection. Here, we utilized single-cell transcriptomics to perform a systematic comparison of 94,442 PBMCs from 7 COVID-19 and 9 HIV-1+ patients in an integrated immune atlas, in which 27 different cell types were identified using an accurate consensus single-cell annotation method. While immune cells in both cohorts show shared inflammation and disrupted mitochondrial function, COVID-19 patients exhibit stronger humoral immunity, broader IFN-I signaling, elevated Rho GTPase and mTOR pathway activities, and downregulated mitophagy. Our results elucidate transcriptional signatures associated with COVID-19 and HIV-1 that may reveal insights into fundamental disease biology and potential therapeutic targets to treat these viral infections.
  5. Thomson, Robert (Ed.)
    Abstract Genome sequencing projects routinely generate haploid consensus sequences from diploid genomes, which are effectively chimeric sequences with the phase at heterozygous sites resolved at random. The impact of phasing errors on phylogenomic analyses under the multispecies coalescent (MSC) model is largely unknown. Here, we conduct a computer simulation to evaluate the performance of four phase-resolution strategies (the true phase resolution, the diploid analytical integration algorithm which averages over all phase resolutions, computational phase resolution using the program PHASE, and random resolution) on estimation of the species tree and evolutionary parameters in analysis of multilocus genomic data under the MSC model. We found that species tree estimation is robust to phasing errors when species divergences were much older than average coalescent times but may be affected by phasing errors when the species tree is shallow. Estimation of parameters under the MSC model with and without introgression is affected by phasing errors. In particular, random phase resolution causes serious overestimation of population sizes for modern species and biased estimation of cross-species introgression probability. In general, the impact of phasing errors is greater when the mutation rate is higher, the data include more samples per species, and the species tree is shallowermore »with recent divergences. Use of phased sequences inferred by the PHASE program produced small biases in parameter estimates. We analyze two real data sets, one of East Asian brown frogs and another of Rocky Mountains chipmunks, to demonstrate that heterozygote phase-resolution strategies have similar impacts on practical data analyses. We suggest that genome sequencing projects should produce unphased diploid genotype sequences if fully phased data are too challenging to generate, and avoid haploid consensus sequences, which have heterozygous sites phased at random. In case the analytical integration algorithm is computationally unfeasible, computational phasing prior to population genomic analyses is an acceptable alternative. [BPP; introgression; multispecies coalescent; phase; species tree.]« less
  6. Mittelsten Scheid, Ortrun (Ed.)
    Transcriptional dynamic in response to environmental and developmental cues are fundamental to biology, yet many mechanistic aspects are poorly understood. One such example is fungal plant pathogens, which use secreted proteins and small molecules, termed effectors, to suppress host immunity and promote colonization. Effectors are highly expressed in planta but remain transcriptionally repressed ex planta , but our mechanistic understanding of these transcriptional dynamics remains limited. We tested the hypothesis that repressive histone modification at H3-Lys27 underlies transcriptional silencing ex planta , and that exchange for an active chemical modification contributes to transcription of in planta induced genes. Using genetics, chromatin immunoprecipitation and sequencing and RNA-sequencing, we determined that H3K27me3 provides significant local transcriptional repression. We detail how regions that lose H3K27me3 gain H3K27ac, and these changes are associated with increased transcription. Importantly, we observed that many in planta induced genes were marked by H3K27me3 during axenic growth, and detail how altered H3K27 modification influences transcription. ChIP-qPCR during in planta growth suggests that H3K27 modifications are generally stable, but can undergo dynamics at specific genomic locations. Our results support the hypothesis that dynamic histone modifications at H3K27 contributes to fungal genome regulation and specifically contributes to regulation of genes importantmore »during host infection.« less