skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, July 12 until 2:00 AM ET on Saturday, July 13 due to maintenance. We apologize for the inconvenience.

Search for: All records

Creators/Authors contains: "Yang, He"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Despite the IEEE Power Electronics Society (PELS) establishing Technical Committee 10 on Design Methodologies with a focus on the cyber-physical security of power electronics systems, a holistic design methodology for addressing security vulnerabilities remains underdeveloped. This gap largely stems from the limited integration of computer science and power/control engineering studies in this interdisciplinary field. Addressing the inadequacy of unilateral cyber or control perspectives, this paper presents a novel four-layer cyber-physical security model specifically designed for electric machine drives. Central to this model is the innovative Control Information Flow (CIF) model, residing within the control layer, which serves as a pivotal link between the cyber layer’s vulnerable resources and the physical layer’s state-space models. By mapping vulnerable resources to control variable space and tracing attack propagation, the CIF model facilitates accurate impact predictions based on tainted control laws. The effectiveness and validity of this proposed model are demonstrated through hardware experiments involving two typical cyber-attack scenarios, underscoring its potential as a comprehensive framework for multidisciplinary security strategies. 
    more » « less
    Free, publicly-accessible full text available January 1, 2025
  2. Abstract Background

    Measuring parathyroid hormone-related peptide (PTHrP) helps diagnose the humoral hypercalcemia of malignancy, but is often ordered for patients with low pretest probability, resulting in poor test utilization. Manual review of results to identify inappropriate PTHrP orders is a cumbersome process.


    Using a dataset of 1330 patients from a single institute, we developed a machine learning (ML) model to predict abnormal PTHrP results. We then evaluated the performance of the model on two external datasets. Different strategies (model transporting, retraining, rebuilding, and fine-tuning) were investigated to improve model generalizability. Maximum mean discrepancy (MMD) was adopted to quantify the shift of data distributions across different datasets.


    The model achieved an area under the receiver operating characteristic curve (AUROC) of 0.936, and a specificity of 0.842 at 0.900 sensitivity in the development cohort. Directly transporting this model to two external datasets resulted in a deterioration of AUROC to 0.838 and 0.737, with the latter having a larger MMD corresponding to a greater data shift compared to the original dataset. Model rebuilding using site-specific data improved AUROC to 0.891 and 0.837 on the two sites, respectively. When external data is insufficient for retraining, a fine-tuning strategy also improved model utility.


    ML offers promise to improve PTHrP test utilization while relieving the burden of manual review. Transporting a ready-made model to external datasets may lead to performance deterioration due to data distribution shift. Model retraining or rebuilding could improve generalizability when there are enough data, and model fine-tuning may be favorable when site-specific data is limited.

    more » « less
    Free, publicly-accessible full text available September 21, 2024
  3. Context.— Machine learning (ML) allows for the analysis of massive quantities of high-dimensional clinical laboratory data, thereby revealing complex patterns and trends. Thus, ML can potentially improve the efficiency of clinical data interpretation and the practice of laboratory medicine. However, the risks of generating biased or unrepresentative models, which can lead to misleading clinical conclusions or overestimation of the model performance, should be recognized. Objectives.— To discuss the major components for creating ML models, including data collection, data preprocessing, model development, and model evaluation. We also highlight many of the challenges and pitfalls in developing ML models, which could result in misleading clinical impressions or inaccurate model performance, and provide suggestions and guidance on how to circumvent these challenges. Data Sources.— The references for this review were identified through searches of the PubMed database, US Food and Drug Administration white papers and guidelines, conference abstracts, and online preprints. Conclusions.— With the growing interest in developing and implementing ML models in clinical practice, laboratorians and clinicians need to be educated in order to collect sufficiently large and high-quality data, properly report the data set characteristics, and combine data from multiple institutions with proper normalization. They will also need to assess the reasons for missing values, determine the inclusion or exclusion of outliers, and evaluate the completeness of a data set. In addition, they require the necessary knowledge to select a suitable ML model for a specific clinical question and accurately evaluate the performance of the ML model, based on objective criteria. Domain-specific knowledge is critical in the entire workflow of developing ML models. 
    more » « less
  4. The Fusarium oxysporum species complex (FOSC) includes both plant and human pathogens that cause devastating plant vascular wilt diseases and threaten public health. Each F. oxysporum genome comprises core chromosomes (CCs) for housekeeping functions and accessory chromosomes (ACs) that contribute to host-specific adaptation. This study inspects global transcription factor profiles (TFomes) and their potential roles in coordinating CC and AC functions to accomplish host-specific interactions. Remarkably, we found a clear positive correlation between the sizes of TFomes and the proteomes of an organism. With the acquisition of ACs, the FOSC TFomes were larger than the other fungal genomes included in this study. Among a total of 48 classified TF families, 14 families involved in transcription/translation regulations and cell cycle controls were highly conserved. Among the 30 FOSC expanded families, Zn2-C6 and Znf_C2H2 were most significantly expanded to 671 and 167 genes per family including well-characterized homologs of Ftf1 (Zn2-C6) and PacC (Znf_C2H2) that are involved in host-specific interactions. Manual curation of characterized TFs increased the TFome repertoires by 3% including a disordered protein Ren1. RNA-Seq revealed a steady pattern of expression for conserved TF families and specific activation for AC TFs. Functional characterization of these TFs could enhance our understanding of transcriptional regulation involved in FOSC cross-kingdom interactions, disentangle species-specific adaptation, and identify targets to combat diverse diseases caused by this group of fungal pathogens. 
    more » « less
  5. null (Ed.)
    Cell walls are at the front line of interactions between walled-organisms and their environment. They support cell expansion, ensure cell integrity and, for multicellular organisms such as plants, they provide cell adherence, support cell shape morphogenesis and mediate cell–cell communication. Wall-sensing, detecting perturbations in the wall and signaling the cell to respond accordingly, is crucial for growth and survival. In recent years, plant signaling research has suggested that a large family of receptor-like kinases (RLKs) could function as wall sensors partly because their extracellular domains show homology with malectin, a diglucose binding protein from the endoplasmic reticulum of animal cells. Studies of several malectin/malectin-like (M/ML) domain-containing RLKs (M/MLD-RLKs) from the model plant Arabidopsis thaliana have revealed an impressive array of biological roles, controlling growth, reproduction and stress responses, processes that in various ways rely on or affect the cell wall. Malectin homologous sequences are widespread across biological kingdoms, but plants have uniquely evolved a highly expanded family of proteins with ML domains embedded within various protein contexts. Here, we present an overview on proteins with malectin homologous sequences in different kingdoms, discuss the chromosomal organization of Arabidopsis M/MLD-RLKs and the phylogenetic relationship between these proteins from several model and crop species. We also discuss briefly the molecular networks that enable the diverse biological roles served by M/MLD-RLKs studied thus far. 
    more » « less
  6. Plants are continuously exposed to beneficial and pathogenic microbes, but how plants recognize and respond to friends versus foes remains poorly understood. Here, we compared the molecular response of Arabidopsis thaliana independently challenged with a Fusarium oxysporum endophyte Fo47 versus a pathogen Fo5176. These two F. oxysporum strains share a core genome of about 46 Mb, in addition to 1,229 and 5,415 unique accessory genes. Metatranscriptomic data reveal a shared pattern of expression for most plant genes (about 80%) in responding to both fungal inoculums at all timepoints from 12 to 96 h postinoculation (HPI). However, the distinct responding genes depict transcriptional plasticity, as the pathogenic interaction activates plant stress responses and suppresses functions related to plant growth and development, while the endophytic interaction attenuates host immunity but activates plant nitrogen assimilation. The differences in reprogramming of the plant transcriptome are most obvious in 12 HPI, the earliest timepoint sampled, and are linked to accessory genes in both fungal genomes. Collectively, our results indicate that the A. thaliana and F. oxysporum interaction displays both transcriptome conservation and plasticity in the early stages of infection, providing insights into the fine-tuning of gene regulation underlying plant differential responses to fungal endophytes and pathogens. [Formula: see text] Copyright © 2021 The Author(s). This is an open access article distributed under the CC BY-NC-ND 4.0 International license . 
    more » « less
  7. null (Ed.)
    Most genomes within the species complex of Fusarium oxysporum are organized into two compartments: the core chromosomes (CCs) and accessory chromosomes (ACs). As opposed to CCs, which are conserved and vertically transmitted to carry out essential housekeeping functions, lineage- or strain-specific ACs are believed to be initially horizontally acquired through unclear mechanisms. These two genomic compartments are different in terms of gene density, the distribution of transposable elements, and epigenetic markers. Although common in eukaryotes, the functional importance of ACs is uniquely emphasized among fungal species, specifically in relationship to fungal pathogenicity and their adaptation to diverse hosts. With a focus on the cross-kingdom fungal pathogen F. oxysporum, this review provides a summary of the differences between CCs and ACs based on current knowledge of gene functions, genome structures, and epigenetic signatures, and explores the transcriptional crosstalk between the core and accessory genomes. 
    more » « less
  8. null (Ed.)
    Background . New York City (NYC) experienced an initial surge and gradual decline in the number of SARS-CoV-2-confirmed cases in 2020. A change in the pattern of laboratory test results in COVID-19 patients over this time has not been reported or correlated with patient outcome. Methods . We performed a retrospective study of routine laboratory and SARS-CoV-2 RT-PCR test results from 5,785 patients evaluated in a NYC hospital emergency department from March to June employing machine learning analysis. Results . A COVID-19 high-risk laboratory test result profile (COVID19-HRP), consisting of 21 routine blood tests, was identified to characterize the SARS-CoV-2 patients. Approximately half of the SARS-CoV-2 positive patients had the distinct COVID19-HRP that separated them from SARS-CoV-2 negative patients. SARS-CoV-2 patients with the COVID19-HRP had higher SARS-CoV-2 viral loads, determined by cycle threshold values from the RT-PCR, and poorer clinical outcome compared to other positive patients without the COVID12-HRP. Furthermore, the percentage of SARS-CoV-2 patients with the COVID19-HRP has significantly decreased from March/April to May/June. Notably, viral load in the SARS-CoV-2 patients declined, and their laboratory profile became less distinguishable from SARS-CoV-2 negative patients in the later phase. Conclusions . Our longitudinal analysis illustrates the temporal change of laboratory test result profile in SARS-CoV-2 patients and the COVID-19 evolvement in a US epicenter. This analysis could become an important tool in COVID-19 population disease severity tracking and prediction. In addition, this analysis may play an important role in prioritizing high-risk patients, assisting in patient triaging and optimizing the usage of resources. 
    more » « less
  9. null (Ed.)
    Abstract Background Accurate diagnostic strategies to identify SARS-CoV-2 positive individuals rapidly for management of patient care and protection of health care personnel are urgently needed. The predominant diagnostic test is viral RNA detection by RT-PCR from nasopharyngeal swabs specimens, however the results are not promptly obtainable in all patient care locations. Routine laboratory testing, in contrast, is readily available with a turn-around time (TAT) usually within 1-2 hours. Method We developed a machine learning model incorporating patient demographic features (age, sex, race) with 27 routine laboratory tests to predict an individual’s SARS-CoV-2 infection status. Laboratory testing results obtained within 2 days before the release of SARS-CoV-2 RT-PCR result were used to train a gradient boosting decision tree (GBDT) model from 3,356 SARS-CoV-2 RT-PCR tested patients (1,402 positive and 1,954 negative) evaluated at a metropolitan hospital. Results The model achieved an area under the receiver operating characteristic curve (AUC) of 0.854 (95% CI: 0.829-0.878). Application of this model to an independent patient dataset from a separate hospital resulted in a comparable AUC (0.838), validating the generalization of its use. Moreover, our model predicted initial SARS-CoV-2 RT-PCR positivity in 66% individuals whose RT-PCR result changed from negative to positive within 2 days. Conclusion This model employing routine laboratory test results offers opportunities for early and rapid identification of high-risk SARS-CoV-2 infected patients before their RT-PCR results are available. It may play an important role in assisting the identification of SARS-CoV-2 infected patients in areas where RT-PCR testing is not accessible due to financial or supply constraints. 
    more » « less
  10. null (Ed.)
    Abstract Fusarium oxysporum is a cross-kingdom fungal pathogen that infects plants and humans. Horizontally transferred lineage-specific (LS) chromosomes were reported to determine host-specific pathogenicity among phytopathogenic F. oxysporum . However, the existence and functional importance of LS chromosomes among human pathogenic isolates are unknown. Here we report four unique LS chromosomes in a human pathogenic strain NRRL 32931, isolated from a leukemia patient. These LS chromosomes were devoid of housekeeping genes, but were significantly enriched in genes encoding metal ion transporters and cation transporters. Homologs of NRRL 32931 LS genes, including a homolog of ceruloplasmin and the genes that contribute to the expansion of the alkaline pH-responsive transcription factor PacC/Rim1p, were also present in the genome of NRRL 47514, a strain associated with Fusarium keratitis outbreak. This study provides the first evidence, to our knowledge, for genomic compartmentalization in two human pathogenic fungal genomes and suggests an important role of LS chromosomes in niche adaptation. 
    more » « less