Search for: All records

Award ID contains: 2316003

« Prev Next »

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Physics-informed machine learning for automatic model reduction in chemical reaction networks

https://doi.org/10.1038/s41598-025-92680-8

Pateras, Joseph; Zhang, Colin; Majumdar, Shriya; Pal, Ayush; Ghosh, Preetam (December 2025, Scientific Reports)

Abstract Physics-informed machine learning bridges the gap between the high fidelity of mechanistic models and the adaptive insights of artificial intelligence. In chemical reaction network modeling, this synergy proves valuable, addressing the high computational costs of detailed mechanistic models while leveraging the predictive power of machine learning. This study applies this fusion to the biomedical challenge of A$$\beta$$fibril aggregation, a key factor in Alzheimer’s disease. Central to the research is the introduction of an automatic reaction order model reduction framework, designed to optimize reduced-order kinetic models. This framework represents a shift in model construction, automatically determining the appropriate level of detail for reaction network modeling. The proposed approach significantly improves simulation efficiency and accuracy, particularly in systems like A$$\beta$$aggregation, where precise modeling of nucleation and growth kinetics can reveal potential therapeutic targets. Additionally, the automatic model reduction technique has the potential to generalize to other network models. The methodology offers a scalable and adaptable tool for applications beyond biomedical research. Its ability to dynamically adjust model complexity based on system-specific needs ensures that models remain both computationally feasible and scientifically relevant, accommodating new data and evolving understandings of complex phenomena.
more » « less
Free, publicly-accessible full text available December 1, 2026
A survey on deep learning for drug-target binding prediction: models, benchmarks, evaluation, and case studies

https://doi.org/10.1093/bib/bbaf491

Debnath, Kusal; Rana, Pratip; Ghosh, Preetam (August 2025, Briefings in Bioinformatics)

Abstract Conventional drug discovery is expensive, time-consuming, and prone to failure. Artificial intelligence has become a potent substitute over the last decade, providing strong answers to challenging biological issues in this field. Among these difficulties, drug-target binding (DTB) is a key component of drug discovery techniques. In this context, drug-target affinity and drug–target interaction are complementary and essential frameworks that work together to improve our comprehension of DTB dynamics. In this work, we thoroughly analyze the most recent deep learning models, popular benchmark datasets, and assessment metrics for DTB prediction. We look at the paradigm shift in the development of drug discovery research since researchers started using deep learning as a potent tool for DTB prediction. In particular, we examine how methodologies have evolved, starting with early heterogeneous network-based approaches, progressing to graph-based approaches that were widely accepted, followed by modern attention-based architectures, and finally, the most recent multimodal approaches. We also provide case studies utilizing an extensive compound library against specific protein targets implicated in critical cancer pathways to demonstrate the usefulness of these approaches. In addition to summarizing the latest developments in DTB prediction models, this review also identifies their drawbacks. It also highlights the outlook for the DTB prediction domain and future research directions. Combined, these studies present a more comprehensive view of how deep learning offers a quantitative framework for researching drug-target relationships, speeding up the identification of new drug candidates and making it easier to identify possible DTBs.
more » « less
Free, publicly-accessible full text available August 31, 2026
Simultaneous fault prediction in evolving industrial environments with ensembles of Hoeffding adaptive trees

https://doi.org/10.1007/s10489-025-06786-7

Esteban, A; Cano, A; Ventura, S; Zafra, A (August 2025, Applied Intelligence)

Abstract Predictive Maintenance (PdM) emerges as a critical task of Industry 4.0, driving operational efficiency, minimizing downtime, and reducing maintenance costs. However, real-world industrial environments present unsolved challenges, especially in predicting simultaneous and correlated faults under evolving conditions. Traditional batch-based and deep learning approaches for simultaneous fault prediction often fall short due to their assumptions of static data distributions and high computational demands, making them unsuitable for dynamic, resource-constrained systems. In response, we propose OEMLHAT (Online Ensemble of Multi-Label Hoeffding Adaptive Trees), a novel model tailored for real-time, multi-label fault prediction in non-stationary industrial settings. OEMLHAT introduces a scalable online ensemble architecture that integrates online bagging, dynamic feature subspacing, and adaptive output weighting. This design allows it to efficiently handle concept drift, high-dimensional input spaces, and label sparsity, key bottlenecks in existing PdM solutions. Experimental results on three public multi-label PdM case studies demonstrate substantial improvements in predictive performance of OEMLHAT over previous batch-based and online proposals for multi-label classification, particularly with an average improvement in micro-averaged F1-score of 18.49% over the second most-accurate batch-based proposal and of 8.56% in the case of the second best online model. By addressing a critical gap in online multi-label learning for PdM, this work provides a robust and interpretable solution for next-generation industrial monitoring systems for fault detection, particularly for rare and concurrent failures.
more » « less
Free, publicly-accessible full text available August 1, 2026
Do LLMs consider security? an empirical study on responses to programming questions

https://doi.org/10.1007/s10664-025-10658-6

Sajadi, Amirali; Le, Binh; Nguyen, Anh; Damevski, Kostadin; Chatterjee, Preetha (July 2025, Empirical Software Engineering)

Abstract The widespread adoption of conversational LLMs for software development has raised new security concerns regarding the safety of LLM-generated content. Our motivational study outlines ChatGPT’s potential in volunteering context-specific information to the developers, promoting safe coding practices. Motivated by this finding, we conduct a study to evaluate the degree of security awareness exhibited by three prominent LLMs: Claude 3, GPT-4, and Llama 3. We prompt these LLMs with Stack Overflow questions that contain vulnerable code to evaluate whether they merely provide answers to the questions or if they also warn users about the insecure code, thereby demonstrating a degree of security awareness. Further, we assess whether LLM responses provide information about the causes, exploits, and the potential fixes of the vulnerability, to help raise users’ awareness. Our findings show that all three models struggle to accurately detect and warn users about vulnerabilities, achieving a detection rate of only 12.6% to 40% across our datasets. We also observe that the LLMs tend to identify certain types of vulnerabilities related to sensitive information exposure and improper input neutralization much more frequently than other types, such as those involving external control of file names or paths. Furthermore, when LLMs do issue security warnings, they often provide more information on the causes, exploits, and fixes of vulnerabilities compared to Stack Overflow responses. Finally, we provide an in-depth discussion on the implications of our findings, and demonstrated a CLI-based prompting tool that can be used to produce more secure LLM responses.
more » « less
Free, publicly-accessible full text available July 1, 2026
24R ,25( OH ) ₂D₃ regulates tumorigenesis in estrogen sensitive laryngeal cancer cells via membrane‐associated receptor complexes in ER + and ER− cells

https://doi.org/10.1002/ijc.70141

Dennis, Cydney D; Cohen, D Joshua; Debnath, Kusal; Schwartz, Nofrat; Lodato, Brock P; Dillon, Jonathan T; Batool, Tillat; Halquist, Matthew S; Ghosh, Preetam; Schwartz, Zvi; et al (September 2025, International Journal of Cancer)

Abstract This study examined the effects of 24R,25‐dihydroxyvitamin D₃(24R,25(OH)₂D₃) in estrogen‐responsive laryngeal cancer tumorigenesis in vivo, the mechanisms involved, and whether the ability of the tumor cells to produce 24R,25(OH)₂D₃locally is estrogen‐dependent. Estrogen receptor alpha‐66 positive (ER+) UM‐SCC‐12 cells and ER− UM‐SCC‐11A cells responded differently to 24R,25(OH)₂D₃in vivo; 24R,25(OH)₂D₃enhanced tumorigenesis in ER+ tumors but inhibited tumorigenesis in ER− tumors. Treatment with 17β‐estradiol (E₂) for 24 h reduced levels of CYP24A1 protein but increased 24R,25(OH)₂D₃production in ER+ cells; treatment with E₂for 9 min reduced CYP24A1 at 24 h and reduced 24R,25(OH)₂D₃production in ER− cells. These findings suggest the involvement of E₂receptor(s) in addition to ERα66. To investigate if 24R,25(OH)₂D₃can act locally, ER+ and ER− cells were treated with 24R,25(OH)₂D₃after inhibiting putative 24R,25(OH)₂D₃receptors, and the cells were assessed for effects on DNA synthesis (proliferation) and p53 production (apoptosis). Specific inhibitors were used to assess downstream secondary messenger signaling pathways and requirements for palmitoylation and caveolae in both cell lines. The results show that 24R,25(OH)₂D₃binds to a complex of receptors, including TLCD3B2, VDR, and protein disulfide‐isomerase A3 (PDIA3) in ER+ UM‐SCC‐12 cells. The mechanism requires palmitoylation, and PLD, PI3K, and LPAR are involved. The anti‐tumorigenic effects of 24R,25(OH)₂D₃in ER− UM‐SCC‐11A cells involve a membrane‐receptor complex consisting of VDR, PDIA3, and ROR2 within caveolae to activate a yet‐to‐be‐elucidated downstream signaling cascade. This work demonstrates a driving mechanism for the therapeutic agent 24R,25(OH)₂D₃that may be used for laryngeal cancer patients.
more » « less
Free, publicly-accessible full text available September 8, 2026
Neuron enriched extracellular vesicles’ MicroRNA expression profiles as a marker of early life alcohol consumption

https://doi.org/10.1038/s41398-024-02874-3

Yakovlev, Vasily; Lapato, Dana M; Rana, Pratip; Ghosh, Preetam; Frye, Rebekah; Roberson-Nay, Roxann (December 2024, Translational Psychiatry)

Abstract Alcohol consumption may impact and shape brain development through perturbed biological pathways and impaired molecular functions. We investigated the relationship between alcohol consumption rates and neuron-enriched extracellular vesicles’ (EVs’) microRNA (miRNA) expression to better understand the impact of alcohol use on early life brain biology. Neuron-enriched EVs’ miRNA expression was measured from plasma samples collected from young people using a commercially available microarray platform while alcohol consumption was measured using the Alcohol Use Disorders Identification Test. Linear regression and network analyses were used to identify significantly differentially expressed miRNAs and to characterize the implicated biological pathways, respectively. Compared to alcohol naïve controls, young people reporting high alcohol consumption exhibited significantly higher expression of three neuron-enriched EVs’ miRNAs including miR-30a-5p, miR-194-5p, and miR-339-3p, although only miR-30a-5p and miR-194-5p survived multiple test correction. The miRNA-miRNA interaction network inferred by a network inference algorithm did not detect any differentially expressed miRNAs with a high cutoff on edge scores. However, when the cutoff of the algorithm was reduced, five miRNAs were identified as interacting with miR-194-5p and miR-30a-5p. These seven miRNAs were associated with 25 biological functions; miR-194-5p was the most highly connected node and was highly correlated with the other miRNAs in this cluster. Our observed association between neuron-enriched EVs’ miRNAs and alcohol consumption concurs with results from experimental animal models of alcohol use and suggests that high rates of alcohol consumption during the adolescent/young adult years may impact brain functioning and development by modulating miRNA expression.
more » « less
Full Text Available
COFFEE: consensus single cell-type specific inference for gene regulatory networks

https://doi.org/10.1093/bib/bbae457

Lodi, Musaddiq K; Chernikov, Anna; Ghosh, Preetam (September 2024, Briefings in Bioinformatics)

Abstract The inference of gene regulatory networks (GRNs) is crucial to understanding the regulatory mechanisms that govern biological processes. GRNs may be represented as edges in a graph, and hence, it have been inferred computationally for scRNA-seq data. A wisdom of crowds approach to integrate edges from several GRNs to create one composite GRN has demonstrated improved performance when compared with individual algorithm implementations on bulk RNA-seq and microarray data. In an effort to extend this approach to scRNA-seq data, we present COFFEE (COnsensus single cell-type speciFic inFerence for gEnE regulatory networks), a Borda voting-based consensus algorithm that integrates information from 10 established GRN inference methods. We conclude that COFFEE has improved performance across synthetic, curated, and experimental datasets when compared with baseline methods. Additionally, we show that a modified version of COFFEE can be leveraged to improve performance on newer cell-type specific GRN inference methods. Overall, our results demonstrate that consensus-based methods with pertinent modifications continue to be valuable for GRN inference at the single cell level. While COFFEE is benchmarked on 10 algorithms, it is a flexible strategy that can incorporate any set of GRN inference algorithms according to user preference. A Python implementation of COFFEE may be found on GitHub: https://github.com/lodimk2/coffee
more » « less
Full Text Available
CHAI: consensus clustering through similarity matrix integration for cell-type identification

https://doi.org/10.1093/bib/bbae411

Lodi, Musaddiq K; Lodi, Muzammil; Osei, Kezie; Ranganathan, Vaishnavi; Hwang, Priscilla; Ghosh, Preetam (July 2024, Briefings in Bioinformatics)

Abstract Several methods have been developed to computationally predict cell-types for single cell RNA sequencing (scRNAseq) data. As methods are developed, a common problem for investigators has been identifying the best method they should apply to their specific use-case. To address this challenge, we present CHAI (consensus Clustering tHrough similArIty matrix integratIon for single cell-type identification), a wisdom of crowds approach for scRNAseq clustering. CHAI presents two competing methods which aggregate the clustering results from seven state-of-the-art clustering methods: CHAI-AvgSim and CHAI-SNF. CHAI-AvgSim and CHAI-SNF demonstrate superior performance across several benchmarking datasets. Furthermore, both CHAI methods outperform the most recent consensus clustering method, SAME-clustering. We demonstrate CHAI’s practical use case by identifying a leader tumor cell cluster enriched with CDH3. CHAI provides a platform for multiomic integration, and we demonstrate CHAI-SNF to have improved performance when including spatial transcriptomics data. CHAI overcomes previous limitations by incorporating the most recent and top performing scRNAseq clustering algorithms into the aggregation framework. It is also an intuitive and easily customizable R package where users may add their own clustering methods to the pipeline, or down-select just the ones they want to use for the clustering aggregation. This ensures that as more advanced clustering algorithms are developed, CHAI will remain useful to the community as a generalized framework. CHAI is available as an open source R package on GitHub: https://github.com/lodimk2/chai.
more » « less
Full Text Available
Understanding the Existence of a Na ₂ Dimer in a High-Spin State

https://doi.org/10.1021/acs.jpca.5c03939

Kilic, Mehmet Emin; Jena, Puru (October 2025, The Journal of Physical Chemistry A)

Free, publicly-accessible full text available October 9, 2026
Advancing infection profiling under data uncertainty through contagion potential

https://doi.org/10.1371/journal.pone.0329828

Roy, Satyaki; Biswas, Preetom; Ghosh, Preetam (August 2025, PLOS One)
Arunachalam, Viswanathan (Ed.)
During the COVID-19 pandemic, the prevalence of asymptomatic cases challenged the reliability of epidemiological statistics in policymaking. To address this, we introducedcontagion potential(CP) as a continuous metric derived from sociodemographic and epidemiological data to quantify the infection risk posed by the asymptomatic within a region. However, CP estimation is hindered by incomplete or biased incidence data, where underreporting and testing constraints make direct estimation infeasible. To overcome this limitation, we employ a hypothesis-testing approach to infer CP from sampled data, allowing for robust estimation despite missing information. Even within the sample collected from spatial contact data, individuals possess partial knowledge of their neighborhoods, as their awareness is restricted to interactions captured by available tracking data. We introduce an adjustment factor that calibrates the sample CPs so that the sample is a reasonable estimate of the population CP. Further complicating estimation, biases in epidemiological and mobility data arise from heterogeneous reporting rates and sampling inconsistencies, which we address throughinverse probability weightingto enhance reliability. Using a spatial model for infection spread through social mixing and an optimization framework based on the SIRS epidemic model, we analyze real infection datasets from Italy, Germany, and Austria. Our findings demonstrate that statistical methods can achieve high-confidence CP estimates while accounting for variations in sample size, confidence level, mobility models, and viral strains. By assessing the effects of bias, social mixing, and sampling frequency, we propose statistical corrections to improve CP prediction accuracy. Finally, we discuss how reliable CP estimates can inform outbreak mitigation strategies despite the inherent uncertainties in epidemiological data.
more » « less
Free, publicly-accessible full text available August 12, 2026

« Prev Next »