NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A large-scale benchmark for network inference from single-cell perturbation data

https://doi.org/10.1038/s42003-025-07764-y

Chevalley, Mathieu; Roohani, Yusuf H; Mehrjou, Arash; Leskovec, Jure; Schwab, Patrick (December 2025, Communications Biology)

Free, publicly-accessible full text available December 1, 2026
RelGNN: Composite Message Passing for Relational Deep Learning

Chen, Tianlang; Kanatsoulis, Charilaos; Leskovec, Jure (July 2025, International Conference on Machine Learning)

Predictive tasks on relational databases are critical in real-world applications spanning e-commerce, healthcare, and social media. To address these tasks effectively, Relational Deep Learning (RDL) encodes relational data as graphs, enabling Graph Neural Networks (GNNs) to exploit relational structures for improved predictions. However, existing RDL methods often overlook the intrinsic structural properties of the graphs built from relational databases, leading to modeling inefficiencies, particularly in handling many-tomany relationships. Here we introduce RELGNN, a novel GNN framework specifically designed to leverage the unique structural characteristics of the graphs built from relational databases. At the core of our approach is the introduction of atomic routes, which are simple paths that enable direct single-hop interactions between the source and destination nodes. Building upon these atomic routes, RELGNN designs new composite message passing and graph attention mechanisms that reduce redundancy, highlight key signals, and enhance predictive accuracy. RELGNN is evaluated on 30 diverse real-world tasks from RELBENCH (Fey et al., 2024), and achieves state-of-the-art performance on the vast majority of tasks, with improvements of up to 25%.
more » « less
Free, publicly-accessible full text available July 13, 2026
Limitations of cell embedding metrics assessed using drifting islands

https://doi.org/10.1038/s41587-025-02702-z

Wang, Hanchen; Leskovec, Jure; Regev, Aviv (June 2025, Nature Biotechnology)

Free, publicly-accessible full text available June 11, 2026
CollabLLM: From Passive Responders to Active Collaborators

Wu, Shirley; Galley, Michel; Peng, Baolin; Cheng, Hao; Li, Gavin; Dou, Yao; Cai, Weixin; Zou, James; Leskovec, Jure; Gao, Jianfeng (July 2025, International Conference on Machine Learning)

Large Language Models are typically trained with next-turn rewards, limiting their ability to optimize for long-term interaction. As a result, they often respond passively to ambiguous or open-ended user requests, failing to help users reach their ultimate intents and leading to inefficient conversations. To address these limitations, we introduce COLLABLLM, a novel and general training framework that enhances multiturn human-LLM collaboration. Its key innovation is a collaborative simulation that estimates the long-term contribution of responses using Multiturn-aware Rewards. By reinforcement fine-tuning these rewards, COLLABLLM goes beyond responding to user requests, and actively uncovers user intent and offers insightful suggestions—a key step towards more humancentered AI. We also devise a multiturn interaction benchmark with three challenging tasks such as document creation. COLLABLLM significantly outperforms our baselines with averages of 18.5% higher task performance and 46.3% improved interactivity by LLM judges. Finally, we conduct a large user study with 201 judges, where COLLABLLM increases user satisfaction by 17.6% and reduces user spent time by 10.4%.
more » « less
Free, publicly-accessible full text available July 13, 2026
Aligning target-aware molecule diffusion models with exact energy optimization

Gu, Siyi; Xu, Minkai; Powers, Alexander; Nie, Weili; Geffner, Tomas; Kreis, Karsten; Leskovec, Jure; Vahdat, Arash; Ermon, Stefano (December 2024, Advances in neural information processing systems)

Free, publicly-accessible full text available December 10, 2025
STARK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases

Wu, Shirley; Zhao, Shiyu; Yasunaga, Michihiro; Huang, Kexin; Cao, Kaidi; Huang, Qian; Ioannidis, Vassilis N; Subbian, Karthik; Zou, James; Leskovec, Jure (December 2024, Advances in neural information processing systems)

Free, publicly-accessible full text available December 10, 2025
Avatar: Optimizing llm agents for tool usage via contrastive reasoning

Wu, Shirley; Zhao, Shiyu; Huang, Qian; Huang, Kexin; Yasunaga, Michihiro; Cao, Kaidi; Ioannidis, Vassilis N; Subbian, Karthik; Leskovec, Jure; Zou, James (December 2024, Advances in neural information processing systems)

Free, publicly-accessible full text available December 10, 2025
A foundation model for clinician-centered drug repurposing

https://doi.org/10.1038/s41591-024-03233-x

Huang, Kexin; Chandak, Payal; Wang, Qianwen; Havaldar, Shreyas; Vaid, Akhil; Leskovec, Jure; Nadkarni, Girish N; Glicksberg, Benjamin S; Gehlenborg, Nils; Zitnik, Marinka (December 2024, Nature Medicine)

Free, publicly-accessible full text available December 1, 2025
RELBENCH: A Benchmark for Deep Learning on Relational Databases

Robinson, Joshua; Ranjan, Rishabh; Hu, Weihua; Huang, Kexin; Han, Jiaqi; Dobles, Alejandro; Fey, Matthias; Lenssen, Jan E; Yuan, Yiwen; Zhang, Zecheng; et al (December 2024, Advances in neural information processing systems)

Free, publicly-accessible full text available December 10, 2025
Inferring Dynamic Networks from Marginals with Iterative Proportional Fitting

Chang, Serina; Koehler, Frederic; Qu, Zhaonan; Leskovec, Jure; Ugander, Johan (July 2024, International Conference on Machine Learning (ICML))

A common network inference problem, arising from real-world data constraints, is how to infer a dynamic network from its time-aggregated adjacency matrix and time-varying marginals (i.e., row and column sums). Prior approaches to this problem have repurposed the classic iterative proportional fitting (IPF) procedure, also known as Sinkhorn's algorithm, with promising empirical results. However, the statistical foundation for using IPF has not been well understood: under what settings does IPF provide principled estimation of a dynamic network from its marginals, and how well does it estimate the network? In this work, we establish such a setting, by identifying a generative network model whose maximum likelihood estimates are recovered by IPF. Our model both reveals implicit assumptions on the use of IPF in such settings and enables new analyses, such as structure-dependent error bounds on IPF's parameter estimates. When IPF fails to converge on sparse network data, we introduce a principled algorithm that guarantees IPF converges under minimal changes to the network structure. Finally, we conduct experiments with synthetic and real-world data, which demonstrate the practical value of our theoretical and algorithmic contributions.
more » « less
Full Text Available

« Prev Next »

Search for: All records