NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

CoDiCast: Conditional Diffusion Model for Global Weather Forecasting with Uncertainty Quantification

https://doi.org/10.24963/ijcai.2025/1095

Shi, Jimeng; Jin, Bowen; Han, Jiawei; Gopalakrishnan, Sundararaman; Narasimhan, Giri (September 2025, International Joint Conferences on Artificial Intelligence Organization)

Accurate weather forecasting is critical for science and society. However, existing methods have not achieved the combination of high accuracy, low uncertainty, and high computational efficiency simultaneously. On one hand, traditional numerical weather prediction (NWP) models are computationally intensive because of their complexity. On the other hand, most machine learning-based weather prediction (MLWP) approaches offer efficiency and accuracy but remain deterministic, lacking the ability to capture forecast uncertainty. To tackle these challenges, we propose a conditional diffusion model, CoDiCast, to generate global weather prediction, integrating accuracy and uncertainty quantification at a modest computational cost. The key idea behind the prediction task is to generate realistic weather scenarios at a future time point, conditioned on observations from the recent past. Due to the probabilistic nature of diffusion models, they can be properly applied to capture the uncertainty of weather predictions. Therefore, we accomplish uncertainty quantifications by repeatedly sampling from stochastic Gaussian noise for each initial weather state and running the denoising process multiple times. Experimental results demonstrate that CoDiCast outperforms several existing MLWP methods in accuracy, and is faster than NWP models in inference speed. Our model can generate 6-day global weather forecasts, at 6-hour steps and 5.625-degree latitude-longitude resolutions, for over 5 variables, in about 12 minutes on a commodity A100 GPU machine with 80GB memory. The source code is available at https://github.com/JimengShi/CoDiCast.
more » « less
Free, publicly-accessible full text available September 1, 2026
LLM Alignment as Retriever Optimization: An Information Retrieval Perspective

Jin, Bowen; Yoon, Jinsung; Qin, Zhen; Wang, Ziqi; Xiong, Wei; Meng, Yu; Han, Jiawei; Arik, Sercan (July 2025, International Machine Learning Society (IMLS))

Free, publicly-accessible full text available July 23, 2026
Can Graph Neural Networks Learn Language with Extremely Weak Text Supervision?

Li, Zihao; Zheng, Lecheng Zheng; Jin, Bowen; Fu, Dongqi; Jing, Baoyu; Ban, Yikun; He, Jingrui; Han, Jiawei (July 2025, ACL 2025)

Free, publicly-accessible full text available July 27, 2026
Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval

Jiang, Pengcheng; Xiao, Cao; Jiang, Minhao; Bhatia, Parminder; Kass-Hout, Taha; Sun, Jimeng; Han, Jiawei (May 2025, ICLR/OpenReview)

Free, publicly-accessible full text available May 18, 2026
Retrieval and Structuring Augmented Generation with LLMs for Web Applications

https://doi.org/10.1145/3701716.3715870

Jiao, Yizhu; Ouyang, Siru; Zhong, Ming; Zhang, Yunyi; Ding, Linyi; Zhou, Sizhe; Han, Jiawei (May 2025, ACM)

Free, publicly-accessible full text available May 8, 2026
Chain-of-Factors Paper-Reviewer Matching

https://doi.org/10.1145/3696410.3714708

Zhang, Yu; Shen, Yanzhen; Kang, SeongKu; Chen, Xiusi; Jin, Bowen; Han, Jiawei (April 2025, ACM)

Free, publicly-accessible full text available April 22, 2026
Multimodal Search in Chemical Documents and Reactions

https://doi.org/10.1145/3726302.3730152

Shah, Ayush Kumar; Dey, Abhisek; Luo, Leo; Amador, Bryan; Philippy, Patrick; Zhong, Ming; Ouyang, Siru; Friday, David Mark; Bianchi, David; Jackson, Nick; et al (July 2025, ACM)

Free, publicly-accessible full text available July 13, 2026
TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision

https://doi.org/10.1145/3696410.3714940

Zhang, Yunyi; Yang, Ruozhen; Xu, Xueqiang; Li, Rui; Xiao, Jinfeng; Shen, Jiaming; Han, Jiawei (April 2025, ACM)

Free, publicly-accessible full text available April 22, 2026
Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations

https://doi.org/10.1609/aaai.v39i1.32013

Jiang, Pengcheng; Xiao, Cao; Fu, Tianfan; Bhatia, Parminder; Kass-Hout, Taha; Sun, Jimeng; Han, Jiawei (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

Molecular representation learning is vital for various downstream applications, including the analysis and prediction of molecular properties and side effects. While Graph Neural Networks (GNNs) have been a popular framework for modeling molecular data, they often struggle to capture the full complexity of molecular representations. In this paper, we introduce a novel method called Gode, which accounts for the dual-level structure inherent in molecules. Molecules possess an intrinsic graph structure and simultaneously function as nodes within a broader molecular knowledge graph. Gode integrates individual molecular graph representations with multi-domain biochemical data from knowledge graphs. By pre-training two GNNs on different graph structures and employing contrastive learning, Gode effectively fuses molecular structures with their corresponding knowledge graph substructures. This fusion yields a more robust and informative representation, enhancing molecular property predictions by leveraging both chemical and biological information. When fine-tuned across 11 chemical property tasks, our model significantly outperforms existing benchmarks, achieving an average ROC-AUC improvement of 12.7% for classification tasks and an average RMSE/MAE improvement of 34.4% for regression tasks. Notably, Gode surpasses the current leading model in property prediction, with advancements of 2.2% in classification and 7.2% in regression tasks.
more » « less
Free, publicly-accessible full text available April 11, 2026
Beyond True or False: Retrieval-Augmented Hierarchical Analysis of Nuanced Claims

https://doi.org/10.18653/v1/2025.acl-long.1434

Kargupta, Priyanka; Tian, Runchu; Han, Jiawei (January 2025, Association for Computational Linguistics)

Free, publicly-accessible full text available January 1, 2026

« Prev Next »

Search for: All records