NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

CoDiCast: Conditional Diffusion Model for Global Weather Forecasting with Uncertainty Quantification

https://doi.org/10.24963/ijcai.2025/1095

Shi, Jimeng; Jin, Bowen; Han, Jiawei; Gopalakrishnan, Sundararaman; Narasimhan, Giri (September 2025, International Joint Conferences on Artificial Intelligence Organization)

Accurate weather forecasting is critical for science and society. However, existing methods have not achieved the combination of high accuracy, low uncertainty, and high computational efficiency simultaneously. On one hand, traditional numerical weather prediction (NWP) models are computationally intensive because of their complexity. On the other hand, most machine learning-based weather prediction (MLWP) approaches offer efficiency and accuracy but remain deterministic, lacking the ability to capture forecast uncertainty. To tackle these challenges, we propose a conditional diffusion model, CoDiCast, to generate global weather prediction, integrating accuracy and uncertainty quantification at a modest computational cost. The key idea behind the prediction task is to generate realistic weather scenarios at a future time point, conditioned on observations from the recent past. Due to the probabilistic nature of diffusion models, they can be properly applied to capture the uncertainty of weather predictions. Therefore, we accomplish uncertainty quantifications by repeatedly sampling from stochastic Gaussian noise for each initial weather state and running the denoising process multiple times. Experimental results demonstrate that CoDiCast outperforms several existing MLWP methods in accuracy, and is faster than NWP models in inference speed. Our model can generate 6-day global weather forecasts, at 6-hour steps and 5.625-degree latitude-longitude resolutions, for over 5 variables, in about 12 minutes on a commodity A100 GPU machine with 80GB memory. The source code is available at https://github.com/JimengShi/CoDiCast.
more » « less
Free, publicly-accessible full text available September 1, 2026
LLM Alignment as Retriever Optimization: An Information Retrieval Perspective

Jin, Bowen; Yoon, Jinsung; Qin, Zhen; Wang, Ziqi; Xiong, Wei; Meng, Yu; Han, Jiawei; Arik, Sercan (July 2025, International Machine Learning Society (IMLS))

Free, publicly-accessible full text available July 23, 2026
Can Graph Neural Networks Learn Language with Extremely Weak Text Supervision?

Li, Zihao; Zheng, Lecheng Zheng; Jin, Bowen; Fu, Dongqi; Jing, Baoyu; Ban, Yikun; He, Jingrui; Han, Jiawei (July 2025, ACL 2025)

Free, publicly-accessible full text available July 27, 2026
Chain-of-Factors Paper-Reviewer Matching

https://doi.org/10.1145/3696410.3714708

Zhang, Yu; Shen, Yanzhen; Kang, SeongKu; Chen, Xiusi; Jin, Bowen; Han, Jiawei (April 2025, ACM)

Free, publicly-accessible full text available April 22, 2026
Improving Scientific Document Retrieval with Concept Coverage-based Query Set Generation

https://doi.org/10.1145/3701551.3703544

Kang, SeongKu; Jin, Bowen; Kweon, Wonbin; Zhang, Yu; Lee, Dongha; Han, Jiawei; Yu, Hwanjo (March 2025, ACM)

Free, publicly-accessible full text available March 10, 2026
Large Language Models on Graphs: A Comprehensive Survey

https://doi.org/10.1109/TKDE.2024.3469578

Jin, Bowen; Liu, Gang; Han, Chi; Jiang, Meng; Ji, Heng; Han, Jiawei (December 2024, IEEE Transactions on Knowledge and Data Engineering)

Free, publicly-accessible full text available December 1, 2025
Can Graph Neural Networks Learn Language with Extremely Weak Text Supervision?

https://doi.org/10.18653/v1/2025.acl-long.545

Li, Zihao; Zheng, Lecheng; Jin, Bowen; Fu, Dongqi; Jing, Baoyu; Ban, Yikun; He, Jingrui; Han, Jiawei (January 2025, Association for Computational Linguistics)

Free, publicly-accessible full text available January 1, 2026
Improving Retrieval in Theme-specific Applications using a Corpus Topical Taxonomy

https://doi.org/10.1145/3589334.3645512

Kang, SeongKu; Agarwal, Shivam; Jin, Bowen; Lee, Dongha; Yu, Hwanjo; Han, Jiawei (May 2024, ACM)
Chua, Tat-Seng; Ngo, Chong-Wah; Kumar, Ravi; Lauw, Hady W; Lee, Roy Ka-Wei (Ed.)
Document retrieval has greatly benefited from the advancements of large-scale pre-trained language models (PLMs). However, their effectiveness is often limited in theme-specific applications for specialized areas or industries, due to unique terminologies, incomplete contexts of user queries, and specialized search intents. To capture the theme-specific information and improve retrieval, we propose to use a corpus topical taxonomy, which outlines the latent topic structure of the corpus while reflecting user-interested aspects. We introduce ToTER (Topical Taxonomy Enhanced Retrieval) framework, which identifies the central topics of queries and documents with the guidance of the taxonomy, and exploits their topical relatedness to supplement missing contexts. As a plug-and-play framework, ToTER can be flexibly employed to enhance various PLM-based retrievers. Through extensive quantitative, ablative, and exploratory experiments on two real-world datasets, we ascertain the benefits of using topical taxonomy for retrieval in theme-specific applications and demonstrate the effectiveness of ToTER.
more » « less
Full Text Available
Bridging Text Data and Graph Data: Towards Semantics and Structure-aware Knowledge Discovery

https://doi.org/10.1145/3616855.3636450

Jin, Bowen; Zhang, Yu; Li, Sha; Han, Jiawei (March 2024, ACM)

Graphs and texts are two key modalities in data mining. In many cases, the data presents a mixture of the two modalities and the information is often complementary: in e-commerce data, the product-user graph and product descriptions capture different aspects of product features; in scientific literature, the citation graph, author metadata, and the paper content all contribute to modeling the paper impact.
more » « less
Full Text Available
Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction

https://doi.org/10.18653/v1/2024.emnlp-main.747

Zhou, Sizhe; Meng, Yu; Jin, Bowen; Han, Jiawei (January 2024, Association for Computational Linguistics)

Full Text Available

« Prev Next »

Search for: All records