NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

LEVIS: Large Exact Verifiable Input Spaces for Neural Networks

Chehade, M; Li, W; Bell, BW; Bent, R; Kazi, SR; Zhu, H (July 2025, Proc. of the International Conference on Machine Learning)

The robustness of neural networks is crucial in safety-critical applications, where identifying a reliable input space is essential for effective model selection, robustness evaluation, and the development of reliable control strategies. Most existing robustness verification methods assess the worst-case output under the assumption that the input space is known. However, precisely identifying a verifiable input space , where no adversarial examples exist, is challenging due to the possible high dimensionality, discontinuity, and non-convex nature of the input space. To address this challenge, we propose a novel framework, LEVIS, comprising LEVIS- and LEVIS-. LEVIS- identifies a single, large verifiable ball that intersects at least two boundaries of a bounded region , while LEVIS- systematically captures the entirety of the verifiable space by integrating multiple verifiable balls. Our contributions are fourfold: we introduce a verification framework, LEVIS, incorporating two optimization techniques for computing nearest and directional adversarial points based on mixed-integer programming (MIP); to enhance scalability, we integrate complementary constrained (CC) optimization with a reduced MIP formulation, achieving up to a 17-fold reduction in runtime by approximating the verifiable region in a principled way; we provide a theoretical analysis characterizing the properties of the verifiable balls obtained through LEVIS-; and we validate our approach across diverse applications, including electrical power flow regression and image classification, demonstrating performance improvements and visualizing the geometric properties of the verifiable region.
more » « less
Free, publicly-accessible full text available July 14, 2026
Advancing Spectrokinetics in Heterogeneous Catalysis: From Bulk to Surface Species and Beyond

Bravo-Suarez, J J; Torres-Velasco, A; Alzahrani, H A; Patil, B S; Qi, Y; Podkolzin, S G; Zhu, H (June 2025, https://aiche.confex.com/aiche/nams25/meetingapp.cgi/Paper/705817)

This work proves the feasibility of utilizing steady state and transient in situ/operando spectroscopy to extract mechanistic information that reduces and leads to robust kinetic models. It also opens new avenues to explore kinetics and mechanisms with charge transfer data in heterogeneous catalysis.
more » « less
Free, publicly-accessible full text available June 13, 2026
3M-Diffusion: Latent Multi-Modal Diffusion for Language-Guided Molecular Structure Generation

Zhu, H; Xiao, T; Honavar, V (October 2024, openreview.net)

Generating molecular structures with desired properties is a critical task with broad applications in drug discovery and materials design. We propose 3M-Diffusion, a novel multi-modal molecular graph generation method, to generate diverse, ideally novel molecular structures with desired properties. 3M-Diffusion encodes molecular graphs into a graph latent space which it then aligns with the text space learned by encoder based LLMs from textual descriptions. It then reconstructs the molecular structure and atomic attributes based on the given text descriptions using the molecule decoder. It then learns a probabilistic mapping from the text space to the latent molecular graph space using a diffusion model. The results of our extensive experiments on several datasets demonstrate that 3M-Diffusion can generate high-quality, novel and diverse molecular graphs that semantically match the textual description provided. The code is available on github.
more » « less
Full Text Available
APOLLO: SGD-like Memory, AdamW-level Performance

Zhu, H; Zhang, Z; Cong, W; Liu, X; Park, S; Chandra, V; Long, B; Pan, D Z; Wang, Z; Lee, J (February 2025, https://doi.org/10.48550/arXiv.2412.05270)

Large language models (LLMs) are notoriously memory-intensive during training, particularly with the popular AdamW optimizer. This memory burden necessitates using more or higher-end GPUs or reducing batch sizes, limiting training scalability and throughput. To address this, various memory-efficient optimizers have been proposed to reduce optimizer memory usage. However, they face critical challenges: (i) reliance on costly SVD operations; (ii) significant performance trade-offs compared to AdamW; and (iii) still substantial optimizer memory overhead to maintain competitive performance. In this work, we identify that AdamW's learning rate adaptation rule can be effectively coarsened as a structured learning rate update. Based on this insight, we propose Approximated Gradient Scaling for Memory-Efficient LLM Optimization (APOLLO), which approximates learning rate scaling using an auxiliary low-rank optimizer state based on pure random projection. This structured learning rate update rule makes APOLLO highly tolerant to further memory reductions while delivering comparable pre-training performance. Even its rank-1 variant, APOLLO-Mini, achieves superior pre-training performance compared to AdamW with SGD-level memory costs. Extensive experiments demonstrate that the APOLLO series performs on-par with or better than AdamW, while achieving greater memory savings by nearly eliminating the optimization states of AdamW. These savings provide significant system-level benefits: (1) Enhanced Throughput: 3x throughput on an 8xA100-80GB setup compared to AdamW by supporting 4x larger batch sizes. (2) Improved Model Scalability: Pre-training LLaMA-13B with naive DDP on A100-80GB GPUs without system-level optimizations. (3) Low-End GPU Friendly Pre-training: Pre-training LLaMA-7B on a single GPU using less than 12 GB of memory with weight quantization.
more » « less
Free, publicly-accessible full text available February 17, 2026
RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models

Xia, P; Zhu, K; Li, H; Zhu, H; Li, Y; Li, G; Zhang, L; Yao, H (November 2024, EMNLP)

The recent emergence of Medical Large Vision Language Models (Med-LVLMs) has enhanced medical diagnosis. However, current Med-LVLMs frequently encounter factual issues, often generating responses that do not align with established medical facts. Retrieval-Augmented Generation (RAG), which utilizes external knowledge, can improve the factual accuracy of these models but introduces two major challenges. First, limited retrieved contexts might not cover all necessary information, while excessive retrieval can introduce irrelevant and inaccurate references, interfering with the model’s generation. Second, in cases where the model originally responds correctly, applying RAG can lead to an over-reliance on retrieved contexts, resulting in incorrect answers. To address these issues, we propose RULE, which consists of two components. First, we introduce a provably effective strategy for controlling factuality risk through the calibrated selection of the number of retrieved contexts. Second, based on samples where over-reliance on retrieved contexts led to errors, we curate a preference dataset to fine-tune the model, balancing its dependence on inherent knowledge and retrieved contexts for generation. We demonstrate the effectiveness of RAFE on three medical VQA datasets, achieving an average improvement of 20.8% in factual accuracy.
more » « less
Full Text Available
NASM: Neural Anisotropic Surface Meshing

Li, H; Zhu, H; Zhong, S; Wang, N; Lin, C; Guo, X; Xin, S; Wang, W; Hua, J; Zhong, Z (December 2024, SIGGRAPH Asia conference)

Full Text Available
Coupling physiochemical adsorption with biodegradation for enhanced removal of microcystin-LR in water

Tang, S; Zhang, L; Zhu, H; Jiang, SC (May 2024, Science of the total environment)

Full Text Available
Waveform Effects on Shear Wave Splitting Near Fault Zones

https://doi.org/10.1029/2025JB031656

Hua, J; Schulte‐Pelkum, V; Becker, T W; He, B; Zhu, H (August 2025, Journal of Geophysical Research: Solid Earth)

Shear wave splitting of teleseismic core phases such as SKS is commonly used to constrain mantle seismic anisotropy, a proxy for convective deformation. In plate boundaries, sharp lateral variations of splitting measurements near transform faults are often linked to deformation within a lithospheric shear zone below, but potential seismic waveform effects from heterogeneous structure on small scales may influence the interpretation. Here, we explore possible finite frequency effects on shear wave splitting near fault zones in a fully three‐dimensional anisotropic setting. We find that shear zones wider than 80 km, a scale set by the Fresnel zone, can be clearly detected, but narrower zones are less distinguishable. Near the edge of the shear zone, the combined effect of anisotropy and scattering generates false splitting measurements with large delay times and fast axis orientation approaching the back‐azimuth, a bias which can only be identified when records from different back‐azimuths are analyzed together. This substantiates that back‐azimuthal variations of splitting can arise not just from vertical layering but also lateral changes of anisotropic media. We also test the effects of shear zone edge geometry, epicentral distance, filtering frequency, crustal thickness, and sediment cover. Our study delineates the ability of shear wave splitting to resolve and investigate fault zones, and emphasizes the importance of good azimuthal coverage to correctly interpret observed anisotropy. Based on revisiting previous shear wave splitting and lithospheric deformation studies, we infer that many crustal fault zones are underlain by lithospheric shear zones at least 20 km wide.
more » « less
Free, publicly-accessible full text available August 1, 2026
Application of Spectrokinetic Techniques for the Understanding of Active Sites and Intermediate Species in Heterogeneous Catalysis

Torres-Velasco, A; Patil, B; Srinivasan, P; Zhu, H; Qi, Y; Podkolzin, S G; Bravo-Suarez, J J (July 2024, https://www.icc-lyon2024.fr/en/scientific-program/detailed-program/47)

The development of simple in situ spectrokinetic techniques to assess intermediate species nature and adsorption location can benefit catalytic studies by providing insights into the reasons for different catalysts performance and facilitate mechanistic proposals.
more » « less
Full Text Available
ANALYZING EQUITY IN TRUCK-DRONE COOPERATIVE DELIVERY FOR RURAL AREAS

Zhu, H; He, X; Wang, Z (January 2024, the 103rd Transportation Research Board Annual Meeting)

Given the surge in rural logistics services and the disparities between urban and rural delivery services, a compelling necessity emerges to explore innovative drone-based delivery solutions. The challenges inherent in truck-drone delivery due to technological and physical barriers affect service quality for some rural customers, thus magnifying concerns about delivery fairness. To investigated delivery equity, we present a truck-drone cooperative delivery model to analyze rural customers’ accessibility to such innovative delivery technology. This model accommodates rural residents’ delivery preferences while optimizing truck routes. Drones are dispatched from designated trucks to serve customers within their flight distance. Our proposed heuristic algorithm, founded on graph-based truck-drone delivery preferences, solves this intricate problem efficiently. Numerical experiments underscore the efficacy of our approach, highlighting substantial reductions in delivery costs and an impressive 20% increase in drone deliveries on a large-scale network. Through sensitivity analyses exploring drone operational costs and flight distances–affected by government policies and technological advancements–we devise an equity metric that gauges the efficiency and accessibility of rapid rural delivery services under the truck-drone delivery framework. Our research contributes to equity analysis, addressing challenges faced by logistics companies and rural residents. Moreover, it bridges the gap between urban and rural logistics, fostering an inclusive and equitable delivery ecosystem benefiting all customers, regardless of their location.
more » « less
Full Text Available

« Prev Next »

Search for: All records