NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Switching constrained OCO with predictions and feedback delays

https://doi.org/10.1016/j.peva.2025.102524

Pan, Weici; Liu, Zhenhua (November 2025, Performance Evaluation)

Full Text Available
CAPLAI: AI-assisted Lifecycle Provisioning for GPU data centers

https://doi.org/10.1109/MASCOTS67699.2025.11283387

Nie, Chengyi; Xing, Anna; Latif, Imran; Liu, Zhenhua (October 2025, IEEE)

Full Text Available
Multi-Entanglement Routing Design Over Quantum Networks Using Greenberger–Horne–Zeilinger Measurements

https://doi.org/10.1109/TON.2025.3608121

Zeng, Yiming; Zhang, Jiarui; Liu, Ji; Liu, Zhenhua; Yang, Yuanyuan (September 2025, IEEE Transactions on Networking)

Full Text Available
MALLM: Multi-Agent Decision-Making with LLMs for Multi-User Edge-Sensor Environments

https://doi.org/10.1145/3764944.3764947

Fu, Heming; Pan, Weici; Zhou, Liangkai; Zhang, Zeyu; Liu, Zhenhua; Lin, Shan (August 2025, ACM SIGMETRICS Performance Evaluation Review)

Multi-user environments present significant challenges in coordinating diverse preferences and resolving conflicts around shared resources. Current systems use a single-agent approach that struggles to balance individual needs with collective objectives. We introduce MALLM, a novel framework that deploys personalized LLM-based agents for each user on edge devices. MALLM integrates multi-sensor data fusion with a structured multi-agent decision-making mechanism, processing all data locally for enhanced privacy. Our edge-computing architecture enables real-time deliberation through evidence-based argumentation and consensus formation algorithms. The system continuously refines user profiles through sensor data while managing computational resources e!ciently. We evaluate MALLM through two case studies-health monitoring and personalized comfort management- demonstrating improved conflict resolution and resource e!ciency compared to conventional approaches. Our results show that MALLM e''ectively balances competing user priorities while preserving privacy in complex shared environments.
more » « less
Full Text Available
Energy-efficient GPU SM allocation

https://doi.org/10.1145/3764944.3764952

Han, Bing-Shiun; Parekh, Kunaal; Lin, Wan-Chu; Paul, Tathagata; Gandhi, Anshul; Liu, Zhenhua (August 2025, ACM SIGMETRICS Performance Evaluation Review)

GPU sharing between workloads is an e!ective approach to increase GPU utilization and reduce idle power waste. To minimize resource contention under GPU sharing, current architectures allow users to allocate core GPU compute resources exclusively to workloads. However, identifying the most e''cient GPU compute resource allocation for colocated workloads is challenging, as it requires balancing potential performance degradation and power savings. This paper presents a framework for finding the most energy-e''cient compute allocation for colocated workload pairs under NVIDIA MPS using lightweight prediction models. Experimental results, using a range of training, inference, and general CUDA workloads, demonstrate that our solution outperforms the equal sharing strategy by 35%, on average, and is within 1.5% of the o#ine optimal strategy.
more » « less
Full Text Available
Cannikin: Optimal Adaptive Distributed DNN Training over Heterogeneous Clusters

https://doi.org/10.1145/3652892.3700767

Nie, Chengyi; Maghakian, Jessica; Liu, Zhenhua (December 2024, ACM)

Full Text Available
KACE: Kernel-Aware Colocation for Efficient GPU Spatial Sharing

https://doi.org/10.1145/3698038.3698555

Han, Bing-Shiun; Paul, Tathagata; Liu, Zhenhua; Gandhi, Anshul (November 2024, ACM)

Full Text Available
Multi-User Entanglement Routing Design over Quantum Internets

https://doi.org/10.1109/ICDCS60910.2024.00033

Zeng, Yiming; Zhang, Jiarui; Shang, Xiaojun; Liu, Ji; Liu, Zhenhua; Yang, Yuanyuan (July 2024, IEEE)

Full Text Available
Joint Task Offloading and Resource Allocation in Heterogeneous Edge Environments

https://doi.org/10.1109/TMC.2023.3335198

Liu, Yu; Mao, Yingling; Liu, Zhenhua; Ye, Fan; Yang, Yuanyuan (June 2024, IEEE Transactions on Mobile Computing)

Full Text Available
Entanglement Routing Design Over Quantum Networks

https://doi.org/10.1109/TNET.2023.3282560

Zeng, Yiming; Zhang, Jiarui; Liu, Ji; Liu, Zhenhua; Yang, Yuanyuan (February 2024, IEEE/ACM Transactions on Networking)

Full Text Available

« Prev Next »

Search for: All records