skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Wang, Hao"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available November 19, 2026
  2. Free, publicly-accessible full text available October 1, 2026
  3. Free, publicly-accessible full text available October 22, 2026
  4. Deep reinforcement learning (DRL) has demonstrated significant potential in various applications, including gaming AI, robotics, and system scheduling. DRL algorithms produce, sample, and learn from training data online through a trial-and-error process, demanding considerable time and computational resources. To address this, distributed DRL algorithms and paradigms have been developed to expedite training using extensive resources. Through carefully designed experiments, we are the first to observe that strategically increasing the actor-environment interactions by spawning more concurrent actors at certain training rounds within ephemeral time frames can significantly enhance training efficiency. Yet, current distributed DRL solutions, which are predominantly server-based (or serverful), fail to capitalize on these opportunities due to their long startup times, limited adaptability, and cumbersome scalability. This paper proposesNitro, a generic training engine for distributed DRL algorithms that enforces timely and effective boosting with concurrent actors instantaneously spawned by serverless computing. With serverless functions,Nitroadjusts data sampling strategies dynamically according to the DRL training demands.Nitroseizes the opportunity of real-time boosting by accurately and swiftly detecting an empirical metric. To achieve cost efficiency, we design a heuristic actor scaling algorithm to guideNitrofor cost-aware boosting budget allocation. We integrateNitrowith state-of-the-art DRL algorithms and frameworks and evaluate them on AWS EC2 and Lambda. Experiments with Mujoco and Atari benchmarks show thatNitroimproves the final rewards (i.e., training quality) by up to 6× and reduces training costs by up to 42%. 
    more » « less
    Free, publicly-accessible full text available September 1, 2026
  5. In the complex traffic environments, understanding how a focal vehicle interacts (e.g., maneuvers) with various traffic elements (e.g., other vehicles, pedestrians, and road infrastructures), i.e., vehicle-to-X interactions (VXIs), is essential for developing the advanced driving support and intelligent vehicles. To derive the VXI scene understanding, reasoning, and decision support (e.g., suggesting cautious move in response of a pedestrian crossing the street), this work takes into account the recent advances of multi-modality large language models (MLLMs). We develop VXI-SUR, a novel VXI Scene Understanding and Reasoning system based on vision-language modeling. VXI-SUR takes in the visual VXI scene, and generates the structured textual responses that interpret the VXI scene and suggests an appropriate decision (e.g., braking, slowing down). We have designed within VXI-SUR a VXI memory mechanism with both scene and knowledge augmentation mechanisms, and enabled scene-knowledge co-learning to capture complex correspondences across scenes and decisions. We have performed extensive and comprehensive evaluations of VXI-SUR based on an open-source dataset with ∼17k VXI scenes. We have conducted extensive experimentation studies upon VXI-SUR, and corroborated VXI awareness, description preciseness, semantic matching, and quality in understanding and reasoning the complex VXI scenes. 
    more » « less
    Free, publicly-accessible full text available October 6, 2026
  6. Battery-powered mobile devices (e.g., smartphones, AR/VR glasses, and various IoT devices) are increasingly being used for AI training due to their growing computational power and easy access to valuable, diverse, and real-time data. On-device training is highly energy-intensive, making accurate energy consumption estimation crucial for effective job scheduling and sustainable AI. However, the heterogeneity of devices and the complexity of models challenge the accuracy and generalizability of existing methods. This paper proposes AMPERE, a generic approach for energy consumption estimation in deep neural network (DNN) training. First, we examine the layer-wise energy additivity property of DNNs and strategically partition the entire model into layers for fine-grained energy consumption profiling. Then, we fit Gaussian Process (GP) models to learn from layer-wise energy consumption measurements and estimate a DNN's overall energy consumption based on its layer-wise energy additivity property. We conduct extensive experiments with various types of models across different real-world platforms. The results demonstrate that AMPERE has effectively reduced the Mean Absolute Percentage Error (MAPE) by up to 30%. Moreover, AMPERE is applied in guiding energy-aware pruning, successfully reducing energy consumption by 50%, thereby further demonstrating its generality and potential. 
    more » « less
    Free, publicly-accessible full text available August 26, 2026
  7. Free, publicly-accessible full text available June 1, 2026
  8. Free, publicly-accessible full text available July 13, 2026
  9. Free, publicly-accessible full text available June 10, 2026
  10. Free, publicly-accessible full text available April 24, 2026