Nitro: Boosting Distributed Reinforcement Learning with Serverless Computing

Yu, Hanfei; Carter, Jacob; Wang, Hao; Tiwari, Devesh; Li, Jian; Park, Seung-Jong

doi:10.14778/3696435.3696441

Citation Details

This content will become publicly available on September 1, 2026

Nitro: Boosting Distributed Reinforcement Learning with Serverless Computing

Deep reinforcement learning (DRL) has demonstrated significant potential in various applications, including gaming AI, robotics, and system scheduling. DRL algorithms produce, sample, and learn from training data online through a trial-and-error process, demanding considerable time and computational resources. To address this, distributed DRL algorithms and paradigms have been developed to expedite training using extensive resources. Through carefully designed experiments, we are the first to observe that strategically increasing the actor-environment interactions by spawning more concurrent actors at certain training rounds within ephemeral time frames can significantly enhance training efficiency. Yet, current distributed DRL solutions, which are predominantly server-based (or serverful), fail to capitalize on these opportunities due to their long startup times, limited adaptability, and cumbersome scalability. This paper proposesNitro, a generic training engine for distributed DRL algorithms that enforces timely and effective boosting with concurrent actors instantaneously spawned by serverless computing. With serverless functions,Nitroadjusts data sampling strategies dynamically according to the DRL training demands.Nitroseizes the opportunity of real-time boosting by accurately and swiftly detecting an empirical metric. To achieve cost efficiency, we design a heuristic actor scaling algorithm to guideNitrofor cost-aware boosting budget allocation. We integrateNitrowith state-of-the-art DRL algorithms and frameworks and evaluate them on AWS EC2 and Lambda. Experiments with Mujoco and Atari benchmarks show thatNitroimproves the final rewards (i.e., training quality) by up to 6× and reduces training costs by up to 42%. more »

Award ID(s):: 2337914 2534286 2534241 2403247

PAR ID:: 10613616

Author(s) / Creator(s):: Yu, Hanfei; Carter, Jacob; Wang, Hao; Tiwari, Devesh; Li, Jian; Park, Seung-Jong

Publisher / Repository:: Proceedings of the VLDB Endowment

Date Published:: 2025-09-01

Journal Name:: Proceedings of the VLDB Endowment

Volume:: 18

Issue:: 1

ISSN:: 2150-8097

Page Range / eLocation ID:: 66 to 79

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on September 1, 2026
Journal Article:
https://doi.org/10.14778/3696435.3696441

More Like this