Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Large-scale deep learning workloads increasingly suffer from I/O bottlenecks as datasets grow beyond local storage capacities and GPU compute outpaces network and disk latencies. While recent systems optimize data-loading time, they overlook the energy cost of I/O—a critical factor at large scale. We introduce EMLIO, an Efficient Machine Learning I/O service that jointly minimizes end-to-end data-loading latency (𝑇) and I/O energy consumption (𝐸) across variable-latency networked storage. EMLIO deploys a lightweight data-serving daemon on storage nodes that serializes and batches raw samples, streams them over TCP with out-of-order prefetching, and integrates seamlessly with GPU-accelerated (NVIDIA DALI) pre-processing on the client side. In exhaustive evaluations over local disk, LAN (0.05 ms & 10 ms round trip time (RTT)), and WAN (30 ms RTT) environments, EMLIO delivers on average up to 8.6X faster I/O and 10.9X lower energy use compared to state-of-the-art loaders, while maintaining constant performance and energy profiles irrespective of network distance. EMLIO’s service-based architecture offers a scalable blueprint for energy-aware I/O in next-generation AI clouds.more » « lessFree, publicly-accessible full text available November 15, 2026
-
Inter-datacenter communication is a significant part of cloud operations and produces a substantial amount of carbon emissions for cloud data centers, where the environmental impact has already been a pressing issue. In this paper, we present a novel carbon-aware temporal data transfer scheduling framework, called LinTS, which promises to significantly reduce the carbon emission of data transfers between cloud data centers. LinTS produces a competitive transfer schedule and makes scaling decisions, outperforming common heuristic algorithms. LinTS can lower carbon emissions during inter-datacenter transfers by up to 66% compared to the worst case and up to 15% compared to other solutions while preserving all deadline constraints.more » « lessFree, publicly-accessible full text available July 12, 2026
-
The increasing complexity of AI workloads, especially distributed Large Language Model (LLM) training, places significant strain on the networking infrastructure of parallel data centers and supercomputing systems. While Equal-Cost Multi-Path (ECMP) routing distributes traffic over parallel paths, hash collisions often lead to imbalanced network resource utilization and performance bottlenecks. This paper presents FlowTracer, a tool designed to analyze network path utilization and evaluate different routing strategies. Unlike tools that introduce additional traffic, FlowTracer aids in debugging network inefficiencies by passively monitoring and correlating user workload flows. As a result, FlowTracer does not interfere with ongoing data transfers, enabling analysis with minimal overhead, which is an important factor when debugging and fine-tuning routing schemes in production systems. FlowTracer can provide detailed insights into traffic distribution and can help identify the root causes of performance degradation, such as hash collisions. With FlowTracer’s flow-level insights, system operators can optimize routing, reduce congestion, and improve the performance of distributed AI workloads. We use a RoCEv2-enabled cluster with a leaf-spine network and 16 400-Gbps nodes to demonstrate how FlowTracer can be used to compare the flow imbalances of ECMP routing against a statically configured network. The example showcases a 30% reduction in imbalance, as measured by a new metric we introduce.more » « lessFree, publicly-accessible full text available June 8, 2026
-
The growing adoption of cloud, edge, and distributed computing, as well as the rise in the use of AI/ML workloads, have created a significant need to measure, monitor, and reduce the carbon emissions associated with these resource-intensive tasks. One significant but often overlooked source of emissions is data transfers over wide-area networks (WANs), primarily due to the challenges in accurately measuring the carbon footprint of end-to-end network paths. We introduce a novel mechanism to measure network carbon footprints and propose strategies for optimizing the scheduling of network-intensive tasks. We show that users can achieve significant carbon savings by shifting data transfer tasks across time and geographic regions based on local carbon intensity.more » « lessFree, publicly-accessible full text available March 1, 2026
-
Adaptive bitrate (ABR) algorithms play a critical role in video streaming by making optimal bitrate decisions in dynamically changing network conditions to provide a high quality of experience (QoE) for users. However, most existing ABRs suffer from limitations such as predefined rules and incorrect assumptions about streaming parameters. They often prioritize higher bitrates and ignore the corresponding energy footprint, resulting in increased energy consumption, especially for mobile device users. Additionally, most ABR algorithms do not consider perceived quality, leading to suboptimal user experience. This article proposes a novel ABR scheme called GreenABR+, which utilizes deep reinforcement learning to optimize energy consumption during video streaming while maintaining high user QoE. Unlike existing rule-based ABR algorithms, GreenABR+ makes no assumptions about video settings or the streaming environment. GreenABR+ model works on different video representation sets and can adapt to dynamically changing conditions in a wide range of network scenarios. Our experiments demonstrate that GreenABR+ outperforms state-of-the-art ABR algorithms by saving up to 57% in streaming energy consumption and 57% in data consumption while providing up to 25% more perceptual QoE due to up to 87% less rebuffering time and near-zero capacity violations. The generalization and dynamic adaptability make GreenABR+ a flexible solution for energy-efficient ABR optimization.more » « less
-
Advancements in mobile hardware and streaming technologies enable high-quality video streaming for mobile users, but this comes at a cost: a boost in power consumption. Despite detailed studies on power consumption during acquisition, existing studies fall short of considering recent technologies and, hence, of accurately capturing video playback power consumption. This paper presents a novel method to model mobile video playback power consumption. First, we identify the major components contributing to power consumption during video playback on mobile devices. Then, we develop models for each component to estimate their power consumption. Our experimental results show that our combined model estimates power consumption with 91% mean accuracy. Furthermore, our model maintains its high accuracy on an unseen device, achieving 88% mean accuracy despite the hardware and screen heterogeneity.more » « less
An official website of the United States government
