Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            The cellular network has undergone rapid progress since its inception in 1980s. While rapid iteration of newer generations of cellular technology plays a key role in this evolution, the incremental and eventually wide deployment of every new technology generation also plays a vital role in delivering the promised performance improvement. In this work, we conduct the first metamorphosis study of a cellular network generation, 5G, by measuring the user-experienced 5G performance from 5G network’s birth (initial deployment) to maturity (steady state). By analyzing a 4-year 5G performance trace of 2.65M+ Ookla® Speedtest Intelligence® measurements collected in 9 cities in the United States and Europe from January 2020 to December 2023, we unveil the detailed evolution of 5G coverage, throughput, and latency at the quarterly granularity, compare the performance diversity across the 9 representative cities, and gain insights into compounding factors that affect user-experienced 5G performance, such as adoption of 5G devices and the load on the 5G network. Our study uncovers the typical life-cycle of a new cellular technology generation as it undergoes its “growing pain” towards delivering its promised QoE over the previous technology generation.more » « lessFree, publicly-accessible full text available October 15, 2026
- 
            In 2022, 3 years after the initial 5G rollout, through a cross-country US driving trip (from Los Angeles to Boston), the authors of [28] conducted an in-depth measurement study of user-perceived experience (network coverage, performance, and QoE of a set of major 5G “killer” apps) over all three major US carriers. The study revealed disappointingly low 5G coverage and suboptimal network performance – falling short of the expectations needed to support the new generation of 5G "killer apps. Now, five years into the 5G era, widely considered its midlife, 5G networks are expected to deliver stable and mature performance. In this work, we replicate the 2022 study along the same coast-to-coast route, evaluating the current state of cellular coverage and network and application performance across all three major US operators. While we observe a substantial increase in 5G coverage and a corresponding boost in network performance, two out of three operators still exhibit less than 50% 5G coverage along the driving route even five years after the initial 5G rollout. We expand the scope of the previous work by analyzing key lower-layer KPIs that directly influence the network performance. Finally, we introduce a head-to-head comparison with Starlink’s LEO satellite network to assess whether emerging non-terrestrial networks (NTNs) can complement the terrestrial cellular infrastructure in the next generation of wireless connectivity.more » « lessFree, publicly-accessible full text available July 28, 2026
- 
            With the rapid innovation of GPUs, heterogeneous GPU clusters in both public clouds and on-premise data centers have become increasingly commonplace. In this paper, we demonstrate how pipeline parallelism, a technique wellstudied for throughput-oriented deep learning model training, can be used effectively for serving latency-bound model inference, e.g., in video analytics systems, on heterogeneous GPU clusters. Our work exploits the synergy between diversity in model layers and diversity in GPU architectures, which results in comparable inference latency for many layers when running on low-class and high-class GPUs. We explore how such overlooked capability of low-class GPUs can be exploited using pipeline parallelism and present a novel inference serving system, PPipe, that employs pool-based pipeline parallelism via an MILP-based control plane and a data plane that performs resource reservation-based adaptive batching. Evaluation results on diverse workloads (18 CNN models) show that PPipe achieves 41.1%–65.5% higher utilization of low-class GPUs while maintaining high utilization of high-class GPUs, leading to 32.2%–75.1% higher serving throughput compared to various baselines.more » « lessFree, publicly-accessible full text available July 9, 2026
- 
            Networking research has witnessed a renaissance from exploring the seemingly unlimited predictive power of machine learning (ML) models. One such promising direction is throughput prediction – accurately predicting the network bandwidth or achievable throughput of a client in real time using ML models can enable a wide variety of network applications to proactively adapt their behavior to the changing network dynamics to potentially achieve significantly improved QoE. Motivated by the key role of newer generations of cellular networks in supporting the new generation of latency-critical applications such as AR/MR, in this work, we focus on accurate throughput prediction in cellular networks at fine time-scales, e.g., in the order of 100 ms. Through a 4-day, 1000+ km driving trip, we collect a dataset of fine-grained throughput measurements under driving across all three major US operators. Using the collected dataset, we conduct the first feasibility study of predicting fine-grained application throughput in real-world cellular networks with mixed LTE/5G technologies. Our analysis shows that popular ML models previously claimed to predict well for various wireless networks scenarios (e.g., WiFi or singletechnology network such as LTE only) do not predict well under app-centric metrics such as ARE95 and PARE10. Further, we uncover the root cause for the poor prediction accuracy of ML models as the inherent conflicting sample sequences in the finegrained cellular network throughput data.more » « less
- 
            With faster wireless networks and server GPUs, offloading high-accuracy but compute-intensive AR tasks implemented in Deep Neural Networks (DNNs) to edge servers offers a promising way to support high-QoE Augmented/Mixed Reality (AR/MR) applications. A cost-effective way for AR app vendors to deploy such edge-assisted AR apps to support a large user base is to use commercial Machine-Learning-as-a-Service (MLaaS) deployed at the edge cloud. To maximize cost-effectiveness, such an MLaaS provider faces a key design challenge, \ie how to maximize the number of clients concurrently served by each GPU server in its cluster while meeting per-client AR task accuracy SLAs. The above AR offloading inference serving problem differs from generic inference serving or video analytics serving in one fundamental way: due to the use of local tracking which reuses the last server-returned inference result to derive results for the current frame, the offloading frequency and end-to-end latency of each AR client directly affect its AR task accuracy (for all the frames). In this paper, we present ARISE, a framework that optimizes the edge server capacity in serving edge-assisted AR clients. Our design exploits the intricate interplay between per-client offloading schedule and batched inference on the server via proactively coordinating offloading request streams from different AR clients. Our evaluation using a large set of emulated AR clients and a 10-phone testbed shows that \name supports 1.7x--6.9x more clients compared to various baselines while keeping the per-client accuracy within the client-specified accuracy SLAs.more » « less
- 
            After a rapid deployment worldwide over the past few years, 5G is expected to have reached a mature deployment stage to provide measurable improvement of network performance and user experience over its predecessors. In this study, we aim to assess 5G deployment maturity via three conditions: (1) Does 5G performance remain stable over a long time span? (2) Does 5G provide better performance than its predecessor LTE? (3) Does the technology offer similar performance across diverse geographic areas and cellular operators? We answer this important question by conducting a cross-sectional, year-long measurement study of 5G uplink performance. Leveraging a custom Android App, we collected 5G uplink performance measurements (of critical importance to latency-critical apps) spanning 8 major cities in 7 countries and two different continents. Our measurements show that 5G deployment in major cities appears to have matured, with no major performance improvements observed over a one-year period, but 5G does not provide consistent, superior measurable performance over LTE, especially in terms of latency, and further there exists clear uneven 5G performance across the 8 cities. Our study suggests that, while 5G deployment appears to have stagnated, it is short of delivering its promised performance and user experience gain over its predecessor.more » « less
- 
            Immersive applications such as Augmented Reality (AR) and Mixed Reality (MR) often need to perform multiple latency-critical tasks on every frame captured by the camera, which all require results to be available within the current frame interval. While such tasks are increasingly supported by Deep Neural Networks (DNNs) offloaded to edge servers due to their high accuracy but heavy computation, prior work has largely focused on offloading one task at a time. Compared to offloading a single task, where more frequent offloading directly translates into higher task accuracy, offloading of multiple tasks competes for shared edge server resources, and hence faces the additional challenge of balancing the offloading frequencies of different tasks to maximize the overall accuracy and hence app QoE. In this paper, we formulate this accuracy-centric multitask offloading problem, and present a framework that dynamically schedules the offloading of multiple DNN tasks from a mobile device to an edge server while optimizing the overall accuracy across tasks. Our design employs two novel ideas: (1) task-specific lightweight models that predict offloading accuracy drop as a function of offloading frequency and frame content, and (2) a general two-level control feedback loop that concurrently balances offloading among tasks and adapts between offloading and using local algorithms for each task. Evaluation results show that our framework improves the overall accuracy significantly in jointly offloading two core tasks in AR — depth estimation and odometry — by on average 7.6%–14.3% over the best baselines under different accuracy weight ratios.more » « less
- 
            Augmented Reality (AR) devices are set apart from other mobile devices by the immersive experience they offer. While the powerful suite of sensors on modern AR devices is necessary for enabling such an immersive experience, they can create unease in bystanders (i.e., those surrounding the device during its use) due to potential bystander data leaks, which is called the bystander privacy problem. In this paper, we propose BystandAR, the first practical system that can effectively protect bystander visual (camera and depth) data in real-time with only on-device processing. BystandAR builds on a key insight that the device user's eye gaze and voice are highly effective indicators for subject/bystander detection in interpersonal interaction, and leverages novel AR capabilities such as eye gaze tracking, wearer-focused microphone, and spatial awareness to achieve a usable frame rate without offloading sensitive information. Through a 16-participant user study,we show that BystandAR correctly identifies and protects 98.14% of bystanders while allowing access to 96.27% of subjects. We accomplish this with average frame rates of 52.6 frames per second without the need to offload unprotected bystander data to another device.more » « less
- 
            Augmented Reality (AR) devices are set apart from other mobile devices by the immersive experience they offer. While the powerful suite of sensors on modern AR devices is necessary for enabling such an immersive experience, they can create unease in bystanders (i.e., those surrounding the device during its use) due to potential bystander data leaks, which is called the bystander privacy problem. In this poster, we propose BystandAR, the first practical system that can effectively protect bystander visual (camera and depth) data in real-time with only on-device processing. BystandAR builds on a key insight that the device user's eye gaze and voice are highly effective indicators for subject/bystander detection in interpersonal interaction, and leverages novel AR capabilities such as eye gaze tracking, wearer-focused microphone, and spatial awareness to achieve a usable frame rate without offloading sensitive information. Through a 16-participant user study, we show that BystandAR correctly identifies and protects 98.14% of bystanders while allowing access to 96.27% of subjects. We accomplish this with average frame rates of 52.6 frames per second without the need to offload unprotected bystander data to another device.more » « less
- 
            Edge-assisted AR supports high-quality AR on resource-constrained mobile devices by offloading high-rate camera-captured frames to powerful GPU edge servers to perform heavy vision tasks. Since the result of an offloaded frame may not come back in the same frame interval, edge-assisted AR designs resort to local tracking on the last server returned result to generate more accurate result for the current frame. In such an offloading+local tracking paradigm, reducing the staleness of the last server returned result is critical to improving AR task accuracy. In this paper, we present MPCP, an online offloading scheduling framework that minimizes the staleness of server-returned result in edge-assisted AR by optimally pipelining network transfer of frames to the edge server and the Deep Neural Network inference on the edge server. MPCP is based on model predictive control (MPC). Our evaluation results show that MPCP reduces the depth estimation error by up to 10.0% compared to several baseline schemes.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
