NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Power-of-2-Arms for Adversarial Bandit Learning With Switching Costs

https://doi.org/10.1109/TON.2024.3522073

Shi, Ming; Lin, Xiaojun; Jiao, Lei (June 2025, IEEE Transactions on Networking)

Free, publicly-accessible full text available June 1, 2026
ARISE: High-Capacity AR Offloading Inference Serving via Proactive Scheduling

https://doi.org/10.1145/3643832.3661894

Kong, Z Jonny; Xu, Qiang; Hu, Y Charlie (June 2024, ACM)

With faster wireless networks and server GPUs, offloading high-accuracy but compute-intensive AR tasks implemented in Deep Neural Networks (DNNs) to edge servers offers a promising way to support high-QoE Augmented/Mixed Reality (AR/MR) applications. A cost-effective way for AR app vendors to deploy such edge-assisted AR apps to support a large user base is to use commercial Machine-Learning-as-a-Service (MLaaS) deployed at the edge cloud. To maximize cost-effectiveness, such an MLaaS provider faces a key design challenge, \ie how to maximize the number of clients concurrently served by each GPU server in its cluster while meeting per-client AR task accuracy SLAs. The above AR offloading inference serving problem differs from generic inference serving or video analytics serving in one fundamental way: due to the use of local tracking which reuses the last server-returned inference result to derive results for the current frame, the offloading frequency and end-to-end latency of each AR client directly affect its AR task accuracy (for all the frames). In this paper, we present ARISE, a framework that optimizes the edge server capacity in serving edge-assisted AR clients. Our design exploits the intricate interplay between per-client offloading schedule and batched inference on the server via proactively coordinating offloading request streams from different AR clients. Our evaluation using a large set of emulated AR clients and a 10-phone testbed shows that \name supports 1.7x--6.9x more clients compared to various baselines while keeping the per-client accuracy within the client-specified accuracy SLAs.
more » « less
Full Text Available
An Easier-to-Verify Sufficient Condition for Whittle Indexability and Application to AoI Minimization

Zhou, Sixiang; Lin, Xiaojun (May 2024, IEEE)

We study a scheduling problem for a base-station transmitting status information to multiple user-equipments (UE) with the goal of minimizing the total expected Age-of-Information (AoI). Such a problem can be formulated as a Restless MultiArmed Bandit (RMAB) problem and solved asymptoticallyoptimally by a low-complexity Whittle index policy, if each UE’s sub-problem is Whittle indexable. However, proving Whittle indexability can be highly non-trivial, especially when the value function cannot be derived in closed-form. In particular, this is the case for the AoI minimization problem with stochastic arrivals and unreliable channels, whose Whittle indexability remains an open problem. To overcome this difficulty, we develop a sufficient condition for Whittle indexability based on the notion of active time (AT). Even though the AT condition shares considerable similarity to the Partial Conservation Law (PCL) condition, it is much easier to understand and verify. We then apply our AT condition to the stochastic-arrival unreliablechannel AoI minimization problem and, for the first time in the literature, prove its Whittle indexability. Our proof uses a novel coupling approach to verify the AT condition, which may also be of independent interest to other large-scale RMAB problems.
more » « less
Full Text Available
A Case for Task Sampling based Learning for Cluster Job Scheduling

https://doi.org/10.1109/TCC.2022.3222649

Jajoo, Akshay; Hu, Y. Charlie; Lin, Xiaojun; Deng, Nan (November 2022, IEEE Transactions on Cloud Computing)

The ability to accurately estimate job runtime properties allows a scheduler to effectively schedule jobs. State-of-the-art online cluster job schedulers use history-based learning, which uses past job execution information to estimate the runtime properties of newly arrived jobs. However, with fast-paced development in cluster technology (in both hardware and software) and changing user inputs, job runtime properties can change over time, which lead to inaccurate predictions. In this paper, we explore the potential and limitation of real-time learning of job runtime properties, by proactively sampling and scheduling a small fraction of the tasks of each job. Such a task-sampling-based approach exploits the similarity among runtime properties of the tasks of the same job and is inherently immune to changing job behavior. Our analytical and experimental analysis of 3 production traces with different skew and job distribution shows that learning in space can be substantially more accurate. Our simulation and testbed evaluation on Azure of the two learning approaches anchored in a generic job scheduler using 3 production cluster job traces shows that despite its online overhead, learning in space reduces the average Job Completion Time (JCT) by 1.28×, 1.56×, and 1.32× compared to the prior-art history-based predictor. We further analyze the experimental results to give intuitive explanations to why learning in space outperforms learning in time in these experiments. Finally, we show how sampling-based learning can be extended to schedule DAG jobs and achieve similar speedups over the prior-art history-based predictor.
more » « less
Full Text Available
Power-of-2-arms for bandit learning with switching costs

https://doi.org/10.1145/3492866.3549720

Shi, Ming; Lin, Xiaojun; Jiao, Lei (October 2022, MobiHoc '22: Proceedings of the Twenty-Third International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing)

Full Text Available
A Case for Task Sampling based Learning for Cluster Job Scheduling

Jajoo, Akshay; Hu, Y. Charlie; Lin, Xiaojun; Deng, Nan (April 2022, Proceedings of 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI))

The ability to accurately estimate job runtime properties allows a scheduler to effectively schedule jobs. State-of-the-art online cluster job schedulers use history-based learning, which uses past job execution information to estimate the runtime properties of newly arrived jobs. However, with fast-paced development in cluster technology (in both hardware and software) and changing user inputs, job runtime properties can change over time, which lead to inaccurate predictions. In this paper, we explore the potential and limitation of real-time learning of job runtime properties, by proactively sampling and scheduling a small fraction of the tasks of each job. Such a task-sampling-based approach exploits the similarity among runtime properties of the tasks of the same job and is inherently immune to changing job behavior. Our analytical and experimental analysis of 3 production traces with different skew and job distribution shows that learning in space can be substantially more accurate. Our simulation and testbed evaluation on Azure of the two learning approaches anchored in a generic job scheduler using 3 production cluster job traces shows that despite its online overhead, learning in space reduces the average Job Completion Time (JCT) by 1.28x, 1.56x, and 1.32x compared to the prior-art history-based predictor. Finally, we show how sampling-based learning can be extended to schedule DAG jobs and achieve similar speedups over the prior-art history-based predictor.
more » « less
Full Text Available

Search for: All records