NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Fundamentals of Caching Layered Data Objects

Bari, Agrim; de_Veciana, Gustavo; Kesidis, George (July 2025, 45th IEEE International Conference on Distributed Computing Systems)

The effective management of the vast amounts of data processed or required by modern cloud and edge computing systems remains a fundamental challenge. This paper focuses on cache management for applications where data objects can be stored in layered representations. In such representations, each additional data layer enhances the “quality” of the object’s version, albeit at the cost of increased memory usage. This layered approach is advantageous in various scenarios, including the delivery of zoomable maps, video coding, future virtual reality gaming, and layered neural network models, where additional data layers improve quality/inference accuracy. In systems where users or devices request different versions of a data object, layered representations provide the flexibility needed for caching policies to achieve improved hit rates, i.e., delivering the specific representations required by users. This paper investigates the performance of the Least Recently Used (LRU) caching policy in the context of layered representation for data, referred to as Layered LRU (LLRU). To this end, we develop an asymptotically accurate analytical model for LLRU. We analyze how LLRU’s performance is influenced by factors such as the number of layers, as well as the popularity and size of an object’s layers. For example, our results demonstrate that, in the case of LLRU, adding more layers does not always enhance performance. Instead, the effectiveness of LLRU depends intricately on the popularity distribution and size characteristics of the layers.
more » « less
Free, publicly-accessible full text available July 20, 2026
Optimization of Offloading Policies for Accuracy-Delay Tradeoffs in Hierarchical Inference

Beytur, Hasan_Burhan; Aydin, Ahmet_Gunhan; de_Veciana, Gustavo; Vikalo, Haris (May 2024, IEEE INFOCOM 2024)

We consider a hierarchical inference system with multiple clients connected to a server via a shared communication resource. When necessary, clients with low-accuracy machine learning models can offload classification tasks to a server for processing on a high-accuracy model. We propose a distributed online offloading algorithm which maximizes the accuracy subject to a shared resource utilization constraint thus indirectly realizing accuracy-delay tradeoffs possible given an underlying network scheduler. The proposed algorithm, named Lyapunov-EXP4, introduces a loss structure based on Lyapunov-drift minimization techniques to the bandits with expert advice framework. We prove that the algorithm converges to a near-optimal threshold policy on the confidence of the clients’ local inference without prior knowledge of the system’s statistics and efficiently solves a constrained bandit problem with sublinear regret. We further consider settings where clients may employ multiple thresholds, allowing more aggressive optimization of overall accuracy at a possible loss in fairness. Extensive simulation results on real and synthetic data demonstrate convergence of Lyapunov-EXP4, and show the
more » « less
Full Text Available
Scheduling "Last Minute" Updates for Timely Decision-Making

https://doi.org/10.1145/3616388.3617518

Abou Rahal, Jean; de Veciana, Gustavo (October 2023, Proc. ACM MSWiM)

Full Text Available

Search for: All records