Today's clouds are inefficient: their utilization of resources like CPUs, GPUs, memory, and storage is low. This inefficiency occurs because applications consume resources at variable rates and ratios, while clouds offer resources at fixed rates and ratios. This mismatch of offering and consumption styles prevents fully realizing the utility computing vision. We advocate for fungible applications, that is, applications that can distribute, scale, and migrate their consumption of different resources independently while fitting their availability across different servers (e.g., memory at one server, CPU at another). Our goal is to make use of resources even if they are transiently available on a server for only a few milliseconds. We are developing a framework called Quicksand for building such applications and unleashing the utility computing vision. Initial results using Quicksand to implement a DNN training pipeline are promising: Quicksand saturates resources that are imbalanced across machines or rapidly shift in quantity.
more »
« less
This content will become publicly available on April 28, 2026
Quicksand: Harnessing Stranded Datacenter Resources with Granular Computing
Datacenters today waste CPU and memory, as resources demanded by applications often fail to match the resources available on machines. This leads to stranded resources because one resource that runs out prevents placing additional applications that could consume the other resources. Unusable stranded resources result in reduced utilization of servers, and wasted money and energy. Quicksand is a new framework and runtime system that unstrands resources by providing developers with familiar, high-level abstractions (e.g., data structures, batch computing). Internally Quicksand decomposes them into resource proclets, granular units that each primarily consume resources of one type. Inspired by recent granular programming models, Quicksand decouples consumption of resources as much as possible. It splits, merges, and migrates resource proclets in milliseconds, so it can use resources on any machine, even if available only briefly. Evaluation of our prototype with four applications shows that Quicksand uses stranded resources effectively; that Quicksand reacts to changing resource availability and demand within milliseconds, increasing utilization; and that porting applications to Quicksand requires moderate effort.
more »
« less
- Award ID(s):
- 2104398
- PAR ID:
- 10635642
- Publisher / Repository:
- NSDI'25 (22nd USENIX Symposium on Networked Systems Design and Implementation). USENIX Association.
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Denial of service (DoS) attacks increasingly exploit algorithmic, semantic, or implementation characteristics dormant in victim applications, often with minimal attacker resources. Practical and efficient detection of these asymmetric DoS attacks requires us to (i) catch offending requests in-flight, before they consume a critical amount of resources, (ii) remain agnostic to the application internals, such as the programming language or runtime system, and (iii) introduce low overhead in terms of both performance and programmer effort. This paper introduces FINELAME, a language-independent framework for detecting asymmetric DoS attacks. FINELAME leverages operating system visibility across the entire software stack to instrument key resource allocation and negotiation points. It leverages recent advances in the Linux extended Berkeley Packet Filter virtual machine to attach application-level interposition probes to key request processing functions, and lightweight resource monitors--user/kernel-level probes--to key resource allocation functions. The data collected is used to train a model of resource utilization that occurs throughout the lifetime of individual requests. The model parameters are then shared with the resource monitors, which use them to catch offending requests in-flight, inline with resource allocation. We demonstrate that FINELAME can be integrated with legacy applications with minimal effort, and that it is able to detect resource abuse attacks much earlier than their intended completion time while posing low performance overheads.more » « less
-
In recent years, a range of online applications have facilitated resource sharing among users, resulting in a significant increase in resource utilization. In all such applications, sharing one’s resources or skills with other agents increases social welfare. In general, each agent will look for other agents whose available resources complement hers, thereby forming natural sharing groups. In this paper, we study settings where a large population self-organizes into sharing groups. In many cases, centralized optimization approaches for creating an optimal partition of the user population are infeasible because either the central authority does not have the necessary information to compute an optimal partition, or it does not have the power to enforce a partition. Instead, the central authority puts in place an incentive structure in the form of a utility sharing method, before letting the participants form the sharing groups by themselves. We first analyze a simple equal-sharing method, which is the one most typically encountered in practice and show that it can lead to highly inefficient equilibria. We then propose a Shapley-sharing method and show that it significantly improves overall social welfare.more » « less
-
Grid Engine is a Distributed Resource Manager (DRM), that manages the resources of distributed systems (such as Grid, HPC, or Cloud systems) and executes designated jobs which have requested to occupy or consume those resources. Grid Engine applies scheduling policies to allocate resources for jobs while simultaneously attempting to maintain optimal utilization of all machines in the distributed system. However, due to the complexity of Grid Engine's job submission commands and complicated resource management policies, the number of faulty job submissions in data centers increases with the number of jobs being submitted. To combat the increase in faulty jobs, Grid Engine allows administrators to design and implement Job Submission Verifiers (JSV) to verify jobs before they enter into Grid Engine. In this paper, we will discuss a Job Submission Verifier that was designed and implemented for Univa Grid Engine, a commercial version of Grid Engine, and thoroughly evaluated at the High Performance Computing Center of Texas Tech University. Our newly developed JSV communicates with Univa Grid Engine (UGE) components to verify whether a submitted job should be accepted as is, or modified then accepted, or rejected due to improper requests for resources. It had a substantial positive impact on reducing the number of faulty jobs submitted to UGE by far. For instance, it corrected 28.6% of job submissions and rejected 0.3% of total jobs from September 2018 to February 2019, that may otherwise lead to long or infinite waiting time in the job queue.more » « less
-
Traditionally, multi-tenant cloud and edge platforms use fair-share schedulers to fairly multiplex resources across applications. These schedulers ensure applications receive processing time proportional to a configurable share of the total time. Unfortunately, enforcing time-fairness across applications often violates energy-fairness, such that some applications consume more than their fair share of energy. This occurs because applications either do not fully utilize their resources or operate at a reduced frequency/voltage during their time-slice. The problem is particularly acute for machine learning (ML) applications using GPUs, where model size largely dictates utilization and energy usage. Enforcing energy-fairness is also important since energy is a costly and limited resource. For example, in cloud platforms, energy dominates operating costs and is limited by the power delivery infrastructure, while in edge platforms, energy is often scarce and limited by energy harvesting and battery constraints. To address the problem, we define the notion of Energy-Time Fairness (ETF), which enables a configurable tradeoff between energy and time fairness, and then design a scheduler that enforces it. We show that ETF satisfies many well-accepted fairness properties. ETF and the new tradeoff it offers are important, as some applications, especially ML models, are time/latency-sensitive and others are energy-sensitive. Thus, while enforcing pure energy-fairness starves time/latency-sensitive applications (of time) and enforcing pure time-fairness starves energy-sensitive applications (of energy), ETF is able to mind the gap between the two. We implement an ETF scheduler, and show that it improves fairness by up to 2x, incentivizes energy efficiency, and exposes a configurable knob to operate between energy- and time-fairness.more » « less
An official website of the United States government
