NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Simple Policies for Multiresource Job Scheduling

Chen, Zhongrui; Grosof, Isaac; Berg, Benjamin (September 2024, Association for Computing Machinery)

Data center workloads are composed of multiresource jobs requiring a variety of computational resources including CPU cores, memory, disk space, and hardware accelerators. Mod- ern servers can run multiple jobs in parallel, but a set of jobs can only run in parallel if the server has sufficient resources to satisfy the demands of each job. It is generally hard to find sets of jobs that perfectly utilize all server resources, and choosing the wrong set of jobs can lead to low resource uti- lization. This raises the question of how to allocate resources across a stream of arriving multiresource jobs to minimize the mean response time across jobs — the mean time from when a job arrives to the system until it is complete. Current policies for scheduling multiresource jobs are com- plex to analyze and hard to implement. We propose a class of simple policies, called Markovian Service Rate (MSR) policies. We show that the class of MSR policies is throughput- optimal, in that if a policy exists that can stabilize the sys- tem, then an MSR policy exists that stabilizes the system. We derive bounds on the mean response time under an MSR policy, and show how our bounds can be used to choose an MSR policy that minimizes mean response time.
more » « less
Full Text Available
Statistical verification of autonomous system controllers under timing uncertainties

https://doi.org/10.1007/S11241-023-09417-X

Ghosh, Bineet; Hobbs, Clara; Xu, Shengjie; Smith, Don; Anderson, James H; Thiagarajan, P S; Berg, Benjamin; Duggirala, Parasara Sridhar; Chakraborty, Samarjit (March 2024, Real-Time Systems)

Full Text Available
The case for phase-aware scheduling of parallelizable jobs

https://doi.org/10.1016/j.peva.2021.102246

Berg, Benjamin; Whitehouse, Justin; Moseley, Benjamin; Wang, Weina; Harchol-Balter, Mor (October 2021, Performance Evaluation)
null (Ed.)
Full Text Available
Optimal Scheduling of Parallel Jobs with Unknown Service Requirements

https://doi.org/10.4018/978-1-7998-7156-9.ch003

Berg, Benjamin; Harchol-Balter, Mor (January 2021, Handbook of Research on Methodologies and Applications of Supercomputing)

Full Text Available
heSRPT: Parallel scheduling to minimize mean slowdown

https://doi.org/10.1016/j.peva.2020.102147

Berg, Benjamin; Vesilo, Rein; Harchol-Balter, Mor (December 2020, Performance Evaluation)
null (Ed.)
Full Text Available
Kangaroo: Caching Billions of Tiny Objects on Flash

https://doi.org/10.1145/3477132.3483568

McAllister, Sara; Berg, Benjamin; Tutuncu-Macias, Julian; Yang, Juncheng; Gunasekar, Sathya; Lu, Jimmy; Berger, Daniel S.; Beckmann, Nathan; Ganger, Gregory R. (October 2021, Symposium on Operating Systems Principles)
null (Ed.)
Full Text Available
Optimal Resource Allocation for Elastic and Inelastic Jobs

https://doi.org/10.1145/3350755.3400265

Berg, Benjamin; Harchol-Balter, Mor; Moseley, Benjamin; Wang, Weina; Whitehouse, Justin (July 2020, Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2020))

Full Text Available
heSRPT: Optimal Scheduling of Parallel Jobs with Known Sizes

https://doi.org/10.1145/3374888.3374896

Berg, Benjamin; Vesilo, Rein; Harchol-Balter, Mor. (December 2019, ACM SIGMETRICS performance evaluation review)

Full Text Available
The CacheLib Caching Engine: Design and Experiences at Scale

Berg, Benjamin; Berger, Daniel; McAllister, Sara; Grosof, Isaac; Gunasekar, Sathya; Lu, Jimmy; Uhlar, Michael; Carrig, Jim; Beckmann, Nathan; Harchol-Balter, Mor; et al (January 2020, 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2020))

Full Text Available
RobinHood: Tail Latency Aware Caching - Dynamic Reallocation from Cache-Rich to Cache-Poor

Berger, Daniel; Berg, Benjamin; Zhu, Timothy; Sen, Siddhartha; Harchol-Balter, Mor (October 2018, 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018)

Full Text Available

Search for: All records