Search for: All records

Award ID contains: 1838022

« Prev Next »

Total Resources

7

Resource Type
Conference Paper

3

Conference Proceeding

0

Dataset

0

Journal Article

4

Workshop Report

0

Availability
Full Text / Resource Available

6

Citation Only

1

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Epidemic Spread Modeling for COVID-19 Using Cross-Fertilization of Mobility Data

https://doi.org/10.1109/TBDATA.2023.3248650

Schmedding, Anna ; Pinciroli, Riccardo ; Yang, Lishan ; Smirni, Evgenia ( October 2023 , IEEE Transactions on Big Data)

Free, publicly-accessible full text available October 1, 2024
Lifespan and Failures of SSDs and HDDs: Similarities, Differences, and Prediction Models

https://doi.org/10.1109/TDSC.2021.3131571

Pinciroli, Riccardo ; Yang, Lishan ; Alter, Jacob ; Smirni, Evgenia ( January 2023 , IEEE Transactions on Dependable and Secure Computing)

Data center downtime typically centers around IT equipment failure. Storage devices are the most frequently failing components in data centers. We present a comparative study of hard disk drives (HDDs) and solid state drives (SSDs) that constitute the typical storage in data centers. Using six-year field data of 100,000 HDDs of different models from the same manufacturer from the Backblaze dataset and six-year field data of 30,000 SSDs of three models from a Google data center, we characterize the workload conditions that lead to failures. We illustrate that their root failure causes differ from common expectations and that they remain difficult to discern. For the case of HDDs we observe that young and old drives do not present many differences in their failures. Instead, failures may be distinguished by discriminating drives based on the time spent for head positioning. For SSDs, we observe high levels of infant mortality and characterize the differences between infant and non-infant failures. We develop several machine learning failure prediction models that are shown to be surprisingly accurate, achieving high recall and low false positive rates. These models are used beyond simple prediction as they aid us to untangle the complex interaction of workload characteristics that lead to failures and identify failure root causes from monitored symptoms.
more » « less
Full Text Available
GeoSpread: an Epidemic Spread Modeling Tool for COVID-19 Using Mobility Data

https://doi.org/10.1145/3524458.3547257

Schmedding, Anna ; Yang, Lishan ; Pinciroli, Riccardo ; Smirni, Evgenia ( September 2022 , GoodIT 2022: {ACM} International Conference on Information Technology for Social Good, Limassol, Cyprus, September 7 - 9, 2022)
Mourlas, cotas ; Pacheco, Diego ; Pandi, Catia (Ed.)
We present an individual-centric agent-based model and a flexible tool, GeoSpread, for studying and predicting the spread of viruses and diseases in urban settings. Using COVID-19 data collected by the Korean Center for Disease Control & Prevention (KCDC), we analyze patient and route data of infected people from January 20, 2020, to May 31, 2020, and discover how infection clusters develop as a function of time. This analysis offers a statistical characterization of population mobility and is used to parameterize GeoSpread to capture the spread of the disease. We validate simulation predictions from GeoSpread with ground truth and we evaluate different what-if counter-measure scenarios to illustrate the usefulness and flexibility of the tool for epidemic modeling.
more » « less
Full Text Available
Optimizing inference serving on serverless platforms

https://doi.org/10.14778/3547305.3547313

Ali, Ahsan ; Pinciroli, Riccardo ; Yan, Feng ; Smirni, Evgenia ( June 2022 , Proceedings of the VLDB Endowment)

Serverless computing is gaining popularity for machine learning (ML) serving workload due to its autonomous resource scaling, easy to use and pay-per-use cost model. Existing serverless platforms work well for image-based ML inference, where requests are homogeneous in service demands. That said, recent advances in natural language processing could not fully benefit from existing serverless platforms as their requests are intrinsically heterogeneous. Batching requests for processing can significantly increase ML serving efficiency while reducing monetary cost, thanks to the pay-per-use pricing model adopted by serverless platforms. Yet, batching heterogeneous ML requests leads to additional computation overhead as small requests need to be "padded" to the same size as large requests within the same batch. Reaching effective batching decisions (i.e., which requests should be batched together and why) is non-trivial: the padding overhead coupled with the serverless auto-scaling forms a complex optimization problem. To address this, we develop Multi-Buffer Serving (MBS), a framework that optimizes the batching of heterogeneous ML inference serving requests to minimize their monetary cost while meeting their service level objectives (SLOs). The core of MBS is a performance and cost estimator driven by analytical models supercharged by a Bayesian optimizer. MBS is prototyped and evaluated on AWS using bursty workloads. Experimental results show that MBS preserves SLOs while outperforming the state-of-the-art by up to 8 x in terms of cost savings while minimizing the padding overhead by up to 37 x with 3 x less number of serverless function invocations.
more » « less
Full Text Available
CEDULE+: Resource Management for Burstable Cloud Instances Using Predictive Analytics

https://doi.org/10.1109/TNSM.2020.3039942

Pinciroli, Riccardo ; Ali, Ahsan ; Yan, Feng ; Smirni, Evgenia ( November 2020 , IEEE Transactions on Network and Service Management)
null (Ed.)
Nearly all principal cloud providers now provide burstable instances in their offerings. The main attraction of this type of instance is that it can boost its performance for a limited time to cope with workload variations. Although burstable instances are widely adopted, it is not clear how to efficiently manage them to avoid waste of resources. In this paper, we use predictive data analytics to optimize the management of burstable instances. We design CEDULE+, a data-driven framework that enables efficient resource management for burstable cloud instances by analyzing the system workload and latency data. CEDULE+ selects the most profitable instance type to process incoming requests and controls CPU, I/O, and network usage to minimize the resource waste without violating Service Level Objectives (SLOs). CEDULE+ uses lightweight profiling and quantile regression to build a data-driven prediction model that estimates system performance for all combinations of instance type, resource type, and system workload. CEDULE+ is evaluated on Amazon EC2, and its efficiency and high accuracy are assessed through real-case scenarios. CEDULE+ predicts application latency with errors less than 10%, extends the maximum performance period of a burstable instance up to 2.4 times, and decreases deployment costs by more than 50%.
more » « less
Full Text Available
BATCH: Machine Learning Inference Serving on Serverless Platforms with Adaptive Batching

https://doi.org/10.1109/SC41405.2020.00073

A. Ali, R. Pinciroli ( November 2020 , 2020 SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Atlanta, GA, US, 2020 pp. 972-986. doi: 10.1109/SC41405.2020.00073)

Serverless computing is a new pay-per-use cloud service paradigm that automates resource scaling for stateless functions and can potentially facilitate bursty machine learning serving. Batching is critical for latency performance and cost-effectiveness of machine learning inference, but unfortunately it is not supported by existing serverless platforms due to their stateless design. Our experiments show that without batching, machine learning serving cannot reap the benefits of serverless computing. In this paper, we present BATCH, a framework for supporting efficient machine learning serving on serverless platforms. BATCH uses an optimizer to provide inference tail latency guarantees and cost optimization and to enable adaptive batching support. We prototype BATCH atop of AWS Lambda and popular machine learning inference systems. The evaluation verifies the accuracy of the analytic optimizer and demonstrates performance and cost advantages over the state-of-the-art method MArk and the state-of-the-practice tool SageMaker.
more » « less
Full Text Available
Mining Multivariate Discrete Event Sequences for Knowledge Discovery and Anomaly Detection

https://doi.org/10.1109/DSN48063.2020.00067

Nie, Bin ; Xu, Jianwu ; Alter, Jacob ; Chen, Haifeng ; Smirni, Evgenia ( June 2020 , Proceedings of the 50th IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2020)
null (Ed.)
Modern physical systems deploy large numbers of sensors to record at different time-stamps the status of different systems components via measurements such as temperature, pressure, speed, but also the component's categorical state. Depending on the measurement values, there are two kinds of sequences: continuous and discrete. For continuous sequences, there is a host of state-of-the-art algorithms for anomaly detection based on time-series analysis, but there is a lack of effective methodologies that are tailored specifically to discrete event sequences. This paper proposes an analytics framework for discrete event sequences for knowledge discovery and anomaly detection. During the training phase, the framework extracts pairwise relationships among discrete event sequences using a neural machine translation model by viewing each discrete event sequence as a "natural language". The relationship between sequences is quantified by how well one discrete event sequence is "translated" into another sequence. These pairwise relationships among sequences are aggregated into a multivariate relationship graph that clusters the structural knowledge of the underlying system and essentially discovers the hidden relationships among discrete sequences. This graph quantifies system behavior during normal operation. During testing, if one or more pairwise relationships are violated, an anomaly is detected. The proposed framework is evaluated on two real-world datasets: a proprietary dataset collected from a physical plant where it is shown to be effective in extracting sensor pairwise relationships for knowledge discovery and anomaly detection, and a public hard disk drive dataset where its ability to effectively predict upcoming disk failures is illustrated.
more » « less
Full Text Available