NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Smart Starts: Accelerating Convergence Through Uncommon Region Exploration

https://doi.org/10.1145/3712255.3726720

Zhang, Xinyu; Antunes, Mário; Estro, Tyler; Zadok, Erez; Mueller, Klaus (July 2025, ACM)

Initialization profoundly affects evolutionary algorithm (EA) efficacy by dictating search trajectories and convergence. This study introduces a hybrid initialization strategy combining empty-space search algorithm (ESA) and opposition-based learning (OBL). OBL initially generates a diverse population, subsequently augmented by ESA, which identifies under-explored regions. This synergy enhances population diversity, accelerates convergence, and improves EA performance on complex, high-dimensional optimization problems. Benchmark results demonstrate the proposed method's superiority in solution quality and convergence speed compared to conventional initialization techniques.
more » « less
Free, publicly-accessible full text available July 14, 2026
Kneeliverse: A universal knee-detection library for performance curves

https://doi.org/10.1016/j.softx.2025.102161

Antunes, Mário; Estro, Tyler; Bhandari, Pranav; Gandhi, Anshul; Kuenning, Geoff; Liu, Yifei; Waldspurger, Carl; Wildani, Avani; Zadok, Erez (May 2025, SoftwareX)

Identifying knee and elbow points in performance curves is a critical task in various domains, including machine learning and system design. These points represent optimal trade-offs between cost and performance, facilitating efficient decision-making and resource allocation. However, accurately determining the knees and elbows in curves poses a significant challenge. To address this challenge, we introduce Kneeliverse, an open-source library dedicated to knee/elbow point detection. Kneeliverse incorporates a suite of well-established knee-detection algorithms, including Menger, L-method, Kneedle, and DFDT. Additionally, Kneeliverse extends these algorithms to detect multiple knees and elbows in complex curves, employing a recursive approach. Kneeliverse further includes Z-Method, a recently developed algorithm specifically designed for multi-knee detection.
more » « less
Free, publicly-accessible full text available May 1, 2026
Into the Void: Mapping the Unseen Gaps in High Dimensional Data

https://doi.org/10.1109/TVCG.2025.3572850

Zhang, Xinyu; Estro, Tyler; Kuenning, Geoff; Zadok, Erez; Mueller, Klaus (January 2025, IEEE Transactions on Visualization and Computer Graphics)

We present a comprehensive pipeline, integrated with a visual analytics system called GapMiner, capable of exploring and exploiting untapped opportunities within the empty regions of high-dimensional datasets. Our approach utilizes a novel Empty-Space Search Algorithm (ESA) to identify the center points of these uncharted voids, which represent reservoirs for potentially valuable new configurations. Initially, this process is guided by user interactions through GapMiner, which visualizes Empty-Space Configurations (ESCs) within the context of the dataset and allows domain experts to explore and refine ESCs for subsequent validation in domain experiments or simulations. These activities iteratively enhance the dataset and contribute to training a connected deep neural network (DNN). As training progresses, the DNN gradually assumes the role of identifying and validating high-potential ESCs, reducing the need for direct user involvement. Once the DNN achieves sufficient accuracy, it autonomously guides the exploration of optimal configurations by predicting performance and refining configurations through a combination of gradient ascent and improved empty-space searches. Domain experts were actively involved throughout the system’s development. Our findings demonstrate that this methodology consistently generates superior novel configurations compared to conventional randomization-based approaches. We illustrate its effectiveness in multiple case studies with diverse objectives.
more » « less
Free, publicly-accessible full text available January 1, 2026
Accelerating multi-tier storage cache simulations using knee detection

Estro, Tyler; Antunes, Mario; Bhandari, Pranav; Et, Al (May 2024, Performance Evaluation)

Full Text Available
Accelerating multi-tier storage cache simulations using knee detection

https://doi.org/10.1016/j.peva.2024.102410

Estro, Tyler; Antunes, Mário; Bhandari, Pranav; Gandhi, Anshul; Kuenning, Geoff; Liu, Yifei; Waldspurger, Carl; Wildani, Avani; Zadok, Erez (May 2024, Performance Evaluation)

Storage cache hierarchies include diverse topologies, assorted parameters and policies, and devices with varied performance characteristics. Simulation enables efficient exploration of their configuration space while avoiding expensive physical experiments. Miss Ratio Curves (MRCs) efficiently characterize the performance of a cache over a range of cache sizes, revealing ‘‘key points’’ for cache simulation, such as knees in the curve that immediately follow sharp cliffs. Unfortunately, there are no automated techniques for efficiently finding key points in MRCs, and the cross-application of existing knee-detection algorithms yields inaccurate results. We present a multi-stage framework that identifies key points in any MRC, for both stack- based (e.g., LRU) and more sophisticated eviction algorithms (e.g., ARC). Our approach quickly locates candidates using efficient hash-based sampling, curve simplification, knee detection, and novel post-processing filters. We introduce Z-Method, a new multi-knee detection algorithm that employs statistical outlier detection to choose promising points robustly and efficiently. We evaluated our framework against seven other knee-detection algorithms, identifying key points in multi-tier MRCs with both ARC and LRU policies for 106 diverse real-world workloads. Compared to naïve approaches, our framework reduced the total number of points needed to accurately identify the best two-tier cache hierarchies by an average factor of approximately 5.5x for ARC and 7.7x for LRU. We also show how our framework can be used to seed the initial population for evolutionary algorithms. We ran 32,616 experiments requiring over three million cache simulations, on 151 samples, from three datasets, using a diverse set of population initialization techniques, evolutionary algorithms, knee-detection algorithms, cache replacement algorithms, and stopping criteria. Our results showed an overall acceleration rate of 34% across all configurations.
more » « less
Full Text Available
Persistent Memory Research in the Post-Optane Era

https://doi.org/10.1145/3609308.3625268

Desnoyers, Peter; Adams, Ian; Estro, Tyler; Gandhi, Anshul; Kuenning, Geoff; Mesnier, Mike; Waldspurger, Carl; Wildani, Avani; Zadok, Erez (October 2023, ACM)

After over a decade of researcher anticipation for the arrival of persistent memory (PMem), the first shipments of 3D XPoint-based Intel Optane Memory in 2019 were quickly followed by its cancellation in 2022. Was this another case of an idea quickly fading from future to past tense, relegating work in this area to the graveyard of failed technologies? The recently introduced Compute Express Link (CXL) may offer a path forward, with its persistent memory profile offering a universal PMem attachment point. Yet new technologies for memory-speed persistence seem years off, and may never become competitive with evolving DRAM and flash speeds. Without persistent memory itself, is future PMem research doomed? We offer two arguments for why reports of the death of PMem research are greatly exaggerated. First, the bulk of persistent-memory research has not in fact addressed memory persistence, but rather in-memory crash consistency, which was never an issue in prior systems where CPUs could not observe post-crash memory states. CXL memory pooling allows multiple hosts to share a single memory, all in different failure domains, raising crash-consistency issues even with volatile memory. Second, we believe CXL necessitates a ``disaggregation'' of PMem research. Most work to date assumed a single technology and set of features, \ie speed, byte addressability, and CPU load/store access. With an open interface allowing new topologies and diverse PMem technologies, we argue for the need to examine these features individually and in combination. While one form of PMem may have been canceled, we argue that the research problems it raised not only remain relevant but have expanded in a CXL-based future.
more » « less
Full Text Available
Guiding Simulations of Multi-Tier Storage Caches Using Knee Detection

https://doi.org/10.1109/MASCOTS59514.2023.10387545

Estro, Tyler; Antunes, Mário; Bhandari, Pranav; Gandhi, Anshul; Kuenning, Geoff; Liu, Yifei; Waldspurger, Carl; Wildani, Avani; Zadok, Erez (October 2023, IEEE)

Simulating storage cache hierarchies enables effi- cient exploration of their configuration space, including diverse topologies, parameters and policies, and devices with varied performance characteristics, while avoiding expensive physical experiments. Miss Ratio Curves (MRCs) efficiently characterize the performance of a cache over a range of cache sizes. These useful tools reveal “key points” for cache simulation, such as knees in the curve that immediately follow sharp cliffs. Unfortunately, there are no automated techniques for efficiently finding key points in MRCs, and the cross-application of existing knee-detection algorithms yields inaccurate results. We present a multi-stage framework that identifies key points in any MRC, for both stack-based (e.g., LRU) and more sophisticated eviction algorithms (e.g., ARC). Our approach quickly locates candidates using efficient hash-based sampling, curve simplification, knee detection, and novel post-processing filters. We introduce Z-Method, a new multi-knee detection algorithm that employs statistical outlier detection to choose promising points robustly and efficiently. We evaluate our framework against seven other knee-detection algorithms, using both ARC and LRU MRCs from 106 diverse real-world workloads, and apply it to identify key points in multi-tier MRCs. Compared to naïve approaches, our framework reduces the total number of points needed to accurately identify the best two-tier cache hierarchies by an average factor of approximately 5.5x for ARC and 7.7x for LRU.
more » « less
Full Text Available
PC-Expo: A Metrics-Based Interactive Axes Reordering Method for Parallel Coordinate Displays

https://doi.org/10.1109/TVCG.2022.3209392

Tyagi, Anjul; Estro, Tyler; Kuenning, Geoff; Zadok, Erez; Mueller, Klaus (October 2022, IEEE Transactions on Visualization and Computer Graphics)

Parallel coordinate plots (PCPs) have been widely used for high-dimensional (HD) data storytelling because they allow for presenting a large number of dimensions without distortions. The axes ordering in PCP presents a particular story from the data based on the user perception of PCP polylines. Existing works focus on directly optimizing for PCP axes ordering based on some common analysis tasks like clustering, neighborhood, and correlation. However, direct optimization for PCP axes based on these common properties is restrictive because it does not account for multiple properties occurring between the axes, and for local properties that occur in small regions in the data. Also, many of these techniques do not support the human-in-the-loop (HIL) paradigm, which is crucial (i) for explainability and (ii) in cases where no single reordering scheme fits the users’ goals. To alleviate these problems, we present PC-Expo, a real-time visual analytics framework for all-in-one PCP line pattern detection and axes reordering. We studied the connection of line patterns in PCPs with different data analysis tasks and datasets. PC-Expo expands prior work on PCP axes reordering by developing real-time, local detection schemes for the 12 most common analysis tasks (properties). Users can choose the story they want to present with PCPs by optimizing directly over their choice of properties. These properties can be ranked, or combined using individual weights, creating a custom optimization scheme for axes reordering. Users can control the granularity at which they want to work with their detection scheme in the data, allowing exploration of local regions. PC-Expo also supports HIL axes reordering via local-property visualization, which shows the regions of granular activity for every axis pair. Local-property visualization is helpful for PCP axes reordering based on multiple properties, when no single reordering scheme fits the user goals. A comprehensive evaluation was done with real users and diverse datasets confirm the efficacy of PC-Expo in data storytelling with PCPs.
more » « less
Full Text Available
Desperately Seeking ... Optimal Multi-Tier Cache Configurations

Estro, Tyler; Bhadari, Pranav; Wildani, Avani; Zadok, Erez (July 2020, HotStorage 2020)

Modern cache hierarchies are tangled webs of complexity. Multiple tiers of heterogeneous physical and virtual devices, with many configurable parameters, all contend to optimally serve swarms of requests between local and remote applications. The challenge of effectively designing these systems is exacerbated by continuous advances in hardware, firmware, innovation in cache eviction algorithms, and evolving workloads and access patterns. This rapidly expanding configuration space has made it costly and time-consuming to physically experiment with numerous cache configurations for even a single stable workload. Current cache evaluation techniques (e.g., Miss Ratio Curves) are short-sighted: they analyze only a single tier of cache, focus primarily on performance, and fail to examine the critical relationships between metrics like throughput and monetary cost. Publicly available I/O cache simulators are also lacking: they can only simulate a fixed or limited number of cache tiers, are missing key features, or offer limited analyses. It is our position that best practices in cache analysis should include the evaluation of multi-tier configurations, coupled with more comprehensive metrics that reveal critical design trade-offs, especially monetary costs. We are developing an n-level I/O cache simulator that is general enough to model any cache hierarchy, captures many metrics, provides a robust set of analysis features, and is easily extendable to facilitate experimental research or production level provisioning. To demonstrate the value of our proposed metrics and simulator, we extended an existing cache simulator (PyMimircache). We present several interesting and counter-intuitive results in this paper.
more » « less
Full Text Available
Desperately seeking... optimal multi-tier cache configurations

Estro, Tyler and (January 2020, Proceedings of the 12th USENIX Workshop on Hot Topics in Storage (HotStorage’20))

Full Text Available

Search for: All records