skip to main content

Search for: All records

Award ID contains: 2008000

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available June 21, 2024
  2. Memory system is critical to architecture design which can significantly impact application performance. Concurrent Average Memory Access Time (C-AMAT) is a model for analyzing and optimizing memory system performance using a recursive definition of the memory access latency along the memory hierarchy. The original C-AMAT model, however, does not provide the necessary granularity and flexibility for handling modern memory architectures with heterogeneous memory technologies and diverse system topology. We propose to augment C-AMAT to take into consideration the idiosyncrasies of individual cache/memory components as well as their topological arrangement in the memory architecture design. Through trace-based simulation, we validate the augmented model and examine the memory system performance with insight unavailable using the original C-AMAT model. 
    more » « less
  3. null (Ed.)
  4. null (Ed.)
    This paper introduces Simulus, a full-fledged open-source discrete-event simulator, supporting both event-driven and process-oriented simulation world-views. Simulus is implemented in Python and aspires to be a part of the Python's ecosystem supporting scientific computing. Simulus also provides several advanced modeling constructs to ease common simulation tasks (e.g., complex queuing models, interprocess synchronizations, and message-passing communications). Simulus also provides organic support for simultaneously running a time-synchronized group of simulators, either sequentially or in parallel, thereby allowing composable simulation of individual simulators handling different aspects of a target system, and enabling large-scale simulation running on parallel computers. This paper describes the salient features of Simulus and examines its major design decisions. 
    more » « less
  5. null (Ed.)
    Prefetching techniques have been studied for decades. However, there are few studies on how concurrent memory accesses may affect prefetching effectiveness. When there are multiple concurrent memory requests, we can classify them into sub-classes by analyzing the overlapping relationship. In this work, we first propose pure prefetch coverage (PPC), a novel prefetching metric that can identify an accurate prefetch coverage under the concurrent memory access model. Then we propose APAC, an adaptive prefetch framework with PPC metric that can capture the dynamics of applications and adjust the prefetching aggressiveness. Our experimental results show that the PPC metric has a higher IPC correlation compared to the conventional prefetch coverage (PC) metric. For memory-intensive single-thread benchmarks, APAC provides an average performance improvement by 17.3% and 5.9% compared to the state-of-the-art adaptive prefetch framework FDP and NST. In a multi-core system, APAC outperforms FDP and NST by 8.5% and 5.0% IPC on average, respectively. 
    more » « less