skip to main content


Title: Online model swapping for architectural simulation
As systems and applications grow more complex, detailed computer architecture simulation takes an ever increasing amount of time. Longer simulation times result in slower design iterations which then force architects to use simpler models, such as spreadsheets, when they want to iterate quickly on a design. Simple models are not easy to work with though, as architects must rely on intuition to choose representative models, and the path from the simple models to a detailed hardware simulation is not always clear. In this work, we present a method of bridging the gap between simple and detailed simulation by monitoring simulation behavior online and automatically swapping out detailed models with simpler statistical approximations. We demonstrate the potential of our methodology by implementing it in the open-source simulator SVE-Cachesim to swap out the level one data cache (L1D) within a memory hierarchy. This proof of concept demonstrates that our technique can train simple models to match real program behavior in the L1D and can swap them in without destructive side-effects for the performance of downstream models. Our models introduce only 8% error in the overall cycle count, while being used for over 90% of the simulation and using models that require two to eight times less computation per cache access.  more » « less
Award ID(s):
1710371
PAR ID:
10294633
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
The 18th International Conference on Computing Frontiers
Page Range / eLocation ID:
102 to 112
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Traditional caching models emphasize hit rate as the principal measure of performance for cache replacement algorithms. However, hit rate alone can be misleading in the presence of a phenomenon known as a delayed hit. Delayed hits occur in high-throughput systems when multiple requests for an object accumulate before the object can be fetched from the backing store. Prior work by Atre et al. has explored the impact of delayed hits in simple caching scenarios, namely single-tier caches with uniform object sizes. In this work we seek to extend that investigation to consider multi-level caches, such as those that might be found in a modern CDN. Furthermore, we extend MAD, the delayed-hits-aware policy proposed by Atre et al, so that it can be deployed in a multi-tier caching system. We evaluate the performance of MAD using a multi-tier cache simulator and an empirical cache configuration based on modern CDNs. Our initial results lead us to believe that delayed hits can still be a prominent factor in the performance of multi-level caches, although their effect may be reduced in comparison to simpler cache configurations. 
    more » « less
  2. In this work, we set out to find the answers to the following questions: (1) Where are the bottlenecks in a state-of-theart architectural simulator? (2) How much faster can architectural simulations run by tuning system configurations? (3) What are the opportunities in accelerating software simulation using hardware accelerators? We choose gem5 as the representative architectural simulator, run several simulations with various configurations, perform a detailed architectural analysis of the gem5 source code on different server platforms, tune both system and architectural settings for running simulations, and discuss the future opportunities in accelerating gem5 as an important application. Our detailed profiling of gem5 reveals that its performance is extremely sensitive to the size of the Ll cache. Our experimental results show that a RISC-V core with 32KB data and instruction cache improves gem5’s simulation speed by 31%-61% compared with a baseline core with 8KB Ll caches. Our paper is the first step toward building specialized hardware and software environments for accelerating software-based simulators. 
    more » « less
  3. In architectural design, architects explore a vast amount of design options to maximize various performance criteria, while adhering to specific constraints. In an effort to assist architects in such a complex endeavour, we propose IDOME, an interactive system for computer-aided design optimization. Our approach balances automation and control by efficiently exploring, analyzing, and filtering space layouts to inform architects' decision-making better. At each design iteration, IDOME provides a set of alternative building layouts which satisfy user-defined constraints and optimality criteria concerning a user-defined space parametrization. When the user selects a design generated by IDOME, the system performs a similar optimization process with the same (or different) parameters and objectives. A user may iterate this exploration process as many times as needed. In this work, we focus on optimizing built environments using architectural metrics by improving the degree of visibility, accessibility, and information gaining for navigating a proposed space. This approach, however, can be extended to support other kinds of analysis as well. We demonstrate the capabilities of IDOME through a series of examples, performance analysis, user studies, and a usability test. The results indicate that IDOME successfully optimizes the proposed designs concerning the chosen metrics and offers a satisfactory experience for users with minimal training. 
    more » « less
  4. Nonprofit organizations (NPOs) lack resources, hindering the quality and quantity of service they can deliver. Meanwhile, NPOs at times have underutilized or even spare resources due to the inability to scale expertise in staffing and tangible resources to meet temporally shifting service demands. These observations motivate us to propose a novel resource sharing system, SWAP, which to the best of our knowledge, is the first resource sharing system that facilitates resource exchanges where NPOs can obtain resources by offering their own. SWAP consists of four elements: a collaborative auction-based sharing process, complete with an offering mechanism, a bidding mechanism, and the virtual currency, SWAPcredit, to facilitate liquidity in exchange; a central technology that represents the award determination problem with a multilateral exchange optimization model, generating resource exchange outcomes; an online platform, the SWAP Hub, where NPOs can offer and bid on available resources, and receive exchange results; and human-centric co-design, shaping the understanding and design decisions of a research collective, that includes the authors and NPO professionals. We conduct a series of experiments using both empirical and simulated data to illustrate the benefits and potential of SWAP. Our results demonstrate that SWAP can address temporal resource needs in practice; show that optimal exchange outcomes can be generated even for large-scale SWAP markets; and provide strong evidence in support of guidance to inform the progression for future versions of SWAP. The SWAP system is presently implemented in Howard County, MD, USA, with ongoing enhancements and potential for future expansion. 
    more » « less
  5. As Machine Learning (ML) applications become pervasive and computer architects further integrate hardware support, the need to rapidly explore trade-offs between algorithms and hardware becomes pressing. While prior work on hardware accelerators has led to tremendous performance and energy improvements, it can be difficult to generalize these approaches without resorting to special-purpose tools or even languages. Through object-oriented design principles, we describe a general and reusable approach for generating parameterized neural network hardware. Specifically, we describe our experiences with high-level hardware design objects for building neural network hardware based on the open-source Python HDL, PyRTL. By thinking at a higher level of abstraction than simple “hardware modules,”, we open the door to a process by which hardware can be developed with software engineering principles. This creates new opportunities for a tight feedback loop between machine learning algorithm innovation and hardware design reality. Future works considering hardware development for ML applications can benefit from our work analyzing the costs and benefits of abstraction. 
    more » « less