Energy efficiency has emerged as a key concern for modern processor design, especially when it comes to embedded and mobile devices. It is vital to accurately quantify the power consumption of different micro-architectural components in a CPU. Traditional RTL or gate-level power estimation is too slow for early design-space exploration studies. By contrast, existing architecture-level power models suffer from large inaccuracies. Recently, advanced machine learning techniques have been proposed for accurate power modeling. However, existing approaches still require slow RTL simulations, have large training overheads or have only been demonstrated for fixed-function accelerators and simple in-order cores with predictable behavior. In this work, we present a novel machine learning-based approach for microarchitecture-level power modeling of complex CPUs. Our approach requires only high-level activity traces obtained from microarchitecture simulations. We extract representative features and develop low-complexity learning formulations for different types of CPU-internal structures. Cycle-accurate models at the sub-component level are trained from a small number of gate-level simulations and hierarchically composed to build power models for complete CPUs. We apply our approach to both in-order and out-of-order RISC-V cores. Cross-validation results show that our models predict cycle-by-cycle power consumption to within 3% of a gate-level power estimation on average. In addition, our power model for the Berkeley Out-of-Order (BOOM) core trained on micro-benchmarks can predict the cycle-by-cycle power of real-world applications with less than 3.6% mean absolute error.
more »
« less
Synthesizing Formal Models of Hardware from RTL for Efficient Verification of Memory Model Implementations
Modern hardware complexity makes it challenging to determine if a given microarchitecture adheres to a particular memory consistency model (MCM). This observation inspired the Check tools, which formally check that a specific microarchitecture correctly implements an MCM with respect to a suite of litmus test pro-grams. Unfortunately, despite their effectiveness and efficiency, theCheck tools must be supplied a microarchitecture in the guise of a manually constructed axiomatic specification, called a 𝜇spec model. To facilitate MCM verification—and enable the Check tools to consume processor RTL directly—we introduce a methodology and associated tool, rtl2𝜇spec, for automatically synthesizing 𝜇spec models from processor designs written in Verilog or SystemVerilog, with the help of modest user-provided design metadata. As a case study, we use rtl2𝜇spec to facilitate the Check-based verification of the four-core RISC-V V-scale (multi-V-scale) processor’s MCM implementation. We show that rtl2𝜇spec can synthesize a complete, and proven correct by construction, 𝜇spec model from the SystemVerilog design of the multi-V-scale processor in 6.84 minutes. Subsequent Check-based MCM verification of the synthesized 𝜇spec model takes less than one second per litmus test.
more »
« less
- Award ID(s):
- 2017863
- NSF-PAR ID:
- 10299361
- Date Published:
- Journal Name:
- IEEE/ACM International Symposium on Microarchitecture
- Page Range / eLocation ID:
- 679 to 694
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Attacks which combine software vulnerabilities and hardware vulnerabilities are emerging security problems. Although the runtime verification or remote attestation can determine the correctness of a system, existing methods suffer from inflexible security policy setup and high performance overheads. Meanwhile, they rarely focus on addressing the threat in the RISC-V architecture, which provides an open Instruction Set Architecture (ISA) of the processsor. In this paper, we propose a comprehensive software and hardware co-verification method to protect the entire RISC-V system in the runtime. The proposed method adopts the Dynamic Information Flow Tracking (DIFT) framework to implement a new Verifier and Prover security architecture for supporting runtime software and hardware coverification. We realize a FPGA prototype on the Rocket-Chip, an RISC-V open-source processor core. The framework is implemented as a co-processor which do not change the architecture of main processor core and the new security architecture can be integrated with other RISC-V processors.more » « less
-
Implementation of a new instruction set architecture (ISA) is a non-trivial task which involves significant modifications to the system software, such as the compiler, the assembler, and the linker. This task also includes modifying and verifying functional and cycle accurate simulators to facilitate correct simulation and performance evaluation of programs under the new ISA. Isolating errors in these software components becomes extremely challenging and demands automated and semi-automated mechanisms since neither the compilation infrastructure nor the simulation infrastructure can be trusted as both parties have been heavily modified. Bootstrapping a new ISA is very common in embedded systems since there is a greater variety of embedded ISAs due to often not having a need to support backward compatibility of executables. In this paper, we present the tools and the verification mechanisms we have implemented to support the development of a number of related, but distinct ISAs. These ISAs are similar in complexity to the RISC-V ISA, and range from simple pipelined and superscalar processor ISAs, to a complete VLIW ISA. Our work in developing the system software and simulators for these ISAs demonstrate that a step-by-step semi-automated approach which relies on simple invariants can facilitate effective bootstrapping of the complete system software and the simulator infrastructure.more » « less
-
SRAM-based FPGAs are frequently used for critical functions in space applications. Soft processors implemented within these FPGAs are often needed to satisfy the mission requirements. The open ISA, RISC-V, has allowed for the development of a wide range of open source processors. Like all SRAM-based FPGA digital designs, these soft processors are susceptible to SEUs. This paper presents an investigation of the performances and relative SEU sensitivity of a selection of newly available open source RISC-V processors. Utilizing dynamic partial reconfiguration, this novel automatic test equipment rapidly deployed different implementations and evaluated SEU sensitivity through fault injection. Using BYU’s new SpyDrNet tools, fine-grain TMR was also applied to each processor with results ranging from a 20× to 500× reduction in sensitivity.more » « less
-
null (Ed.)Processing-in-memory (PIM) architectures attempt to overcome the von Neumann bottleneck by combining computation and storage logic into a single component. The content-addressable parallel processing paradigm (CAPP) from the seventies is an in-situ PIM architecture that leverages content-addressable memories to realize bit-serial arithmetic and logic operations, via sequences of search and update operations over multiple memory rows in parallel. In this paper, we set out to investigate whether the concepts behind classic CAPP can be used successfully to build an entirely CMOS-based, general-purpose microarchitecture that can deliver manyfold speedups while remaining highly programmable. We conduct a full-stack design of a Content-Addressable Processing Engine (CAPE), built out of dense push-rule 6T SRAM arrays. CAPE is programmable using the RISC-V ISA with standard vector extensions. Our experiments show that CAPE achieves an average speedup of 14 (up to 254) over an area-equivalent (slightly under 9mm^2 at 7nm) out-of-order processor core with three levels of caches.more » « less