skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: TNT: A Modular Approach to Traversing Physically Heterogeneous NOCs at Bare-wire Latency
The ideal latency for on-chip network traversal would be the delay incurred from wire traversal alone. Unfortunately, in a realistic modular network, the latency for a packet to traverse the network is significantly higher than this wire delay. The main limiter to achieving lower latency is the modular quantization of network traversal into hops. Beyond this, the physical heterogeneity in real-world systems further complicate the ability to reach ideal wire-only delay. In this work, we propose TNT or Transparent Network Traversal . TNT targets ideal network latency by attempting source to destination network traversal as a single multi-cycle ‘long-hop’, bypassing the quantization effects of intermediate routers via transparent data/information flow. TNT is built in a modular tile-scalable manner via a novel control path performing neighbor-to-neighbor interactions but enabling end-to-end transparent flit traversal. Further, TNT’s fine grained on-the-fly delay tracking allows it to cope with physical NOC heterogeneity across the chip. Analysis on Ligra graph workloads shows that TNT can reduce NOC latency by as much as 43% compared to the state of the art and allows efficiency gains up to 38%. Further, it can achieve more than 3x the benefits of the best/closest alternative research proposal, SMART [ 43 ].  more » « less
Award ID(s):
2010830
PAR ID:
10433361
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
ACM Transactions on Architecture and Code Optimization
Volume:
20
Issue:
3
ISSN:
1544-3566
Page Range / eLocation ID:
1 to 25
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We propose CURE, a deep reinforcement learning (DRL)-based NoC design framework that simultaneously reduces network latency, improves energy-efficiency, and tolerates transient errors and permanent faults. CURE has several architectural innovations and a DRL-based hardware controller to manage design complexity and optimize trade-offs. First, in CURE, we propose reversible multi-function adaptive channels (RMCs) to reduce NoC power consumption and network latency. Second, we implement a new fault-secure adaptive error correction hardware in each router to enhance reliability for both transient errors and permanent faults. Third, we propose a router power-gating and bypass design that powers off NoC components to reduce power and extend chip lifespan. Further, for the complex dynamic interactions of these techniques, we propose using DRL to train a proactive control policy to provide improved fault-tolerance, reduced power consumption, and improved performance. Simulation using the PARSEC benchmark shows that CURE reduces end-to-end packet latency by 39%, improves energy efficiency by 92%, and lowers static and dynamic power consumption by 24% and 38%, respectively, over conventional solutions. Using mean-time-to-failure, we show that CURE is 7.7x more reliable than the conventional NoC design. 
    more » « less
  2. null (Ed.)
    Multithreaded applications are capable of exploiting the full potential of many-core systems. However, network-on-chip (NoC)-based intercore communication in many-core systems is responsible for 60%-75% of the miss latency experienced by multithreaded applications. Delay in the arrival of critical data at the requesting core severely hampers performance. This brief presents some interesting insights about how critical data are requested from the memory by multithreaded applications. Then it investigates the cause of delay in NoC and how it affects the performance. Finally, this brief shows how NoC-aware memory access optimizations can significantly improve performance. Our experimental evaluation considers early restart memory access optimization and demonstrates that by exploiting available NoC resources, critical data can be prioritized to reduce miss penalty by 11% and improve overall system performance by 9%. 
    more » « less
  3. null (Ed.)
    System-on-Chips (SoCs) are designed using different Intellectual Property (IP) blocks from multiple third-party vendors to reduce design cost while meeting aggressive time-to-market constraints. Designing trustworthy SoCs need to address the increasing concerns related to supply-chain security vulnerabilities. Malicious implants on IPs, such as Hardware Trojans (HTs) are one of the significant security threats in designing trustworthy SoCs. It is a major challenge to detect Trojans in complex multi-processor SoCs using conventional pre- and post-silicon validation methodologies. Packet-based Network-on-Chip (NoC) is a widely used solution for on-chip communication between IPs in complex SoCs. The focus of this paper is to enable trusted NoC communication in the presence of potentially untrusted IPs. This paper makes three key contributions. (1) We model an HT in NoC router that activates misrouting of the packets to initiate denial of service, delay of service, and injection suppression. (2) We propose a dynamic shielding technique that isolates the identified HT infected IP. (3) We present a secure routing algorithm to bypass the HT infected NoC router. Experimental results on HT infected NoC demonstrate that the proposed method reduces effective average packet latency by 38% in real benchmarks and 48% in synthetic traffic patterns. Our method also increases throughput and reduces effective average deflected packet latency by 62% in real benchmarks and 97% in synthetic traffic patterns. 
    more » « less
  4. As technology scales, Network-on-Chips (NoCs), currently being used for on-chip communication in manycore architectures, face several problems including high network latency, excessive power consumption, and low reliability. Simultaneously addressing these problems is proving to be difficult due to the explosion of the design space and the complexity of handling many trade-offs. In this paper, we propose IntelliNoC, an intelligent NoC design framework which introduces architectural innovations and uses reinforcement learning to manage the design complexity and simultaneously optimize performance, energy-efficiency, and reliability in a holistic manner. IntelliNoC integrates three NoC architectural techniques: (1) multifunction adaptive channels (MFACs) to improve energy-efficiency; (2) adaptive error detection/correction and re-transmission control to enhance reliability; and (3) a stress-relaxing bypass feature which dynamically powers off NoC components to prevent overheating and fatigue. To handle the complex dynamic interactions induced by these techniques, we train a dynamic control policy using Q-learning, with the goal of providing improved fault-tolerance and performance while reducing power consumption and area overhead. Simulation using PARSEC benchmarks shows that our proposed IntelliNoC design improves energy-efficiency by 67% and mean-time-to-failure (MTTF) by 77%, and decreases end-to-end packet latency by 32% and area requirements by 25% over baseline NoC architecture. 
    more » « less
  5. Network-on-Chips (NoCs) have emerged as the standard on-chip communication fabrics for multi/many core systems and system on chips. However, as the number of cores on chip increases, so does power consumption. Recent studies have shown that NoC power consumption can reach up to 40% of the overall chip power. Considerable research efforts have been deployed to significantly reduce NoC power consumption. In this paper, we build on approximate computing techniques and propose an approximate communication methodology called DEC-NoC for reducing NoC power consumption. The proposed DEC-NoC leverages applications' error tolerance and dynamically reduces the amount of error checking and correction in packet transmission, which results in a significant reduction in the number of retransmitted packets. The reduction in packet retransmission results in reduced power consumption. Our cycle accurate simulation using PARSEC benchmark suites shows that DEC-NoC achieves up to 56% latency reduction and up to 58% dynamic power reduction compared to NoC architectures with conventional error control techniques. 
    more » « less