Chiplet integration using 2.5D packaging is gaining popularity nowadays which enables several interesting features like heterogeneous integration and drop-in design method. In the traditional die-by-die approach of designing a 2.5D system, each chiplet is designed independently without any knowledge of the package RDLs. In this paper, we propose a Chip-Package Co-Design flow for implementing 2.5D systems using existing commercial chip design tools. Our flow encompasses 2.5D-aware partitioning suitable for SoC design, Chip-Package Floorplanning, and post-design analysis and verification of the entire 2.5D system. We also designed our own package planners to route RDL layers on top of chiplet layers. We use an ARM Cortex-M0 SoC system to illustrate our flow and compare analysis results with a monolithic 2D implementation of the same system. We also compare two different 2.5D implementations of the same SoC system following the drop-in approach. Alongside the traditional die-by-die approach, our holistic flow enables design efficiency and flexibility with accurate cross-boundary parasitic extraction and design verification.
more »
« less
TAP-2.5D: A Thermally-Aware Chiplet Placement Methodology for 2.5D Systems
Heterogeneous systems are commonly used today to
sustain the historic benefits we have achieved through technology
scaling. 2.5D integration technology provides a cost-effective
solution for designing heterogeneous systems. The traditional
physical design of a 2.5D heterogeneous system closely packs the
chiplets to minimize wirelength, but this leads to a thermally-inefficient
design. We propose TAP-2.5D: the first open-source
network routing and thermally-aware chiplet placement methodology
for heterogeneous 2.5D systems. TAP-2.5D strategically inserts
spacing between chiplets to jointly minimize the temperature and
total wirelength, and in turn, increases the thermal design power
envelope of the overall system. We present three case studies
demonstrating the usage and efficacy of TAP-2.5D.
more »
« less
- Award ID(s):
- 1716352
- NSF-PAR ID:
- 10248937
- Date Published:
- Journal Name:
- Design, Automation and Test in Europe (DATE)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
design, both inter- and intra-chiplet, impacts overall system performance as well as its manufacturing cost and thermal feasibility. This paper introduces a cross-layer methodology for designing networks in 2.5D systems. We optimize the network design and chiplet placement jointly across logical, physical, and circuit layers to achieve an energy-efficient network, while maximizing system performance, minimizing manufacturing cost, and adhering to thermal constraints. In the logical layer, our co-optimization considers eight different network topologies. In the physical layer, we consider routing, microbump assignment, and microbump pitch constraints to account for the extra costs associated with microbump utilization in the inter-chiplet communication. In the circuit layer, we consider both passive and active links with five different link types, including a gas station link design. Using our cross-layer methodology results in more accurate determination of (superior) inter-chiplet network and 2.5D system designs compared to prior methods. Compared to 2D systems, our approach achieves 29% better performance with the same manufacturing cost, or 25% lower cost with the same performance.more » « less
-
Heterogeneous chiplets have been proposed for accelerating high-performance computing tasks. Integrated inside one package, CPU and GPU chiplets can share a common interconnection network that can be implemented through the interposer. However, CPU and GPU applications have very different traffic patterns in general. Without effective management of the network resource, some chiplets can suffer significant performance degradation because the network bandwidth is taken away by communication-intensive applications. Therefore, techniques need to be developed to effectively manage the shared network resources. In a chiplet-based system, resource management needs to not only react in real-time but also be cost-efficient. In this work, we propose a reconfigurable network architecture, leveraging Kalman Filter to make accurate predictions on network resources needed by the applications and then adaptively change the resource allocation. Using our design, the network bandwidth can be fairly allocated to avoid starvation or performance degradation. Our evaluation results show that the proposed reconfigurable interconnection network can dynamically react to the changes in traffic demand of the chiplets and improve the system performance with low cost and design complexity.more » « less
-
2.5D chiplet-based technology promises an efficient integration technique for advanced designs with more functionality and higher performance. Temperature and related thermal optimization, heat removal are of critical importance for temperature-aware physical synthesis for chiplets. This paper presents a novel graph convolutional networks (GCN) architecture to estimate the thermal map of the 2.5D chiplet-based systems with the thermal resistance networks built by the compact thermal model (CTM). First, we take the total power of all chiplets as an input feature, which is a global feature. This additional global information can overcome the limitation that the GCN can only extract local information via neighborhood aggregation. Second, inspired by convolutional neural networks (CNN), we add skip connection into the GCN to pass the global feature directly across the hidden layers with the concatenation operation. Third, to consider the edge embedding feature, we propose an edge-based attention mechanism based on the graph attention networks (GAT). Last, with the multiple aggregators and scalers of principle neighborhood aggregation (PNA) networks, we can further improve the modeling capacity of the novel GCN. The experimental results show that the proposed GCN model can achieve an average RMSE of 0.31 K and deliver a 2.6$\times$ speedup over the fast steady-state solver of open-source {\it HotSpot} based on SuperLU. More importantly, the GCN model demonstrates more useful generalization or transferable capability. Our results show that the trained GCN can be directly applied to predict thermal maps of six unseen datasets with acceptable mean RMSEs of less than 0.67 K without retraining via inductive learning.more » « less
-
Fast-evolving artificial intelligence (AI) algorithms such as large language models have been driving the ever increasing computing demands in today’s data centers. Heterogeneous computing with domain-specific architectures (DSAs) brings many opportunities when scaling up and scaling out the computing system. In particular, heterogeneous chiplet architecture is favored to keep scaling up and scaling out the system as well as to reduce the design complexity and the cost stemming from the traditional monolithic chip design. However, how to interconnect computing resources and orchestrate heterogeneous chiplets is the key to success. In this paper, we first discuss the diversity and evolving demands of different AI workloads. We discuss how chiplet brings better cost efficiency and shorter time to market. Then we discuss the challenges in establishing chiplet interface standards, packaging, and security issues. We further discuss the software programming challenges in chiplet systems.more » « less