Industry trends are moving toward increasing use of chiplets as a replacement for monolithic fabrication in many modern chips. Each chiplet is a separately-produced silicon die, and a system-on-chip (SoC) is created by packaging the chiplets together on a silicon interposer or bridge. Chiplets enable IP reuse, heterogeneousintegration, and better ability to leverage cost-appropriate process nodes. Yet, creating systems from separately produced components also brings security risks to consider, such as the possibility of die swapping, or susceptibility to interposer probing or tampering. In a zero-trust security posture, a chiplet should not blindly assume it is operating in a friendly environment.In this paper we propose a delay-based PUF for chiplets to verify system integrity. Our technique allows a single chiplet to initiate a protocol with its neighbors to measure unique variations in the propagation delays of incoming signals as part of an integrity check. We prototype our design on Xilinx Ultrascale+ FPGAs, which are constructed as multi-die systems on a silicon interposer, and which also emulate the general features of other industrial chiplet interfaces. We perform experiments on, and compare data from, dozens of Ultrascale+ FPGAs by making use of Amazon’s Elastic Compute Cloud (EC2) F1 instances as a testing platform. The PUF cells are shown to reject clock and temperature variation as common mode, and each cell produces approximately 5 ps of unique delay variation. For a design with 144 PUF cells, we measure the mean within-class and between-class distances to be 68.3 ps and 847.7 ps, respectively. The smallest between-class distance of 686.0 ps exceeds the largest within-class distance of 124.0 ps by more than 5x under nominal conditions, and the PUF is shown to be resilient to environmental changes. Our findings indicate the PUF can be used for authentication, and is potentially sensitive enough to detect picosecond-scale timing changes due to tampering.
more »
« less
This content will become publicly available on August 7, 2026
ChipletQuake: On-Die Digital Impedance Sensing for Chiplet and Interposer Verification
The increasing complexity and cost of manufacturing monolithic chips have driven the semiconductor industry toward chiplet-based designs, where smaller, modular chiplets are integrated onto a single interposer. While chiplet architectures offer significant advantages, such as improved yields, design flexibility, and cost efficiency, they introduce new security challenges in the horizontal hardware manufacturing supply chain. These challenges include risks of hardware Trojans, cross-die side-channel and fault injection attacks, probing of chiplet interfaces, and intellectual property theft. To address these concerns, this paper presents ChipletQuake, a novel on-chiplet framework for verifying the physical security and integrity of adjacent chiplets during the post-silicon stage. By sensing the impedance of the power delivery network (PDN) of the system, ChipletQuake detects tamper events in the interposer and neighboring chiplets without requiring any direct signal interface or additional hardware components. Fully compatible with the digital resources of FPGA-based chiplets, this framework demonstrates the ability to identify the insertion of passive and subtle malicious circuits, providing an effective solution to enhance the security of chiplet-based systems. To validate our claims, we showcase how our framework detects hardware Trojans and interposer tampering.
more »
« less
- Award ID(s):
- 2338069
- PAR ID:
- 10626804
- Publisher / Repository:
- Sensors
- Date Published:
- Journal Name:
- Sensors
- Volume:
- 25
- Issue:
- 15
- ISSN:
- 1424-8220
- Page Range / eLocation ID:
- 4861
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Heterogeneous chiplets have been proposed for accelerating high-performance computing tasks. Integrated inside one package, CPU and GPU chiplets can share a common interconnection network that can be implemented through the interposer. However, CPU and GPU applications have very different traffic patterns in general. Without effective management of the network resource, some chiplets can suffer significant performance degradation because the network bandwidth is taken away by communication-intensive applications. Therefore, techniques need to be developed to effectively manage the shared network resources. In a chiplet-based system, resource management needs to not only react in real-time but also be cost-efficient. In this work, we propose a reconfigurable network architecture, leveraging Kalman Filter to make accurate predictions on network resources needed by the applications and then adaptively change the resource allocation. Using our design, the network bandwidth can be fairly allocated to avoid starvation or performance degradation. Our evaluation results show that the proposed reconfigurable interconnection network can dynamically react to the changes in traffic demand of the chiplets and improve the system performance with low cost and design complexity.more » « less
-
Fast-evolving artificial intelligence (AI) algorithms such as large language models have been driving the ever increasing computing demands in today’s data centers. Heterogeneous computing with domain-specific architectures (DSAs) brings many opportunities when scaling up and scaling out the computing system. In particular, heterogeneous chiplet architecture is favored to keep scaling up and scaling out the system as well as to reduce the design complexity and the cost stemming from the traditional monolithic chip design. However, how to interconnect computing resources and orchestrate heterogeneous chiplets is the key to success. In this paper, we first discuss the diversity and evolving demands of different AI workloads. We discuss how chiplet brings better cost efficiency and shorter time to market. Then we discuss the challenges in establishing chiplet interface standards, packaging, and security issues. We further discuss the software programming challenges in chiplet systems.more » « less
-
A core challenge for superconducting quantum computers is to scale up the number of qubits in each processor without increasing noise or cross-talk. Distributed quantum computing across small qubit arrays, known as chiplets, can address these challenges in a scalable manner. We propose a chiplet architecture over microwave links with potential to exceed monolithic performance on near-term hardware. Our methods of modeling and evaluating the chiplet architecture bridge the physical and network layers in these processors. We find evidence that distributing computation across chiplets may reduce the overall error rates associated with moving data across the device, despite higher error figures for transfers across links. Preliminary analyses suggest that latency is not substantially impacted, and that at least some applications and architectures may avoid bottlenecks around chiplet boundaries. In the long-term, short-range networks may underlie quantum computers just as local area networks underlie classical datacenters and supercomputers today.more » « less
-
Not AvailableModern Artificial Intelligence (AI) workloads demand computing systems with large silicon area to sustain throughput and competitive performance. However, prohibitive manufacturing costs and yield limitations at advanced tech nodes and die-size reaching the reticle limit restrain us from achieving this. With the recent innovations in advanced packaging technologies, chiplet-based architectures have gained significant attention in the AI hardware domain. However, the vast design space of chiplet-based AI accelerator design and the absence of system and package-level co-design methodology make it difficult for the designer to find the optimum design point regarding Power, Performance, Area, and manufacturing Cost (PPAC). This paper presents Chiplet-Gym, a Reinforcement Learning (RL)-based optimization framework to explore the vast design space of chiplet-based AI accelerators, encompassing the resource allocation, placement, and packaging architecture. We analytically model the PPAC of the chiplet-based AI accelerator and integrate it into an OpenAI gym environment to evaluate the design points. We also explore non-RL-based optimization approaches and combine these two approaches to ensure the robustness of the optimizer. The optimizer-suggested design point achieves 1.52× throughput, 0.27× energy, and 0.89× cost of its monolithic counterpart at iso-area.more » « less
An official website of the United States government
