Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
In the current noisy intermediate-scale quantum (NISQ) Era, Quantum Computing faces significant challenges due to noise, which severely restricts the application of computing complex algorithms. Superconducting quantum chips, one of the pioneer quantum computation technologies, introduce additional noise when moving qubits to adjacent locations for operation on designated two-qubit gates. The current compilers rely on decision models that either count the swap gates or multiply the gate errors when choosing swap paths at the routing stage. Our research has unveiled the overlooked situations for error propagations through the circuit, leading to accumulations that may affect the final output. In this paper, we propose Error Propagation-Aware Routing (EPAR), designed to enhance the compilation performance by considering accumulated errors in routing. EPAR’s effectiveness is validated through benchmarks on a 27-qubit machine and two simulated systems with different topologies. The results indicate an average success rate improvement of 10% on both real and simulated heavy hex lattice topologies, along with a 16% enhancement in a mesh topology simulation. These findings underscore the potential of EPAR to advance quantum computing in the NISQ era substantially.more » « lessFree, publicly-accessible full text available June 12, 2025
-
With the wide adoption of deep neural network (DNN) models for various applications, enterprises, and cloud providers have built deep learning clusters and increasingly deployed specialized accelerators, such as GPUs and TPUs, for DNN training jobs. To arbitrate cluster resources among multi-user jobs, existing schedulers fall short, either lacking fine-grained heterogeneity awareness or hardly generalizable to various scheduling policies. To fill this gap, we propose a novel design of a task-level heterogeneity-aware scheduler, Hadar, based on an online optimization framework that can express other scheduling algorithms. Hadar leverages the performance traits of DNN jobs on a heterogeneous cluster, characterizes the task-level performance heterogeneity in the optimization problem, and makes scheduling decisions across both spatial and temporal dimensions. The primal-dual framework is employed, with our design of a dual subroutine, to solve the optimization problem and guide the scheduling design. Extensive trace-driven simulations with representative DNN models have been conducted to demonstrate that Hadar improves the average job completion time (JCT) by 3× over an Apache YARN-based resource manager used in production. Moreover, Hadar outperforms Gavel[1], the state-of-the-art heterogeneity-aware scheduler, by 2.5× for the average JCT, and shortens the queuing delay by 13% and improve FTF (Finish-Time-Fairness) by 1.5%.more » « lessFree, publicly-accessible full text available May 27, 2025
-
Free, publicly-accessible full text available January 1, 2025
-
Quantum computers in the current noisy intermediate-scale quantum (NISQ) era face two major limitations - size and error vulnerability. Although quantum error correction (QEC) methods exist, they are not applicable at the current size of computers, requiring thousands of qubits, while NISQ systems have nearly one hundred at most. One common approach to improve reliability is to adjust the compilation process to create a more reliable final circuit, where the two most critical compilation decisions are the qubit allocation and qubit routing problems. We focus on solving the qubit allocation problem and identifying initial layouts that result in a reduction of error. To identify these layouts, we combine reinforcement learning with a graph neural network (GNN)-based Q-network to process the mesh topology of the quantum computer, known as the backend, and make mapping decisions, creating a Graph Neural Network Assisted Quantum Compilation (GNAQC) strategy. We train the architecture using a set of four backends and six circuits and find that GNAQC improves output fidelity by roughly 12.7% over pre-existing allocation methods.more » « less
-
The proliferation of IoT devices, with various capabilities in sensing, monitoring, and controlling, has prompted diverse emerging applications, highly relying on effective delivery of sensitive information gathered at edge devices to remote controllers for timely responses. To effectively deliver such information/status updates, this paper undertakes a holistic study of AoI in multi-hop networks by considering the relevant and realistic factors, aiming for optimizing information freshness by rapidly shipping sensitive updates captured at a source to its destination. In particular, we consider the multi-channel with OFDM (orthogonal frequency-division multiplexing) spectrum access in multi-hop networks and develop a rigorous mathematical model to optimize AoI at destination nodes. Real-world factors, including orthogonal channel access, wireless interference, and queuing model, are taken into account for the very first time to explore their impacts on the AoI. To this end, we propose two effective algorithms where the first one approximates the optimal solution as closely as we desire while the second one has polynomial time complexity, with a guaranteed performance gap to the optimal solution. The developed model and algorithms enable in-depth studies on AoI optimization problems in OFDM-based multi-hop wireless networks. Numerical results demonstrate that our solutions enjoy better AoI performance and that AoI is affected markedly by those realistic factors taken into our consideration.more » « less
-
With the growing effort to reduce power consumption in machines, fault tolerance becomes more of a concern. This holds particularly for large-scale computing, where execution failures due to soft faults waste excessive time and resources. These large-scale applications are normally parallel in nature and rely on control structures tailored specifically for parallel computing, such as locks and barriers. While there are many studies on resilient software, to our knowledge none of them focus on protecting these parallel control structures. In this work, we present a method of ensuring the correct operation of both locks and barriers in parallel applications. Our method tracks the memory locations used within parallel sections and detects a violation of the control structures. Upon detecting any violation, the violating thread is rolled back to the beginning of the structure and reattempts it, similar to rollback mechanisms in transactional memory systems. We test the method on representative samples of the BigDataBench kernels and find it exhibits a mean error reduction of 93.6% for basic mutex locks and barriers with a mean 6.55% execution time overhead at 64 threads. Additionally, we provide a comparison to transactional memory methods and demonstrate up to a mean 57.5% execution time overhead reduction.more » « less
-
The fall detection system is of critical importance in protecting elders through promptly discovering fall accidents to provide immediate medical assistance, potentially saving elders' lives. This paper aims to develop a novel and lightweight fall detection system by relying solely on a home audio device via inaudible acoustic sensing, to recognize fall occurrences for wide home deployment. In particular, we program the audio device to let its speaker emit 20kHz continuous wave, while utilizing a microphone to record reflected signals for capturing the Doppler shift caused by the fall. Considering interferences from different factors, we first develop a set of solutions for their removal to get clean spectrograms and then apply the power burst curve to locate the time points at which human motions happen. A set of effective features is then extracted from the spectrograms for representing the fall patterns, distinguishable from normal activities. We further apply the Singular Value Decomposition (SVD) and K-mean algorithms to reduce the data feature dimensions and to cluster the data, respectively, before input them to a Hidden Markov Model for training and classification. In the end, our system is implemented and deployed in various environments for evaluation. The experimental results demonstrate that our system can achieve superior performance for detecting fall accidents and is robust to environment changes, i.e., transferable to other environments after training in one environment.more » « less