Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
n this paper, we present a solution to the industrial challenge put forth by ARM in 2022. We systematically analyze the effect of shared resource contention to an augmented reality head-up display (AR-HUD) case-study application of the industrial challenge on a heterogeneous multicore platform, NVIDIA Jetson Nano. We configure the AR-HUD application such that it can process incoming image frames in real-time at 20Hz on the platform. We use Microarchitectural Denial-of-Service (DoS) attacks as aggressor workloads of the challenge and show that they can dramatically impact the latency and accuracy of the AR-HUD application. This results in significant deviations of the estimated trajec- tories from known ground truths, despite our best effort to mitigate their influence by using cache partitioning and real-time scheduling of the AR- HUD application. To address the challenge, we propose RT-Gang++, a partitioned real-time gang scheduling framework with last-level cache (LLC) and integrated GPU bandwidth throttling capabilities. By applying RT-Gang++, we are able to achieve desired level of performance of the AR-HUD application even in the presence of fully loaded aggressor tasks.more » « lessFree, publicly-accessible full text available January 1, 2025
-
Urban Air Mobility (UAM) applications, such as air taxis, will rely heavily on perception for situational awareness and safe operation. With recent advances in AI/ML, state-of-the-art perception systems can provide the high-fidelity information necessary for UAM systems. However, due to size, weight, power, and cost (SWaP-C) constraints, the available computing resources of the on-board computing platform in such UAM systems are limited. Therefore, real-time processing of sophisticated perception algorithms, along with guidance, navigation, and control (GNC) functions in a UAM system, is challenging and requires the careful allocation of computing resources. Furthermore, the optimal allocation of computing resources may change over time depending on the speed of the vehicle, environmental complexities, and other factors. For instance, a fast-moving air vehicle at low altitude would need a low-latency perception system, as a long delay in perception can negatively affect safety. Conversely, a slowly landing air vehicle in a complex urban environment would prefer a highly accurate perception system, even if it takes a little longer. However, most perception and control systems are not designed to support such dynamic reconfigurations necessary to maximize performance and safety. We advocate for developing “anytime” perception and control capabilities that can dynamically reconfigure the capabilities of perception and GNC algorithms at runtime to enable safe and intelligent UAM applications. The anytime approach will efficiently allocate the limited computing resources in ways that maximize mission success and ensure safety. The anytime capability is also valuable in the context of distributed sensing, enabling the efficient sharing of perception information across multiple sensor modalities between the nodes.more » « lessFree, publicly-accessible full text available January 4, 2025
-
The paper discusses
algorithmic priority inversion in mission-critical machine inference pipelines used in modern neural-network-based perception subsystems and describes a solution to mitigate its effect. In general,priority inversion occurs in computing systems when computations that are less important are performed together with or ahead of those that are more important. Significant priority inversion occurs in existing machine inference pipelines when they do not differentiate between critical and less critical data. We describe a framework to resolve this problem and demonstrate that it improves a perception system's ability to react to critical inputs, while at the same time reducing platform cost.Free, publicly-accessible full text available February 1, 2025 -
In this work, we set out to find the answers to the following questions: (1) Where are the bottlenecks in a state-of-theart architectural simulator? (2) How much faster can architectural simulations run by tuning system configurations? (3) What are the opportunities in accelerating software simulation using hardware accelerators? We choose gem5 as the representative architectural simulator, run several simulations with various configurations, perform a detailed architectural analysis of the gem5 source code on different server platforms, tune both system and architectural settings for running simulations, and discuss the future opportunities in accelerating gem5 as an important application. Our detailed profiling of gem5 reveals that its performance is extremely sensitive to the size of the Ll cache. Our experimental results show that a RISC-V core with 32KB data and instruction cache improves gem5’s simulation speed by 31%-61% compared with a baseline core with 8KB Ll caches. Our paper is the first step toward building specialized hardware and software environments for accelerating software-based simulators.more » « less