NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Efficient Video Redaction at the Edge: Human Motion Tracking for Privacy Protection

https://doi.org/10.1145/3762994

Qiao, Haotian; Srinivas, Vidya; Dinda, Peter; Dick, Robert (August 2025, ACM Transactions on Embedded Computing Systems)

Computationally efficient, camera-based, real-time human position tracking on low-end, edge devices would enable numerous applications, including privacy-preserving video redaction and analysis. Unfortunately, running most deep neural network based models in real time requires expensive hardware, making widespread deployment difficult, particularly on edge devices. Shifting inference to the cloud increases the attack surface, generally requiring that users trust cloud servers, and increases demands on wireless networks in deployment venues. Our goal is to determine the extreme to which edge video redaction efficiency can be taken, with a particular interest in enabling, for the first time, low-cost, real-time deployments with inexpensive commodity hardware. We present an efficient solution to the human detection (and redaction) problem based on singular value decomposition (SVD) background removal and describe a novel time- and energy-efficient sensor-fusion algorithm that leverages human position information in real-world coordinates to enable real-time visual human detection and tracking at the edge. These ideas are evaluated using a prototype built from (resource-constrained) commodity hardware representative of commonly used low-cost IoT edge devices. The speed and accuracy of the system are evaluated via a deployment study, and it is compared with the most advanced relevant alternatives. The multi-modal system operates at a frame rate ranging from 20 FPS to 60 FPS, achieves awIoU_0.3score (see Section 5.4) ranging from 0.71 to 0.79, and successfully performs complete redaction of privacy-sensitive pixels with a success rate of 91%–99% in human head regions and 77%–91% in upper body regions, depending on the number of individuals present in the field of view. These results demonstrate that it is possible to achieve adequate efficiency to enable real-time redaction on inexpensive, commodity edge hardware.
more » « less
Free, publicly-accessible full text available August 27, 2026
CAMP: Compiler and Allocator-based Heap Memory Protection

Lin, Zhenpeng; Yu, Zheng; Guo, Ziyi; Campanoni, Simone; Dinda, Peter; Xing, Xinyu (August 2024, USENIX Security 2024)

Full Text Available
Getting a Handle on Unmanaged Memory

https://doi.org/10.1145/3620666.3651326

Wanninger, Nick; McMichen, Tommy; Campanoni, Simone; Dinda, Peter (April 2024, ACM)

Full Text Available
TrackFM: Far-out Compiler Support for a Far Memory World

https://doi.org/10.1145/3617232.3624856

Tauro, Brian R; Suchy, Brian; Campanoni, Simone; Dinda, Peter; Hale, Kyle C (April 2024, ACM)

Full Text Available
Compiling Loop-Based Nested Parallelism for Irregular Workloads

https://doi.org/10.1145/3620665.3640405

Su, Yian; Rainey, Mike; Wanninger, Nick; Dhiantravan, Nadharm; Liang, Jasper; Acar, Umut A; Dinda, Peter; Campanoni, Simone (April 2024, ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems)

Modern programming languages offer special syntax and semantics for logical fork-join parallelism in the form of parallel loops, allowing them to be nested, e.g., a parallel loop within another parallel loop. This expressiveness comes at a price, however: on modern multicore systems, realizing logical parallelism results in overheads due to the creation and management of parallel tasks, which can wipe out the benefits of parallelism. Today, we expect application programmers to cope with it by manually tuning and optimizing their code. Such tuning requires programmers to reason about architectural factors hidden behind layers of software abstractions, such as task scheduling and load balancing. Managing these factors is particularly challenging when workloads are irregular because their performance is input-sensitive. This paper presents HBC, the first compiler that translates C/C++ programs with high-level, fork-join constructs (e.g., OpenMP) to binaries capable of automatically controlling the cost of parallelism and dealing with irregular, input-sensitive workloads. The basis of our approach is Heartbeat Scheduling, a recent proposal for automatic granularity control, which is backed by formal guarantees on performance. HBC binaries outperform OpenMP binaries for workloads for which even entirely manual solutions struggle to find the right balance between parallelism and its costs.
more » « less
Full Text Available
CARAT KOP: Towards Protecting the Core HPC Kernel from Linux Kernel Modules

https://doi.org/10.1145/3624062.3624237

Filipiuk, Thomas; Wanninger, Nick; Dhiantravan, Nadharm; Surmeier, Carson; Bernat, Alex; Dinda, Peter (November 2023, Proceedings of the 13th International Workshop on Runtime and Operating Systems for Supercomputers (ROSS 2023))
Generalized Collective Algorithms for the Exascale Era

https://doi.org/10.1109/CLUSTER52292.2023.00013

Wilkins, Michael; Wang, Hanming; Liu, Peizhi; Pham, Bangyen; Guo, Yanfei; Thakur, Rajeev; Dinda, Peter; Hardavellas, Nikos (October 2023, IEEE)

Full Text Available
Evaluating Functional Memory-Managed Parallel Languages for HPC using the NAS Parallel Benchmarks

https://doi.org/10.1109/IPDPSW59300.2023.00072

Wilkins, Michael; Weil, Garrett; Arnold, Luke; Hardavellas, Nikos; Dinda, Peter (May 2023, Proceedings of the 28th HIPS workshop at IPDPS 2023)

Full Text Available
ACCLAiM: Advancing the Practicality of MPI Collective Communication Autotuning Using Machine Learning

https://doi.org/10.1109/CLUSTER51413.2022.00030

Wilkins, Michael; Guo, Yanfei; Thakur, Rajeev; Dinda, Peter; Hardavellas, Nikos (September 2022, Proceedings of the 2022 IEEE International Conference on Cluster Computing (CLUSTER))

Full Text Available
Program State Element Characterization

https://doi.org/10.1145/3579990.3580011

Deiana, Enrico Armenio; Suchy, Brian; Wilkins, Michael; Homerding, Brian; McMichen, Tommy; Dunajewski, Katarzyna; Dinda, Peter; Hardavellas, Nikos; Campanoni, Simone (February 2023, International Symposium on Code Generation and Optimization)

Modern programming languages offer abstractions that simplify software development and allow hardware to reach its full potential. These abstractions range from the well-established OpenMP language extensions to newer C++ features like smart pointers. To properly use these abstractions in an existing codebase, programmers must determine how a given source code region interacts with Program State Elements (PSEs) (i.e., the program's variables and memory locations). We call this process Program State Element Characterization (PSEC). Without tool support for PSEC, a programmer's only option is to manually study the entire codebase. We propose a profile-based approach that automates PSEC and provides abstraction recommendations to programmers. Because a profile-based approach incurs an impractical overhead, we introduce the Compiler and Runtime Memory Observation Tool (CARMOT), a PSEC-specific compiler co-designed with a parallel runtime. CARMOT reduces the overhead of PSEC by two orders of magnitude, making PSEC practical. We show that CARMOT's recommendations achieve the same speedup as hand-tuned OpenMP directives and avoid memory leaks with C++ smart pointers. From this, we argue that PSEC tools, such as CARMOT, can provide support for the rich ecosystem of modern programming language abstractions.
more » « less
Full Text Available

« Prev Next »

Search for: All records