Neural codecs have demonstrated strong performance in high-fidelity compression of audio signals at low bitrates. The token-based representations produced by these codecs have proven particularly useful for generative modeling. While much research has focused on improvements in compression ratio and perceptual transparency, recent works have largely overlooked another desirable codec property -- \textit{idempotence}, the stability of compressed outputs under multiple rounds of encoding. We find that state-of-the-art neural codecs exhibit varied degrees of idempotence, with some degrading audio outputs significantly after as few as three encodings. We investigate possible causes of low idempotence and devise a method for improving idempotence through fine-tuning a codec model. We then examine the effect of idempotence on a simple conditional generative modeling task, and find that increased idempotence can be achieved without negatively impacting downstream modeling performance -- potentially extending the usefulness of neural codecs for practical file compression and iterative generative modeling workflows.
more »
« less
Disruption-Resilient Real-Time Sensor Data Delivery via Neural Multiple Description Coding
In this paper we develop a novel disruptionresilient approach for real-time, high-resolution sensor data delivery over multiple wireless channels for military autonomous systems such as drones, autonomous vehicles and robots. We design two innovative neural multiple description codecs (neural MDCs) which compress and encode images into multiple independently decodable and mutually refineable streams. Our approach not only achieves high compression efficiency, but also enables the effective use of multiple diverse radio channels for real-time delivery of high-resolution sensor data while ensuring disruption resiliency. Using benchmark image/video sensor datasets as well as real-world 5G traces, we evaluate and demonstrate the efficacy of both neural MDC codecs for highresolution sensor data streaming over multiple radio channels under various jamming scenarios.
more »
« less
- PAR ID:
- 10653987
- Publisher / Repository:
- IEEE Milcom 2026
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Support for connected and autonomous vehicles (CAVs) is a major use case of 5G networks. Due to their large from factors, CAVs can be equipped with multiple radio antennas, cameras, LiDAR and other sensors. In other words, they are "giant" mobile integrated communications and sensing devices. The data collected can not only facilitate edge-assisted autonomous driving, but also enable intelligent radio resource allocation by cellular networks. In this paper we conduct an initial study to assess the feasibility of delivering multi-modal sensory data collected by vehicles over emerging commercial 5G networks. We carried out an "in-the-wild" drive test and data collection campaign between Minneapolis and Chicago using a vehicle equipped with a 360° camera, a LiDAR device, multiple smart phones and a professional 5G network measurement tool. Using the collected multi-modal data, we conduct trace-driven experiments in a local streaming testbed to analyze the requirements and performance of streaming multi-modal sensor data over existing 4G/5G networks. We reveal several notable findings and point out future research directions.more » « less
-
With the expansion of sensor nodes to newer avenues of technologies, such as the Internet of things (IoT), internet of bodies (IoB), augmented reality (AR), and mixed reality, the demand to support high-speed operations, such as audio and video, with a minimal increase in power consumption is gaining much traction. In this work, we focus on these nodes operating in audio-based AR (AAR) and explore the opportunity of supporting audio at a low power budget. For sensor nodes, communicating one bit of data usually consumes significantly higher power than the power associated with sensing and processing/computing one data bit. Compressing the number of communication bits at the expense of a few computation cycles considerably reduces the overall power consumption of the nodes. Audio codecs such as AAC and LDAC that currently perform compression and decompression of audio streams burn significant power and create a floor to the minimum power possible in these applications. Compressive sensing (CS), a powerful mathematical tool for compression, is often used in physiological signal sensing, such as EEG and ECG, and it can offer a promising low-power alternative to audio codecs. We introduce a new paradigm of using the CS-based approach to realize audio compression that can function as a new independent technique or augment the existing codecs for a higher level of compression. This work, CS-Audio, fabricated in TSMC 65-nm CMOS technology, presents the first CS-based compression, equipped with an ON-chip DWT sparsifier for non-sparse audio signals. The CS design, realized in a pipelined architecture, achieves high data rates and enables a wake-up implementation to bypass computation for insignificant input samples, reducing the power consumption of the hardware. The measurement results demonstrate a 3X-15X reduction in transmitted audio data without a perceivable degradation of audio quality, as indicated by the perceptual evaluation of audio quality mean opinion score (PEAQ MOS) >1.5. The hardware consumes 238 μW power at 0.65 V and 15 Mbps, which is (~20X-40X) lower than audio codecs.more » « less
-
The rise of intelligent autonomous systems, especially in robotics and autonomous agents, has created a critical need for robust communication middleware that can ensure real-time processing of extensive sensor data. Current robotics middleware like Robot Operating System (ROS) 2 faces challenges with nondeterminism and high communication latency when dealing with large data across multiple subscribers on a multi-core compute platform. To address these issues, we present High-Performance Robotic Middleware (HPRM), built on top of the deterministic coordination language Lingua Franca (LF). HPRM employs optimizations including an in-memory object store for efficient zero-copy transfer of large payloads, adaptive serialization to minimize serialization overhead, and an eager protocol with real-time sockets to reduce handshake latency. Benchmarks show HPRM achieves up to 114x lower latency than ROS2 when broadcasting large messages to multiple nodes. We then demonstrate the benefits of HPRM by integrating it with the CARLA simulator and running reinforcement learning agents along with object detection workloads. In the CARLA autonomous driving application, HPRM attains 91.1% lower latency than ROS2. The deterministic coordination semantics of HPRM, combined with its optimized IPC mechanisms, enable efficient and predictable real-time communication for intelligent autonomous systems. Code and videos can be found on our project page: https://hprm-robotics.github.io/HPRMmore » « less
-
In this paper, we provide an approach to data-driven control for artificial pancreas systems by learning neural network models of human insulin-glucose physiology from available patient data and using a mixed integer optimization approach to control blood glucose levels in real-time using the inferred models. First, our approach learns neural networks to predict the future blood glucose values from given data on insulin infusion and their resulting effects on blood glucose levels. However, to provide guarantees on the resulting model, we use quantile regression to fit multiple neural networks that predict upper and lower quantiles of the future blood glucose levels, in addition to the mean. Using the inferred set of neural networks, we formulate a model-predictive control scheme that adjusts both basal and bolus insulin delivery to ensure that the risk of harmful hypoglycemia and hyperglycemia are bounded using the quantile models while the mean prediction stays as close as possible to the desired target. We discuss how this scheme can handle disturbances from large unannounced meals as well as infeasibilities that result from situations where the uncertainties in future glucose predictions are too high. We experimentally evaluate this approach on data obtained from a set of 17 patients over a course of 40 nights per patient. Furthermore, we also test our approach using neural networks obtained from virtual patient models available through the UVA-Padova simulator for type-1 diabetes.more » « less
An official website of the United States government

