Edge-assisted video analytics is gaining momentum. In this work, we tackle an important problem to compress video content live streamed from the device to the edge without scarifying accuracy and timeliness of its video analytics. We find that on-device processing can be tuned over a larger configuration space for more video compression, which was largely overlooked. Inspired by our pilot study, we design VPPlus to fulfill the potentials to compress the video as much as we can, while preserving analytical accuracy. VPPlus incorporates two core modules – offline profiling and online adaptation – to generate proper feedback automatically and quickly to tune on-device processing. We validate the effectiveness and efficiency of VPPlususing five object detection tasks over two popular datasets; VPPlus outperforms the state-of-art approaches in almost all the cases.
more »
« less
Profiling-free Configuration Adaptation and Latency-Aware Resource Scheduling for Video Analytics
With increasingly deployed cameras and the rapid advances of Computer Vision, large-scale live video analytics becomes feasible. However, analyzing videos is compute-intensive. In addition, live video analytics needs to be performed in real time. In this paper, we design an edge server system for live video analytics. We propose to perform configuration adaptation without profiling video online. We select configurations with a prediction model based on object movement features. In addition, we reduce the latency through resource orchestration on video analytics servers. The key idea of resource orchestration is to batch inference tasks that use the same CNN model, and schedule tasks based on a priority value that estimates their impact on the total latency. We evaluate our system with two video analytic applications, road traffic monitoring and pose detection. The experimental results show that our profiling-free adaptation reduces the workload by 80% of the state-of-the-art adaptation without lowering the accuracy. The average serving latency is reduced by up to 95% comparing with the profiling-based adaptation.
more »
« less
- Award ID(s):
- 1908536
- PAR ID:
- 10465136
- Date Published:
- Journal Name:
- 2022 IEEE International Conference on Big Data (Big Data)
- Page Range / eLocation ID:
- 1202 to 1211
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Fifth-generation and beyond (B5G) networks must handle stringent requirements for ultra-low latency, high reliability, and dynamic service provisioning across decentralized environments. While container-based live migration has emerged as a flexible mechanism to ensure service continuity during failures and overload scenarios, most proposed approaches are reactive and lack integration with the transport layer and automation through real-time resource orchestration. This work presents a proactive migration framework that tightly couples a 5G radio access network (RAN) architecture with an OpenROADM-compliant optical transport network (OTN) testbed. By leveraging dynamic optical wavelength service creation and container-based network function virtualization, the presented framework enables seamless live migration of the gNB (next generation) central unit–user plane (CU-UP) between two remote locations without disconnecting the supported mobile services. A custom xApp within the near-real-time RAN intelligent controller (Near-RT RIC) monitors system performance metrics and employs predictive analytics to trigger the proactive CU-UP container migration ahead of a probable server overload scenario. A robot framework-based automation platform ensures coordinated orchestration between the compute and transport layer resource allocation to achieve a successful live migration of the container running the CU-UP. Experimental results confirm that the proposed approach achieves near-zero mobile user service downtime, demonstrating its effectiveness in meeting the end-to-end quality of service (QoS) requirements of B5G applications.more » « less
-
With the growing ubiquity of video content, efficient video analytics has become essential for applications such as surveillance, autonomous driving, and augmented reality. Yet, deploying video analytics models on resource-constrained edge devices and in lowbandwidth environments remains challenging. A dominant method for handling demanding video analytics tasks on edge devices has been to offload computation strategically from the edge device to servers. However, all prior solutions fail to offload under severely constrained, real-world network conditions (such as, a few-Mbps satellite network) due to the much higher data rates associated with video tasks. We introduce ApproxBit, a system to optimize shared edge-to-cloud processing for video analytics tasks; the two that we experiment with are video action recognition and video question answering. ApproxBit integrates an encoder within the video model, uses learned binary codes to effectively compress and offload data, and adaptively decides on the offloading point depending on the network bandwidth. ApproxBit’s adaptive and efficient data compression, which reduces the original feature map size by up to 2142.4×, makes it an ideal solution for video analytics on edge devices, especially with constrained networks. We evaluate ApproxBit on the two video tasks, across different model architectures (e.g., convolution- and Transformer-based) and multiple datasets (e.g., Something-Something-v2, Kinetics, and MSVD). Our results of latency and accuracy are superior over baselines: edge-only processing, server-only processing, DNN Surgery [ToCC ’23], full offloading of H.264-encoded videos, DeepCOD [SenSys ’20], neural video compression DCVC-FM [CVPR ’24], and Limit- Net [MobiSys ’24]. We also demonstrate ApproxBit’s adaptivity to changing network conditions, and generalization in a real-world user study.more » « less
-
Serverless computing has become increasingly popular for cloud applications, due to its compelling properties of high-level abstractions, lightweight runtime, high elasticity and pay-per-use billing. In this revolutionary computing paradigm shift, challenges arise when adapting data analytics applications to the serverless environment, due to the lack of support for efficient state sharing, which attract ever-growing research attention. In this paper, we aim to exploit the advantages of task level orchestration and fine-grained resource provisioning for data analytics on serverless platforms, with the hope of fulfilling the promise of serverless deployment to the maximum extent. To this end, we present ACTS, an autonomous cost-efficient task orchestration framework for serverless analytics. ACTS judiciously schedules and coordinates function tasks to mitigate cold-start latency and state sharing overhead. In addition, ACTS explores the optimization space of fine-grained workload distribution and function resource configuration for cost efficiency. We have deployed and implemented ACTS on AWS Lambda, evaluated with various data analytics workloads. Results from extensive experiments demonstrate that ACTS achieves up to 98% monetary cost reduction while maintaining superior job completion time performance, in comparison with the state-of-the-art baselines.more » « less
-
Video analytics has many applications in traffic control, security monitoring, action/event analysis, etc. With the adoption of deep neural networks, the accuracy of video analytics in video streams has been greatly improved. However, deep neural networks for performing video analytics are compute-intensive. In order to reduce processing time, many systems switch to the lower frame rate or resolution. State-of-the-art switching approaches adjust configurations by profiling video clips on a large configuration space. Multiple configurations are tested periodically and the cheapest one with a desired accuracy is adopted. In this paper, we propose a method that adapts the configuration by analyzing past video analytics results instead of profiling candidate configurations. Our method adopts a lower/higher resolution or frame rate when objects move slow/fast. We train a model that automatically selects the best configuration. We evaluate our method with two real-world video analytics applications: traffic tracking and pose estimation. Compared to the periodic profiling method, our method achieves 3%-12% higher accuracy with the same resource cost and 8-17x faster with comparable accuracy.more » « less
An official website of the United States government

