With increasingly deployed cameras and the rapid advances of Computer Vision, large-scale live video analytics becomes feasible. However, analyzing videos is compute-intensive. In addition, live video analytics needs to be performed in real time. In this paper, we design an edge server system for live video analytics. We propose to perform configuration adaptation without profiling video online. We select configurations with a prediction model based on object movement features. In addition, we reduce the latency through resource orchestration on video analytics servers. The key idea of resource orchestration is to batch inference tasks that use the same CNN model, and schedule tasks based on a priority value that estimates their impact on the total latency. We evaluate our system with two video analytic applications, road traffic monitoring and pose detection. The experimental results show that our profiling-free adaptation reduces the workload by 80% of the state-of-the-art adaptation without lowering the accuracy. The average serving latency is reduced by up to 95% comparing with the profiling-based adaptation.
more »
« less
Popularity Prediction of Facebook Videos for Higher Quality Streaming
Streaming video algorithms dynamically select between different versions of a video to deliver the highest quality version that can be viewed without buffering over the client’s connection. To improve the quality for viewers, the backing video service can generate more and/or better versions, but at a significant computational overhead. Processing all videos uploaded to Facebook in the most intensive way would require a prohibitively large cluster. Facebook’s video popularity distribution is highly skewed, however, with analysis on sampled videos showing 1% of them accounting for 83% of the total watch time by users. Thus, if we can predict the future popularity of videos, we can focus the intensive processing on those videos that improve the quality of the most watch time. To address this challenge, we designed CHESS, the first popularity prediction algorithm that is both scalable and accurate. CHESS is scalable because, unlike the state-ofthe- art approaches, it requires only constant space per video, enabling it to handle Facebook’s video workload. CHESS is accurate because it delivers superior predictions using a combination of historical access patterns with social signals in a unified online learning framework. We have built a video prediction service, CHESSVPS, using our new algorithm that can handle Facebook’s workload with only four machines. We find that re-encoding popular videos predicted by CHESSVPS enables a higher percentage of total user watch time to benefit from intensive encoding, with less overhead than a recent production heuristic, e.g., 80% of watch time with one-third as much overhead.
more »
« less
- Award ID(s):
- 1553579
- PAR ID:
- 10048563
- Date Published:
- Journal Name:
- USENIX Annual Technical Conference
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
As a part of road safety initiatives, surrogate road safety approaches have gained popularity due to the rapid advancement of video collection and processing technologies. This paper presents an end-to-end software pipeline for processing traffic videos and running a safety analysis based on surrogate safety measures. We developed algorithms and software to determine trajectory movement and phases that, when combined with signal timing data, enable us to perform accurate event detection and categorization in terms of the type of conflict for both pedestrian-vehicle and vehicle-vehicle interactions. Using this information, we introduce a new surrogate safety measure, “severe event,” which is quantified by multiple existing metrics such as time-to-collision (TTC) and post-encroachment time (PET) as recorded in the event, deceleration, and speed. We present an efficient multistage event filtering approach followed by a multi-attribute decision tree algorithm that prunes the extensive set of conflicting interactions to a robust set of severe events. The above pipeline was used to process traffic videos from several intersections in multiple cities to measure and compare pedestrian and vehicle safety. Detailed experimental results are presented to demonstrate the effectiveness of this pipeline.more » « less
-
Accessing high-quality video content can be challenging due to insufficient and unstable network bandwidth. Recent advances in neural enhancement have shown promising results in improving the quality of degraded videos through deep learning. Neural-Enhanced Streaming (NES) incorporates this new approach into video streaming, allowing users to download low-quality video segments and then enhance them to obtain high-quality content without violating the playback of the video stream. We introduce BONES, an NES control algorithm that jointly manages the network and computational resources to maximize the quality of experience (QoE) of the user. BONES formulates NES as a Lyapunov optimization problem and solves it in an online manner with near-optimal performance, making it the first NES algorithm to provide a theoretical performance guarantee. Comprehensive experimental results indicate that BONES increases QoE by 5% to 20% over state-of-the-art algorithms with minimal overhead. Our code is available at https://github.com/UMass-LIDS/bones.more » « less
-
Growing interest in “flipped” classrooms has made video lessons an increasingly prominent component of post-secondary mathematics curricula. This format, where students watch videos outside of class, can be leveraged to create a more active learning environment during class. Thus, for very challenging but essential classes in STEM, like calculus, the use of video lessons can have a positive impact on student success. However, relatively little is known about how students watch and learn from calculus instructional videos. This research generates knowledge about how students engage with, make sense of, and learn from calculus instructional videos.more » « less
-
null (Ed.)The need for mobile applications and mobile programming is increasing due to the continuous rise in the pervasiveness of mobile devices. Developers often refer to video programming tutorials to learn more about mobile programming topics. To find the right video to watch, developers typically skim over several videos, looking at their title, description, and video content in order to determine if they are relevant to their information needs. Unfortunately, the title and description do not always provide an accurate overview, and skimming over videos is time-consuming and can lead to missing important information. We propose a novel approach that locates and extracts the GUI screens showcased in a video tutorial, then selects and displays the most representative ones to provide a GUI-focused overview of the video. We believe this overview can be used by developers as an additional source of information for determining if a video contains the information they need. To evaluate our approach, we performed an empirical study on iOS and Android programming screencasts which investigates the accuracy of our automated GUI extraction. The results reveal that our approach can detect and extract GUI screens with an accuracy of 94%.more » « less
An official website of the United States government

