Internet traffic load is not uniformly distributed through the day; it is significantly higher during peak-periods, and comparatively idle during off-peak periods. In this context, we present CacheFlix, a time-shifted edge-caching solution that prefetches Netflix content during off-peak periods of network connectivity. We specifically focus on Netflix since it contributes to the largest percentage of global Internet traffic by a single application. We analyze a real-world dataset of Netflix viewing activity that we collected from 1060 users spanning a 1-year period and comprised of over 2.2 million Netflix TV shows and documentary series; we restrict the scope of our study to Netflix series that account for 65% of a typical user's Netflix load in terms of bytes fetched. We present insights on users' viewing behavior, and develop an accurate and efficient prediction algorithm using LSTM networks that caches episodes of Netflix series on storage constrained edge nodes, based on the user's past viewing activity. We evaluate CacheFlix on the collected dataset over various cache eviction policies, and find that CacheFlix is able to shift 70% of Netflix series traffic to off-peak hours.
more »
« less
A Real-world Dataset of Netflix Videos and User Watch-Behavior: Analysis and Insights
Netflix is the most popular video streaming site contributing to nearly a quarter of global video traffic. Given the dominance of Netflix on Internet traffic, understanding how individual users consume content on Netflix is of interest to not only the research community, but to network operators, content creators and providers, users and advertisers. In this context, we collect Netflix viewing activity from 1060 users spanning a 1 year period, and consisting of over 1.7 million episodes and movies. We group the users based on their activity level, and provide key insights pertaining to the user’s watch patterns, watch-session length, user preferences, predictability and watch-behavior continuation tendencies. We also implement and evaluate classifiers which are used to predict the user’s engagement in a series based on their past behavioral patterns.
more »
« less
- Award ID(s):
- 1813242
- PAR ID:
- 10323133
- Date Published:
- Journal Name:
- IEEE International Conference on Communications
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Smart DNS (SDNS) services advertise access to geofenced content (typically, video streaming sites such as Netflix or Hulu) that is normally inaccessible unless the client is within a prescribed geographic region. SDNS is simple to use and involves no software installation. Instead, it requires only that users modify their DNS settings to point to an SDNS resolver. The SDNS resolver “smartly” identifies geofenced domains and, in lieu of their proper DNS resolutions, returns IP addresses of proxy servers located within the geofence. These servers then transparently proxy traffic between the users and their intended destinations, allowing for the bypass of these geographic restrictions. This paper presents the first academic study of SDNS services. We identify a number of serious and pervasive privacy vulnerabilities that expose information about the users of these systems. These include architectural weaknesses that enable content providers to identify which requesting clients use SDNS. Worse, we identify flaws in the design of some SDNS services that allow any arbitrary third party to enumerate these services’ users (by IP address), even if said users are currently offline. We present mitigation strategies to these attacks that have been adopted by at least one SDNS provider in response to our findings.more » « less
-
Augmented/Virtual reality and video-based media play a vital role in the digital learning revolution to train novices in spatial tasks. However, creating content for these different media requires expertise in several fields. We present EditAR, a unified authoring, and editing environment to create content for AR, VR, and video based on a single demonstration. EditAR captures the user’s interaction within an environment and creates a digital twin, enabling users without programming backgrounds to develop content. We conducted formative interviews with both subject and media experts to design the system. The prototype was developed and reviewed by experts. We also performed a user study comparing traditional video creation with 2D video creation from 3D recordings, via a 3D editor, which uses freehand interaction for in-headset editing. Users took 5 times less time to record instructions and preferred EditAR, along with giving significantly higher usability scores.more » « less
-
YouTube is the most popular video sharing platform with more than 2 billion active users and 1 billion hours of video content watched daily. The dominance of YouTube has had a big impact on the performance of Internet protocols, algorithms, and systems. Understanding the interaction of users with YouTube is thus of much interest to the research community. In this context, we collect YouTube watch history data from 243 users spanning a 1.5 year period. The dataset comprises of a total of 1.8 million videos. We use the dataset to analyze and present key insights about user-level usage behavior. We also show that our analysis can be used by researchers to tackle a myriad of problems in the general domains of networking and communication. We present baseline characteristics and also substantiated directions to solve a few representative problems related to local caching techniques, prefetching strategies, the performance of YouTube's recommendation engine, the variability of user's video preferences and application specific load provisioning.more » « less
-
Streaming video algorithms dynamically select between different versions of a video to deliver the highest quality version that can be viewed without buffering over the client’s connection. To improve the quality for viewers, the backing video service can generate more and/or better versions, but at a significant computational overhead. Processing all videos uploaded to Facebook in the most intensive way would require a prohibitively large cluster. Facebook’s video popularity distribution is highly skewed, however, with analysis on sampled videos showing 1% of them accounting for 83% of the total watch time by users. Thus, if we can predict the future popularity of videos, we can focus the intensive processing on those videos that improve the quality of the most watch time. To address this challenge, we designed CHESS, the first popularity prediction algorithm that is both scalable and accurate. CHESS is scalable because, unlike the state-ofthe- art approaches, it requires only constant space per video, enabling it to handle Facebook’s video workload. CHESS is accurate because it delivers superior predictions using a combination of historical access patterns with social signals in a unified online learning framework. We have built a video prediction service, CHESSVPS, using our new algorithm that can handle Facebook’s workload with only four machines. We find that re-encoding popular videos predicted by CHESSVPS enables a higher percentage of total user watch time to benefit from intensive encoding, with less overhead than a recent production heuristic, e.g., 80% of watch time with one-third as much overhead.more » « less
An official website of the United States government

