Video communication has been rapidly increasing over the past decade, with YouTube providing a medium where users can post, discover, share, and react to videos. There has also been an increase in the number of videos citing research articles, especially since it has become relatively commonplace for academic conferences to require video submissions. However, the relationship between research articles and YouTube videos is not clear, and the purpose of the present paper is to address this issue. We created new datasets using YouTube videos and mentions of research articles on various online platforms. We found that most of the articles cited in the videos are related to medicine and biochemistry. We analyzed these datasets through statistical techniques and visualization, and built machine learning models to predict (1) whether a research article is cited in videos, (2) whether a research article cited in a video achieves a level of popularity, and (3) whether a video citing a research article becomes popular. The best models achieved F1 scores between 80% and 94%. According to our results, research articles mentioned in more tweets and news coverage have a higher chance of receiving video citations. We also found that video views are important for predicting citations and increasing research articles’ popularity and public engagement with science. 
                        more » 
                        « less   
                    
                            
                            A YouTube Dataset with User-level Usage Data: Baseline Characteristics and Key Insights
                        
                    
    
            YouTube is the most popular video sharing platform with more than 2 billion active users and 1 billion hours of video content watched daily. The dominance of YouTube has had a big impact on the performance of Internet protocols, algorithms, and systems. Understanding the interaction of users with YouTube is thus of much interest to the research community. In this context, we collect YouTube watch history data from 243 users spanning a 1.5 year period. The dataset comprises of a total of 1.8 million videos. We use the dataset to analyze and present key insights about user-level usage behavior. We also show that our analysis can be used by researchers to tackle a myriad of problems in the general domains of networking and communication. We present baseline characteristics and also substantiated directions to solve a few representative problems related to local caching techniques, prefetching strategies, the performance of YouTube's recommendation engine, the variability of user's video preferences and application specific load provisioning. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1813242
- PAR ID:
- 10322913
- Date Published:
- Journal Name:
- IEEE International Conference on Communications
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            null (Ed.)We consider a novel application of inverse reinforcement learning with behavioral economics constraints to model, learn and predict the commenting behavior of YouTube viewers. Each group of users is modeled as a rationally inattentive Bayesian agent which solves a contextual bandit problem. Our methodology integrates three key components. First, to identify distinct commenting patterns, we use deep embedded clustering to estimate framing information (essential extrinsic features) that clusters users into distinct groups. Second, we present an inverse reinforcement learning algorithm that uses Bayesian revealed preferences to test for rationality: does there exist a utility function that rationalizes the given data, and if yes, can it be used to predict commenting behavior? Finally, we impose behavioral economics constraints stemming from rational inattention to characterize the attention span of groups of users. The test imposes a Rényi mutual information cost constraint which impacts how the agent can select attention strategies to maximize their expected utility. After a careful analysis of a massive YouTube dataset, our surprising result is that in most YouTube user groups, the commenting behavior is consistent with optimizing a Bayesian utility with rationally inattentive constraints. The paper also highlights how the rational inattention model can accurately predict commenting behavior. The massive YouTube dataset and analysis used in this paper are available on GitHub and completely reproduciblemore » « less
- 
            null (Ed.)Faculty often utilize homework problems as a means to help students practice problem solving. Recently, with textbook solutions manuals being freely available online, students are prone to copying/cheating, which can severely limit improvements in problem solving. One hypothesis is that YouTube problems could serve as alternatives to textbook problems to significantly reduce cheating and promote better problem solving. YouTube problems are student-written problems that were inspired by events in a video publicly available online. While our previous studies have showcased positive attitudes related to engineering, high engagement, and rigor of the YouTube problems, the current study examines a subset of problems related to one major course topic, namely vapor-liquid equilibrium. The cohorts include engineering students from a public university who were assigned homework problems as part of a material and energy balance course. Two constructs were explored: problem solving and perception of problem difficulty. The study adopted an established and validated rubric to quantify performance in relevant stages of problem solving, including problem identification, representation, organization, calculation, solution completion, and solution accuracy. While problem solving can be influenced by perception of problem difficulty, the widely used NASA Task Load Index was adopted to measure the problem rigor. This paper will compare textbook and YouTube problem with respect to overall problem-solving ability as well as in each stage of problem solving. Furthermore, we will investigate whether disparities exist in students’ perceptions when solving vapor-liquid equilibrium problems.more » « less
- 
            As video traffic dominates the Internet, it is important for operators to detect video Quality of Experience (QoE) in order to ensure adequate support for video traffic. With wide deployment of endto- end encryption, traditional deep packet inspection based traffic monitoring approaches are becoming ineffective. This poses a challenge for network operators to monitor user QoE and improve upon their experience. To resolve this issue, we develop and present a system for REal-time QUality of experience metric detection for Encrypted Traffic, Requet. Requet uses a detection algorithm we develop to identify video and audio chunks from the IP headers of encrypted traffic. Features extracted from the chunk statistics are used as input to a Machine Learning (ML) algorithm to predict QoE metrics, specifically, buffer warning (low buffer, high buffer), video state (buffer increase, buffer decay, steady, stall), and video resolution. We collect a large YouTube dataset consisting of diverse video assets delivered over various WiFi network conditions to evaluate the performance. We compare Requet with a baseline system based on previous work and show that Requet outperforms the baseline system in accuracy of predicting buffer low warning, video state, and video resolution by 1.12×, 1.53×, and 3.14×, respectively.more » « less
- 
            Problem solving is a vital skill required to be successful in many engineering industries. One way for students to practice problem solving is through solving homework problems. However, solutions manuals for textbook problems are usually available online, and students can easily default to copying from solution manual. To address the solution manual dilemma and promote better problem-solving ability, this study utilizes novel homework problems that integrate a video component as an alternative to text-only, textbook problems. Building upon research showing visuals promote better learning, YouTube videos are reversed engineered by students to create new homework problems. Previous studies have catalogued student-written problems in a material and energy balance course, which are called YouTube problems. In this study, textbook homework problems were replaced with student-written YouTube problems. We additionally focused on examining learning attitudes after students solve YouTube problems. Data collection include attitudinal survey responses using a validated instrument called CLASS (Colorado Learning Attitudes about Science Survey). Students completed the survey at the beginning and end of the course. Analysis compared gains in attitudes for participants in the treatment groups. Mean overall attitude of participants undergoing YouTube intervention was improved by a normalized gain factor of 0.15 with a small effect size (Hedge’s g = 0.35). Improvement was most prominent in attitudes towards personal application and relation to real world connection with normalized gain of 0.49 and small effect size (Hedge’s g = 0.38).more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    