skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on November 17, 2025

Title: R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding
Award ID(s):
2239688
PAR ID:
10575135
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Springer, Cham
Date Published:
ISSN:
0302-9743
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Querying video data has become increasingly popular and useful. Video queries can be complex, ranging from retrieval tasks (“find me the top videos that have … ”), to analytics (“how many videos contained object X per day?”), to excerpting tasks (“highlight and zoom into scenes with object X near object Y”), or combinations thereof. Results for video queries are still typically shown as either relational data or a primitive collection of clickable thumbnails on a web page. Presenting query results in this form is an impedance mismatch with the video medium: they are cumbersome to skim through and are in a different modality and information density compared to the source data. We describe V2V, a system to efficiently synthesize video results for video queries. V2V returns a fully-edited video, allowing the user to consume results in the same manner as the source videos. A key challenge is that synthesizing video results from a collection of videos is computationally intensive, especially within interactive query response times. To address this, V2V features a grammar to express video transformations in a declarative manner and a heuristic optimizer that improves the efficiency of V2V processing in a manner similar to how databases execute relational queries. Experiments show that our V2V optimizer enables video synthesis to run 3x faster. 
    more » « less
  2. null (Ed.)
    As video tra!c continues to dominate the Internet, interest in nearsecond low-latency streaming has increased. Existing low-latency streaming platforms rely on using tens of seconds of video in the bu"er to o"er a seamless experience. Striving for near-second latency requires the receiver to make quick decisions regarding the download bitrate and the playback speed. To cope with the challenges, we design a new adaptive bitrate (ABR) scheme, Stallion, for STAndard Low-LAtency vIdeo cONtrol. Stallion uses a sliding window to measure the mean and standard deviation of both the bandwidth and latency. We evaluate Stallion and compare it to the standard DASH DYNAMIC algorithm over a variety of networking conditions. Stallion shows 1.8x increase in bitrate, and 4.3x reduction in the number of stalls. 
    more » « less
  3. Live video (LV) communication tools (e.g., Zoom) have the potential to provide survey researchers with many of the benefits of in-person interviewing, while also greatly reducing data collection costs, given that interviewers do not need to travel and make in-person visits to sampled households. The COVID-19 pandemic has exposed the vulnerability of in-person data collection to public health crises, forcing survey researchers to explore remote data collection modes—such as LV interviewing—that seem likely to yield high-quality data without in-person interaction. Given the potential benefits of these technologies, the operational and methodological aspects of video interviewing have started to receive research attention from survey methodologists. Although it is remote, video interviewing still involves respondent–interviewer interaction that introduces the possibility of interviewer effects. No research to date has evaluated this potential threat to the quality of the data collected in video interviews. This research note presents an evaluation of interviewer effects in a recent experimental study of alternative approaches to video interviewing including both LV interviewing and the use of prerecorded videos of the same interviewers asking questions embedded in a web survey (“prerecorded video” interviewing). We find little evidence of significant interviewer effects when using these two approaches, which is a promising result. We also find that when interviewer effects were present, they tended to be slightly larger in the LV approach as would be expected in light of its being an interactive approach. We conclude with a discussion of the implications of these findings for future research using video interviewing. 
    more » « less