skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on August 1, 2026

Title: MoVer: Motion Verification for Motion Graphics Animations
While large vision-language models can generate motion graphics animations from text prompts, they regularly fail to include all spatio-temporal properties described in the prompt. We introduce MoVer, a motion verification DSL based on first-order logic that can check spatio-temporal properties of a motion graphics animation. We identify a general set of such properties that people commonly use to describe animations (e.g., the direction and timing of motions, the relative positioning of objects, etc.). We implement these properties as predicates in MoVer and provide an execution engine that can apply a MoVer program to any input SVG-based motion graphics animation. We then demonstrate how MoVer can be used in an LLM-based synthesis and verification pipeline for iteratively refining motion graphics animations. Given a text prompt, our pipeline synthesizes a motion graphics animation and a corresponding MoVer program. Executing the verification program on the animation yields a report of the predicates that failed and the report can be automatically fed back to LLM to iteratively correct the animation. To evaluate our pipeline, we build a synthetic dataset of 5600 text prompts paired with ground truth MoVer verification programs. We find that while our LLM-based pipeline is able to automatically generate a correct motion graphics animation for 58.8% of the test prompts without any iteration, this number raises to 93.6% with up to 50 correction iterations. Our code and dataset are at https://mover-dsl.github.io.  more » « less
Award ID(s):
2219865
PAR ID:
10656772
Author(s) / Creator(s):
;
Publisher / Repository:
ACM SIGGRAPH
Date Published:
Journal Name:
ACM Transactions on Graphics
Volume:
44
Issue:
4
ISSN:
0730-0301
Page Range / eLocation ID:
1 to 17
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We introduce a new inverse modeling method to interactively design crowd animations. Few works focus on providing succinct high-level and large-scale crowd motion modeling. Our methodology is to read in real or virtual agent trajectory data and automatically infer a set of parameterized crowd motion models. Then, components of the motion models can be mixed, matched, and altered enabling rapidly producing new crowd motions. Our results show novel animations using real-world data, using synthetic data, and imitating real-world scenarios. Moreover, by combining our method with our interactive crowd trajectory sketching tool, we can create complex spatio-temporal crowd animations in about a minute. 
    more » « less
  2. The strength of modern generative models lies in their ability to be controlled through text-based prompts. Typical "hard" prompts are made from interpretable words and tokens, and must be hand-crafted by humans. There are also "soft" prompts, which consist of continuous feature vectors. These can be discovered using powerful optimization methods, but they cannot be easily interpreted, re-used across models, or plugged into a text-based interface. We describe an approach to robustly optimize hard text prompts through efficient gradient-based optimization. Our approach automatically generates hard text-based prompts for both text-to-image and text-to-text applications. In the text-to-image setting, the method creates hard prompts for diffusion models, allowing API users to easily generate, discover, and mix and match image concepts without prior knowledge on how to prompt the model. In the text-to-text setting, we show that hard prompts can be automatically discovered that are effective in tuning LMs for classification. 
    more » « less
  3. Recent work has presented intriguing results examining the knowledge contained in language models (LMs) by having the LM fill in the blanks of prompts such as “ Obama is a __ by profession”. These prompts are usually manually created, and quite possibly sub-optimal; another prompt such as “ Obama worked as a __ ” may result in more accurately predicting the correct profession. Because of this, given an inappropriate prompt, we might fail to retrieve facts that the LM does know, and thus any given prompt only provides a lower bound estimate of the knowledge contained in an LM. In this paper, we attempt to more accurately estimate the knowledge contained in LMs by automatically discovering better prompts to use in this querying process. Specifically, we propose mining-based and paraphrasing-based methods to automatically generate high-quality and diverse prompts, as well as ensemble methods to combine answers from different prompts. Extensive experiments on the LAMA benchmark for extracting relational knowledge from LMs demonstrate that our methods can improve accuracy from 31.1% to 39.6%, providing a tighter lower bound on what LMs know. We have released the code and the resulting LM Prompt And Query Archive (LPAQA) at https://github.com/jzbjyb/LPAQA . 
    more » « less
  4. Motion graphics videos are widely used in Web design, digital advertising, animated logos and film title sequences, to capture a viewer's attention. But editing such video is challenging because the video provides a low-level sequence of pixels and frames rather than higher-level structure such as the objects in the video with their corresponding motions and occlusions. We present amotion vectorizationpipeline for converting motion graphics video into an SVG motion program that provides such structure. The resulting SVG program can be rendered using any SVG renderer (e.g. most Web browsers) and edited using any SVG editor. We also introduce aprogram transformationAPI that facilitates editing of a SVG motion program to create variations that adjust the timing, motions and/or appearances of objects. We show how the API can be used to create a variety of effects including retiming object motion to match a music beat, adding motion textures to objects, and collision preserving appearance changes. 
    more » « less
  5. Procedural functionality enables visual creators to rapidly edit, explore alternatives, and fine-tune artwork in many domains including illustration, motion graphics, and interactive animation. Symbolic procedural tools, such as textual programming languages, are highly expressive but often limit directly manipulating concrete artwork; whereas direct manipulation tools support some procedural expression but limit creators to pre-defined behaviors and inputs. Inspired by visions of using geometric input to create procedural relationships, we identify an opportunity to use vector geometry from artwork to specify expressive user-defined procedural functions. We present Drawing Transforms (DTs), a technique that enables the use of any drawing to procedurally transform the stylistic, spatial, and temporal properties of target artwork. We apply DTs in a prototype motion graphics system to author continuous and discrete transformations, modify multiple elements in a composition simultaneously, create animations, and control fine-grained procedural instantiation. We discuss how DTs can unify procedural authoring through direct manipulation across visual media domains. 
    more » « less