skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: PVL: A Framework for Navigating the Precision-Variety Trade-off in Automated Animation of Smiles
Animating digital characters has an important role in computer assisted experiences, from video games to movies to interactive robotics. A critical challenge in the field is to generate animations which accurately reflect the state of the animated characters, without looking repetitive or unnatural. In this work, we investigate the problem of procedurally generating a diverse variety of facial animations that express a given semantic quality (e.g., very happy). To that end, we introduce a new learning heuristic called Precision Variety Learning (PVL) which actively identifies and exploits the fundamental trade-off between precision (how accurate positive labels are) and variety (how diverse the set of positive labels is). We both identify conditions where important theoretical properties can be guaranteed, and show good empirical performance in variety of conditions. Lastly, we apply our PVL heuristic to our motivating problem of generating smile animations, and perform several user studies to validate the ability of our method to produce a perceptually diverse variety of smiles for different target intensities.  more » « less
Award ID(s):
1748541 1526693 1544887
PAR ID:
10074257
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the ... AAAI Conference on Artificial Intelligence
ISSN:
2374-3468
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Visually similar characters, or homoglyphs, can be used to perform social engineering attacks or to evade spam and plagiarism detectors. It is thus important to understand the capabilities of an attacker to identify homoglyphs - particularly ones that have not been previously spotted - and leverage them in attacks. We investigate a deep-learning model using embedding learning, transfer learning, and augmentation to determine the visual similarity of characters and thereby identify potential homoglyphs. Our approach uniquely takes advantage of weak labels that arise from the fact that most characters are not homoglyphs. Our model drastically outperforms the Normal-ized Compression Distance approach on pairwise homoglyph identification, for which we achieve an average precision of 0.97. We also present the first attempt at clustering homoglyphs into sets of equivalence classes, which is more efficient than pairwise information for security practitioners to quickly lookup homoglyphs or to normalize confusable string encodings. To measure clustering performance, we propose a metric (mBIOU) building on the classic Intersection-Over-Union (IOU) metric. Our clustering method achieves 0.592 mBIOU, compared to 0.430 for the naive baseline. We also use our model to predict over 8,000 previously unknown homoglyphs, and find good early indications that many of these may be true positives. Source code and list of predicted homoglyphs are uploaded to Github: https://github.com/PerryXDeng/weaponizing_unicode. 
    more » « less
  2. Verbal labels for math concepts influence multiple aspects of math learning. In this study, we examined the influence of point labels (e.g., .42 as “point four two”), decomposed labels (e.g., “four tenths and two hundredths”), and common-unit labels (e.g., “forty-two hundredths”) on children’s processing and representation of decimal magnitudes. We randomly assigned 162 5th- and 6th-graders to briefly learn decomposed, common-unit, or point labels. Children then completed measures of decimal magnitude processing and representation. We found that the place-value labels (i.e., decomposed and common-unit labels) each showed unique advantages in reducing the whole-number bias, and common-unit labels also reduced componential processing. No difference was found in the ratio effect – which served as an index of the precision of decimal magnitude representation - among children from the three conditions. These findings add to our understanding of the role of verbal labels in math learning and have important implications for instructional practices. 
    more » « less
  3. Narrative planning is the process of generating sequences of actions that form coherent and goal-oriented narratives. Classical implementations of narrative planning rely on heuristic search techniques to offer structured story generation but face challenges with scalability due to large branching factors and deep search requirements. Large Language Models (LLMs), with their extensive training on diverse linguistic datasets, excel in understanding and generating coherent narratives. However, their planning ability lacks the precision and structure needed for effective narrative planning. This paper explores a hybrid approach that uses LLMs as heuristic guides within classical search frameworks for narrative planning. We compare various prompt designs to generate LLM heuristic predictions and evaluate their performance against h+, hmax, and relaxed plan heuristics. Additionally, we analyze the ability of relaxed plans to predict the next action correctly, comparing it to the LLMs’ ability to make the same prediction. Our findings indicate that LLMs rarely exceed the accuracy of classical planning heuristics. 
    more » « less
  4. Repetitive DNA (repeats) poses significant challenges for accurate and efficient genome assembly and sequence alignment. This is particularly true for metagenomic data, where genome dynamics such as horizontal gene transfer, gene duplication, and gene loss/gain complicate accurate genome assembly from metagenomic communities. Detecting repeats is a crucial first step in overcoming these challenges. To address this issue, we propose GraSSRep, a novel approach that leverages the assembly graph's structure through graph neural networks (GNNs) within a self-supervised learning framework to classify DNA sequences into repetitive and non-repetitive categories. Specifically, we frame this problem as a node classification task within a metagenomic assembly graph. In a self-supervised fashion, we rely on a high-precision (but low-recall) heuristic to generate pseudo-labels for a small proportion of the nodes. We then use those pseudo-labels to train a GNN embedding and a random forest classifier to propagate the labels to the remaining nodes. In this way, GraSSRep combines sequencing features with predefined and learned graph features to achieve state-of-the-art performance in repeat detection. We evaluate our method using simulated and synthetic metagenomic datasets. The results on the simulated data highlight our GraSSRep's robustness to repeat attributes, demonstrating its effectiveness in handling the complexity of repeated sequences. Additionally, our experiments with synthetic metagenomic datasets reveal that incorporating the graph structure and the GNN enhances our detection performance. Finally, in comparative analyses, GraSSRep outperforms existing repeat detection tools with respect to precision and recall. 
    more » « less
  5. Furht, Borko; Khoshgoftaar, Taghi (Ed.)
    Acquiring labeled datasets often incurs substantial costs primarily due to the requirement of expert human intervention to produce accurate and reliable class labels. In the modern data landscape, an overwhelming proportion of newly generated data is unlabeled. This paradigm is especially evident in domains such as fraud detection and datasets for credit card fraud detection. These types of data have their own difficulties associated with being highly class imbalanced, which poses its own challenges to machine learning and classification. Our research addresses these challenges by extensively evaluating a novel methodology for synthesizing class labels for highly imbalanced credit card fraud data. The methodology uses an autoencoder as its underlying learner to effectively learn from dataset features to produce an error metric for use in creating new binary class labels. The methodology aims to automatically produce new labels with minimal expert input. These class labels are then used to train supervised classifiers for fraud detection. Our empirical results show that the synthesized labels are of high enough quality to produce classifiers that significantly outperform a baseline learner comparison when using area under the precision-recall curve (AUPRC). We also present results of varying levels of positive-labeled instances and their effect on classifier performance. Results show that AUPRC performance improves as more instances are labeled positive and belong to the minority class. Our methodology thereby effectively addresses the concerns of high class imbalance in machine learning by creating new and effective class labels. 
    more » « less