skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Generative replay for multi-class modeling of human activities via sensor data from in-home robotic companion pets
Deploying socially assistive robots (SARs) at home, such as robotic companion pets, can be useful for tracking behavioral and health-related changes in humans during lifestyle fluctuations over time, like those experienced during CoVID-19. However, a fundamental problem required when deploying autonomous agents such as SARs in people’s everyday living spaces is understanding how users interact with those robots when not observed by researchers. One way to address that is to utilize novel modeling methods based on the robot’s sensor data, combined with newer types of interaction evaluation such as ecological momentary assessment (EMA), to recognize behavior modalities. This paper presents such a study of human-specific behavior classification based on data collected through EMA and sensors attached onboard a SAR, which was deployed in user homes. Classification was conducted using generative replay models, which attempt to use encoding/decoding methods to emulate how human dreaming is thought to create perturbations of the same experience in order to learn more efficiently from less data. Both multi-class and binary classification were explored for comparison, using several types of generative replay (variational autoencoders, generative adversarial networks, semi-supervised GANs). The highest-performing binary model showed approximately 79% accuracy (AUC 0.83), though multi-class classification across all modalities only attained 33% accuracy (AUC 0.62, F1 0.25), despite various attempts to improve it. The paper here highlights the strengths and weaknesses of using generative replay for modeling during human–robot interaction in the real world and also suggests a number of research paths for future improvement.  more » « less
Award ID(s):
1900683
PAR ID:
10540517
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Publisher / Repository:
Springer-Verlag
Date Published:
Journal Name:
Intelligent Service Robotics
Volume:
17
Issue:
2
ISSN:
1861-2776
Page Range / eLocation ID:
277-287
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Human mesenchymal stem cells (hMSCs) are multipotent progenitor cells with the potential to differentiate into various cell types, including osteoblasts, chondrocytes, and adipocytes. These cells have been extensively employed in the field of cell-based therapies and regenerative medicine due to their inherent attributes of self-renewal and multipotency. Traditional approaches for assessing hMSCs differentiation capacity have relied heavily on labor-intensive techniques, such as RT-PCR, immunostaining, and Western blot, to identify specific biomarkers. However, these methods are not only time-consuming and economically demanding, but also require the fixation of cells, resulting in the loss of temporal data. Consequently, there is an emerging need for a more efficient and precise approach to predict hMSCs differentiation in live cells, particularly for osteogenic and adipogenic differentiation. In response to this need, we developed innovative approaches that combine live-cell imaging with cutting-edge deep learning techniques, specifically employing a convolutional neural network (CNN) to meticulously classify osteogenic and adipogenic differentiation. Specifically, four notable pre-trained CNN models, VGG 19, Inception V3, ResNet 18, and ResNet 50, were developed and tested for identifying adipogenic and osteogenic differentiated cells based on cell morphology changes. We rigorously evaluated the performance of these four models concerning binary and multi-class classification of differentiated cells at various time intervals, focusing on pivotal metrics such as accuracy, the area under the receiver operating characteristic curve (AUC), sensitivity, precision, and F1-score. Among these four different models, ResNet 50 has proven to be the most effective choice with the highest accuracy (0.9572 for binary, 0.9474 for multi-class) and AUC (0.9958 for binary, 0.9836 for multi-class) in both multi-class and binary classification tasks. Although VGG 19 matched the accuracy of ResNet 50 in both tasks, ResNet 50 consistently outperformed it in terms of AUC, underscoring its superior effectiveness in identifying differentiated cells. Overall, our study demonstrated the capability to use a CNN approach to predict stem cell fate based on morphology changes, which will potentially provide insights for the application of cell-based therapy and advance our understanding of regenerative medicine. 
    more » « less
  2. Robots such as unmanned aerial vehicles (UAVs) deployed for search and rescue (SAR) can explore areas where human searchers cannot easily go and gather information on scales that can transform SAR strategy. Multi-UAV teams therefore have the potential to transform SAR by augmenting the capabilities of human teams and providing information that would otherwise be inaccessible. Our research aims to develop new theory and technologies for field deploying autonomous UAVs and managing multi-UAV teams working in concert with multi-human teams for SAR. Specifically, in this paper we summarize our work in progress towards these goals, including: (1) a multi-UAV search path planner that adapts to human behavior; (2) an in-field distributed computing prototype that supports multi-UAV computation and communication; (3) behavioral modeling that fields spatially localized predictions of lost person location; and (4) an interface between human searchers and UAVs that facilitates human-UAV interaction over a wide range of autonomy. 
    more » « less
  3. This paper presents an intensive case study of 10 participants in the US and South Korea interacting with a robotic companion pet in their own homes over the course of several weeks. Participants were tracked every second of every day during that period of time. The fundamental goal was to determine whether there were significant differences in the types of interactions that occurred across those cultural settings, and how those differences affected modeling of the human-robot interactions. We collected a mix of quantitative and qualitative data through sensors onboard the robot, ecological momentary assessment (EMA), and participant interviews. Results showed that there were significant differences in how participants in Korea interacted with the robotic pet relative to participants in the US, which impacted machine learning and deep learning models of the interactions. Moreover, those differences were connected to differences in participant perceptions of the robot based on the qualitative interviews. The work here suggests that it may be necessary to develop culturally-specific models and/or sensor suites for human-robot interaction (HRI) in the future, and that simply adapting the same robot's behavior through cultural homophily may be insufficient. 
    more » « less
  4. Globerson, A; Mackey, L; Belgrave, D; Fan, A; Paquet, U; Tomczak, J; Zhang, C (Ed.)
    Designing ligand-binding proteins, such as enzymes and biosensors, is essential in bioengineering and protein biology. One critical step in this process involves designing protein pockets, the protein interface binding with the ligand. Current approaches to pocket generation often suffer from time-intensive physical computations or template-based methods, as well as compromised generation quality due to the overlooking of domain knowledge. To tackle these challenges, we propose PocketFlow, a generative model that incorporates protein-ligand interaction priors based on flow matching. During training, PocketFlow learns to model key types of protein-ligand interactions, such as hydrogen bonds. In the sampling, PocketFlow leverages multi-granularity guidance (overall binding affinity and interaction geometry constraints) to facilitate generating high-affinity and valid pockets. Extensive experiments show that PocketFlow outperforms baselines on multiple benchmarks, e.g., achieving an average improvement of 1.29 in Vina Score and 0.05 in scRMSD. Moreover, modeling interactions make PocketFlow a generalized generative model across multiple ligand modalities, including small molecules, peptides, and RNA. 
    more » « less
  5. Generative artificial intelligence has made significant strides, producing text indistinguishable from human prose and remarkably photorealistic images. Automatically measuring how close the generated data distribution is to the target distribution is central to diagnosing existing models and developing better ones. We present MAUVE, a family of comparison measures between pairs of distributions such as those encountered in the generative modeling of text or images. These scores are statistical summaries of divergence frontiers capturing two types of errors in generative modeling. We explore three approaches to statistically estimate these scores: vector quantization, non-parametric estimation, and classifier-based estimation. We provide statistical bounds for the vector quantization approach. Empirically, we find that the proposed scores paired with a range of f -divergences and statistical estimation methods can quantify the gaps between the distributions of human-written text and those of modern neural language models by correlating with human judgments and identifying known properties of the generated texts. We demonstrate in the vision domain that MAUVE can identify known properties of generated images on par with or better than existing metrics. In conclusion, we present practical recommendations for using MAUVE effectively with language and image modalities. 
    more » « less