Designing and implementing human-robot interactions requires numerous skills, from having a rich understanding of social interactions and the capacity to articulate their subtle requirements, to the ability to then program a social robot with the many facets of such a complex interaction. Although designers are best suited to develop and implement these interactions due to their inherent understanding of the context and its requirements, these skills are a barrier to enabling designers to rapidly explore and prototype ideas: it is impractical for designers to also be experts on social interaction behaviors, and the technical challenges associated with programming a social robot are prohibitive. In this work, we introduce Synthé, which allows designers to act out, or bodystorm, multiple demonstrations of an interaction. These demonstrations are automatically captured and translated into prototypes for the design team using program synthesis. We evaluate Synthé in multiple design sessions involving pairs of designers bodystorming interactions and observing the resulting models on a robot. We build on the findings from these sessions to improve the capabilities of Synthé and demonstrate the use of these capabilities in a second design session.
more »
« less
Learning Turn-Taking Behavior from Human Demonstrations for Social Human-Robot Interactions
Turn-taking is a fundamental behavior during human interactions and robots must be capable of turn-taking to interact with humans. Current state-of-the-art approaches in turn-taking focus on developing general models to predict the end of turn (EoT) across all contexts. This demands an all-inclusive verbal and non-verbal behavioral dataset from all possible contexts of interaction. Before robot deployment, gathering such a dataset may be infeasible and/or impractical. More importantly, a robot needs to predict the EoT and decide on the best time to take a turn (i.e, start speaking). In this research, we present a learning from demonstration (LfD) system for a robot to learn from demonstrations, after it has been deployed, to make decisions on the appropriate time for taking a turn within specific social interaction contexts. The system captures demonstrations of turn-taking during social interactions and uses these demonstrations to train a LSTM RNN based model to replicate the turn-taking behavior of the demonstrator. We evaluate the system for teaching the turn-taking behavior of an interviewer during a job interview context. Furthermore, we investigate the efficacy of verbal, prosodic, and gestural cues for deciding when to begin a turn.
more »
« less
- Award ID(s):
- 1948224
- PAR ID:
- 10442501
- Date Published:
- Journal Name:
- 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- Page Range / eLocation ID:
- 7643 to 7649
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Understanding human perceptions of robot performance is crucial for designing socially intelligent robots that can adapt to human expectations. Current approaches often rely on surveys, which can disrupt ongoing human–robot interactions. As an alternative, we explore predicting people’s perceptions of robot performance using non-verbal behavioral cues and machine learning techniques. We contribute the SEAN TOGETHER Dataset consisting of observations of an interaction between a person and a mobile robot in Virtual Reality, together with perceptions of robot performance provided by users on a 5-point scale. We then analyze how well humans and supervised learning techniques can predict perceived robot performance based on different observation types (like facial expression and spatial behavior features). Our results suggest that facial expressions alone provide useful information, but in the navigation scenarios that we considered, reasoning about spatial features in context is critical for the prediction task. Also, supervised learning techniques outperformed humans’ predictions in most cases. Further, when predicting robot performance as a binary classification task on unseen users’ data, the F1-Score of machine learning models more than doubled that of predictions on a 5-point scale. This suggested good generalization capabilities, particularly in identifying performance directionality over exact ratings. Based on these findings, we conducted a real-world demonstration where a mobile robot uses a machine learning model to predict how a human who follows it perceives it. Finally, we discuss the implications of our results for implementing these supervised learning models in real-world navigation. Our work paves the path to automatically enhancing robot behavior based on observations of users and inferences about their perceptions of a robot.more » « less
-
Introduction This dataset was gathered during the Vid2Real online video-based study, which investigates humans’ perception of robots' intelligence in the context of an incidental Human-Robot encounter. The dataset contains participants' questionnaire responses to four video study conditions, namely Baseline, Verbal, Body language, and Body language + Verbal. The videos depict a scenario where a pedestrian incidentally encounters a quadruped robot trying to enter a building. The robot uses verbal commands or body language to try to ask for help from the pedestrian in different study conditions. The differences in the conditions were manipulated using the robot’s verbal and expressive movement functionalities. Dataset Purpose The dataset includes the responses of human subjects about the robots' social intelligence used to validate the hypothesis that robot social intelligence is positively correlated with human compliance in an incidental human-robot encounter context. The video based dataset was also developed to obtain empirical evidence that can be used to design future real-world HRI studies. Dataset Contents Four videos, each corresponding to a study condition. Four sets of Perceived Social Intelligence Scale data. Each set corresponds to one study condition Four sets of compliance likelihood questions, each set include one Likert question and one free-form question One set of Godspeed questionnaire data. One set of Anthropomorphism questionnaire data. A csv file containing the participants demographic data, Likert scale data, and text responses. A data dictionary explaining the meaning of each of the fields in the csv file. Study Conditions There are 4 videos (i.e. study conditions), the video scenarios are as follows. Baseline: The robot walks up to the entrance and waits for the pedestrian to open the door without any additional behaviors. This is also the "control" condition. Verbal: The robot walks up to the entrance, and says ”can you please open the door for me” to the pedestrian while facing the same direction, then waits for the pedestrian to open the door. Body Language: The robot walks up to the entrance, turns its head to look at the pedestrian, then turns its head to face the door, and waits for the pedestrian to open the door. Body Language + Verbal: The robot walks up to the entrance, turns its head to look at the pedestrian, and says ”Can you open the door for me” to the pedestrian, then waits for the pedestrian to open the door. Image showing the Verbal condition. Image showing the Body Language condition. A within-subject design was adopted, and all participants experienced all conditions. The order of the videos, as well as the PSI scales, were randomized. After receiving consent from the participants, they were presented with one video, followed by the PSI questions and the two exploratory questions (compliance likelihood) described above. This set was repeated 4 times, after which the participants would answer their general perceptions of the robot with Godspeed and AMPH questionnaires. Each video was around 20 seconds and the total study time was around 10 minutes. Video as a Study Method A video-based study in human-robot interaction research is a common method for data collection. Videos can easily be distributed via online participant recruiting platforms, and can reach a larger sample than in-person/lab-based studies. Therefore, it is a fast and easy method for data collection for research aiming to obtain empirical evidence. Video Filming The videos were filmed with a first-person point-of-view in order to maximize the alignment of video and real-world settings. The device used for the recording was an iPhone 12 pro, and the videos were shot in 4k 60 fps. For better accessibility, the videos have been converted to lower resolutions. Instruments The questionnaires used in the study include the Perceived Social Intelligence Scale (PSI), Godspeed Questionnaire, and Anthropomorphism Questionnaire (AMPH). In addition to these questionnaires, a 5-point Likert question and a free-text response measuring human compliance were added for the purpose of the video-based study. Participant demographic data was also collected. Questionnaire items are attached as part of this dataset. Human Subjects For the purpose of this project, the participants are recruited through Prolific. Therefore, the participants are users of Prolific. Additionally, they are restricted to people who are currently living in the United States, fluent in English, and have no hearing or visual impairments. No other restrictions were imposed. Among the 385 participants, 194 participants identified as female, and 191 as male, the age ranged from 19 to 75 (M = 38.53, SD = 12.86). Human subjects remained anonymous. Participants were compensated with $4 upon submission approval. This study was reviewed and approved by UT Austin Internal Review Board. Robot The dataset contains data about humans’ perceived social intelligence of a Boston Dynamics’ quadruped robot Spot (Explorer model). The robot was selected because quadruped robots are gradually being adopted to provide services such as delivery, surveillance, and rescue. However, there are still issues or obstacles that robots cannot easily overcome by themselves in which they will have to ask for help from nearby humans. Therefore, it is important to understand how humans react to a quadruped robot that they incidentally encounter. For the purposes of this video-study, the robot operation was semi-autonomous, with the navigation being manually teleoperated by an operator and a few standalone autonomous modules to supplement it. Data Collection The data was collected through Qualtrics, a survey development platform. After the completion of data collection, the data was downloaded as a csv file. Data Quality Control Qualtrics automatically detects bots so any response that is flagged as bots are discarded. All incomplete and duplicate responses were discarded. Data Usage This dataset can be used to conduct a meta-analysis on robots' perceived intelligence. Please note that data is coupled with this study design. Users interested in data reuse will have to assess that this dataset is in line with their study design. Acknowledgement This study was funded through the NSF Award # 2219236GCR: Community-Embedded Robotics: Understanding Sociotechnical Interactions with Long-term Autonomous Deployments.more » « less
-
Collaborative robots provide prospective and great solutions to human–robot cooperative tasks. In this paper, we present a comprehensive review for two significant topics in human–robot interaction: robots learning from demonstrations and human comfort. The collaboration quality between the human and the robot has been improved largely by taking advantage of robots learning from demonstrations. Human teaching and robot learning approaches with their corresponding applications are investigated in this review. We also discuss several important issues that need to be paid attention to and addressed in the human–robot teaching–learning process. After that, the factors that may affect human comfort in human–robot interaction are described and discussed. Moreover, the measures utilized to improve human acceptance of robots and human comfort in human–robot interaction are also presented and discussed.more » « less
-
Deploying robots in-the-wild is critical for studying human-robot interaction, since human behavior varies between lab settings and public settings. Though robots that have been used in-the-wild exist, many of these robots are proprietary, expensive, or unavailable. We introduce Shutter, a low-cost, flexible social robot platform for in-the-wild experiments on human-robot interaction. Our demonstration will include a Shutter robot, which consists of a 4-DOF arm with a face screen, and a Kinect sensor. We will demonstrate two different interactions with Shutter: a photo-taking interaction and an embodied explanations interaction. Both interactions have been publicly deployed on the Shutter system.more » « less
An official website of the United States government

