skip to main content


Title: AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated Objects via Few-shot Interactions
Perceiving and interacting with 3D articulated objects, such as cabinets, doors, and faucets, pose particular challenges for future home-assistant robots performing daily tasks in human environments. Besides parsing the articulated parts and joint parameters, researchers recently advocate learning manipulation affordance over the input shape geometry which is more task-aware and geometrically fine-grained. However, taking only passive observations as inputs, these methods ignore many hidden but important kinematic constraints (e.g., joint location and limits) and dynamic factors (e.g., joint friction and restitution), therefore losing significant accuracy for test cases with such uncertainties. In this paper, we propose a novel framework, named AdaAfford, that learns to perform very few test-time interactions for quickly adapting the affordance priors to more accurate instance-specific posteriors. We conduct large-scale experiments using the PartNet-Mobility dataset and prove that our system performs better than baselines.  more » « less
Award ID(s):
1763268
NSF-PAR ID:
10381767
Author(s) / Creator(s):
Date Published:
Journal Name:
European Conference on Computer Vision 2022
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Perceiving and manipulating 3D articulated objects (e.g., cabinets, doors) in human environments is an important yet challenging task for future home-assistant robots. The space of 3D articulated objects is exceptionally rich in their myriad semantic categories, diverse shape geometry, and complicated part functionality. Previous works mostly abstract kinematic structure with estimated joint parameters and part poses as the visual representations for manipulating 3D articulated objects. In this paper, we propose object-centric actionable visual priors as a novel perception-interaction handshaking point that the perception system outputs more actionable guidance than kinematic structure estimation, by predicting dense geometry-aware, interaction-aware, and task-aware visual action affordance and trajectory proposals. We design an interaction-for-perception framework VAT-Mart to learn such actionable visual representations by simultaneously training a curiosity-driven reinforcement learning policy exploring diverse interaction trajectories and a perception module summarizing and generalizing the explored knowledge for pointwise predictions among diverse shapes. Experiments prove the effectiveness of the proposed approach using the large-scale PartNet-Mobility dataset in SAPIEN environment and show promising generalization capabilities to novel test shapes, unseen object categories, and real-world data. 
    more » « less
  2. Abstract This article describes the development and evaluation of our passively actuated closed-loop articulated wearable (CLAW) that uses a common slider to passively drive its exo-fingers for use in physical training of people with limited hand mobility. Our design approach utilizes physiological tasks for dimensional synthesis and yields a variety of design candidates that fulfill the desired fingertip precision grasping trajectory. Once it is ensured that the synthesized fingertip motion is close to the physiological fingertip grasping trajectories, performance assessment criteria related to user–device interference and natural joint angle movement are taken into account. After the most preferred design for each finger is chosen, minor modifications are made related to substituting the backbone chain with the wearer’s limb to provide the skeletal structure for the customized passive device. Subsequently, we evaluate it for natural joint motion based on a novel design candidate assessment method. A hand prototype is printed, and its preliminary performance regarding natural joint motion, wearability, and scalability are assessed. The pilot experimental test on a range of healthy subjects with different hand/finger sizes shows that the CLAW hand is easy to operate and guides the user’s fingers without causing any discomfort. It also ensures both precision and power grasping in a natural manner. This study establishes the importance of incorporating novel design candidate assessment techniques, based on human finger kinematic models, on a conceptual design level that can assist in finding design candidates for natural joint motion coordination. 
    more » « less
  3. We present an end-to-end method for capturing the dynamics of 3D human characters and translating them for synthesizing new, visually-realistic motion sequences. Conventional methods employ sophisticated, but generic, control approaches for driving the joints of articulated characters, paying little attention to the distinct dynamics of human joint movements. In contrast, our approach attempts to synthesize human-like joint movements by exploiting a biologically-plausible, compact network of spiking neurons that drive joint control in primates and rodents. We adapt the controller architecture by introducing learnable components and propose an evolutionary algorithm for training the spiking neural network architectures and capturing diverse joint dynamics. Our method requires only a few samples for capturing the dynamic properties of a joint's motion and exploits the biologically-inspired, trained controller for its reconstruction. More importantly, it can transfer the captured dynamics to new visually-plausible motion sequences. To enable user-dependent tailoring of the resulting motion sequences, we develop an interactive framework that allows for editing and real-time visualization of the controlled 3D character. We also demonstrate the applicability of our method to real human motion capture data by learning the hand joint dynamics from a gesture dataset and using our framework to reconstruct the gestures with our 3D animated character. The compact architecture of our joint controller emerging from its biologically-realistic design, and the inherent capacity of our evolutionary learning algorithm for parallelization, suggest that our approach could provide an efficient and scalable alternative for synthesizing 3D character animations with diverse and visually-realistic motion dynamics.

     
    more » « less
  4. Abstract

    Study of life history strategies may help predict the performance of microorganisms in nature by organizing the complexity of microbial communities into groups of organisms with similar strategies. Here, we tested the extent that one common application of life history theory, the copiotroph-oligotroph framework, could predict the relative population growth rate of bacterial taxa in soils from four different ecosystems. We measured the change of in situ relative growth rate to added glucose and ammonium using both 18O–H2O and 13C quantitative stable isotope probing to test whether bacterial taxa sorted into copiotrophic and oligotrophic groups. We saw considerable overlap in nutrient responses across most bacteria regardless of phyla, with many taxa growing slowly and few taxa that grew quickly. To define plausible life history boundaries based on in situ relative growth rates, we applied Gaussian mixture models to organisms’ joint 18O–13C signatures and found that across experimental replicates, few taxa could consistently be assigned as copiotrophs, despite their potential for fast growth. When life history classifications were assigned based on average relative growth rate at varying taxonomic levels, finer resolutions (e.g., genus level) were significantly more effective in capturing changes in nutrient response than broad taxonomic resolution (e.g., phylum level). Our results demonstrate the difficulty in generalizing bacterial life history strategies to broad lineages, and even to single organisms across a range of soils and experimental conditions. We conclude that there is a continued need for the direct measurement of microbial communities in soil to advance ecologically realistic frameworks.

     
    more » « less
  5. null (Ed.)
    Out-of-distribution (OOD) testing is increasingly popular for evaluating a machine learning system's ability to generalize beyond the biases of a training set. OOD benchmarks are designed to present a different joint distribution of data and labels between training and test time. VQA-CP has become the standard OOD benchmark for visual question answering, but we discovered three troubling practices in its current use. First, most published methods rely on explicit knowledge of the construction of the OOD splits. They often rely on ``inverting'' the distribution of labels, e.g. answering mostly 'yes' when the common training answer is 'no'. Second, the OOD test set is used for model selection. Third, a model's in-domain performance is assessed after retraining it on in-domain splits (VQA v2) that exhibit a more balanced distribution of labels. These three practices defeat the objective of evaluating generalization, and put into question the value of methods specifically designed for this dataset. We show that embarrassingly-simple methods, including one that generates answers at random, surpass the state of the art on some question types. We provide short- and long-term solutions to avoid these pitfalls and realize the benefits of OOD evaluation. 
    more » « less