In this work, we consider the multi-image object matching problem in distributed networks of robots. Multi-image feature matching is a keystone of many applications, including Simultaneous Localization and Mapping, homography, object detection, and Structure from Motion. We first review the QuickMatch algorithm for multi-image feature matching. We then present NetMatch, an algorithm for distributing sets of features across computational units (agents) that largely preserves feature match quality and minimizes communication between agents (avoiding, in particular, the need to flood all data to all agents). Finally, we present an experimental application of both QuickMatch and NetMatch on an object matching test with low-quality images. The QuickMatch and NetMatch algorithms are compared with other standard matching algorithms in terms of preservation of match consistency. Our experiments show that QuickMatch and Netmatch can scale to larger numbers of images and features, and match more accurately than standard techniques.
more »
« less
Consistent Multi-Robot Object Matching via QuickMatch
In this work, we present a novel solution and experimental verification for the multi-image object matching problem. We first review the QuickMatch algorithm for multi-image feature matching and then show how it applies to an object matching test case. The presented experiment looks to match features across a large number of images and features more often and accurately than standard techniques. This experiment demonstrates the advantages of rapid multi-image matching, not only for improving existing algorithms, but also for use in new applications, such as object discovery and localization.
more »
« less
- Award ID(s):
- 1717656
- PAR ID:
- 10106596
- Date Published:
- Journal Name:
- International Symposium on Experimental Robotics
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In this paper, we tackle an important task in computer vision: any view object recognition. In both training and testing, for each object instance, we are only given its 2D image viewed from an unknown angle. We propose a computational framework by designing object and viewer-centered neural networks (OVCNet) to recognize an object instance viewed from an arbitrary unknown angle. OVCNet consists of three branches that respectively implement object-centered, 3D viewer-centered, and in-plane viewer-centered recognition. We evaluate our proposed OVCNet using two metrics with unseen views from both seen and novel object instances. Experimental results demonstrate the advantages of OVCNet over classic 2D-image-based CNN classi fiers, 3D-object (inferred from 2D image) classifiers, and competing multi-view based approaches. It gives rise to a viable and practical computing framework that combines both viewpoint-dependent and viewpoint-independent features for object recognition from any view.more » « less
-
null (Ed.)The success of supervised learning requires large-scale ground truth labels which are very expensive, time- consuming, or may need special skills to annotate. To address this issue, many self- or un-supervised methods are developed. Unlike most existing self-supervised methods to learn only 2D image features or only 3D point cloud features, this paper presents a novel and effective self-supervised learning approach to jointly learn both 2D image features and 3D point cloud features by exploiting cross-modality and cross-view correspondences without using any human annotated labels. Specifically, 2D image features of rendered images from different views are extracted by a 2D convolutional neural network, and 3D point cloud features are extracted by a graph convolution neural network. Two types of features are fed into a two-layer fully connected neural network to estimate the cross-modality correspondence. The three networks are jointly trained (i.e. cross-modality) by verifying whether two sampled data of different modalities belong to the same object, meanwhile, the 2D convolutional neural network is additionally optimized through minimizing intra-object distance while maximizing inter-object distance of rendered images in different views (i.e. cross-view). The effectiveness of the learned 2D and 3D features is evaluated by transferring them on five different tasks including multi-view 2D shape recognition, 3D shape recognition, multi-view 2D shape retrieval, 3D shape retrieval, and 3D part-segmentation. Extensive evaluations on all the five different tasks across different datasets demonstrate strong generalization and effectiveness of the learned 2D and 3D features by the proposed self-supervised method.more » « less
-
We present a novel approach, MAGIC (manipulation analogies for generalizable intelligent contacts), for one-shot learning of manipulation strategies with fast and extensive generalization to novel objects. By leveraging a reference action trajectory, MAGIC effectively identifies similar contact points and sequences of actions on novel objects to replicate a demonstrated strategy, such as using different hooks to retrieve distant objects of different shapes and sizes. Our method is based on a twostage contact-point matching process that combines global shape matching using pretrained neural features with local curvature analysis to ensure precise and physically plausible contact points. We experiment with three tasks including scooping, hanging, and hooking objects. MAGIC demonstrates superior performance over existing methods, achieving significant improvements in runtime speed and generalization to different object categories. Website: https://magic-2024.github.io/.more » « less
-
null; null (Ed.)Biodiversity image repositories are crucial sources of training data for machine learning approaches to biological research. Metadata, specifically metadata about object quality, is putatively an important prerequisite to selecting sample subsets for these experiments. This study demonstrates the importance of image quality metadata to a species classification experiment involving a corpus of 1935 fish specimen images which were annotated with 22 metadata quality properties. A small subset of high quality images produced an F1 accuracy of 0.41 compared to 0.35 for a taxonomically matched subset of low quality images when used by a convolutional neural network approach to species identification. Using the full corpus of images revealed that image quality differed between correctly classified and misclassified images. We found the visibility of all anatomical features was the most important quality feature for classification accuracy. We suggest biodiversity image repositories consider adopting a minimal set of image quality metadata to support future machine learning projects.more » « less
An official website of the United States government

