skip to main content


Title: Linking WordNet to 3D shapes
We describe a project to link the Princeton WordNet to 3D representations of real objects and scenes. The goal is to establish a dataset that helps us to understand how people categorize everyday common objects via their parts, attributes, and context. This paper describes the annotation and data collection effort so far as well as ideas for future work.  more » « less
Award ID(s):
1729205
NSF-PAR ID:
10081573
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Global WordNet Conference
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper presents an approach to detect out-of-context (OOC) objects in an image. Given an image with a set of objects, our goal is to determine if an object is inconsistent with the scene context and detect the OOC object with a bounding box. In this work, we consider commonly explored contextual relations such as co-occurrence relations, the relative size of an object with respect to other objects, and the position of the object in the scene. We posit that contextual cues are useful to determine object labels for in-context objects and inconsistent context cues are detrimental to determining object labels for out-of-context objects. To realize this hypothesis, we propose a graph contextual reasoning network (GCRN) to detect OOC objects. GCRN consists of two separate graphs to predict object labels based on the contextual cues in the image: 1) a representation graph to learn object features based on the neighboring objects and 2) a context graph to explicitly capture contextual cues from the neighboring objects. GCRN explicitly captures the contextual cues to improve the detection of in-context objects and identify objects that violate contextual relations. In order to evaluate our approach, we create a large-scale dataset by adding OOC object instances to the COCO images. We also evaluate on recent OCD benchmark. Our results show that GCRN outperforms competitive baselines in detecting OOC objects and correctly detecting in-context objects. 
    more » « less
  2. This paper presents an approach to detect out-of-context (OOC) objects in an image. Given an image with a set of objects, our goal is to determine if an object is inconsistent with the scene context and detect the OOC object with a bounding box. In this work, we consider commonly explored contextual relations such as co-occurrence relations, the relative size of an object with respect to other objects, and the position of the object in the scene. We posit that contextual cues are useful to determine object labels for in-context objects and inconsistent context cues are detrimental to determining object labels for out-of-context objects. To realize this hypothesis, we propose a graph contextual reasoning network (GCRN) to detect OOC objects. GCRN consists of two separate graphs to predict object labels based on the contextual cues in the image: 1) a representation graph to learn object features based on the neighboring objects and 2) a context graph to explicitly capture contextual cues from the neighboring objects. GCRN explicitly captures the contextual cues to improve the detection of in-context objects and identify objects that violate contextual relations. In order to evaluate our approach, we create a large-scale dataset by adding OOC object instances to the COCO images. We also evaluate on recent OCD benchmark. Our results show that GCRN outperforms competitive baselines in detecting OOC objects and correctly detecting in-context objects. 
    more » « less
  3. Early object name learning is often conceptualized as a problem of mapping heard names to referents. However, infants do not hear object names as discrete events but rather in extended interactions organized around goal-directed actions on objects. The present study examined the statistical structure of the nonlinguistic events that surround parent naming of objects. Parents and 12-month-old infants were left alone in a room for 10 minutes with 32 objects available for exploration. Parent and infant handling of objects and parent naming of objects were coded. The four measured statistics were from measures used in the study of coherent discourse: (i) a frequency distribution in which actions were frequently directed to a few objects and more rarely to other objects; (ii) repeated returns to the high-frequency objects over the 10- minute play period; (iii) clustered repetitions and continuity of actions on objects; and (iv) structured networks of transitions among objects in play that connected all the played-with objects. Parent naming was infre- quent but related to the statistics of object-directed actions. The impli- cations of the discourse-like stream of actions are discussed in terms of learning mechanisms that could support rapid learning of object names from relatively few name-object co-occurrences. 
    more » « less
  4. We see the external world as consisting not only of objects and their parts, but also of relations that hold between them. Visual analogy, which depends on similarities between relations, provides a clear example of how perception supports reasoning. Here we report an experiment in which we quantitatively measured the human ability to find analogical mappings between parts of different objects, where the objects to be compared were drawn either from the same category (e.g., images of two mammals, such as a dog and a horse), or from two dissimilar categories (e.g., a chair image mapped to a cat image). Humans showed systematic mapping patterns, but with greater variability in mapping responses when objects were drawn from dissimilar categories. We simulated the human response of analogical mapping using a computational model of mapping between 3D objects, visiPAM (visual Probabilistic Analogical Mapping). VisiPAM takes point-cloud representations of two 3D objects as inputs, and outputs the mapping between analogous parts of the two objects. VisiPAM consists of a visual module that constructs structural representations of individual objects, and a reasoning module that identifies a probabilistic mapping between parts of the two 3D objects. Model simulations not only capture the qualitative pattern of human mapping performance cross conditions, but also approach human-level reliability in solving visual analogy problems. 
    more » « less
  5. Abstract

    The discovery of two interstellar objects passing through the solar system, 1I/‘Oumuamua and 2I/Borisov, implies that a galactic population exists with a spatial number density of order ∼0.1 au−3. The forthcoming Rubin Observatory Legacy Survey of Space and Time (LSST) has been predicted to detect more asteroidal interstellar objects like 1I/‘Oumuamua. We apply recently developed methods to simulate a suite of galactic populations of interstellar objects with a range of assumed kinematics, albedos, and size–frequency distributions (SFDs). We incorporate these populations into the objectsInField algorithm, which simulates detections of moving objects by an arbitrary survey. We find that the LSST should detect between ∼0 and 70 asteroidal interstellar objects every year (assuming the implied number density), with sensitive dependence on the SFD slope and characteristic albedo of the host population. The apparent rate of motion on the sky—along with the associated trailing loss—appears to be the largest barrier to detecting interstellar objects. Specifically, a relatively large number of synthetic objects would be detectable by the LSST if not for their rapid sky motion (>0.°5 day−1). Therefore, algorithms that could successfully link and detect rapidly moving objects would significantly increase the number of interstellar object discoveries with the LSST (and in general). The mean diameter of detectable, inactive interstellar objects ranges from ∼50 to 600 m and depends sensitively on the SFD slope and albedo.

     
    more » « less