skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Hierarchical Grouping of Simple Visual Scenes
Human visual grouping processes consolidate independent visual objects into grouped visual features on the basis of shared characteristics; these visual features can themselves be grouped, resulting in a hierarchical representation of visual grouping information. This “grouping hierarchy“ promotes ef- ficient attention in the support of goal-directed behavior, but improper grouping of elements of a visual scene can also re- sult in critical behavioral errors. Understanding of how visual object/features characteristics such as size and form influences perception of hierarchical visual groups can further theory of human visual grouping behavior and contribute to effective in- terface design. In the present study, participants provided free- response groupings of a set of stimuli that contained consistent structural relationships between a limited set of visual features. These grouping patterns were evaluated for relationships be- tween specific characteristics of the constituent visual features and the distribution of features across levels of the indicated grouping hierarchy. We observed that while the relative size of the visual features differentiated groupings across levels of the grouping hierarchy, the form of visual objects and features was more likely to distinguish separate groups within a partic- ular level of hierarchy. These consistent relationships between visual feature characteristics and placement within a grouping hierarchy can be leveraged to advance computational theories of human visual grouping behavior, which can in turn be ap- plied to effective design for interfaces such as voter ballots.  more » « less
Award ID(s):
1920513
PAR ID:
10464962
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the Forty-Fifth Annual Meeting of the Cognitive Science Society
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Human visual grouping processes consolidate independent visual objects into grouped visual features on the basis of shared characteristics; these visual features can themselves be grouped, resulting in a hierarchical representation of visual grouping information. This “grouping hierarchy“ promotes ef- ficient attention in the support of goal-directed behavior, but improper grouping of elements of a visual scene can also re- sult in critical behavioral errors. Understanding of how visual object/features characteristics such as size and form influences perception of hierarchical visual groups can further theory of human visual grouping behavior and contribute to effective in- terface design. In the present study, participants provided free- response groupings of a set of stimuli that contained consistent structural relationships between a limited set of visual features. These grouping patterns were evaluated for relationships be- tween specific characteristics of the constituent visual features and the distribution of features across levels of the indicated grouping hierarchy. We observed that while the relative size of the visual features differentiated groupings across levels of the grouping hierarchy, the form of visual objects and features was more likely to distinguish separate groups within a partic- ular level of hierarchy. These consistent relationships between visual feature characteristics and placement within a grouping hierarchy can be leveraged to advance computational theories of human visual grouping behavior, which can in turn be ap- plied to effective design for interfaces such as voter ballots. 
    more » « less
  2. Grouping is inherently ambiguous due to the multiple levels of granularity in which one can decompose a scene -- should the wheels of an excavator be considered separate or part of the whole? We present Group Anything with Radiance Fields (GARField), an approach for decomposing 3D scenes into a hierarchy of semantically meaningful groups from posed image inputs. To do this we embrace group ambiguity through physical scale: by optimizing a scale-conditioned 3D affinity feature field, a point in the world can belong to different groups of different sizes. We optimize this field from a set of 2D masks provided by Segment Anything (SAM) in a way that respects coarse-to-fine hierarchy, using scale to consistently fuse conflicting masks from different viewpoints. From this field we can derive a hierarchy of possible groupings via automatic tree construction or user interaction. We evaluate GARField on a variety of in-the-wild scenes and find it effectively extracts groups at many levels: clusters of objects, objects, and various subparts. GARField inherently represents multi-view consistent groupings and produces higher fidelity groups than the input SAM masks. GARField's hierarchical grouping could have exciting downstream applications such as 3D asset extraction or dynamic scene understanding. See the project website at https://www.garfield.studio/ 
    more » « less
  3. Perceptual organization offers an elegant framework to group low-level features that are likely to come from a single object. We offer a novel strategy to adapt this grouping process to objects in a domain. Given a set of training images of objects in context, the associated learning process decides on the relative importance of the basic salient relationships such as proximity, parallelness, continuity, junctions, and common region toward segregating the objects from the background. The parameters of the grouping process are cast as probabilistic specifications of Bayesian networks that need to be learned. This learning is accomplished using a team of stochastic automata in an N-player cooperative game framework. The grouping process, which is based on graph partitioning is able to form large groups from relationships defined over a small set of primitives and is fast. We statistically demonstrate the robust performance of the grouping and the learning frameworks on a variety of real images. Among the interesting conclusions is the significant role of photometric attributes in grouping and the ability to form large salient groups from a set of local relations, each defined over a small number of primitives. 
    more » « less
  4. Social touch provides a rich non-verbal communication channel between humans and robots. Prior work has identified a set of touch gestures for human-robot interaction and described them with natural language labels (e.g., stroking, patting). Yet, no data exists on the semantic relationships between the touch gestures in users’ minds. To endow robots with touch intelligence, we investigated how people perceive the similarities of social touch labels from the literature. In an online study, 45 participants grouped 36 social touch labels based on their perceived similarities and annotated their groupings with descriptive names. We derived quantitative similarities of the gestures from these groupings and analyzed the similarities using hierarchical clustering. The analysis resulted in 9 clusters of touch gestures formed around the social, emotional, and contact characteristics of the gestures. We discuss the implications of our results for designing and evaluating touch sensing and interactions with social robots. 
    more » « less
  5. Abstract Neocortical computations underlying vision are performed by a distributed network of functionally specialized areas. Mouse visual cortex, a dense interareal network that exhibits hierarchical properties, comprises subnetworks interconnecting distinct processing streams. To determine the layout of the mouse visual hierarchy, we have evaluated the laminar patterns formed by interareal axonal projections originating in each of ten areas. Reciprocally connected pairs of areas exhibit feedforward/feedback relationships consistent with a hierarchical organization. Beta regression analyses, which estimate a continuous hierarchical distance measure, indicate that the network comprises multiple nonhierarchical circuits embedded in a hierarchical organization of overlapping levels. Single-unit recordings in anaesthetized mice show that receptive field sizes are generally consistent with the hierarchy, with the ventral stream exhibiting a stricter hierarchy than the dorsal stream. Together, the results provide an anatomical metric for hierarchical distance, and reveal both hierarchical and nonhierarchical motifs in mouse visual cortex. 
    more » « less