NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Geometry in Style: 3D Stylization via Surface Normal Deformation

Dinh, Nam_Anh; Lang, Itai; Kim, Hyunwoo; Stein, Oded; Hanocka, Rana (June 2025, The Computer Vision Foundation)

Free, publicly-accessible full text available June 15, 2026
Geometry in Style: 3D Stylization via Surface Normal Deformation

https://doi.org/10.1109/CVPR52734.2025.02650

Dinh, Nam Anh; Lang, Itai; Kim, Hyunwoo; Stein, Oded; Hanocka, Rana (June 2025, IEEE)

Free, publicly-accessible full text available June 10, 2026
GeoCode: Interpretable Shape Programs

https://doi.org/10.1111/cgf.15276

Pearl, Ofek; Lang, Itai; Hu, Yuhua; Yeh, Raymond A; Hanocka, Rana (February 2025, Computer Graphics Forum)

Abstract The task of crafting procedural programs capable of generating structurally valid 3D shapes easily and intuitively remains an elusive goal in computer vision and graphics. Within the graphics community, generating procedural 3D models has shifted to using node graph systems. They allow the artist to create complex shapes and animations through visual programming. Being a high‐level design tool, they made procedural 3D modelling more accessible. However, crafting those node graphs demands expertise and training. We present GeoCode, a novel framework designed to extend an existing node graph system and significantly lower the bar for the creation of new procedural 3D shape programs. Our approach meticulously balances expressiveness and generalization for part‐based shapes. We propose a curated set of new geometric building blocks that are expressive and reusable across domains. We showcase three innovative and expressive programs developed through our technique and geometric building blocks. Our programs enforce intricate rules, empowering users to execute intuitive high‐level parameter edits that seamlessly propagate throughout the entire shape at a lower level while maintaining its validity. To evaluate the user‐friendliness of our geometric building blocks among non‐experts, we conduct a user study that demonstrates their ease of use and highlights their applicability across diverse domains. Empirical evidence shows the superior accuracy of GeoCode in inferring and recovering 3D shapes compared to an existing competitor. Furthermore, our method demonstrates superior expressiveness compared to alternatives that utilize coarse primitives. Notably, we illustrate the ability to execute controllable local and global shape manipulations. Our code, programs, datasets and Blender add‐on are available athttps://github.com/threedle/GeoCode.
more » « less
Free, publicly-accessible full text available February 1, 2026
iSeg: Interactive 3D Segmentation via Interactive Attention

https://doi.org/10.1145/3680528.3687605

Lang, Itai; Xu, Fei; Decatur, Dale; Babu, Sudarshan; Hanocka, Rana (December 2024, ACM)

We present iSeg, a new interactive technique for segmenting 3D shapes. Previous works have focused mainly on leveraging pre-trained 2D foundation models for 3D segmentation based on text. However, text may be insufficient for accurately describing fine-grained spatial segmentations. Moreover, achieving a consistent 3D segmentation using a 2D model is highly challenging, since occluded areas of the same semantic region may not be visible together from any 2D view. Thus, we design a segmentation method conditioned on fine user clicks, which operates entirely in 3D. Our system accepts user clicks directly on the shape's surface, indicating the inclusion or exclusion of regions from the desired shape partition. To accommodate various click settings, we propose a novel interactive attention module capable of processing different numbers and types of clicks, enabling the training of a single unified interactive segmentation model. We apply iSeg to a myriad of shapes from different domains, demonstrating its versatility and faithfulness to the user's specifications. Our project page is at https://threedle.github.io/iSeg/.
more » « less
Full Text Available
TEDi: Temporally-Entangled Diffusion for Long-Term Motion Synthesis

https://doi.org/10.1145/3641519.3657515

Zhang, Zihan; Liu, Richard; Hanocka, Rana; Aberman, Kfir (July 2024, ACM)

The gradual nature of a diffusion process that synthesizes samples in small increments constitutes a key ingredient of Denoising Diffusion Probabilistic Models (DDPM), which have presented unprecedented quality in image synthesis and been recently explored in the motion domain. In this work, we propose to adapt the gradual diffusion concept (operating along a diffusion time-axis) into the temporal-axis of the motion sequence. Our key idea is to extend the DDPM framework to support temporally varying denoising, thereby entangling the two axes. Using our special formulation, we iteratively denoise a motion buffer that contains a set of increasingly-noised poses, which auto-regressively produces an arbitrarily long stream of frames. With a stationary diffusion time-axis, in each diffusion step we increment only the temporal-axis of the motion such that the framework produces a new, clean frame which is removed from the beginning of the buffer, followed by a newly drawn noise vector that is appended to it. This new mechanism paves the way towards a new framework for long-term motion synthesis with applications to character animation and other domains.
more » « less
Full Text Available
3D Paintbrush: Local Stylization of 3D Shapes with Cascaded Score Distillation

https://doi.org/10.1109/CVPR52733.2024.00428

Decatur, Dale; Lang, Itai; Aberman, Kfir; Hanocka, Rana (June 2024, IEEE)

We present 3D Paintbrush, a technique for automatically texturing local semantic regions on meshes via text descriptions. Our method is designed to operate directly on meshes, producing texture maps which seamlessly integrate into standard graphics pipelines. We opt to simultaneously produce a localization map (to specify the edit region) and a texture map which conforms to it. This approach improves the quality of both the localization and the stylization. To enhance the details and resolution of the textured area, we leverage multiple stages of a cascaded diffusion model to supervise our local editing technique with generative priors learned from images at different resolutions. Our technique, referred to as Cascaded Score Distillation (CSD), simultaneously distills scores at multiple resolutions in a cascaded fashion, enabling control over both the granularity and global understanding of the supervision. We demonstrate the effectiveness of 3D Paintbrush to locally texture different semantic regions on a variety of shapes. Project page: https://threedle.github.io/3d-paintbrush
more » « less
Full Text Available
HyperFields: Towards Zero-Shot Generation of NeRFs from Text

Babu, Sudarshan; Liu, Richard Liu; Zhou, Avery; Maire, Michael; Shakhnarovich, Greg; Hanocka, Rana (July 2024, ICML 2024 in OpenReview.net)

Full Text Available
HyperFields: Towards zero-shot generation of NeRFs from text

Babu, Sudarshan; Liu, Richard; Zhou, Avery; Maire, Michael; Shakhnarovich, Greg; Hanocka, Rana (June 2024, ICML)

We introduce HyperFields, a method for generating text-conditioned Neural Radiance Fields (NeRFs) with a single forward pass and (optionally) some fine-tuning. Key to our approach are: (i) a dynamic hypernetwork, which learns a smooth mapping from text token embeddings to the space of NeRFs; (ii) NeRF distillation training, which distills scenes encoded in individual NeRFs into one dynamic hypernetwork. These techniques enable a single network to fit over a hundred unique scenes. We further demonstrate that HyperFields learns a more general map between text and NeRFs, and consequently is capable of predicting novel in-distribution and out-of-distribution scenes--either zero-shot or with a few finetuning steps. Finetuning HyperFields benefits from accelerated convergence thanks to the learned general map, and is capable of synthesizing novel scenes 5 to 10 times faster than existing neural optimization-based methods. Our ablation experiments show that both the dynamic architecture and NeRF distillation are critical to the expressivity of HyperFields.
more » « less
Full Text Available
HaLo‐NeRF: Learning Geometry‐Guided Semantics for Exploring Unconstrained Photo Collections

https://doi.org/10.1111/cgf.15006

Dudai, Chen; Alper, Morris; Bezalel, Hana; Hanocka, Rana; Lang, Itai; Averbuch‐Elor, Hadar (May 2024, Computer Graphics Forum)

Internet image collections containing photos captured by crowds of photographers show promise for enabling digital exploration of large‐scale tourist landmarks. However, prior works focus primarily on geometric reconstruction and visualization, neglecting the key role of language in providing a semantic interface for navigation and fine‐grained understanding. In more constrained 3D domains, recent methods have leveraged modern vision‐and‐language models as a strong prior of 2D visual semantics. While these models display an excellent understanding of broad visual semantics, they struggle with unconstrained photo collections depicting such tourist landmarks, as they lack expert knowledge of the architectural domain and fail to exploit the geometric consistency of images capturing multiple views of such scenes. In this work, we present a localization system that connects neural representations of scenes depicting large‐scale landmarks with text describing a semantic region within the scene, by harnessing the power of SOTA vision‐and‐language models with adaptations for understanding landmark scene semantics. To bolster such models with fine‐grained knowledge, we leverage large‐scale Internet data containing images of similar landmarks along with weakly‐related textual information. Our approach is built upon the premise that images physically grounded in space can provide a powerful supervision signal for localizing new concepts, whose semantics may be unlocked from Internet textual metadata with large language models. We use correspondences between views of scenes to bootstrap spatial understanding of these semantics, providing guidance for 3D‐compatible segmentation that ultimately lifts to a volumetric scene representation. To evaluate our method, we present a new benchmark dataset containing large‐scale scenes with ground‐truth segmentations for multiple semantic concepts. Our results show that HaLo‐NeRF can accurately localize a variety of semantic concepts related to architectural landmarks, surpassing the results of other 3D models as well as strong 2D segmentation baselines. Our code and data are publicly available at https://tau‐vailab.github.io/HaLo‐NeRF/
more » « less
Full Text Available
3D Highlighter: Localizing Regions on 3D Shapes via Text Descriptions

Decatur, Dale; Lang, Itai; Hanocka, Rana (June 2023, IEEE Conference on Computer Vision and Pattern Recognition)

« Prev Next »

Search for: All records