skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The DOI auto-population feature in the Public Access Repository (PAR) will be unavailable from 4:00 PM ET on Tuesday, July 8 until 4:00 PM ET on Wednesday, July 9 due to scheduled maintenance. We apologize for the inconvenience caused.


Title: 3D Paintbrush: Local Stylization of 3D Shapes with Cascaded Score Distillation
We present 3D Paintbrush, a technique for automatically texturing local semantic regions on meshes via text descriptions. Our method is designed to operate directly on meshes, producing texture maps which seamlessly integrate into standard graphics pipelines. We opt to simultaneously produce a localization map (to specify the edit region) and a texture map which conforms to it. This approach improves the quality of both the localization and the stylization. To enhance the details and resolution of the textured area, we leverage multiple stages of a cascaded diffusion model to supervise our local editing technique with generative priors learned from images at different resolutions. Our technique, referred to as Cascaded Score Distillation (CSD), simultaneously distills scores at multiple resolutions in a cascaded fashion, enabling control over both the granularity and global understanding of the supervision. We demonstrate the effectiveness of 3D Paintbrush to locally texture different semantic regions on a variety of shapes. Project page: https://threedle.github.io/3d-paintbrush  more » « less
Award ID(s):
2304481
PAR ID:
10572455
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-5300-6
Page Range / eLocation ID:
4473 to 4483
Format(s):
Medium: X
Location:
Seattle, WA, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. We provide an approach to reconstruct spatiotemporal 3D models of aging objects such as fruit containing time-varying shape and appearance using multi-view time-lapse videos captured by a microenvironment of Raspberry Pi cameras. Our approach represents the 3D structure of the object prior to aging using a static 3D mesh reconstructed from multiple photographs of the object captured using a rotating camera track. We manually align the 3D mesh to the images at the first time instant. Our approach automatically deforms the aligned 3D mesh to match the object across the multi-viewpoint time-lapse videos. We texture map the deformed 3D meshes with intensities from the frames at each time instant to create the spatiotemporal 3D model of the object. Our results reveal the time dependence of volume loss due to transpiration and color transformation due to enzymatic browning on banana peels and in exposed parts of bitten fruit. 
    more » « less
  2. 3D CT point clouds reconstructed from the original CT images are naturally represented in real-world coordinates. Compared with CT images, 3D CT point clouds contain invariant geometric features with irregular spatial distributions from multiple viewpoints. This paper rethinks pulmonary nodule detection in CT point cloud representations. We first extract the multi-view features from a sparse convolutional (SparseConv) encoder by rotating the point clouds with different angles in the world coordinate. Then, to simultaneously learn the discriminative and robust spatial features from various viewpoints, a nodule proposal optimization schema is proposed to obtain coarse nodule regions by aggregating consistent nodule proposals prediction from multi-view features. Last, the multi-level features and semantic segmentation features extracted from a SparseConv decoder are concatenated with multi-view features for final nodule region regression. Experiments on the benchmark dataset (LUNA16) demonstrate the feasibility of applying CT point clouds in lung nodule detection task. Furthermore, we observe that by combining multi-view predictions, the performance of the proposed framework is greatly improved compared to single-view, while the interior texture features of nodules from images are more suitable for detecting nodules in small sizes. 
    more » « less
  3. We present 3D Highlighter, a technique for localizing semantic regions on a mesh using text as input. A key feature of our system is the ability to interpret “out-of-domain” localizations. Our system demonstrates the ability to reason about where to place non-obviously related concepts on an input 3D shape, such as adding clothing to a bare 3D animal model. Our method contextualizes the text description using a neural field and colors the corresponding region of the shape using a probability-weighted blend. Our neural optimization is guided by a pre-trained CLIP encoder, which bypasses the need for any 3D datasets or 3D annotations. Thus, 3D Highlighter is highly flexible, general, and capable of producing localizations on a myriad of input shapes. Our code is publicly available at https://github.com/threedle/3DHighlighter. 
    more » « less
  4. We present iSeg, a new interactive technique for segmenting 3D shapes. Previous works have focused mainly on leveraging pre-trained 2D foundation models for 3D segmentation based on text. However, text may be insufficient for accurately describing fine-grained spatial segmentations. Moreover, achieving a consistent 3D segmentation using a 2D model is highly challenging, since occluded areas of the same semantic region may not be visible together from any 2D view. Thus, we design a segmentation method conditioned on fine user clicks, which operates entirely in 3D. Our system accepts user clicks directly on the shape's surface, indicating the inclusion or exclusion of regions from the desired shape partition. To accommodate various click settings, we propose a novel interactive attention module capable of processing different numbers and types of clicks, enabling the training of a single unified interactive segmentation model. We apply iSeg to a myriad of shapes from different domains, demonstrating its versatility and faithfulness to the user's specifications. Our project page is at https://threedle.github.io/iSeg/. 
    more » « less
  5. Abstract Frequency-modulated continuous wave (FMCW) light detection and ranging (LiDAR) is an emerging 3D ranging technology that offers high sensitivity and ranging precision. Due to the limited bandwidth of digitizers and the speed limitations of beam steering using mechanical scanners, meter-scale FMCW LiDAR systems typically suffer from a low 3D frame rate, which greatly restricts their applications in real-time imaging of dynamic scenes. In this work, we report a high-speed FMCW based 3D imaging system, combining a grating for beam steering with a compressed time-frequency analysis approach for depth retrieval. We thoroughly investigate the localization accuracy and precision of our system both theoretically and experimentally. Finally, we demonstrate 3D imaging results of multiple static and moving objects, including a flexing human hand. The demonstrated technique achieves submillimeter localization accuracy over a tens-of-centimeter imaging range with an overall depth voxel acquisition rate of 7.6 MHz, enabling densely sampled 3D imaging at video rate. 
    more » « less