Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Neural image representations offer the possibility of high fidelity, compact storage, and resolution-independent accuracy, providing an attractive alternative to traditional pixel- and grid-based representations. However, coordinate neural networks fail to capture discontinuities present in the image and tend to blur across them; we aim to address this challenge. In many cases, such as rendered images, vector graphics, diffusion curves, or solutions to partial differential equations, the locations of the discontinuities are known. We take those locations as input, represented as linear, quadratic, or cubic Bézier curves, and construct a feature field that is discontinuous across these locations and smooth everywhere else. Finally, we use a shallow multi-layer perceptron to decode the features into the signal value. To construct the feature field, we develop a new data structure based on a curved triangular mesh, with features stored on the vertices and on a subset of the edges that are marked as discontinuous. We show that our method can be used to compress a 100, 0002-pixel rendered image into a 25MB file; can be used as a new diffusion-curve solver by combining with Monte-Carlo-based methods or directly supervised by the diffusion-curve energy; or can be used for compressing 2D physics simulation data.more » « less
-
Separating an image into meaningful underlying components is a crucial first step for both editing and understanding images. We present a method capable of selecting the regions of a photograph exhibiting the same material as an artist-chosen area. Our proposed approach is robust to shading, specular highlights, and cast shadows, enabling selection in real images. As we do not rely on semantic segmentation (different woods or metal should not be selected together), we formulate the problem as a similarity-based grouping problem based on a user-provided image location. In particular, we propose to leverage the unsupervised DINO [Caron et al. 2021] features coupled with a proposed Cross-Similarity Feature Weighting module and an MLP head to extract material similarities in an image. We train our model on a new synthetic image dataset, that we release. We show that our method generalizes well to real-world images. We carefully analyze our model's behavior on varying material properties and lighting. Additionally, we evaluate it against a hand-annotated benchmark of 50 real photographs. We further demonstrate our model on a set of applications, including material editing, in-video selection, and retrieval of object photographs with similar materials.more » « less
-
We introduce the task of spotting temporally precise, fine-grained events in video (detecting the precise moment in time events occur). Precise spotting requires models to reason globally about the full-time scale of actions and locally to identify subtle frame-to-frame appearance and motion differences that identify events during these actions. Surprisingly, we find that top performing solutions to prior video understanding tasks such as action detection and segmentation do not simultaneously meet both requirements. In response, we propose E2E-Spot, a compact, end-to-end model that performs well on the precise spotting task and can be trained quickly on a single GPU. We demonstrate that E2E-Spot significantly outperforms recent baselines adapted from the video action detection, segmentation, and spotting literature to the precise spotting task. Finally, we contribute new annotations and splits to several fine-grained sports action datasets to make these datasets suitable for future work on precise spotting.more » « less
An official website of the United States government
