skip to main content

Search for: All records

Creators/Authors contains: "Huang, Qixing"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available December 10, 2024
  2. We study how to optimize the latent space of neural shape generators that map latent codes to 3D deformable shapes. The key focus is to look at a deformable shape generator from a differential geometry perspective. We define a Riemannian metric based on as-rigid-as-possible and as-conformal-as-possible deformation energies. Under this metric, we study two desired properties of the latent space: 1) straight-line interpolations in latent codes follow geodesic curves; 2) latent codes disentangle pose and shape variations at different scales. Strictly enforcing the geometric interpolation property, however, only applies if the metric matrix is a constant. We show how to achieve this property approximately by enforcing that geodesic interpolations are axis-aligned, i.e., interpolations along coordinate axis follow geodesic curves. In addition, we introduce a novel approach that decouples pose and shape variations via generalized eigendecomposition. We also study efficient regularization terms for learning deformable shape generators, e.g., that promote smooth interpolations. Experimental results on benchmark datasets show that our approach leads to interpretable latent codes, improves the generalizability of synthetic shapes, and enhances performance in geodesic interpolation and geodesic shooting.

    more » « less
    Free, publicly-accessible full text available December 5, 2024
  3. High-quality large-scale scene rendering requires a scalable representation and accurate camera poses. This research combines tile-based hybrid neural fields with parallel distributive optimization to improve bundle-adjusting neural radiance fields. The proposed method scales with a divide-and-conquer strategy. We partition scenes into tiles, each with a multi-resolution hash feature grid and shallow chained diffuse and specular multilayer perceptrons (MLPs). Tiles unify foreground and background via a spatial contraction function that allows both distant objects in outdoor scenes and planar reflections as virtual images outside the tile. Decomposing appearance with the specular MLP allows a specular-aware warping loss to provide a second optimization path for camera poses. We apply the alternating direction method of multipliers (ADMM) to achieve consensus among camera poses while maintaining parallel tile optimization. Experimental results show that our method outperforms state-of-the-art neural scene rendering method quality by 5%--10% in PSNR, maintaining sharp distant objects and view-dependent reflections across six indoor and outdoor scenes.

    more » « less
    Free, publicly-accessible full text available December 5, 2024
  4. Free, publicly-accessible full text available July 1, 2024
  5. Human skeleton-based action recognition offers a valuable means to understand the intricacies of human behavior because it can handle the complex relationships between physical constraints and intention. Although several studies have focused on encoding a skeleton, less attention has been paid to embed this information into the latent representations of human action. InfoGCN proposes a learning framework for action recognition combining a novel learning objective and an encoding method. First, we design an information bottleneck-based learning objective to guide the model to learn informative but compact latent representations. To provide discriminative information for classifying action, we introduce attention-based graph convolution that captures the context-dependent intrinsic topology of human action. In addition, we present a multi-modal representation of the skeleton using the relative position of joints, designed to provide complementary spatial information for joints. InfoGcn 1 1 Code is available at surpasses the known state-of-the-art on multiple skeleton-based action recognition benchmarks with the accuracy of 93.0% on NTU RGB+D 60 cross-subject split, 89.8% on NTU RGB+D 120 cross-subject split, and 97.0% on NW-UCLA. 
    more » « less
  6. null (Ed.)