NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Use of Motion Capture Technology to Study Extrinsic Laryngeal Muscle Tension and Hyperfunction

https://doi.org/10.1002/lary.30829

Hogue, Steven; Guo, Xiaohu; Morrison, Robert A.; McDowell, Sarah; Shembel, Adrianna C. (June 2023, The Laryngoscope)

ObjectivesPatients with primary muscle tension dysphonia (pMTD) commonly report paralaryngeal pain and discomfort, and extrinsic laryngeal muscle (ELM) tension and hyperfunction are commonly implicated. However, quantitative physiological metrics to study ELM movement patterns for the characterization of pMTD diagnosis and monitoring of treatment progress are lacking. The objectives of this study were to validate motion capture (MoCap) technology to study ELM kinematics, determine whether MoCap could distinguish ELM tension and hyperfunction between individuals with and without pMTD, and investigate relationships between common clinical voice metrics and ELM kinematics. MethodsThirty subjects (15 with pMTD and 15 controls) were recruited for the study. Sixteen markers were placed on different anatomical landmarks on the chin and anterior neck. Movements across these regions were tracked during four voice and speech tasks using two three‐dimensional cameras. Movement displacement and variability were determined based on 16 key‐points and 53 edges. ResultsIntraclass correlation coefficients demonstrated high intra‐ and inter‐rater reliability (p's < 0.001). Other than greater movement displacements around the thyrohyoid space during longer phrasing (reading passage, 30‐s diadochokinetics) and more movement variability in patients with pMTD, kinematic patterns between groups were similar across the 53 edges for the four voice and speech tasks. There were also no significant correlations between ELM kinematics and standard voice metrics. ConclusionResults demonstrate the feasibility and reliability of MoCap for the study of ELM kinematics. Level of Evidence3Laryngoscope, 133:3472–3481, 2023
more » « less
MusicFace: Music-driven expressive singing face synthesis

https://doi.org/10.1007/s41095-023-0343-7

Liu, Pengfei; Deng, Wenjin; Li, Hengda; Wang, Jintai; Zheng, Yinglin; Ding, Yiwei; Guo, Xiaohu; Zeng, Ming (February 2024, Computational Visual Media)

Abstract It remains an interesting and challenging problem to synthesize a vivid and realistic singing face driven by music. In this paper, we present a method for this task with natural motions for the lips, facial expression, head pose, and eyes. Due to the coupling of mixed information for the human voice and backing music in common music audio signals, we design a decouple-and-fuse strategy to tackle the challenge. We first decompose the input music audio into a human voice stream and a backing music stream. Due to the implicit and complicated correlation between the two-stream input signals and the dynamics of the facial expressions, head motions, and eye states, we model their relationship with an attention scheme, where the effects of the two streams are fused seamlessly. Furthermore, to improve the expressivenes of the generated results, we decompose head movement generation in terms of speed and direction, and decompose eye state generation into short-term blinking and long-term eye closing, modeling them separately. We have also built a novel dataset, SingingFace, to support training and evaluation of models for this task, including future work on this topic. Extensive experiments and a user study show that our proposed method is capable of synthesizing vivid singing faces, qualitatively and quantitatively better than the prior state-of-the-art.
more » « less
Full Text Available
MATTopo: Topology-preserving Medial Axis Transform with Restricted Power Diagram

https://doi.org/10.1145/3687763

Wang, Ningna; Huang, Hui; Song, Shibo; Wang, Bin; Wang, Wenping; Guo, Xiaohu (December 2024, ACM Transactions on Graphics)

We present a novel topology-preserving 3D medial axis computation framework based on volumetric restricted power diagram (RPD), while preserving the medial features and geometric convergence simultaneously, for both 3D CAD and organic shapes. The volumetric RPD discretizes the input 3D volume into sub-regions given a set of medial spheres. With this intermediate structure, we convert the homotopy equivalency between the generated medial mesh and the input 3D shape into a localized contractibility checking for each restricted element (power cell, power face, power edge), by checking their connected components and Euler characteristics. We further propose a fractional Euler characteristic algorithm for efficient GPU-based computation of Euler characteristic for each restricted element on the fly while computing the volumetric RPD. Compared with existing voxel-based or point-cloud-based methods, our approach is the first to adaptively and directly revise the medial mesh without globally modifying the dependent structure, such as voxel size or sampling density, while preserving its topology and medial features. In comparison with the feature preservation method MATFP [Wang et al. 2022], our method provides geometrically comparable results with fewer spheres and more robustly captures the topology of the input 3D shape.
more » « less
Free, publicly-accessible full text available December 19, 2025
NASM: Neural Anisotropic Surface Meshing

https://doi.org/10.1145/3680528.3687700

Li, Hongbo; Zhu, Haikuan; Zhong, Sikai; Wang, Ningna; Lin, Cheng; Guo, Xiaohu; Xin, Shiqing; Wang, Wenping; Hua, Jing; Zhong, Zichun (December 2024, ACM)

Free, publicly-accessible full text available December 3, 2025
SurgicalGaussian: Deformable 3D Gaussians for High-Fidelity Surgical Scene Reconstruction

Xie, Weixing; Yao, Junfeng; Cao, Xianpeng; Lin, Qiqin; Tang, Zerui; Dong, Xiao; Guo, Xiaohu (October 2024, Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. Lecture Notes in Computer Science, vol 15006. Springer.)

Full Text Available
CWF: Consolidating Weak Features in High-quality Mesh Simplification

https://doi.org/10.1145/3658159

Xu, Rui; Liu, Longdu; Wang, Ningna; Chen, Shuangmin; Xin, Shiqing; Guo, Xiaohu; Zhong, Zichun; Komura, Taku; Wang, Wenping; Tu, Changhe (July 2024, ACM Transactions on Graphics)

In mesh simplification, common requirements like accuracy, triangle quality, and feature alignment are often considered as a trade-off. Existing algorithms concentrate on just one or a few specific aspects of these requirements. For example, the well-known Quadric Error Metrics (QEM) approach [Garland and Heckbert 1997] prioritizes accuracy and can preserve strong feature lines/points as well, but falls short in ensuring high triangle quality and may degrade weak features that are not as distinctive as strong ones. In this paper, we propose a smooth functional that simultaneously considers all of these requirements. The functional comprises a normal anisotropy term and a Centroidal Voronoi Tessellation (CVT) [Du et al. 1999] energy term, with the variables being a set of movable points lying on the surface. The former inherits the spirit of QEM but operates in a continuous setting, while the latter encourages even point distribution, allowing various surface metrics. We further introduce a decaying weight to automatically balance the two terms. We selected 100 CAD models from the ABC dataset [Koch et al. 2019], along with 21 organic models, to compare the existing mesh simplification algorithms with ours. Experimental results reveal an important observation: the introduction of a decaying weight effectively reduces the conflict between the two terms and enables the alignment of weak features. This distinctive feature sets our approach apart from most existing mesh simplification methods and demonstrates significant potential in shape understanding. Please refer to the teaser figure for illustration.
more » « less
Full Text Available
DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures

https://doi.org/10.1109/CVPRW63382.2024.00198

Hogue, Steven; Zhang, Chenxu; Daruger, Hamza; Tian, Yapeng; Guo, Xiaohu (June 2024, IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW))

Full Text Available
DRSM: Efficient Neural 4D Decomposition for Dynamic Reconstruction in Stationary Monocular Cameras

https://doi.org/10.1109/ICASSP48485.2024.10447270

Xie, Weixing; Dong, Xiao; Yang, Yong; Lin, Qiqin; Chen, Jingze; Yao, Junfeng; Guo, Xiaohu (April 2024, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))

Full Text Available
DR ² : Disentangled Recurrent Representation Learning for Data-efficient Speech Video Synthesis

https://doi.org/10.1109/WACV57701.2024.00609

Zhang, Chenxu; Wang, Chao; Zhao, Yifan; Cheng, Shuo; Luo, Linjie; Guo, Xiaohu (January 2024, IEEE/CVF Winter Conference on Applications of Computer Vision (WACV))

Full Text Available
Computational Design of Wiring Layout on Tight Suits with Minimal Motion Resistance

https://doi.org/10.1145/3610548.3618200

Wang, Kai; Xu, Xiaoyu; Zheng, Yinping; Zhou, Da; Guo, Shihui; Qin, Yipeng; Guo, Xiaohu (December 2023, ACM SIGGRAPH Asia 2023 Conference Papers)

Full Text Available

« Prev Next »

Search for: All records