NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

FreqMixFormerV2: Lightweight Frequency-aware Mixed Transformer for Human Skeleton Action Recognition

Wu, Wenhan; Wang, Pengfei; Chen, Chen; Lu, Aidong (January 2025, IEEE International Conference on Automatic Face and Gesture Recognition)

Transformer-based human skeleton action recognition has been developed for years. However, the complexity and high parameter count demands of these models hinder their practical applications, especially in resource-constrained environments. In this work, we propose FreqMixForemrV2, which was built upon the Frequency-aware Mixed Transformer (FreqMixFormer) for identifying subtle and discriminative actions with pioneered frequency-domain analysis. We design a lightweight architecture that maintains robust performance while significantly reducing the model complexity. This is achieved through a redesigned frequency operator that optimizes high-frequency and low-frequency parameter adjustments, and a simplified frequency-aware attention module. These improvements result in a substantial reduction in model parameters, enabling efficient deployment with only a minimal sacrifice in accuracy. Comprehensive evaluations of standard datasets (NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets) demonstrate that the proposed model achieves a superior balance between efficiency and accuracy, outperforming state-of-the-art methods with only 60% of the parameters.
more » « less
Free, publicly-accessible full text available January 30, 2026
Frequency Guidance Matters: Skeletal Action Recognition by Frequency-Aware Mixed Transformer

Wu, Wenhan; Zheng, Ce; Yang, Zihao; Chen, Chen; Das, Srijan; Lu, Aidong (July 2024, ACM Multimedia)

Recently, transformers have demonstrated great potential for modeling long-term dependencies from skeleton sequences and thereby gained ever-increasing attention in skeleton action recognition. However, the existing transformer-based approaches heavily rely on the naive attention mechanism for capturing the spatiotemporal features, which falls short in learning discriminative representations that exhibit similar motion patterns. To address this challenge, we introduce the Frequency-aware Mixed Transformer (FreqMixFormer), specifically designed for recognizing similar skeletal actions with subtle discriminative motions. First, we introduce a frequency-aware attention module to unweave skeleton frequency representations by embedding joint features into frequency attention maps, aiming to distinguish the discriminative movements based on their frequency coefficients. Subsequently, we develop a mixed transformer architecture to incorporate spatial features with frequency features to model the comprehensive frequency-spatial patterns. Additionally, a temporal transformer is proposed to extract the global correlations across frames. Extensive experiments show that FreqMiXFormer outperforms SOTA on 3 popular skeleton action recognition datasets, including NTU RGB+D, NTU RGB+D120, and NW-UCLA datasets. Our project is publicly available at: https://github.com/wenhanwu95/FreqMixFormer.
more » « less
Full Text Available
A Review of Motion Data Privacy in Virtual Reality

Xu, Depeng; Wang, Weichao; Lu, Aidong (June 2024, IEEE International Conference on Meta Computing)

As the metaverse grows with the advances of new technologies, a number of researchers have raised the concern on the privacy of motion data in virtual reality (VR). It is becoming clear that motion data can reveal essential information of people, such as user identification. However, the fundamental problems about what types of motion data, how to process, and on what ranges of VR applications are still underexplored. This work summarizes the work of motion data privacy on these aspects from both the fields of VR and data privacy. Our results demonstrate that researchers from both fields have recognized the importance of the problem, while there are differences due to the focused problems. A variety of VR studies have been used for user identification, and the results are affected by the application types and ranges of involved actions. We also review the biometrics work from related fields including the behaviors of keystrokes and waist as well as data of skeleton, face and fingerprint. At the end, we discuss our findings and suggest future work to protect the privacy of motion data.
more » « less
Full Text Available
Part Aware Contrastive Learning for Self-Supervised Action Recognition

Yilei Hua, Wenhan Wu (October 2023, THE 32nd INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE)

In recent years, remarkable results have been achieved in self-supervised action recognition using skeleton sequences with contrastive learning. It has been observed that the semantic distinction of human action features is often represented by local body parts, such as legs or hands, which are advantageous for skeleton-based action recognition. This paper proposes an attention-based contrastive learning framework for skeleton representation learning, called SkeAttnCLR, which integrates local similarity and global features for skeleton-based action representations. To achieve this, a multi-head attention mask module is employed to learn the soft attention mask features from the skeletons, suppressing non-salient local features while accentuating local salient features, thereby bringing similar local features closer in the feature space. Additionally, ample contrastive pairs are generated by expanding contrastive pairs based on salient and non-salient features with global features, which guide the network to learn the semantic representations of the entire skeleton. Therefore, with the attention mask mechanism, SkeAttnCLR learns local features under different data augmentation views. The experiment results demonstrate that the inclusion of local feature similarity significantly enhances skeleton-based action representation. Our proposed SkeAttnCLR outperforms state-of-the-art methods on NTURGB+D, NTU120-RGB+D, and PKU-MMD datasets. The code and settings are available at this repository: https://github.com/GitHubOfHyl97/SkeAttnCLR.
more » « less
Full Text Available
SkeletonMAE: Spatial-Temporal Masked Autoencoders for Self-supervised Skeleton Action Recognition

Wenhan Wu, Yilei Hua (July 2023, Workshop in International Conference on Multimedia and Expo (ICME), 2023)

Self-supervised skeleton-based action recognition has attracted more attention in recent years. By utilizing the unlabeled data, more generalizable features can be learned to alleviate the overfitting problem and reduce the demand for massive labeled training data. Inspired by the MAE [1], we propose a spatial-temporal masked autoencoder framework for self-supervised 3D skeleton-based action recognition (SkeletonMAE). Following MAE's masking and reconstruction pipeline, we utilize a skeleton-based encoder-decoder transformer architecture to reconstruct the masked skeleton sequences. A novel masking strategy, named Spatial-Temporal Masking, is introduced in terms of both joint-level and frame-level for the skeleton sequence. This pre-training strategy makes the encoder output generalizable skeleton features with spatial and temporal dependencies. Given the unmasked skeleton sequence, the encoder is fine-tuned for the action recognition task. Extensive ex- periments show that our SkeletonMAE achieves remarkable performance and outperforms the state-of-the-art methods on both NTU RGB+D 60 and NTU RGB+D 120 datasets.
more » « less
Full Text Available
Protection of Network Security Selector Secrecy in Outsourced Network Testing

Sultan Alasmari, Weichao Wang (July 2023, The 32nd International Conference on Computer Communications and Networks (ICCCN 2023))

With the emergence and fast development of cloud computing and outsourced services, more and more companies start to use managed security service providers (MSSP) as their security service team. This approach can save the budget on maintaining its own security teams and depend on professional security persons to protect the company infrastructures and intellectual property. However, this approach also gives the MSSP opportunities to honor only a part of the security service level agreement. To pre- vent this from happening, researchers propose to use outsourced network testing to verify the execution of the security policies. During this procedure, the end customer has to design network testing traffic and provide it to the testers. Since the testing traffic is designed based on the security rules and selectors, external testers could derive the customer network security setup, and conduct subsequent attacks based on the learned knowledge. To protect the network security configuration secrecy in outsourced testing, in this paper we propose different methods to hide the accurate information. For Regex-based security selectors, we propose to introduce fake testing traffic to confuse the testers. For exact match and range based selectors, we propose to use NAT VM to hide the accurate information. We conduct simulation to show the protection effectiveness under different scenarios. We also discuss the advantages of our approaches and the potential challenges.
more » « less
Full Text Available
Linkage Attack on Skeleton-based Motion Visualization

Thomas Carr, Aidong Lu (January 2023, ACM International Conference on Information and Knowledge Management)

Skeleton-based motion capture and visualization is an important computer vision task, especially in the virtual reality (VR) envi- ronment. It has grown increasingly popular due to the ease of gathering skeleton data and the high demand of virtual socializa- tion. The captured skeleton data seems anonymous but can still be used to extract personal identifiable information (PII). This can lead to an unintended privacy leakage inside a VR meta-verse. We propose a novel linkage attack on skeleton-based motion visual- ization. It detects if a target and a reference skeleton are the same individual. The proposed model, called Linkage Attack Neural Net- work (LAN), is based on the principles of a Siamese Network. It incorporates deep neural networks to embed the relevant PII then uses a classifier to match the reference and target skeletons. We also employ classical and deep motion retargeting (MR) to cast the target skeleton onto a dummy skeleton such that the motion sequence is anonymized for privacy protection. Our evaluation shows that the effectiveness of LAN in the linkage attack and the effectiveness of MR in anonymization. The source code is available at https://github.com/Thomasc33/Linkage-Attack
more » « less
Full Text Available
PaWLA: PPG-based Weight Lifting Assessment

A B M Mohaimenur Rahman, Pu Wang (October 2022, EEE International Performance Computing and Communications Conference (IPCCC))

Full Text Available
Toward cross-platform immersive visualization for indoor navigation and collaboration with augmented reality

https://doi.org/10.1007/s12650-022-00852-9

Ayyanchira, Akshay; Mahfoud, Elias; Wang, Weichao; Lu, Aidong (June 2022, Journal of Visualization)

Full Text Available
A Study of Real-time Information on User Behaviors during Search and Rescue (SAR) Training of Firefighters

https://doi.org/10.1109/VRW55335.2022.00085

Doroudian, Shahin; Wu, Zekun; Wang, Weichao; Galati, Alexia; Lu, Aidong (March 2022, IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW))

Full Text Available

« Prev Next »

Search for: All records