skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Enhancing Robotic Arm Activity Recognition with Vision Transformers and Wavelet-Transformed Channel State Information
Vision-based methods are commonly used in robotic arm activity recognition. These approaches typically rely on line-of-sight (LoS) and raise privacy concerns, particularly in smart home applications. Passive Wi-Fi sensing represents a new paradigm for recognizing human and robotic arm activi- ties, utilizing channel state information (CSI) measurements to identify activities in indoor environments. In this paper, a novel machine learning approach based on discrete wavelet transform and vision transformers for robotic arm activity recognition from CSI measurements in indoor settings is proposed. This method outperforms convolutional neural network (CNN) and long short- term memory (LSTM) models in robotic arm activity recognition, particularly when LoS is obstructed by barriers, without relying on external or internal sensors or visual aids. Experiments are conducted using four different data collection scenarios and four different robotic arm activities. Performance results demonstrate that wavelet transform can significantly enhance the accuracy of visual transformer networks in robotic arms activity recognition.  more » « less
Award ID(s):
2121121
PAR ID:
10536200
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
IEEE
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Despite the current surge of interest in autonomous robotic systems, robot activity recognition within restricted in- door environments remains a formidable challenge. Conventional methods for detecting and recognizing robotic arms’ activities often rely on vision-based or light detection and ranging (LiDAR) sensors, which require line-of-sight (LoS) access and may raise privacy concerns, for example, in nursing facilities. This research pioneers an innovative approach harnessing channel state in- formation (CSI) measured from WiFi signals, subtly influenced by the activity of robotic arms. We developed an attention- based network to classify eight distinct activities performed by a Franka Emika robotic arm in different situations. Our proposed bidirectional vision transformer-concatenated (BiVTC) methodology aspires to predict robotic arm activities accurately, even when trained on activities with different velocities, all without dependency on external or internal sensors or visual aids. Considering the high dependency of CSI data on the environment motivated us study the problem of sniffer location selection, by systematically changing the sniffer’s location and collecting different sets of data. Finally, this paper also marks the first publication of the CSI data of eight distinct robotic arm activities, collectively referred to as RoboFiSense. This initiative aims to provide a benchmark dataset and baselines to the research community, fostering advancements in the field of robotics sensing. 
    more » « less
  2. null (Ed.)
    Channel state information (CSI)-based fingerprinting via neural networks (NNs) is a promising approach to enable accurate indoor and outdoor positioning of user equipments (UEs), even under challenging propagation conditions. In this paper, we propose a positioning pipeline for wireless LAN MIMO-OFDM systems which uses uplink CSI measurements obtained from one or more unsynchronized access points (APs). For each AP receiver, novel features are first extracted from the CSI that are robust to system impairments arising in real-world transceivers. These features are the inputs to a NN that extracts a probability map indicating the likelihood of a UE being at a given grid point. The NN output is then fused across multiple APs to provide a final position estimate. We provide experimental results with real-world indoor measurements under line-of-sight (LoS) and non-LoS propagation conditions for an 80 MHz bandwidth IEEE 802.11ac system using a two-antenna transmit UE and two AP receivers each with four antennas. Our approach is shown to achieve centimeter-level median distance error, an order of magnitude improvement over a conventional baseline. 
    more » « less
  3. Localization of wireless transmitters based on channel state information (CSI) fingerprinting finds widespread use in indoor as well as outdoor scenarios. Fingerprinting localization first builds a database containing CSI with measured location information. One then searches for the most similar CSI in this database to approximate the position of wireless transmitters. In this paper, we investigate the efficacy of locality-sensitive hashing (LSH) to reduce the complexity of the nearest neighbor- search (NNS) required by conventional fingerprinting localization systems. More specifically, we propose a low-complexity and memory efficient LSH function based on the sum-to-one (STOne) transform and use approximate hash matches. We evaluate the accuracy and complexity (in terms of the number of searches and storage requirements) of our approach for line-of-sight (LoS) and non-LoS channels, and we show that LSH enables low-complexity fingerprinting localization with comparable accuracy to methods relying on exact NNS or deep neural networks. 
    more » « less
  4. Recent channel state information (CSI)-based positioning pipelines rely on deep neural networks (DNNs) in order to learn a mapping from estimated CSI to position. Since real-world communication transceivers suffer from hardware impairments, CSI-based positioning systems typically rely on features that are designed by hand. In this paper, we propose a CSI-based positioning pipeline that directly takes raw CSI measurements and learns features using a structured DNN in order to generate probability maps describing the likelihood of the transmitter being at pre-defined grid points. To further improve the positioning accuracy of moving user equipments, we propose to fuse a time-series of learned CSI features or a time-series of probability maps. To demonstrate the efficacy of our methods, we perform experiments with real-world indoor line-of-sight (LoS) and nonLoS channel measurements. We show that CSI feature learning and time-series fusion can reduce the mean distance error by up to 2.5× compared to the state-of-the-art. 
    more » « less
  5. Driven by the development of machine learning and the development of wireless techniques, lots of research efforts have been spent on the human activity recognition (HAR). Although various deep learning algorithms can achieve high accuracy for recognizing human activities, existing works lack of a theoretical performance upper bound which is the best accuracy that is only limited by the influencing factors in wireless networks such as indoor physical environments and settings of wireless sensing devices regardless of any HAR algorithm. Without the understanding of performance upper bound, mistakenly configuring the influencing factors can reduce the HAR accuracy drastically no matter what deep learning algorithms are utilized. In this paper, we propose the HAR performance upper bound which is the minimum classification error probability that doesn't depend on any HAR algorithms and can be considered as a function of influencing factors in wireless sensing networks for CSI based human activity recognition. Since the performance upper bound can capture the impacts of influencing factors on HAR accuracy, we further analyze the influences of those factors with varying situations such as through the wall HAR and different human activities by MATLAB simulations. 
    more » « less