skip to main content

Title: Privacy-Preserving Motor Intent Classification via Feature Disentanglement
Recent studies have revealed that sensitive and private attributes could be decoded from sEMG signals, which incurs significant privacy threats to the users of sEMG applications. Most researches so far focus on improving the accuracy and reliability of sEMG models, but much less attention has been paid to their privacy. To fill this gap, this paper implemented and evaluated a framework to optimize the sEMG-based data-sharing mechanism. Our primary goal is to remove sensitive attributes in the sEMG features before sharing them with primary tasks while maintaining the data utility. We disentangled the identity-invariant task-relevant representations from original sEMG features. We shared it with the downstream pattern recognition tasks to reduce the chance of sensitive attributes being inferred by potential attackers. The proposed method was evaluated on data from twenty subjects, with training and testing data acquired 3-25 days apart. Experimental results show that the disentangled representations significantly lower the success rate of identity inference attacks compared to the original feature and its sparse representations generated by the state-of-the-art feature projection methods. Furthermore, the utility of the disentangled representation is also evaluated in hand gesture recognition tasks, showing superior performance over other methods. This work shows that disentangled representations of sEMG signals are a promising solution for privacy-reserving applications.  more » « less
Award ID(s):
Author(s) / Creator(s):
Date Published:
Journal Name:
11th International IEEE EMBS Conference on Neural Engineering
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Federated learning (FL) has been widely studied recently due to its property to collaboratively train data from different devices without sharing the raw data. Nevertheless, recent studies show that an adversary can still be possible to infer private information about devices' data, e.g., sensitive attributes such as income, race, and sexual orientation. To mitigate the attribute inference attacks, various existing privacy-preserving FL methods can be adopted/adapted. However, all these existing methods have key limitations: they need to know the FL task in advance, or have intolerable computational overheads or utility losses, or do not have provable privacy guarantees. We address these issues and design a task-agnostic privacy-preserving presentation learning method for FL (TAPPFL) against attribute inference attacks. TAPPFL is formulated via information theory. Specifically, TAPPFL has two mutual information goals, where one goal learns task-agnostic data representations that contain the least information about the private attribute in each device's data, and the other goal ensures the learnt data representations include as much information as possible about the device data to maintain FL utility. We also derive privacy guarantees of TAPPFL against worst-case attribute inference attacks, as well as the inherent tradeoff between utility preservation and privacy protection. Extensive results on multiple datasets and applications validate the effectiveness of TAPPFL to protect data privacy, maintain the FL utility, and be efficient as well. Experimental results also show that TAPPFL outperforms the existing defenses.

    more » « less
  2. Security monitoring is crucial for maintaining a strong IT infrastructure by protecting against emerging threats, identifying vulnerabilities, and detecting potential points of failure. It involves deploying advanced tools to continuously monitor networks, systems, and configurations. However, organizations face challenges in adapting modern techniques like Machine Learning (ML) due to privacy and security risks associated with sharing internal data. Compliance with regulations like GDPR further complicates data sharing. To promote external knowledge sharing, a secure and privacy-preserving method for organizations to share data is necessary. Privacy-preserving data generation involves creating new data that maintains privacy while preserving key characteristics and properties of the original data so that it is still useful in creating downstream models of attacks. Generative models, such as Generative Adversarial Networks (GAN), have been proposed as a solution for privacy preserving synthetic data generation. However, standard GANs are limited in their capabilities to generate realistic system data. System data have inherent constraints, e.g., the list of legitimate I.P. addresses and port numbers are limited, and protocols dictate a valid sequence of network events. Standard generative models do not account for such constraints and do not utilize domain knowledge in their generation process. Additionally, they are limited by the attribute values present in the training data. This poses a major privacy risk, as sensitive discrete attribute values are repeated by GANs. To address these limitations, we propose a novel model for Knowledge Infused Privacy Preserving Data Generation. A privacy preserving Generative Adversarial Network (GAN) is trained on system data for generating synthetic datasets that can replace original data for downstream tasks while protecting sensitive data. Knowledge from domain-specific knowledge graphs is used to guide the data generation process, check for the validity of generated values, and enrich the dataset by diversifying the values of attributes. We specifically demonstrate this model by synthesizing network data captured by the network capture tool, Wireshark. We establish that the synthetic dataset holds up to the constraints of the network-specific datasets and can replace the original dataset in downstream tasks. 
    more » « less
  3. User authentication plays an important role in securing systems and devices by preventing unauthorized accesses. Although surface Electromyogram (sEMG) has been widely applied for human machine interface (HMI) applications, it has only seen a very limited use for user authentication. In this paper, we investigate the use of multi-channel sEMG signals of hand gestures for user authentication. We propose a new deep anomaly detection-based user authentication method which employs sEMG images generated from multi-channel sEMG signals. The deep anomaly detection model classifies the user performing the hand gesture as client or imposter by using sEMG images as the input. Different sEMG image generation methods are studied in this paper. The performance of the proposed method is evaluated with a high-density hand gesture sEMG (HD-sEMG) dataset and a sparse-density hand gesture sEMG (SD-sEMG) dataset under three authentication test scenarios. Among the sEMG image generation methods, root mean square (RMS) map achieves significantly better performance than others. The proposed method with RMS map also greatly outperforms the reference method, especially when using SD-sEMG signals. The results demonstrate the validity of the proposed method with RMS map for user authentication. 
    more » « less
  4. Big Data empowers the farming community with the information needed to optimize resource usage, increase productivity, and enhance the sustainability of agricultural practices. The use of Big Data in farming requires the collection and analysis of data from various sources such as sensors, satellites, and farmer surveys. While Big Data can provide the farming community with valuable insights and improve efficiency, there is significant concern regarding the security of this data as well as the privacy of the participants. Privacy regulations, such as the European Union’s General Data Protection Regulation (GDPR), the EU Code of Conduct on agricultural data sharing by contractual agreement, and the proposed EU AI law, have been created to address the issue of data privacy and provide specific guidelines on when and how data can be shared between organizations. To make confidential agricultural data widely available for Big Data analysis without violating the privacy of the data subjects, we consider privacy-preserving methods of data sharing in agriculture. Synthetic data that retains the statistical properties of the original data but does not include actual individuals’ information provides a suitable alternative to sharing sensitive datasets. Deep learning-based synthetic data generation has been proposed for privacy-preserving data sharing. However, there is a lack of compliance with documented data privacy policies in such privacy-preserving efforts. In this study, we propose a novel framework for enforcing privacy policy rules in privacy-preserving data generation algorithms. We explore several available agricultural codes of conduct, extract knowledge related to the privacy constraints in data, and use the extracted knowledge to define privacy bounds in a privacy-preserving generative model. We use our framework to generate synthetic agricultural data and present experimental results that demonstrate the utility of the synthetic dataset in downstream tasks. We also show that our framework can evade potential threats, such as re-identification and linkage issues, and secure data based on applicable regulatory policy rules. 
    more » « less
  5. Graph embedding techniques are pivotal in real-world machine learning tasks that operate on graph-structured data, such as social recommendation and protein structure modeling. Embeddings are mostly performed on the node level for learning representations of each node. Since the formation of a graph is inevitably affected by certain sensitive node attributes, the node embeddings can inherit such sensitive information and introduce undesirable biases in downstream tasks. Most existing works impose ad-hoc constraints on the node embeddings to restrict their distributions for unbiasedness/fairness, which however compromise the utility of the resulting embeddings. In this paper, we propose a principled new way for unbiased graph embedding by learning node embeddings from an underlying bias-free graph, which is not influenced by sensitive node attributes. Motivated by this new perspective, we propose two complementary methods for uncovering such an underlying graph, with the goal of introducing minimum impact on the utility of the embeddings. Both our theoretical justification and extensive experimental comparisons against state-of-the-art solutions demonstrate the effectiveness of our proposed methods. 
    more » « less