This content will become publicly available on December 15, 2024
- NSF-PAR ID:
- 10478491
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- IEEE Big Data 2023
- Page Range / eLocation ID:
- 10
- Subject(s) / Keyword(s):
- Cyber-bullying Deep Learning Neural Networks Natural Language Processing.
- Format(s):
- Medium: X
- Location:
- Sorrento, Italy
- Sponsoring Org:
- National Science Foundation
More Like this
-
Learning explicit and implicit patterns in human trajectories plays an important role in many Location-Based Social Networks (LBSNs) applications, such as trajectory classification (e.g., walking, driving, etc.), trajectory-user linking, friend recommendation, etc. A particular problem that has attracted much attention recently – and is the focus of our work – is the Trajectory-based Social Circle Inference (TSCI), aiming at inferring user social circles (mainly social friendship) based on motion trajectories and without any explicit social networked information. Existing approaches addressing TSCI lack satisfactory results due to the challenges related to data sparsity, accessibility and model efficiency. Motivated by the recent success of machine learning in trajectory mining, in this paper we formulate TSCI as a novel multi-label classification problem and develop a Recurrent Neural Network (RNN)-based framework called DeepTSCI to use human mobility patterns for inferring corresponding social circles. We propose three methods to learn the latent representations of trajectories, based on: (1) bidirectional Long Short-Term Memory (LSTM); (2) Autoencoder; and (3) Variational autoencoder. Experiments conducted on real-world datasets demonstrate that our proposed methods perform well and achieve significant improvement in terms of macro-R, macro-F1 and accuracy when compared to baselines.more » « less
-
Social media platforms are playing increasingly critical roles in disaster response and rescue operations. During emergencies, users can post rescue requests along with their addresses on social media, while volunteers can search for those messages and send help. However, efficiently leveraging social media in rescue operations remains challenging because of the lack of tools to identify rescue request messages on social media automatically and rapidly. Analyzing social media data, such as Twitter data, relies heavily on Natural Language Processing (NLP) algorithms to extract information from texts. The introduction of bidirectional transformers models, such as the Bidirectional Encoder Representations from Transformers (BERT) model, has significantly outperformed previous NLP models in numerous text analysis tasks, providing new opportunities to precisely understand and classify social media data for diverse applications. This study developed and compared ten VictimFinder models for identifying rescue request tweets, three based on milestone NLP algorithms and seven BERT-based. A total of 3191 manually labeled disaster-related tweets posted during 2017 Hurricane Harvey were used as the training and testing datasets. We evaluated the performance of each model by classification accuracy, computation cost, and model stability. Experiment results show that all BERT-based models have significantly increased the accuracy of categorizing rescue-related tweets. The best model for identifying rescue request tweets is a customized BERT-based model with a Convolutional Neural Network (CNN) classifier. Its F1-score is 0.919, which outperforms the baseline model by 10.6%. The developed models can promote social media use for rescue operations in future disaster events.more » « less
-
Anomaly detection in time-series data is an integral part in the context of the Internet of Things (IoT). In particular, with the advent of sophisticated deep and machine learning-based techniques, this line of research has attracted many researchers to develop more accurate anomaly detection algorithms. The problem itself has been a long-lasting challenging problem in security and especially in malware detection and data tampering. The advancement of the IoT paradigm as well as the increasing number of cyber attacks on the networks of the Internet of Things worldwide raises the concern of whether flexible and simple yet accurate anomaly detection techniques exist. In this paper, we investigate the performance of deep learning-based models including recurrent neural network-based Bidirectional LSTM (BI-LSTM), Long Short-Term Memory (LSTM), CNN-based Temporal Convolutional (TCN), and CuDNN-LSTM, which is a fast LSTM implementation supported by CuDNN. In particular, we assess the performance of these models with respect to accuracy and the training time needed to build such models. According to our experiment, using different timestamps (i.e., 15, 20, and 30 min), we observe that in terms of performance, the CuDNN-LSTM model outperforms other models, whereas in terms of training time, the TCN-based model is trained faster. We report the results of experiments in comparing these four models with various look-back values.more » « less
-
Abstract Objective .This paper presents data-driven solutions to address two challenges in the problem of linking neural data and behavior: (1) unsupervised analysis of behavioral data and automatic label generation from behavioral observations, and (2) extraction of subject-invariant features for the development of generalized neural decoding models.Approach . For behavioral analysis and label generation, an unsupervised method, which employs an autoencoder to transform behavioral data into a cluster-friendly feature space is presented. The model iteratively refines the assigned clusters with soft clustering assignment loss, and gradually improves the learned feature representations. To address subject variability in decoding neural activity, adversarial learning in combination with a long short-term memory-based adversarial variational autoencoder (LSTM-AVAE) model is employed. By using an adversary network to constrain the latent representations, the model captures shared information among subjects’ neural activity, making it proper for cross-subject transfer learning.Main results . The proposed approach is evaluated using cortical recordings of Thy1-GCaMP6s transgenic mice obtained via widefield calcium imaging during a motivational licking behavioral experiment. The results show that the proposed model achieves an accuracy of 89.7% in cross-subject neural decoding, outperforming other well-known autoencoder-based feature learning models. These findings suggest that incorporating an adversary network eliminates subject dependency in representations, leading to improved cross-subject transfer learning performance, while also demonstrating the effectiveness of LSTM-based models in capturing the temporal dependencies within neural data.Significance . Results demonstrate the feasibility of the proposed framework in unsupervised clustering and label generation of behavioral data, as well as achieving high accuracy in cross-subject neural decoding, indicating its potentials for relating neural activity to behavior. -
null (Ed.)The element of repetition in cyberbullying behavior has directed recent computational studies toward detecting cyberbullying based on a social media session. In contrast to a single text, a session may consist of an initial post and an associated sequence of comments. Yet, emerging efforts to enhance the performance of session-based cyberbullying detection have largely overlooked unintended social biases in existing cyberbullying datasets. For example, a session containing certain demographic-identity terms (e.g., “gay” or “black”) is more likely to be classified as an instance of cyberbullying. In this paper, we first show evidence of such bias in models trained on sessions collected from different social media platforms (e.g., Instagram). We then propose a context-aware and model-agnostic debiasing strategy that leverages a reinforcement learning technique, without requiring any extra resources or annotations apart from a pre-defined set of sensitive triggers commonly used for identifying cyberbullying instances. Empirical evaluations show that the proposed strategy can simultaneously alleviate the impacts of the unintended biases and improve the detection performance.more » « less