skip to main content


Title: Evidence Fusion for Malicious Bot Detection in IoT
Billions of devices in the Internet of Things (IoT) are inter-connected over the internet and communicate with each other or end users. IoT devices communicate through messaging bots. These bots are important in IoT systems to automate and better manage the work flows. IoT devices are usually spread across many applications and are able to capture or generate substantial influx of big data. The integration of IoT with cloud computing to handle and manage big data, requires considerable security measures in order to prevent cyber attackers from adversarial use of such large amount of data. An attacker can simply utilize the messaging bots to perform malicious activities on a number of devices and thus bots pose serious cybersecurity hazards for IoT devices. Hence, it is important to detect the presence of malicious bots in the network. In this paper we propose an evidence theory-based approach for malicious bot detection. Evidence Theory, a.k.a. Dempster Shafer Theory (DST) is a probabilistic reasoning tool and has the unique ability to handle uncertainty, i.e. in the absence of evidence. It can be applied efficiently to identify a bot, especially when the bots have dynamic or polymorphic behavior. The key characteristic of DST is that the detection system may not need any prior information about the malicious signatures and profiles. In this work, we propose to analyze the network flow characteristics to extract key evidence for bot traces. We then quantify these pieces of evidence using apriori algorithm and apply DST to detect the presence of the bots.  more » « less
Award ID(s):
1821560
NSF-PAR ID:
10186560
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2018 IEEE International Conference on Big Data (Big Data)
Page Range / eLocation ID:
4545 to 4548
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. As malicious bots reside in a network to disrupt network stability, graph neural networks (GNNs) have emerged as one of the most popular bot detection methods. However, in most cases these graphs are significantly class-imbalanced. To address this issue, graph oversampling has recently been proposed to synthesize nodes and edges, which still suffers from graph heterophily, leading to suboptimal performance. In this paper, we propose HOVER, which implements Homophilic Oversampling Via Edge Removal for bot detection on graphs. Instead of oversampling nodes and edges within initial graph structure, HOVER designs a simple edge removal method with heuristic criteria to mitigate heterophily and learn distinguishable node embeddings, which are then used to oversample minority bots to generate a balanced class distribution without edge synthesis. Experiments on TON IoT networks demonstrate the state-of-the-art performance of HOVER on bot detection with high graph heterophily and extreme class imbalance. 
    more » « less
  2. Twitter bot detection is vital in combating misinformation and safeguarding the integrity of social media discourse. While malicious bots are becoming more and more sophisticated and personalized, standard bot detection approaches are still agnostic to social environments (henceforth, communities) the bots operate at. In this work, we introduce community-specific bot detection, estimating the percentage of bots given the context of a community. Our method{---}BotPercent{---}is an amalgamation of Twitter bot detection datasets and feature-, text-, and graph-based models, adjusted to a particular community on Twitter. We introduce an approach that performs confidence calibration across bot detection models, which addresses generalization issues in existing community-agnostic models targeting individual bots and leads to more accurate community-level bot estimations. Experiments demonstrate that BotPercent achieves state-of-the-art performance in community-level Twitter bot detection across both balanced and imbalanced class distribution settings, presenting a less biased estimator of Twitter bot populations within the communities we analyze. We then analyze bot rates in several Twitter groups, including users who engage with partisan news media, political communities in different countries, and more. Our results reveal that the presence of Twitter bots is not homogeneous, but exhibiting a spatial-temporal distribution with considerable heterogeneity that should be taken into account for content moderation and social media policy making. 
    more » « less
  3. null (Ed.)
    Internet of Things (IoT) devices, web browsers, phones, and even cars may be fingerprinted for tracking, and their connections routed through or to malicious entities. When IoT devices interact with a remote service, the integrity or authentication of that service is not guaranteed. IoT and other edge devices could be subject to man-in-the-middle (MiTM) attacks, with IoT devices attempting to connect to remote services. It is also straight-forward to use phishing or pharming to convince a user to accept a connection to a potentially malicious unfamiliar device. These risks could be mitigated by leveraging information on the edge of the network about the path to and destination of a connection. In this work we sample packets, then use packet analysis and local history to identify risky or suspicious connections. In contrast to other machine learning and big data approaches, the use of local data enables risk detection without loss of privacy. 
    more » « less
  4. Artificial Intelligence (AI) bots receive much attention and usage in industry manufacturing and even store cashier applications. Our research is to train AI bots to be software engineering assistants, specifically to detect biases and errors inside AI software applications. An example application is an AI machine learning system that sorts and classifies people according to various attributes, such as the algorithms involved in criminal sentencing, hiring, and admission practices. Biases, unfair decisions, and flaws in terms of the equity, diversity, and justice presence, in such systems could have severe consequences. As a Hispanic-Serving Institution, we are concerned about underrepresented groups and devoted an extended amount of our time to implementing “An Assure AI” (AAAI) Bot to detect biases and errors in AI applications. Our state-of-the-art AI Bot was developed based on our previous accumulated research in AI and Deep Learning (DL). The key differentiator is that we are taking a unique approach: instead of cleaning the input data, filtering it out and minimizing its biases, we trained our deep Neural Networks (NN) to detect and mitigate biases of existing AI models. The backend of our bot uses the Detection Transformer (DETR) framework, developed by Facebook, 
    more » « less
  5. null (Ed.)
    Internet of Things (IoT) devices are becoming increasingly popular and offer a wide range of services and functionality to their users. However, there are significant privacy and security risks associated with these devices. IoT devices can infringe users' privacy by ex-filtrating their private information to third parties, often without their knowledge. In this work we investigate the possibility to identify IoT devices and their location in an Internet Service Provider's network. By analyzing data from a large Internet Service Provider (ISP), we show that it is possible to recognize specific IoT devices, their vendors, and sometimes even their specific model, and to infer their location in the network. This is possible even with sparsely sampled flow data that are often the only datasets readily available at an ISP. We evaluate our proposed methodology to infer IoT devices at subscriber lines of a large ISP. Given ground truth information on IoT devices location and models, we were able to detect more than 77% of the studied IoT devices from sampled flow data in the wild. 
    more » « less