skip to main content


Title: Capture the Bot: Using Adversarial Examples to Improve CAPTCHA Robustness to Bot Attacks
Award ID(s):
1822094
NSF-PAR ID:
10334467
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE Intelligent Systems
Volume:
36
Issue:
5
ISSN:
1541-1672
Page Range / eLocation ID:
104 to 112
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    As the web keeps increasing in size, the number of vulnerable and poorly-managed websites increases commensurately. Attackers rely on armies of malicious bots to discover these vulnerable websites, compromising their servers, and exfiltrating sensitive user data. It is therefore crucial for the security of the web to understand the population and behavior of malicious bots. In this paper, we report on the design, implementation, and results of Aristaeus, a system for deploying large numbers of honeysites, i.e., websites that exist for the sole purpose of attracting and recording bot traffic. Through a seven-month-long experiment with 100 dedicated honeysites, Aristaeus recorded 26.4 million requests sent by more than 287K unique IP addresses, with 76K of them belonging to clearly malicious bots. By analyzing the type of requests and payloads that these bots send, we discover that the average honeysite received more than 37K requests each month, with more than 50% of these requests attempting to brute-force credentials, fingerprint the deployed web applications, and exploit large numbers of different vulnerabilities. By comparing the declared identity of these bots with their TLS handshakes and HTTP headers, we uncover that more than 86.2% of bots claiming to be Mozilla Firefox and Google Chrome are lying about their identity and are instead built on HTTP libraries and command-line tools. 
    more » « less
  2. Artificial Intelligence (AI) bots receive much attention and usage in industry manufacturing and even store cashier applications. Our research is to train AI bots to be software engineering assistants, specifically to detect biases and errors inside AI software applications. An example application is an AI machine learning system that sorts and classifies people according to various attributes, such as the algorithms involved in criminal sentencing, hiring, and admission practices. Biases, unfair decisions, and flaws in terms of the equity, diversity, and justice presence, in such systems could have severe consequences. As a Hispanic-Serving Institution, we are concerned about underrepresented groups and devoted an extended amount of our time to implementing “An Assure AI” (AAAI) Bot to detect biases and errors in AI applications. Our state-of-the-art AI Bot was developed based on our previous accumulated research in AI and Deep Learning (DL). The key differentiator is that we are taking a unique approach: instead of cleaning the input data, filtering it out and minimizing its biases, we trained our deep Neural Networks (NN) to detect and mitigate biases of existing AI models. The backend of our bot uses the Detection Transformer (DETR) framework, developed by Facebook, 
    more » « less
  3. As malicious bots reside in a network to disrupt network stability, graph neural networks (GNNs) have emerged as one of the most popular bot detection methods. However, in most cases these graphs are significantly class-imbalanced. To address this issue, graph oversampling has recently been proposed to synthesize nodes and edges, which still suffers from graph heterophily, leading to suboptimal performance. In this paper, we propose HOVER, which implements Homophilic Oversampling Via Edge Removal for bot detection on graphs. Instead of oversampling nodes and edges within initial graph structure, HOVER designs a simple edge removal method with heuristic criteria to mitigate heterophily and learn distinguishable node embeddings, which are then used to oversample minority bots to generate a balanced class distribution without edge synthesis. Experiments on TON IoT networks demonstrate the state-of-the-art performance of HOVER on bot detection with high graph heterophily and extreme class imbalance. 
    more » « less
  4. Billions of devices in the Internet of Things (IoT) are inter-connected over the internet and communicate with each other or end users. IoT devices communicate through messaging bots. These bots are important in IoT systems to automate and better manage the work flows. IoT devices are usually spread across many applications and are able to capture or generate substantial influx of big data. The integration of IoT with cloud computing to handle and manage big data, requires considerable security measures in order to prevent cyber attackers from adversarial use of such large amount of data. An attacker can simply utilize the messaging bots to perform malicious activities on a number of devices and thus bots pose serious cybersecurity hazards for IoT devices. Hence, it is important to detect the presence of malicious bots in the network. In this paper we propose an evidence theory-based approach for malicious bot detection. Evidence Theory, a.k.a. Dempster Shafer Theory (DST) is a probabilistic reasoning tool and has the unique ability to handle uncertainty, i.e. in the absence of evidence. It can be applied efficiently to identify a bot, especially when the bots have dynamic or polymorphic behavior. The key characteristic of DST is that the detection system may not need any prior information about the malicious signatures and profiles. In this work, we propose to analyze the network flow characteristics to extract key evidence for bot traces. We then quantify these pieces of evidence using apriori algorithm and apply DST to detect the presence of the bots. 
    more » « less
  5. Machine learning (ML) operations or MLOps advocates for integration of DevOps- related practices into the ML development and deployment process. Adoption of MLOps can be hampered due to a lack of knowledge related to how development tasks can be automated. A characterization of bot usage in ML projects can help practitioners on the types of tasks that can be automated with bots, and apply that knowledge into their ML development and deployment process. To that end, we conduct a preliminary empirical study with 135 issues reported mined from 3 libraries related to deep learning: Keras, PyTorch, and Tensorflow. From our empirical study we observe 9 categories of tasks that are automated with bots. We conclude our work-in-progress paper by providing a list of lessons that we learned from our empirical study. 
    more » « less