Billions of devices in the Internet of Things (IoT) are inter-connected over the internet and communicate with each other or end users. IoT devices communicate through messaging bots. These bots are important in IoT systems to automate and better manage the work flows. IoT devices are usually spread across many applications and are able to capture or generate substantial influx of big data. The integration of IoT with cloud computing to handle and manage big data, requires considerable security measures in order to prevent cyber attackers from adversarial use of such large amount of data. An attacker can simply utilize the messaging bots to perform malicious activities on a number of devices and thus bots pose serious cybersecurity hazards for IoT devices. Hence, it is important to detect the presence of malicious bots in the network. In this paper we propose an evidence theory-based approach for malicious bot detection. Evidence Theory, a.k.a. Dempster Shafer Theory (DST) is a probabilistic reasoning tool and has the unique ability to handle uncertainty, i.e. in the absence of evidence. It can be applied efficiently to identify a bot, especially when the bots have dynamic or polymorphic behavior. The key characteristic of DST is that the detection system may not need any prior information about the malicious signatures and profiles. In this work, we propose to analyze the network flow characteristics to extract key evidence for bot traces. We then quantify these pieces of evidence using apriori algorithm and apply DST to detect the presence of the bots.
more »
« less
An Assure AI Bot (AAAI bot)
Artificial Intelligence (AI) bots receive much attention and usage in industry manufacturing and even store cashier applications. Our research is to train AI bots to be software engineering assistants, specifically to detect biases and errors inside AI software applications. An example application is an AI machine learning system that sorts and classifies people according to various attributes, such as the algorithms involved in criminal sentencing, hiring, and admission practices. Biases, unfair decisions, and flaws in terms of the equity, diversity, and justice presence, in such systems could have severe consequences. As a Hispanic-Serving Institution, we are concerned about underrepresented groups and devoted an extended amount of our time to implementing “An Assure AI” (AAAI) Bot to detect biases and errors in AI applications. Our state-of-the-art AI Bot was developed based on our previous accumulated research in AI and Deep Learning (DL). The key differentiator is that we are taking a unique approach: instead of cleaning the input data, filtering it out and minimizing its biases, we trained our deep Neural Networks (NN) to detect and mitigate biases of existing AI models. The backend of our bot uses the Detection Transformer (DETR) framework, developed by Facebook,
more »
« less
- Award ID(s):
- 1722563
- PAR ID:
- 10351915
- Date Published:
- Journal Name:
- Proceedings of ISNCC
- Volume:
- 1
- Issue:
- 1
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Twitter bot detection is vital in combating misinformation and safeguarding the integrity of social media discourse. While malicious bots are becoming more and more sophisticated and personalized, standard bot detection approaches are still agnostic to social environments (henceforth, communities) the bots operate at. In this work, we introduce community-specific bot detection, estimating the percentage of bots given the context of a community. Our method{---}BotPercent{---}is an amalgamation of Twitter bot detection datasets and feature-, text-, and graph-based models, adjusted to a particular community on Twitter. We introduce an approach that performs confidence calibration across bot detection models, which addresses generalization issues in existing community-agnostic models targeting individual bots and leads to more accurate community-level bot estimations. Experiments demonstrate that BotPercent achieves state-of-the-art performance in community-level Twitter bot detection across both balanced and imbalanced class distribution settings, presenting a less biased estimator of Twitter bot populations within the communities we analyze. We then analyze bot rates in several Twitter groups, including users who engage with partisan news media, political communities in different countries, and more. Our results reveal that the presence of Twitter bots is not homogeneous, but exhibiting a spatial-temporal distribution with considerable heterogeneity that should be taken into account for content moderation and social media policy making.more » « less
-
Background: Bots help automate many of the tasks performed by software developers and are widely used to commit code in various social coding platforms. At present, it is not clear what types of activities these bots perform and understanding it may help design better bots, and find application areas which might benefit from bot adoption. Aim: We aim to categorize the Bot Commits by the type of change (files added, deleted, or modified), find the more commonly changed file types, and identify the groups of file types that tend to get updated together. Method: 12,326,137 commits made by 461 popular bots (that made at least 1000 commits) were examined to identify the frequency and the type of files added/ deleted/ modified by the commits, and association rule mining was used to identify the types of files modified together. Result: Majority of the bot commits modify an existing file, a few of them add new files, while deletion of a file is very rare. Commits involving more than one type of operation are even rarer. Files containing data, configuration, and documentation are most frequently updated, while HTML is the most common type in terms of the number of files added, deleted, and modified. Files of the type "Markdown", "Ignore List", "YAML", "JSON" were the types that are updated together with other types of files most frequently. Conclusion: We observe that majority of bot commits involve single file modifications, and bots primarily work with data, configuration, and documentation files. A better understanding if this is a limitation of the bots and, if overcome, would lead to different kinds of bots remains an open question.more » « less
-
Of the hundreds of billions of dollars spent on developer wages, up to 25% accounts for fixing bugs. Companies like Google save significant human effort and engineering costs with automatic bug detection tools, yet automatically fixing them is still a nascent endeavour. Very recent work (including our own) demonstrates the feasibility of automatic program repair in practice. As automated repair technology matures, it presents great appeal for integration into developer workflows. We believe software bots are a promising vehicle for realizing this integration, as they bridge the gap between human software development and automated processes. We envision repair bots orchestrating automated refactoring and bug fixing. To this end, we explore what building a repair bot entails. We draw on our understanding of patch generation, validation, and real world software development interactions to identify six principles that bear on engineering repair bots and discuss related design challenges for integrating human workflows. Ultimately, this work aims to foster critical focus and interest for making repair bots a reality.more » « less
-
null (Ed.)As the web keeps increasing in size, the number of vulnerable and poorly-managed websites increases commensurately. Attackers rely on armies of malicious bots to discover these vulnerable websites, compromising their servers, and exfiltrating sensitive user data. It is therefore crucial for the security of the web to understand the population and behavior of malicious bots. In this paper, we report on the design, implementation, and results of Aristaeus, a system for deploying large numbers of honeysites, i.e., websites that exist for the sole purpose of attracting and recording bot traffic. Through a seven-month-long experiment with 100 dedicated honeysites, Aristaeus recorded 26.4 million requests sent by more than 287K unique IP addresses, with 76K of them belonging to clearly malicious bots. By analyzing the type of requests and payloads that these bots send, we discover that the average honeysite received more than 37K requests each month, with more than 50% of these requests attempting to brute-force credentials, fingerprint the deployed web applications, and exploit large numbers of different vulnerabilities. By comparing the declared identity of these bots with their TLS handshakes and HTTP headers, we uncover that more than 86.2% of bots claiming to be Mozilla Firefox and Google Chrome are lying about their identity and are instead built on HTTP libraries and command-line tools.more » « less
An official website of the United States government

