Billions of devices in the Internet of Things (IoT) are inter-connected over the internet and communicate with each other or end users. IoT devices communicate through messaging bots. These bots are important in IoT systems to automate and better manage the work flows. IoT devices are usually spread across many applications and are able to capture or generate substantial influx of big data. The integration of IoT with cloud computing to handle and manage big data, requires considerable security measures in order to prevent cyber attackers from adversarial use of such large amount of data. An attacker can simply utilize the messaging bots to perform malicious activities on a number of devices and thus bots pose serious cybersecurity hazards for IoT devices. Hence, it is important to detect the presence of malicious bots in the network. In this paper we propose an evidence theory-based approach for malicious bot detection. Evidence Theory, a.k.a. Dempster Shafer Theory (DST) is a probabilistic reasoning tool and has the unique ability to handle uncertainty, i.e. in the absence of evidence. It can be applied efficiently to identify a bot, especially when the bots have dynamic or polymorphic behavior. The key characteristic of DST is that the detection system may not need any prior information about the malicious signatures and profiles. In this work, we propose to analyze the network flow characteristics to extract key evidence for bot traces. We then quantify these pieces of evidence using apriori algorithm and apply DST to detect the presence of the bots.
more »
« less
Good Bot, Bad Bot: Characterizing Automated Browsing Activity
As the web keeps increasing in size, the number of vulnerable and poorly-managed websites increases commensurately. Attackers rely on armies of malicious bots to discover these vulnerable websites, compromising their servers, and exfiltrating sensitive user data. It is therefore crucial for the security of the web to understand the population and behavior of malicious bots. In this paper, we report on the design, implementation, and results of Aristaeus, a system for deploying large numbers of honeysites, i.e., websites that exist for the sole purpose of attracting and recording bot traffic. Through a seven-month-long experiment with 100 dedicated honeysites, Aristaeus recorded 26.4 million requests sent by more than 287K unique IP addresses, with 76K of them belonging to clearly malicious bots. By analyzing the type of requests and payloads that these bots send, we discover that the average honeysite received more than 37K requests each month, with more than 50% of these requests attempting to brute-force credentials, fingerprint the deployed web applications, and exploit large numbers of different vulnerabilities. By comparing the declared identity of these bots with their TLS handshakes and HTTP headers, we uncover that more than 86.2% of bots claiming to be Mozilla Firefox and Google Chrome are lying about their identity and are instead built on HTTP libraries and command-line tools.
more »
« less
- PAR ID:
- 10250016
- Date Published:
- Journal Name:
- Proceedings of the 42nd IEEE Symposium on Security and Privacy
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Modern web sites often run web applications on the server to handle HTTP requests from users and generate dynamic responses. Due to their concurrent nature, web applications are vulnerable to server-side request races. The problem becomes more severe with the ever-increasing popularity of web applications. We first conduct a comprehensive characteristic study of 157 real-world server-side request races collected from different, popular types of web applications. The findings of this study can provide guidance for future development support in combating server-side request races. Guided by our study results, we develop a dynamic framework, ReqRacer, for detecting and exposing server-side request races in web applications. We propose novel approaches to model happens-before relationships between HTTP requests, which are essential to web applications. Our evaluation shows that ReqRacer can effectively and efficiently detect known and unknown request races.more » « less
-
Twitter bot detection is vital in combating misinformation and safeguarding the integrity of social media discourse. While malicious bots are becoming more and more sophisticated and personalized, standard bot detection approaches are still agnostic to social environments (henceforth, communities) the bots operate at. In this work, we introduce community-specific bot detection, estimating the percentage of bots given the context of a community. Our method{---}BotPercent{---}is an amalgamation of Twitter bot detection datasets and feature-, text-, and graph-based models, adjusted to a particular community on Twitter. We introduce an approach that performs confidence calibration across bot detection models, which addresses generalization issues in existing community-agnostic models targeting individual bots and leads to more accurate community-level bot estimations. Experiments demonstrate that BotPercent achieves state-of-the-art performance in community-level Twitter bot detection across both balanced and imbalanced class distribution settings, presenting a less biased estimator of Twitter bot populations within the communities we analyze. We then analyze bot rates in several Twitter groups, including users who engage with partisan news media, political communities in different countries, and more. Our results reveal that the presence of Twitter bots is not homogeneous, but exhibiting a spatial-temporal distribution with considerable heterogeneity that should be taken into account for content moderation and social media policy making.more » « less
-
Deadlocks in database-backed web applications could involve different numbers of HTTP requests, and they could be caused by locks explicitly requested in application code or implicitly requested by databases during query execution. To help developers understand these deadlocks and guide the design of tools for combating these deadlocks, we conduct a characteristic study with 49 deadlocks collected from real-world web applications developed following different programming paradigms. We provide categorization results based on HTTP request numbers and resource types, with a special focus on cat-egorizing deadlocks on database locks. We expect our results to be useful for application developers to understand web-application deadlocks and for tool researchers to design comprehensive support for combating web-application deadlocks.more » « less
-
Background: Bots help automate many of the tasks performed by software developers and are widely used to commit code in various social coding platforms. At present, it is not clear what types of activities these bots perform and understanding it may help design better bots, and find application areas which might benefit from bot adoption. Aim: We aim to categorize the Bot Commits by the type of change (files added, deleted, or modified), find the more commonly changed file types, and identify the groups of file types that tend to get updated together. Method: 12,326,137 commits made by 461 popular bots (that made at least 1000 commits) were examined to identify the frequency and the type of files added/ deleted/ modified by the commits, and association rule mining was used to identify the types of files modified together. Result: Majority of the bot commits modify an existing file, a few of them add new files, while deletion of a file is very rare. Commits involving more than one type of operation are even rarer. Files containing data, configuration, and documentation are most frequently updated, while HTML is the most common type in terms of the number of files added, deleted, and modified. Files of the type "Markdown", "Ignore List", "YAML", "JSON" were the types that are updated together with other types of files most frequently. Conclusion: We observe that majority of bot commits involve single file modifications, and bots primarily work with data, configuration, and documentation files. A better understanding if this is a limitation of the bots and, if overcome, would lead to different kinds of bots remains an open question.more » « less
An official website of the United States government

