skip to main content


Title: Internet-scale Insecurity of Consumer Internet of Things: An Empirical Measurements Perspective
The number of Internet-of-Things (IoT) devices actively communicating across the Internet is continually increasing, as these devices are deployed across a variety of sectors, constantly transferring private data across the Internet. Due to the extensive deployment of such devices, the continuous discovery and persistence of IoT-centric vulnerabilities in protocols, applications, hardware, and the improper management of such IoT devices has resulted in the rampant, uncontrolled spread of malware threatening consumer IoT devices. To this end, this work adopts a novel, macroscopic methodology for fingerprinting Internet-scale compromised IoT devices, revealing crucial cyber threat intelligence on the insecurity of consumer IoT devices. By developing data-driven techniques rooted in machine learning methods and analyzing 3.6 TB of network traffic data, we discover 855,916 compromised IP addresses, with 310,164 fingerprinted as IoT. Further analysis reveals China and Brazil to be hosting the most significant population of compromised IoT devices (100,000 and 55,000, respectively). Additionally, we provide a longitudinal analysis on data from one year ago against this work, revealing the evolving trends of IoT exploitation, such as the increased number of vendors targeted by malware, rising from 50 to 131. Moreover, countries such as China (420% increased infected IoT count) and Indonesia (177% increased infected IoT count) have seen notably high increases in infection rates. Last, we compare our geographic results against Global Cybersecurity Index (GCI) ratings, verifying that countries with high GCI ratings, such as the Netherlands and Germany, had relatively low infection rates. However, upon further inspection, we find that the GCI rate does not accurately represent the consumer IoT market, with countries such as China and Russia being rated with “high” CGI scores, yet hosting a large population of infected consumer IoT devices.  more » « less
Award ID(s):
1953051
NSF-PAR ID:
10249467
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
ACM Transactions on Management Information Systems
Volume:
11
Issue:
4
ISSN:
2158-656X
Page Range / eLocation ID:
1 to 24
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract This project is funded by the US National Science Foundation (NSF) through their NSF RAPID program under the title “Modeling Corona Spread Using Big Data Analytics.” The project is a joint effort between the Department of Computer & Electrical Engineering and Computer Science at FAU and a research group from LexisNexis Risk Solutions. The novel coronavirus Covid-19 originated in China in early December 2019 and has rapidly spread to many countries around the globe, with the number of confirmed cases increasing every day. Covid-19 is officially a pandemic. It is a novel infection with serious clinical manifestations, including death, and it has reached at least 124 countries and territories. Although the ultimate course and impact of Covid-19 are uncertain, it is not merely possible but likely that the disease will produce enough severe illness to overwhelm the worldwide health care infrastructure. Emerging viral pandemics can place extraordinary and sustained demands on public health and health systems and on providers of essential community services. Modeling the Covid-19 pandemic spread is challenging. But there are data that can be used to project resource demands. Estimates of the reproductive number (R) of SARS-CoV-2 show that at the beginning of the epidemic, each infected person spreads the virus to at least two others, on average (Emanuel et al. in N Engl J Med. 2020, Livingston and Bucher in JAMA 323(14):1335, 2020). A conservatively low estimate is that 5 % of the population could become infected within 3 months. Preliminary data from China and Italy regarding the distribution of case severity and fatality vary widely (Wu and McGoogan in JAMA 323(13):1239–42, 2020). A recent large-scale analysis from China suggests that 80 % of those infected either are asymptomatic or have mild symptoms; a finding that implies that demand for advanced medical services might apply to only 20 % of the total infected. Of patients infected with Covid-19, about 15 % have severe illness and 5 % have critical illness (Emanuel et al. in N Engl J Med. 2020). Overall, mortality ranges from 0.25 % to as high as 3.0 % (Emanuel et al. in N Engl J Med. 2020, Wilson et al. in Emerg Infect Dis 26(6):1339, 2020). Case fatality rates are much higher for vulnerable populations, such as persons over the age of 80 years (> 14 %) and those with coexisting conditions (10 % for those with cardiovascular disease and 7 % for those with diabetes) (Emanuel et al. in N Engl J Med. 2020). Overall, Covid-19 is substantially deadlier than seasonal influenza, which has a mortality of roughly 0.1 %. Public health efforts depend heavily on predicting how diseases such as those caused by Covid-19 spread across the globe. During the early days of a new outbreak, when reliable data are still scarce, researchers turn to mathematical models that can predict where people who could be infected are going and how likely they are to bring the disease with them. These computational methods use known statistical equations that calculate the probability of individuals transmitting the illness. Modern computational power allows these models to quickly incorporate multiple inputs, such as a given disease’s ability to pass from person to person and the movement patterns of potentially infected people traveling by air and land. This process sometimes involves making assumptions about unknown factors, such as an individual’s exact travel pattern. By plugging in different possible versions of each input, however, researchers can update the models as new information becomes available and compare their results to observed patterns for the illness. In this paper we describe the development a model of Corona spread by using innovative big data analytics techniques and tools. We leveraged our experience from research in modeling Ebola spread (Shaw et al. Modeling Ebola Spread and Using HPCC/KEL System. In: Big Data Technologies and Applications 2016 (pp. 347-385). Springer, Cham) to successfully model Corona spread, we will obtain new results, and help in reducing the number of Corona patients. We closely collaborated with LexisNexis, which is a leading US data analytics company and a member of our NSF I/UCRC for Advanced Knowledge Enablement. The lack of a comprehensive view and informative analysis of the status of the pandemic can also cause panic and instability within society. Our work proposes the HPCC Systems Covid-19 tracker, which provides a multi-level view of the pandemic with the informative virus spreading indicators in a timely manner. The system embeds a classical epidemiological model known as SIR and spreading indicators based on causal model. The data solution of the tracker is built on top of the Big Data processing platform HPCC Systems, from ingesting and tracking of various data sources to fast delivery of the data to the public. The HPCC Systems Covid-19 tracker presents the Covid-19 data on a daily, weekly, and cumulative basis up to global-level and down to the county-level. It also provides statistical analysis for each level such as new cases per 100,000 population. The primary analysis such as Contagion Risk and Infection State is based on causal model with a seven-day sliding window. Our work has been released as a publicly available website to the world and attracted a great volume of traffic. The project is open-sourced and available on GitHub. The system was developed on the LexisNexis HPCC Systems, which is briefly described in the paper. 
    more » « less
  2. The NTT (Nippon Telegraph and Telephone) Data Corporation report found that 80% of U.S. consumers are concerned about their smart home data security. The Internet of Things (IoT) technology brings many benefits to people's homes, and more people across the world are heavily dependent on the technology and its devices. However, many IoT devices are deployed without considering security, increasing the number of attack vectors available to attackers. Numerous Internet of Things devices lacking security features have been compromised by attackers, resulting in many security incidents. Attackers can infiltrate these smart home devices and control the home via turning off the lights, controlling the alarm systems, and unlocking the smart locks, to name a few. Attackers have also been able to access the smart home network, leading to data exfiltration. There are many threats that smart homes face, such as the Man-in-the-Middle (MIM) attacks, data and identity theft, and Denial of Service (DoS) attacks. The hardware vulnerabilities often targeted by attackers are SPI, UART, JTAG, USB, etc. Therefore, to enhance the security of the smart devices used in our daily lives, threat modeling should be implemented early on in developing any given system. This past Spring semester, Morgan State University launched a (senior) capstone project targeting undergraduate (electrical) engineering students who were thus allowed to research with the Cybersecurity Assurance and Policy (CAP) center for four months. The primary purpose of the capstone was to help students further develop both hardware and software skills while researching. For this project, the students mainly focused on the Arduino Mega Board. Some of the expected outcomes for this capstone project include: 1) understanding the physical board components, 2) learning how to attack the board through the STRIDE technique, 3) generating a Data Flow Diagram (DFD) of the system using the Microsoft threat modeling tool, 4) understanding the attack patterns, and 5) generating the threat based on the user's input. To prevent future threats and attacks from taking advantage of systems vulnerabilities, the practice of "threat modeling" is implemented. This method allows the analysis of potential attackers, including their goals and techniques, while also providing solutions and mitigation strategies. Although Threat modeling can be performed throughout the development of a system, implementing it during developmental stages will prevent further problems in the future. Threat Modeling is crucial because it will help identify any potential threat before it propagates in the system. Identifying threats and providing countermeasures will save both time and money while also keeping the consumers safe. As a result, students must grow to understand how essential detecting and preventing attacks are to protect consumer information systems and networks. At the end of this capstone project, students should take away hands-on skills in cyber defense. 
    more » « less
  3. Software vulnerabilities in emerging systems, such as the Internet of Things (IoT), allow for multiple attack vectors that are exploited by adversaries for malicious intents. One of such vectors is malware, where limited efforts have been dedicated to IoT malware analysis, characterization, and understanding. In this paper, we analyze recent IoT malware through the lenses of static analysis. Towards this, we reverse-engineer and perform a detailed analysis of almost 2,900 IoT malware samples of eight different architectures across multiple analysis directions. We conduct string analysis, unveiling operation, unique textual characteristics, and network dependencies. Through the control flow graph analysis, we unveil unique graph-theoretic features. Through the function analysis, we address obfuscation by function approximation. We then pursue two applications based on our analysis: 1) Combining various analysis aspects, we reconstruct the infection lifecycle of various prominent malware families, and 2) using multiple classes of features obtained from our static analysis, we design a machine learning-based detection model with features that are robust and an average detection rate of 99.8%. 
    more » « less
  4. Prof. Ninghui Li Editor in Chief, ACM Transactions (Ed.)
    Malware analysis is an essential task to understand infection campaigns, the behavior of malicious codes, and possible ways to mitigate threats. Malware analysis also allows better assessment of attacker’s capabilities, techniques, and processes. Although a substantial amount of previous work provided a comprehensive analysis of the international malware ecosystem, research on regionalized, country, and population-specific malware campaigns have been scarce. Moving towards addressing this gap, we conducted a longitudinal (2012-2020) and comprehensive (encompassing an entire population of online banking users) study of MS Windows desktop malware that actually infected Brazilian bank’s users. We found that the Brazilian financial desktop malware has been evolving quickly: it started to make use of a variety of file formats instead of typical PE binaries, relied on native system resources, and abused obfuscation technique to bypass detection mechanisms. Our study on the threats targeting a significant population on the ecosystem of the largest and most populous country in Latin America can provide invaluable insights that may be applied to other countries’ user populations, especially those in the developing world that might face cultural peculiarities similar to Brazil’s. With this evaluation, we expect to motivate the security community/industry to seriously considering a deeper level of customization during the development of next generation anti-malware solutions, as well as to raise awareness towards regionalized and targeted Internet threats. 
    more » « less
  5. Recent years have witnessed the rise of Internet-of-Things (IoT) based cyber attacks. These attacks, as expected, are launched from compromised IoT devices by exploiting security flaws already known. Less clear, however, are the fundamental causes of the pervasiveness of IoT device vulnerabilities and their security implications, particularly in how they affect ongoing cybercrimes. To better understand the problems and seek effective means to suppress the wave of IoT-based attacks, we conduct a comprehensive study based on a large number of real-world attack traces collected from our honeypots, attack tools purchased from the underground, and information collected from high-profile IoT attacks. This study sheds new light on the device vulnerabilities of today’s IoT systems and their security implications: ongoing cyber attacks heavily rely on these known vulnerabilities and the attack code released through their reports; on the other hand, such a reliance on known vulnerabilities can actually be used against adversaries. The same bug reports that enable the development of an attack at an exceedingly low cost can also be leveraged to extract vulnerability-specific features that help stop the attack. In particular, we leverage Natural Language Processing (NLP) to automatically collect and analyze more than 7,500 security reports (with 12,286 security critical IoT flaws in total) scattered across bug-reporting blogs, forums, and mailing lists on the Internet. We show that signatures can be automatically generated through an NLP-based report analysis, and be used by intrusion detection or firewall systems to effectively mitigate the threats from today’s IoT-based attacks. 
    more » « less