skip to main content


Title: Cracking Wall of Confinement: Understanding and Analyzing Malicious Domain Takedowns
Take-down operations aim to disrupt cybercrime involving malicious domains. In the past decade, many successful take-down operations have been reported, including those against the Conficker worm, and most recently, against VPNFilter. Although it plays an important role in fighting cybercrime, the domain take-down procedure is still surprisingly opaque. There seems to be no in-depth understanding about how the take-down operation works and whether there is due diligence to ensure its security and reliability. In this paper, we report the first systematic study on domain takedown. Our study was made possible via a large collection of data, including various sinkhole feeds and blacklists, passive DNS data spanning six years, and historical WHOIS information. Over these datasets, we built a unique methodology that extensively used various reverse lookups and other data analysis techniques to address the challenges in identifying taken-down domains, sinkhole operators, and take-down durations. Applying the methodology on the data, we discovered over 620K takendown domains and conducted a longitudinal analysis on the take-down process, thus facilitating a better understanding of the operation and its weaknesses. We found that more than 14% of domains taken-down over the past ten months have been released back to the domain market and that some of the released domains have been repurchased by the malicious actor again before being captured and seized, either by the same or different sinkholes. In addition, we showed that the misconfiguration of DNS records corresponding to the sinkholed domains allowed us to hijack a domain that was seized by the FBI. Further, we found that expired sinkholes have caused the transfer of around 30K takendown domains whose traffic is now under the control of new owners.  more » « less
Award ID(s):
1801432
NSF-PAR ID:
10097935
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
the 26th Annual Network and Distributed System Security Symposium
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Successful malware campaigns often rely on the ability of infected hosts to locate and contact their command-and-control (C2) servers. Malware campaigns often use DNS domains for this purpose, but DNS domains may be taken down by the registrar that sold them. In response to this threat, malware operators have begun using blockchain-based naming systems to store C2 server names. Blockchain naming systems are a threat to malware defenders because they are not subject to a centralized authority, such as a registrar, that can take down abused domains, either voluntarily or under legal pressure. In fact, blockchains are robust against a variety of interventions that work on DNS domains, which is bad news for defenders. We analyze the ecosystem of blockchain naming systems and identify new locations for defenders to stage interventions against malware. In particular, we find that malware is obligated to use centralized or semi-centralized infrastructure to connect to blockchain naming systems and modify the records stored within. In fact, scattered interventions have already been staged against this centralized infrastructure: we present case studies of several such instances. We also present a study of how blockchain naming systems are currently abused by malware operators, and discuss the factors that would cause a blockchain naming system to become an unstoppable threat. We conclude that existing blockchain naming systems still provide opportunities for defenders to prevent malware from contacting its C2 servers. 
    more » « less
  2. Abstract

    Sinkholes are the most abundant surface features in karst areas worldwide. Understanding sinkhole occurrences and characteristics is critical for studying karst aquifers and mitigating sinkhole‐related hazards. Most sinkholes appear on the land surface as depressions or cover collapses and are commonly mapped from elevation data, such as digital elevation models (DEMs). Existing methods for identifying sinkholes from DEMs often require two steps: locating surface depressions and separating sinkholes from non‐sinkhole depressions. In this study, we explored deep learning to directly identify sinkholes from DEM data and aerial imagery. A key contribution of our study is an evaluation of various ways of integrating these two types of raster data. We used an image segmentation model, U‐Net, to locate sinkholes. We trained separate U‐Net models based on four input images of elevation data: a DEM image, a slope image, a DEM gradient image, and a DEM‐shaded relief image. Three normalization techniques (Global, Gaussian, and Instance) were applied to improve the model performance. Model results suggest that deep learning is a viable method to identify sinkholes directly from the images of elevation data. In particular, DEM gradient data provided the best input for U‐net image segmentation models to locate sinkholes. The model using the DEM gradient image with Gaussian normalization achieved the best performance with a sinkhole intersection‐over‐union (IoU) of 45.38% on the unseen test set. Aerial images, however, were not useful in training deep learning models for sinkholes as the models using an aerial image as input achieved sinkhole IoUs below 3%.

     
    more » « less
  3. Obeid, Iyad Selesnick (Ed.)
    The Temple University Hospital EEG Corpus (TUEG) [1] is the largest publicly available EEG corpus of its type and currently has over 5,000 subscribers (we currently average 35 new subscribers a week). Several valuable subsets of this corpus have been developed including the Temple University Hospital EEG Seizure Corpus (TUSZ) [2] and the Temple University Hospital EEG Artifact Corpus (TUAR) [3]. TUSZ contains manually annotated seizure events and has been widely used to develop seizure detection and prediction technology [4]. TUAR contains manually annotated artifacts and has been used to improve machine learning performance on seizure detection tasks [5]. In this poster, we will discuss recent improvements made to both corpora that are creating opportunities to improve machine learning performance. Two major concerns that were raised when v1.5.2 of TUSZ was released for the Neureka 2020 Epilepsy Challenge were: (1) the subjects contained in the training, development (validation) and blind evaluation sets were not mutually exclusive, and (2) high frequency seizures were not accurately annotated in all files. Regarding (1), there were 50 subjects in dev, 50 subjects in eval, and 592 subjects in train. There was one subject common to dev and eval, five subjects common to dev and train, and 13 subjects common between eval and train. Though this does not substantially influence performance for the current generation of technology, it could be a problem down the line as technology improves. Therefore, we have rebuilt the partitions of the data so that this overlap was removed. This required augmenting the evaluation and development data sets with new subjects that had not been previously annotated so that the size of these subsets remained approximately the same. Since these annotations were done by a new group of annotators, special care was taken to make sure the new annotators followed the same practices as the previous generations of annotators. Part of our quality control process was to have the new annotators review all previous annotations. This rigorous training coupled with a strict quality control process where annotators review a significant amount of each other’s work ensured that there is high interrater agreement between the two groups (kappa statistic greater than 0.8) [6]. In the process of reviewing this data, we also decided to split long files into a series of smaller segments to facilitate processing of the data. Some subscribers found it difficult to process long files using Python code, which tends to be very memory intensive. We also found it inefficient to manipulate these long files in our annotation tool. In this release, the maximum duration of any single file is limited to 60 mins. This increased the number of edf files in the dev set from 1012 to 1832. Regarding (2), as part of discussions of several issues raised by a few subscribers, we discovered some files only had low frequency epileptiform events annotated (defined as events that ranged in frequency from 2.5 Hz to 3 Hz), while others had events annotated that contained significant frequency content above 3 Hz. Though there were not many files that had this type of activity, it was enough of a concern to necessitate reviewing the entire corpus. An example of an epileptiform seizure event with frequency content higher than 3 Hz is shown in Figure 1. Annotating these additional events slightly increased the number of seizure events. In v1.5.2, there were 673 seizures, while in v1.5.3 there are 1239 events. One of the fertile areas for technology improvements is artifact reduction. Artifacts and slowing constitute the two major error modalities in seizure detection [3]. This was a major reason we developed TUAR. It can be used to evaluate artifact detection and suppression technology as well as multimodal background models that explicitly model artifacts. An issue with TUAR was the practicality of the annotation tags used when there are multiple simultaneous events. An example of such an event is shown in Figure 2. In this section of the file, there is an overlap of eye movement, electrode artifact, and muscle artifact events. We previously annotated such events using a convention that included annotating background along with any artifact that is present. The artifacts present would either be annotated with a single tag (e.g., MUSC) or a coupled artifact tag (e.g., MUSC+ELEC). When multiple channels have background, the tags become crowded and difficult to identify. This is one reason we now support a hierarchical annotation format using XML – annotations can be arbitrarily complex and support overlaps in time. Our annotators also reviewed specific eye movement artifacts (e.g., eye flutter, eyeblinks). Eye movements are often mistaken as seizures due to their similar morphology [7][8]. We have improved our understanding of ocular events and it has allowed us to annotate artifacts in the corpus more carefully. In this poster, we will present statistics on the newest releases of these corpora and discuss the impact these improvements have had on machine learning research. We will compare TUSZ v1.5.3 and TUAR v2.0.0 with previous versions of these corpora. We will release v1.5.3 of TUSZ and v2.0.0 of TUAR in Fall 2021 prior to the symposium. ACKNOWLEDGMENTS Research reported in this publication was most recently supported by the National Science Foundation’s Industrial Innovation and Partnerships (IIP) Research Experience for Undergraduates award number 1827565. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the official views of any of these organizations. REFERENCES [1] I. Obeid and J. Picone, “The Temple University Hospital EEG Data Corpus,” in Augmentation of Brain Function: Facts, Fiction and Controversy. Volume I: Brain-Machine Interfaces, 1st ed., vol. 10, M. A. Lebedev, Ed. Lausanne, Switzerland: Frontiers Media S.A., 2016, pp. 394 398. https://doi.org/10.3389/fnins.2016.00196. [2] V. Shah et al., “The Temple University Hospital Seizure Detection Corpus,” Frontiers in Neuroinformatics, vol. 12, pp. 1–6, 2018. https://doi.org/10.3389/fninf.2018.00083. [3] A. Hamid et, al., “The Temple University Artifact Corpus: An Annotated Corpus of EEG Artifacts.” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2020, pp. 1-3. https://ieeexplore.ieee.org/document/9353647. [4] Y. Roy, R. Iskander, and J. Picone, “The NeurekaTM 2020 Epilepsy Challenge,” NeuroTechX, 2020. [Online]. Available: https://neureka-challenge.com/. [Accessed: 01-Dec-2021]. [5] S. Rahman, A. Hamid, D. Ochal, I. Obeid, and J. Picone, “Improving the Quality of the TUSZ Corpus,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2020, pp. 1–5. https://ieeexplore.ieee.org/document/9353635. [6] V. Shah, E. von Weltin, T. Ahsan, I. Obeid, and J. Picone, “On the Use of Non-Experts for Generation of High-Quality Annotations of Seizure Events,” Available: https://www.isip.picone press.com/publications/unpublished/journals/2019/elsevier_cn/ira. [Accessed: 01-Dec-2021]. [7] D. Ochal, S. Rahman, S. Ferrell, T. Elseify, I. Obeid, and J. Picone, “The Temple University Hospital EEG Corpus: Annotation Guidelines,” Philadelphia, Pennsylvania, USA, 2020. https://www.isip.piconepress.com/publications/reports/2020/tuh_eeg/annotations/. [8] D. Strayhorn, “The Atlas of Adult Electroencephalography,” EEG Atlas Online, 2014. [Online]. Availabl 
    more » « less
  4. Sinkholes develop on carbonate landscapes when caves collapse and can subsequently become lake-like environments if they are flooded by local groundwater. Sediment cores retrieved from sinkholes have yielded high-resolution reconstructions of past environmental change, hydroclimate, and hurricane activity. However, our understanding of the internal sedimentary processes of these systems remains incomplete. Here, we use a multiproxy approach including sedimentology (stratigraphy, coarse-grained particle density, bulk organic matter content), micropaleontology (ostracods), and geochemistry (δ13C and δ2H on n-alkanoic acids) to reconstruct evidence for paleolimnology and regional hydroclimate from a continuous stratigraphic record (Emerald Pond sinkhole) in the northern Bahamas that spans the middle to late Holocene. Basal peat at 8.9 m below modern sea level documents the maximum sea-level position at ~ 8200 cal. yr BP. Subsequent upward vertical migration of the local aquifer caused by regional sea-level rise promoted carbonate-marl deposition from ~ 8300 to 1700 cal. yr BP. A shift in coarse particle deposition and ostracods at 5500 cal. yr BP suggests some environmental change, which may be related to one or multiple internal or external drivers. Sapropel deposition from ~ 1700 to 1300 cal. yr BP indicates a fundamental change in limnology to promote increased organic matter preservation, perhaps related to the regional cooling during the Dark Ages Cold Period. We find δ2H28 values are largely invariant from 7700 to 6150 cal. yr BP suggesting a generally stable hydroclimate (mean − 133‰, 1σ = 5‰). The shift to more depleted values (− 156‰, 1σ = 19‰) at ~ 6000–4800 cal. yr BP may be linked to a weakened (eastern displaced) North Atlantic Subtropical High. Nevertheless, additional local hydroclimate records are needed to better disentangle uncertainties from either internal or external influences on the resultant measurements. 
    more » « less
  5. null (Ed.)
    Society has long been exposed to naturally-occurring nanoparticles. Due to their ubiquitous nature, biological systems have adapted and built protection against their potential effects. However, for the past decades, there have been onslaughts of newly engineered nanoparticles being released in the environment with no known effects on ecosystems. Although these materials offer distinct advantages in manufacturing processes, such as odor-free fabric or controlled drug delivery, their fate in nature has yet to be thoroughly investigated. As the size of an already-large NPs market is expected to grow, due to advances in synthetic biology, it is vital that we increase our understanding of their impacts on human, food and natural ecosystems. Recent studies have shown that NPs affect phytoplankton biomass and diversity in the ocean, solely by regulating micronutrients bioavailability. These types of changes could ultimately impact several biogeochemical cycles, as phytoplankton are responsible for almost half of the primary production on earth. Consequently, this study was designed to evaluate the impact of various concentrations (0μM, 20μM, 40μM, 80μM and 100μM) of several manufactured nanoparticles (gold, carbon and iron) on the dynamics of four economically important microalgae strains. Responses, such as chlorophyll content, protein, lipid content, lipid profile, biomass and cell morphology were monitored over a period of two weeks. No significant acute toxicity was exhibited within the first 24 hours of exposure. However, after 4 days, a remarkably high mortality rate was detected with increasing NPs concentrations of Fe60, C80 and Au60. Iron suspensions were found to be more toxic to the microalgae strains tested than those of Gold and Carbon under comparable regimes. Further investigations with other, either positively or negatively charged nanoparticles, should provide a deeper understanding on the impacts on these engineered materials in our ecosystems. 
    more » « less