skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Review of Dark Web: Trends and Future Directions
The dark web is often discussed in taboo by many who are unfamiliar with the subject. However, this paper takes a dive into the skeleton of what constructs the dark web by compiling the research of published essays. The Onion Router (TOR) and other discussed browsers are specialized web browsers that provide anonymity by going through multiple servers and encrypted networks between the host and client, hiding the IP address of both ends. This provides difficulty in terms of controlling or monitoring the dark web, leading to its popularity in criminal underworlds. In this work, we provide an overview of data mining and penetration testing tools that are being widely used to crawl and collect data. We compare the tools to provide strengths and weaknesses of the tools while providing challenges of harnessing massive data from dark web using crawlers and penetration testing tools including machine learning (ML) techniques. Despite the effort to crawl dark web has progressed, there are still rooms to advance existing approaches to combat the ever-changing landscape of the dark web.  more » « less
Award ID(s):
2100115 1723578
PAR ID:
10347029
Author(s) / Creator(s):
Date Published:
Journal Name:
IEEE Conference on Computers, Software & Applications
Page Range / eLocation ID:
1780-1785
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Online trackers are invasive as they track our digital footprints, many of which are sensitive in nature, and when aggregated over time, they can help infer intricate details about our lifestyles and habits. Although much research has been conducted to understand the effectiveness of existing countermeasures for the desktop platform, little is known about how mobile browsers have evolved to handle online trackers. With mobile devices now generating more web traffic than their desktop counterparts, we fill this research gap through a large-scale comparative analysis of mobile web browsers. We crawl 10K valid websites from the Tranco list on real mobile devices. Our data collection process covers both popular generic browsers (e.g., Chrome, Firefox, and Safari) as well as privacy-focused browsers (e.g., Brave, Duck Duck Go, and Firefox-Focus). We use dynamic analysis of runtime execution traces and static analysis of source codes to highlight the tracking behavior of invasive fingerprinters. We also find evidence of tailored content being served to different browsers. In particular, we note that Firefox Focus sees altered script code, whereas Brave and Duck Duck Go have highly similar content. To test the privacy protection of browsers, we measure the responses of each browser in blocking trackers and advertisers and note the strengths and weaknesses of privacy browsers. To establish ground truth, we use well-known block lists, including EasyList, EasyPrivacy, Disconnect and WhoTracksMe and find that Brave generally blocks the highest number of content that should be blocked as per these lists. Focus performs better against social trackers, and Duck Duck Go restricts third-party trackers that perform email-based tracking. 
    more » « less
  2. Abstract Over half of all visits to websites now take place in a mobile browser, yet the majority of web privacy studies take the vantage point of desktop browsers, use emulated mobile browsers, or focus on just a single mobile browser instead. In this paper, we present a comprehensive web-tracking measurement study on mobile browsers and privacy-focused mobile browsers. Our study leverages a new web measurement infrastructure, OmniCrawl, which we develop to drive browsers on desktop computers and smartphones located on two continents. We capture web tracking measurements using 42 different non-emulated browsers simultaneously. We find that the third-party advertising and tracking ecosystem of mobile browsers is more similar to that of desktop browsers than previous findings suggested. We study privacy-focused browsers and find their protections differ significantly and in general are less for lower-ranked sites. Our findings also show that common methodological choices made by web measurement studies, such as the use of emulated mobile browsers and Selenium, can lead to website behavior that deviates from what actual users experience. 
    more » « less
  3. Fodor, Anthony (Ed.)
    ABSTRACT Are two adjacent genes in the same operon? What are the order and spacing between several transcription factor binding sites? Genome browsers are software data visualization and exploration tools that enable biologists to answer questions such as these. In this paper, we report on a major update to our browser, Genome Explorer, that provides nearly instantaneous scaling and traversing of a genome, enabling users to quickly and easily zoom into an area of interest. The user can rapidly move between scales that depict the entire genome, individual genes, and the sequence; Genome Explorer presents the most relevant detail and context for each scale. By downloading the data for the entire genome to the user’s web browser and dynamically generating visualizations locally, we enable fine control of zoom and pan functions and real-time redrawing of the visualization, resulting in smoother and more intuitive exploration of a genome than is possible with other browsers. Further, genome features are presented together, in-line, using familiar graphical depictions. In contrast, many other browsers depict genome features using data tracks, which have low information density and can visually obscure the relative positions of features. Genome Explorer diagrams have a high information density that provides larger amounts of genome context and sequence information to be presented in a given-sized monitor than for tracks-based browsers. Genome Explorer provides optional data tracks for the analysis of large-scale data sets and a unique comparative mode that aligns genomes at orthologous genes with synchronized zooming. IMPORTANCEGenome browsers provide graphical depictions of genome information to speed the uptake of complex genome data by scientists. They provide search operations to help scientists find information and zoom operations to enable scientists to view genome features at different resolutions. We introduce the Genome Explorer browser, which provides extremely fast zooming and panning of genome visualizations and displays with high information density. 
    more » « less
  4. International dark web platforms operating within multiple geopolitical regions and languages host a myriad of hacker assets such as malware, hacking tools, hacking tutorials, and malicious source code. Cybersecurity analytics organizations employ machine learning models trained on human-labeled data to automatically detect these assets and bolster their situational awareness. However, the lack of human-labeled training data is prohibitive when analyzing foreign-language dark web content. In this research note, we adopt the computational design science paradigm to develop a novel IT artifact for cross-lingual hacker asset detection(CLHAD). CLHAD automatically leverages the knowledge learned from English content to detect hacker assets in non-English dark web platforms. CLHAD encompasses a novel Adversarial deep representation learning (ADREL) method, which generates multilingual text representations using generative adversarial networks (GANs). Drawing upon the state of the art in cross-lingual knowledge transfer, ADREL is a novel approach to automatically extract transferable text representations and facilitate the analysis of multilingual content. We evaluate CLHAD on Russian, French, and Italian dark web platforms and demonstrate its practical utility in hacker asset profiling, and conduct a proof-of-concept case study. Our analysis suggests that cybersecurity managers may benefit more from focusing on Russian to identify sophisticated hacking assets. In contrast, financial hacker assets are scattered among several dominant dark web languages. Managerial insights for security managers are discussed at operational and strategic levels. 
    more » « less
  5. Abstract Current genome sequencing technologies have made it possible to generate highly contiguous genome assemblies for non-model animal species. Despite advances in genome assembly methods, there is still room for improvement in the delineation of specific gene features in the genomes. Here we present genome visualization and annotation tools to support seven livestock species (bovine, chicken, goat, horse, pig, sheep, and water buffalo), available in a new resource called AgAnimalGenomes. In addition to supporting the manual refinement of gene models, these browsers provide visualization tracks for hundreds of RNAseq experiments, as well as data generated by the Functional Annotation of Animal Genomes (FAANG) Consortium. For species with predicted gene sets from both Ensembl and RefSeq, the browsers provide special tracks showing the thousands of protein-coding genes that disagree across the two gene sources, serving as a valuable resource to alert researchers to gene model issues that may affect data interpretation. We describe the data and search methods available in the new genome browsers and how to use the provided tools to edit and create new gene models. 
    more » « less