skip to main content


Search for: All records

Creators/Authors contains: "Chai, Y."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Black hat hackers use malicious exploits to circumvent security controls and take advantage of system vulnerabilities worldwide, costing the global economy over $450 billion annually. While many organizations are increasingly turning to cyber threat intelligence (CTI) to help prioritize their vulnerabilities, extant CTI processes are often criticized as being reactive to known exploits. One promising data source that can help develop proactive CTI is the vast and ever-evolving Dark Web. In this study, we adopted the computational design science paradigm to design a novel deep learning (DL)-based exploit-vulnerability attention deep structured semantic model (EVA-DSSM) that includes bidirectional processing and attention mechanisms to automatically link exploits from the Dark Web to vulnerabilities. We also devised a novel device vulnerability severity metric (DVSM) that incorporates the exploit post date and vulnerability severity to help cybersecurity professionals with their device prioritization and risk management efforts. We rigorously evaluated the EVA-DSSM against state-of-the-art non-DL and DL-based methods for short text matching on 52,590 exploit-vulnerability linkages across four testbeds: web application, remote, local, and denial of service. Results of these evaluations indicate that the proposed EVA-DSSM achieves precision at 1 scores 20% - 41% higher than non-DL approaches and 4% - 10% higher than DL-based approaches. We demonstrated the EVA-DSSM’s and DVSM’s practical utility with two CTI case studies: openly accessible systems in the top eight U.S. hospitals and over 20,000 Supervisory Control and Data Acquisition (SCADA) systems worldwide. A complementary user evaluation of the case study results indicated that 45 cybersecurity professionals found the EVA-DSSM and DVSM results more useful for exploit-vulnerability linking and risk prioritization activities than those produced by prevailing approaches. Given the rising cost of cyberattacks, the EVA-DSSM and DVSM have important implications for analysts in security operations centers, incident response teams, and cybersecurity vendors. 
    more » « less
  2. Black hat hackers use malicious exploits to circumvent security controls and take advantage of system vulnerabilities worldwide, costing the global economy over $450 billion annually. While many organizations are increasingly turning to cyber threat intelligence (CTI) to help prioritize their vulnerabilities, extant CTI processes are often criticized as being reactive to known exploits. One promising data source that can help develop proactive CTI is the vast and ever-evolving Dark Web. In this study, we adopted the computational design science paradigm to design a novel deep learning (DL)-based exploit-vulnerability attention deep structured semantic model (EVA-DSSM) that includes bidirectional processing and attention mechanisms to automatically link exploits from the Dark Web to vulnerabilities. We also devised a novel device vulnerability severity metric (DVSM) that incorporates the exploit post date and vulnerability severity to help cybersecurity professionals with their device prioritization and risk management efforts. We rigorously evaluated the EVA-DSSM against state-of-the-art non-DL and DL-based methods for short text matching on 52,590 exploit-vulnerability linkages across four testbeds: web application, remote, local, and denial of service. Results of these evaluations indicate that the proposed EVA-DSSM achieves precision at 1 scores 20%-41% higher than non-DL approaches and 4%-10% higher than DL-based approaches. We demonstrated the EVA-DSSM's and DVSM's practical utility with two CTI case studies: openly accessible systems in the top eight U.S. hospitals and over 20,000 Supervisory Control and Data Acquisition (SCADA) systems worldwide. A complementary user evaluation of the case study results indicated that 45 cybersecurity professionals found the EVA-DSSM and DVSM results more useful for exploit-vulnerability linking and risk prioritization activities than those produced by prevailing approaches. Given the rising cost of cyberattacks, the EVA-DSSM and DVSM have important implications for analysts in security operations centers, incident response teams, and cybersecurity vendors. 
    more » « less
  3. International dark web platforms operating within multiple geopolitical regions and languages host a myriad of hacker assets such as malware, hacking tools, hacking tutorials, and malicious source code. Cybersecurity analytics organizations employ machine learning models trained on human-labeled data to automatically detect these assets and bolster their situational awareness. However, the lack of human-labeled training data is prohibitive when analyzing foreign-language dark web content. In this research note, we adopt the computational design science paradigm to develop a novel IT artifact for cross-lingual hacker asset detection(CLHAD). CLHAD automatically leverages the knowledge learned from English content to detect hacker assets in non-English dark web platforms. CLHAD encompasses a novel Adversarial deep representation learning (ADREL) method, which generates multilingual text representations using generative adversarial networks (GANs). Drawing upon the state of the art in cross-lingual knowledge transfer, ADREL is a novel approach to automatically extract transferable text representations and facilitate the analysis of multilingual content. We evaluate CLHAD on Russian, French, and Italian dark web platforms and demonstrate its practical utility in hacker asset profiling, and conduct a proof-of-concept case study. Our analysis suggests that cybersecurity managers may benefit more from focusing on Russian to identify sophisticated hacking assets. In contrast, financial hacker assets are scattered among several dominant dark web languages. Managerial insights for security managers are discussed at operational and strategic levels. 
    more » « less
  4. The regularity of devastating cyber-attacks has made cybersecurity a grand societal challenge. Many cybersecurity professionals are closely examining the international Dark Web to proactively pinpoint potential cyber threats. Despite its potential, the Dark Web contains hundreds of thousands of non-English posts. While machine translation is the prevailing approach to process non-English text, applying MT on hacker forum text results in mistranslations. In this study, we draw upon Long-Short Term Memory (LSTM), Cross-Lingual Knowledge Transfer (CLKT), and Generative Adversarial Networks (GANs) principles to design a novel Adversarial CLKT (A-CLKT) approach. A-CLKT operates on untranslated text to retain the original semantics of the language and leverages the collective knowledge about cyber threats across languages to create a language invariant representation without any manual feature engineering or external resources. Three experiments demonstrate how A-CLKT outperforms state-of-the-art machine learning, deep learning, and CLKT algorithms in identifying cyber-threats in French and Russian forums. 
    more » « less
  5. ABSTRACT

    PG 1553 + 113 is one of the few blazars with a convincing quasi-periodic emission in the gamma-ray band. The source is also a very high energy (VHE; >100 GeV) gamma-ray emitter. To better understand its properties and identify the underlying physical processes driving its variability, the MAGIC Collaboration initiated a multiyear, multiwavelength monitoring campaign in 2015 involving the OVRO 40-m and Medicina radio telescopes, REM, KVA, and the MAGIC telescopes, Swift and Fermi satellites, and the WEBT network. The analysis presented in this paper uses data until 2017 and focuses on the characterization of the variability. The gamma-ray data show a (hint of a) periodic signal compatible with literature, but the X-ray and VHE gamma-ray data do not show statistical evidence for a periodic signal. In other bands, the data are compatible with the gamma-ray period, but with a relatively high p-value. The complex connection between the low- and high-energy emission and the non-monochromatic modulation and changes in flux suggests that a simple one-zone model is unable to explain all the variability. Instead, a model including a periodic component along with multiple emission zones is required.

     
    more » « less
  6. Abstract We report on a long-lasting, elevated gamma-ray flux state from VER J0521+211 observed by VERITAS, MAGIC, and Fermi-LAT in 2013 and 2014. The peak integral flux above 200 GeV measured with the nightly binned light curve is (8.8 ± 0.4) × 10 −7 photons m −2 s −1 , or ∼37% of the Crab Nebula flux. Multiwavelength observations from X-ray, UV, and optical instruments are also presented. A moderate correlation between the X-ray and TeV gamma-ray fluxes was observed, and the X-ray spectrum appeared harder when the flux was higher. Using the gamma-ray spectrum and four models of the extragalactic background light (EBL), a conservative 95% confidence upper limit on the redshift of the source was found to be z ≤ 0.31. Unlike the gamma-ray and X-ray bands, the optical flux did not increase significantly during the studied period compared to the archival low-state flux. The spectral variability from optical to X-ray bands suggests that the synchrotron peak of the spectral energy distribution (SED) may become broader during flaring states, which can be adequately described with a one-zone synchrotron self-Compton model varying the high-energy end of the underlying particle spectrum. The synchrotron peak frequency of the SED and the radio morphology of the jet from the MOJAVE program are consistent with the source being an intermediate-frequency-peaked BL Lac object. 
    more » « less
  7. ABSTRACT MAXI J1820+070 is a low-mass X-ray binary with a black hole (BH) as a compact object. This binary underwent an exceptionally bright X-ray outburst from 2018 March to October, showing evidence of a non-thermal particle population through its radio emission during this whole period. The combined results of 59.5 h of observations of the MAXI J1820+070 outburst with the H.E.S.S., MAGIC and VERITAS experiments at energies above 200 GeV are presented, together with Fermi-LAT data between 0.1 and 500 GeV, and multiwavelength observations from radio to X-rays. Gamma-ray emission is not detected from MAXI J1820+070, but the obtained upper limits and the multiwavelength data allow us to put meaningful constraints on the source properties under reasonable assumptions regarding the non-thermal particle population and the jet synchrotron spectrum. In particular, it is possible to show that, if a high-energy (HE) gamma-ray emitting region is present during the hard state of the source, its predicted flux should be at most a factor of 20 below the obtained Fermi-LAT upper limits, and closer to them for magnetic fields significantly below equipartition. During the state transitions, under the plausible assumption that electrons are accelerated up to ∼500 GeV, the multiwavelength data and the gamma-ray upper limits lead consistently to the conclusion that a potential HE and very-HE gamma-ray emitting region should be located at a distance from the BH ranging between 1011 and 1013 cm. Similar outbursts from low-mass X-ray binaries might be detectable in the near future with upcoming instruments such as CTA. 
    more » « less
  8. Abstract The results of gamma-ray observations of the binary system HESS J0632 + 057 collected during 450 hr over 15 yr, between 2004 and 2019, are presented. Data taken with the atmospheric Cherenkov telescopes H.E.S.S., MAGIC, and VERITAS at energies above 350 GeV were used together with observations at X-ray energies obtained with Swift-XRT, Chandra, XMM-Newton, NuSTAR, and Suzaku. Some of these observations were accompanied by measurements of the H α emission line. A significant detection of the modulation of the very high-energy gamma-ray fluxes with a period of 316.7 ± 4.4 days is reported, consistent with the period of 317.3 ± 0.7 days obtained with a refined analysis of X-ray data. The analysis of data from four orbital cycles with dense observational coverage reveals short-timescale variability, with flux-decay timescales of less than 20 days at very high energies. Flux variations observed over a timescale of several years indicate orbit-to-orbit variability. The analysis confirms the previously reported correlation of X-ray and gamma-ray emission from the system at very high significance, but cannot find any correlation of optical H α parameters with fluxes at X-ray or gamma-ray energies in simultaneous observations. The key finding is that the emission of HESS J0632 + 057 in the X-ray and gamma-ray energy bands is highly variable on different timescales. The ratio of gamma-ray to X-ray flux shows the equality or even dominance of the gamma-ray energy range. This wealth of new data is interpreted taking into account the insufficient knowledge of the ephemeris of the system, and discussed in the context of results reported on other gamma-ray binary systems. 
    more » « less