skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Chai, Y"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Black hat hackers use malicious exploits to circumvent security controls and take advantage of system vulnerabilities worldwide, costing the global economy over $450 billion annually. While many organizations are increasingly turning to cyber threat intelligence (CTI) to help prioritize their vulnerabilities, extant CTI processes are often criticized as being reactive to known exploits. One promising data source that can help develop proactive CTI is the vast and ever-evolving Dark Web. In this study, we adopted the computational design science paradigm to design a novel deep learning (DL)-based exploit-vulnerability attention deep structured semantic model (EVA-DSSM) that includes bidirectional processing and attention mechanisms to automatically link exploits from the Dark Web to vulnerabilities. We also devised a novel device vulnerability severity metric (DVSM) that incorporates the exploit post date and vulnerability severity to help cybersecurity professionals with their device prioritization and risk management efforts. We rigorously evaluated the EVA-DSSM against state-of-the-art non-DL and DL-based methods for short text matching on 52,590 exploit-vulnerability linkages across four testbeds: web application, remote, local, and denial of service. Results of these evaluations indicate that the proposed EVA-DSSM achieves precision at 1 scores 20% - 41% higher than non-DL approaches and 4% - 10% higher than DL-based approaches. We demonstrated the EVA-DSSM’s and DVSM’s practical utility with two CTI case studies: openly accessible systems in the top eight U.S. hospitals and over 20,000 Supervisory Control and Data Acquisition (SCADA) systems worldwide. A complementary user evaluation of the case study results indicated that 45 cybersecurity professionals found the EVA-DSSM and DVSM results more useful for exploit-vulnerability linking and risk prioritization activities than those produced by prevailing approaches. Given the rising cost of cyberattacks, the EVA-DSSM and DVSM have important implications for analysts in security operations centers, incident response teams, and cybersecurity vendors. 
    more » « less
  2. Black hat hackers use malicious exploits to circumvent security controls and take advantage of system vulnerabilities worldwide, costing the global economy over $450 billion annually. While many organizations are increasingly turning to cyber threat intelligence (CTI) to help prioritize their vulnerabilities, extant CTI processes are often criticized as being reactive to known exploits. One promising data source that can help develop proactive CTI is the vast and ever-evolving Dark Web. In this study, we adopted the computational design science paradigm to design a novel deep learning (DL)-based exploit-vulnerability attention deep structured semantic model (EVA-DSSM) that includes bidirectional processing and attention mechanisms to automatically link exploits from the Dark Web to vulnerabilities. We also devised a novel device vulnerability severity metric (DVSM) that incorporates the exploit post date and vulnerability severity to help cybersecurity professionals with their device prioritization and risk management efforts. We rigorously evaluated the EVA-DSSM against state-of-the-art non-DL and DL-based methods for short text matching on 52,590 exploit-vulnerability linkages across four testbeds: web application, remote, local, and denial of service. Results of these evaluations indicate that the proposed EVA-DSSM achieves precision at 1 scores 20%-41% higher than non-DL approaches and 4%-10% higher than DL-based approaches. We demonstrated the EVA-DSSM's and DVSM's practical utility with two CTI case studies: openly accessible systems in the top eight U.S. hospitals and over 20,000 Supervisory Control and Data Acquisition (SCADA) systems worldwide. A complementary user evaluation of the case study results indicated that 45 cybersecurity professionals found the EVA-DSSM and DVSM results more useful for exploit-vulnerability linking and risk prioritization activities than those produced by prevailing approaches. Given the rising cost of cyberattacks, the EVA-DSSM and DVSM have important implications for analysts in security operations centers, incident response teams, and cybersecurity vendors. 
    more » « less
  3. International dark web platforms operating within multiple geopolitical regions and languages host a myriad of hacker assets such as malware, hacking tools, hacking tutorials, and malicious source code. Cybersecurity analytics organizations employ machine learning models trained on human-labeled data to automatically detect these assets and bolster their situational awareness. However, the lack of human-labeled training data is prohibitive when analyzing foreign-language dark web content. In this research note, we adopt the computational design science paradigm to develop a novel IT artifact for cross-lingual hacker asset detection(CLHAD). CLHAD automatically leverages the knowledge learned from English content to detect hacker assets in non-English dark web platforms. CLHAD encompasses a novel Adversarial deep representation learning (ADREL) method, which generates multilingual text representations using generative adversarial networks (GANs). Drawing upon the state of the art in cross-lingual knowledge transfer, ADREL is a novel approach to automatically extract transferable text representations and facilitate the analysis of multilingual content. We evaluate CLHAD on Russian, French, and Italian dark web platforms and demonstrate its practical utility in hacker asset profiling, and conduct a proof-of-concept case study. Our analysis suggests that cybersecurity managers may benefit more from focusing on Russian to identify sophisticated hacking assets. In contrast, financial hacker assets are scattered among several dominant dark web languages. Managerial insights for security managers are discussed at operational and strategic levels. 
    more » « less
  4. The regularity of devastating cyber-attacks has made cybersecurity a grand societal challenge. Many cybersecurity professionals are closely examining the international Dark Web to proactively pinpoint potential cyber threats. Despite its potential, the Dark Web contains hundreds of thousands of non-English posts. While machine translation is the prevailing approach to process non-English text, applying MT on hacker forum text results in mistranslations. In this study, we draw upon Long-Short Term Memory (LSTM), Cross-Lingual Knowledge Transfer (CLKT), and Generative Adversarial Networks (GANs) principles to design a novel Adversarial CLKT (A-CLKT) approach. A-CLKT operates on untranslated text to retain the original semantics of the language and leverages the collective knowledge about cyber threats across languages to create a language invariant representation without any manual feature engineering or external resources. Three experiments demonstrate how A-CLKT outperforms state-of-the-art machine learning, deep learning, and CLKT algorithms in identifying cyber-threats in French and Russian forums. 
    more » « less
  5. ABSTRACT OT 081 is a well-known, luminous blazar that is remarkably variable in many energy bands. We present the first broadband study of the source, which includes very high energy (VHE, $$E\gt $$ 100 GeV) $$\gamma$$-ray data taken by the MAGIC (Major Atmospheric Gamma-ray Imaging Cherenkov telescopes) and H.E.S.S. (High Energy Stereoscopic System) imaging Cherenkov telescopes. The discovery of VHE $$\gamma$$-ray emission happened during a high state of $$\gamma$$-ray activity in July 2016, observed by many instruments from radio to VHE $$\gamma$$-rays. We identify four states of activity of the source, one of which includes VHE $$\gamma$$-ray emission. Variability in the VHE domain is found on daily time-scales. The intrinsic VHE spectrum can be described by a power law with index $$3.27\pm 0.44_{\rm stat}\pm 0.15_{\rm sys}$$ (MAGIC) and $$3.39\pm 0.58_{\rm stat}\pm 0.64_{\rm sys}$$ (H.E.S.S.) in the energy range of 55–300 and 120–500 GeV, respectively. The broadband emission cannot be successfully reproduced by a simple one-zone synchrotron self-Compton model. Instead, an additional external Compton component is required. We test a lepto-hadronic model that reproduces the data set well and a proton-synchrotron-dominated model that requires an extreme proton luminosity. Emission models that are able to successfully represent the data place the emitting region well outside of the broad-line region to a location at which the radiative environment is dominated by the infrared thermal radiation field of the dusty torus. In the scenario described by this flaring activity, the source appears to be a flat spectrum radio quasar (FSRQ), in contrast with past categorizations. This suggests that the source can be considered to be a transitional blazar, intermediate between BL Lac and FSRQ objects. 
    more » « less
    Free, publicly-accessible full text available May 15, 2026
  6. Aims.Mrk 421 was in its most active state around early 2010, which led to the highest TeV gamma-ray flux ever recorded from any active galactic nuclei (AGN). We aim to characterize the multiwavelength behavior during this exceptional year for Mrk 421, and evaluate whether it is consistent with the picture derived with data from other less exceptional years. Methods.We investigated the period from November 5, 2009, (MJD 55140) until July 3, 2010, (MJD 55380) with extensive coverage from very-high-energy (VHE;E > 100 GeV) gamma rays to radio with MAGIC, VERITAS,Fermi-LAT,RXTE,Swift, GASP-WEBT, VLBA, and a variety of additional optical and radio telescopes. We characterized the variability by deriving fractional variabilities as well as power spectral densities (PSDs). In addition, we investigated images of the jet taken with VLBA and the correlation behavior among different energy bands. Results.Mrk 421 was in widely different states of activity throughout the campaign, ranging from a low-emission state to its highest VHE flux ever recorded. We find the strongest variability in X-rays and VHE gamma rays, and PSDs compatible with power-law functions with indices around 1.5. We observe strong correlations between X-rays and VHE gamma rays at zero time lag with varying characteristics depending on the exact energy band. We also report a marginally significant (∼3σ) positive correlation between high-energy (HE;E > 100 MeV) gamma rays and the ultraviolet band. We detected marginally significant (∼3σ) correlations between the HE and VHE gamma rays, and between HE gamma rays and the X-ray, that disappear when the large flare in February 2010 is excluded from the correlation study, hence indicating the exceptionality of this flaring event in comparison with the rest of the campaign. The 2010 violent activity of Mrk 421 also yielded the first ejection of features in the VLBA images of the jet of Mrk 421. Yet the large uncertainties in the ejection times of these unprecedented radio features prevent us from firmly associating them to the specific flares recorded during the 2010 campaign. We also show that the collected multi-instrument data are consistent with a scenario where the emission is dominated by two regions, a compact and extended zone, which could be considered as a simplified implementation of an energy-stratified jet as suggested by recentIXPEobservations. 
    more » « less
    Free, publicly-accessible full text available February 1, 2026
  7. The BL Lacertae object VER J0521+211 underwent a notable flaring episode in February 2020. A short-term monitoring campaign, led by the MAGIC (Major Atmospheric Gamma Imaging Cherenkov) collaboration, covering a wide energy range from radio to very high-energy (VHE, 100 GeV <E< 100 TeV) gamma rays was organised to study its evolution. These observations resulted in a consistent detection of the source over six consecutive nights in the VHE gamma-ray domain. Combining these nightly observations with an extensive set of multi-wavelength data made modelling of the blazar’s spectral energy distribution (SED) possible during the flare. This modelling was performed with a focus on two plausible emission mechanisms: (i) a leptonic two-zone synchrotron-self-Compton scenario, and (ii) a lepto-hadronic one-zone scenario. Both models effectively replicated the observed SED from radio to the VHE gamma-ray band. Furthermore, by introducing a set of evolving parameters, both models were successful in reproducing the evolution of the fluxes measured in different bands throughout the observing campaign. Notably, the lepto-hadronic model predicts enhanced photon and neutrino fluxes at ultra-high energies (E> 100 TeV). While the photon component, generated via decay of neutral pions, is not directly observable as it is subject to intense pair production (and therefore extinction) through interactions with the cosmic microwave background photons, neutrino detectors (e.g. IceCube) can probe the predicted neutrino component. Finally, the analysis of the gamma-ray spectra, observed by MAGIC and theFermi-LAT telescopes, yielded a conservative 95% confidence upper limit ofz ≤ 0.244 for the redshift of this blazar. 
    more » « less
    Free, publicly-accessible full text available February 1, 2026
  8. Context.Blazars exhibit strong variability across the entire electromagnetic spectrum, including periods of high-flux states commonly known as flares. The physical mechanisms in blazar jets responsible for flares remain poorly understood to date. Aims.Our aim is to better understand the emission mechanisms during blazar flares using X-ray polarimetry and broadband observations from the archetypical TeV blazar Mrk 421, which can be studied with higher accuracy than other blazars that are dimmer and/or located farther away. Methods.We studied a flaring activity from December 2023 that was characterized from radio to very high-energy (VHE;E > 0.1 TeV) gamma rays with MAGIC,Fermi-LAT,Swift,XMM-Newton, and several optical and radio telescopes. These observations included, for the first time for a gamma-ray flare of a blazar, simultaneous X-ray polarization measurements with IXPE, in addition to optical and radio polarimetry data. We quantify the variability and correlations among the multi-band flux and polarization measurements, and describe the varying broadband emission within a theoretical scenario constrained by the polarization data. Results.We find substantial variability in both X-rays and VHE gamma rays throughout the campaign, with the highest VHE flux above 0.2 TeV occurring during the IXPE observing window, and exceeding twice the flux of the Crab Nebula. However, the VHE and X-ray spectra are on average softer, and the correlation between these two bands is weaker than those reported in the previous flares of Mrk 421. IXPE reveals an X-ray polarization degree significantly higher than that at radio and optical frequencies, similar to previous results for Mrk 421 and other high synchrotron peaked blazars. Differently to past observations, the X-ray polarization angle varies by ∼100° on timescales of days, and the polarization degree changes by more than a factor of 4. The highest X-ray polarization degree, analyzed in 12 h time intervals, reaches 26 ± 2%, around which an X-ray counter-clockwise hysteresis loop is measured withXMM-Newton. It suggests that the X-ray emission comes from particles close to the high-energy cutoff, hence possibly probing an extreme case of the Turbulent Extreme Multi-Zone model for which the chromatic trend in the polarization may be more pronounced than theoretically predicted. We model the broadband emission with a simplified stratified jet model throughout the flare. The polarization measurements imply an electron distribution in the X-ray emitting region with a very high minimum Lorentz factor ($$ \gamma\prime_{\mathrm{min}}\gtrsim10^4 $$), which is expected in electron-ion plasma, as well as a variation of the emitting region size of up to a factor of 3 during the flaring activity. We find no correlation between the fluxes and the evolution of the model parameters, which indicates a stochastic nature of the underlying physical mechanism that likely explains the lack of a tight X-ray/VHE correlation during this flaring activity. Such behavior would be expected in a highly turbulent electron-ion plasma crossing a shock front. 
    more » « less
    Free, publicly-accessible full text available March 1, 2026
  9. Aims.We have performed the first broadband study of Mrk 421 from radio to TeV gamma rays with simultaneous measurements of the X-ray polarization from IXPE. Methods.The data were collected as part of an extensive multiwavelength campaign carried out between May and June 2022 using MAGIC,Fermi-LAT,NuSTAR,XMM-Newton,Swift, and several optical and radio telescopes to complement IXPE data. Results.During the IXPE exposures, the measured 0.2–1 TeV flux was close to the quiescent state and ranged from 25% to 50% of the Crab Nebula without intra-night variability. Throughout the campaign, the very high-energy (VHE) and X-ray emission are positively correlated at a 4σsignificance level. The IXPE measurements reveal an X-ray polarization degree that is a factor of 2–5 higher than in the optical/radio bands; that implies an energy-stratified jet in which the VHE photons are emitted co-spatially with the X-rays, in the vicinity of a shock front. The June 2022 observations exhibit a rotation of the X-ray polarization angle. Despite no simultaneous VHE coverage being available during a large fraction of the swing, theSwift-XRT monitoring reveals an X-ray flux increase with a clear spectral hardening. This suggests that flares in high synchrotron peaked blazars can be accompanied by a polarization angle rotation, as observed in some flat spectrum radio quasars. Finally, during the polarization angle rotation,NuSTARdata reveal two contiguous spectral hysteresis loops in opposite directions (clockwise and counterclockwise), implying important changes in the particle acceleration efficiency on approximately hour timescales. 
    more » « less
  10. Context.The nearby elliptical galaxy M87 contains one of only two supermassive black holes whose emission surrounding the event horizon has been imaged by the Event Horizon Telescope (EHT). In 2018, more than two dozen multi-wavelength (MWL) facilities (from radio toγ-ray energies) took part in the second M87 EHT campaign. Aims.The goal of this extensive MWL campaign was to better understand the physics of the accreting black hole M87*, the relationship between the inflow and inner jets, and the high-energy particle acceleration. Understanding the complex astrophysics is also a necessary first step towards performing further tests of general relativity. Methods.The MWL campaign took place in April 2018, overlapping with the EHT M87* observations. We present a new, contemporaneous spectral energy distribution (SED) ranging from radio to very high-energy (VHE)γ-rays as well as details of the individual observations and light curves. We also conducted phenomenological modelling to investigate the basic source properties. Results.We present the first VHEγ-ray flare from M87 detected since 2010. The flux above 350 GeV more than doubled within a period of ≈36 hours. We find that the X-ray flux is enhanced by about a factor of two compared to 2017, while the radio and millimetre core fluxes are consistent between 2017 and 2018. We detect evidence for a monotonically increasing jet position angle that corresponds to variations in the bright spot of the EHT image. Conclusions.Our results show the value of continued MWL monitoring together with precision imaging for addressing the origins of high-energy particle acceleration. While we cannot currently pinpoint the precise location where such acceleration takes place, the new VHEγ-ray flare already presents a challenge to simple one-zone leptonic emission model approaches, and it emphasises the need for combined image and spectral modelling. 
    more » « less
    Free, publicly-accessible full text available December 1, 2025