Search for: All records

Creators/Authors contains: "Chen, Hsinchun"

« Prev Next »

Total Resources

20

Resource Type
Conference Paper

13

Conference Proceeding

0

Dataset

0

Journal Article

7

Workshop Report

0

Availability
Full Text / Resource Available

20

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Counteracting Dark Web Text-Based CAPTCHA with Generative Adversarial Learning for Proactive Cyber Threat Intelligence

https://doi.org/10.1145/3505226

Zhang, Ning ; Ebrahimi, Mohammadreza ; Li, Weifeng ; Chen, Hsinchun ( June 2022 , ACM Transactions on Management Information Systems)

Automated monitoring of dark web (DW) platforms on a large scale is the first step toward developing proactive Cyber Threat Intelligence (CTI). While there are efficient methods for collecting data from the surface web, large-scale dark web data collection is often hindered by anti-crawling measures. In particular, text-based CAPTCHA serves as the most prevalent and prohibiting type of these measures in the dark web. Text-based CAPTCHA identifies and blocks automated crawlers by forcing the user to enter a combination of hard-to-recognize alphanumeric characters. In the dark web, CAPTCHA images are meticulously designed with additional background noise and variable character length to prevent automated CAPTCHA breaking. Existing automated CAPTCHA breaking methods have difficulties in overcoming these dark web challenges. As such, solving dark web text-based CAPTCHA has been relying heavily on human involvement, which is labor-intensive and time-consuming. In this study, we propose a novel framework for automated breaking of dark web CAPTCHA to facilitate dark web data collection. This framework encompasses a novel generative method to recognize dark web text-based CAPTCHA with noisy background and variable character length. To eliminate the need for human involvement, the proposed framework utilizes Generative Adversarial Network (GAN) to counteract dark web background noise and leverages an enhanced character segmentation algorithm to handle CAPTCHA images with variable character length. Our proposed framework, DW-GAN, was systematically evaluated on multiple dark web CAPTCHA testbeds. DW-GAN significantly outperformed the state-of-the-art benchmark methods on all datasets, achieving over 94.4% success rate on a carefully collected real-world dark web dataset. We further conducted a case study on an emergent Dark Net Marketplace (DNM) to demonstrate that DW-GAN eliminated human involvement by automatically solving CAPTCHA challenges with no more than three attempts. Our research enables the CTI community to develop advanced, large-scale dark web monitoring. We make DW-GAN code available to the community as an open-source tool in GitHub.
more » « less
Full Text Available
ACM KDD AI4Cyber/MLHat: Workshop on AI-enabled Cybersecurity Analytics and Deployable Defense

https://doi.org/10.1145/3534678.3542894

Samtani, Sagar ; Wang, Gang ; Ahmadzadeh, Ali ; Ciptadi, Arridhana ; Yang, Shanchieh ; Chen, Hsinchun ( August 2022 , ACM KDD AI4Cyber/MLHat)

Federal funding agencies and industry entities are seeking innovative approaches to address the ever-growing cybersecurity crisis. Increasingly, numerous cybersecurity thought leaders are indicating that Artificial Intelligence (AI)-enabled analytics can help tackle key cybersecurity tasks and deploy defenses. This half-day workshop, co-located with ACM KDD, sought to attain significant research contributions to various aspects of AI-enabled analytics for cybersecurity applications and deployable defense solutions from academics and practitioners. This workshop was a joint workshop of the 2021 AI-enabled Cybersecurity Analytics and 2021 International Workshop on Deployable Machine Learning for Security Defense. As such, we developed an interdisciplinary Program Committee with significant experience in various aspects of AI, cybersecurity, and/or deployable defense.
more » « less
Full Text Available
Key Factors Affecting User Adoption of Open-Access Data Repositories in Intelligence and Security Informatics: An Affordance Perspective

https://doi.org/10.1145/3460823

Wen, Bo ; Hu, Paul Jen-Hwa ; Ebrahimi, Mohammadreza ; Chen, Hsinchun ( March 2022 , ACM Transactions on Management Information Systems)

Rich, diverse cybersecurity data are critical for efforts by the intelligence and security informatics (ISI) community. Although open-access data repositories (OADRs) provide tremendous benefits for ISI researchers and practitioners, determinants of their adoption remain understudied. Drawing on affordance theory and extant ISI literature, this study proposes a factor model to explain how the essential and unique affordances of an OADR (i.e., relevance, accessibility, and integration) affect individual professionals' intentions to use and collaborate with AZSecure, a major OADR. A survey study designed to test the model and hypotheses reveals that the effects of affordances on ISI professionals' intentions to use and collaborate are mediated by perceived usefulness and ease of use, which then jointly determine their perceived value. This study advances ISI research by specifying three important affordances of OADRs; it also contributes to extant technology adoption literature by scrutinizing and affirming the interplay of essential user acceptance and value perceptions to explain ISI professionals' adoptions of OADRs.
more » « less
Full Text Available
Distilling Contextual Embeddings Into A Static Word Embedding For Improving Hacker Forum Analytics

https://doi.org/10.1109/ISI53945.2021.9624848

Ampel, Benjamin ; Chen, Hsinchun ( November 2021 , Proceedings of 2021 IEEE International Conference on Intelligence and Security Informatics (IEEE ISI 2021))

Hacker forums provide malicious actors with a large database of tutorials, goods, and assets to leverage for cyber-attacks. Careful research of these forums can provide tremendous benefit to the cybersecurity community through trend identification and exploit categorization. This study aims to provide a novel static word embedding, Hack2Vec, to improve performance on hacker forum classification tasks. Our proposed Hack2Vec model distills contextual representations from the seminal pre-trained language model BERT to a continuous bag-of-words model to create a highly targeted hacker forum static word embedding. The results of our experimental design indicate that Hack2Vec improves performance over prominent embeddings in accuracy, precision, recall, and F1-score for a benchmark hacker forum classification task.
more » « less
Full Text Available
A Multi-Disciplinary Perspective for Conducting Artificial Intelligence-enabled Privacy Analytics: Connecting Data, Algorithms, and Systems

https://doi.org/10.1145/3447507

Samtani, Sagar ; Kantarcioglu, Murat ; Chen, Hsinchun ( March 2021 , ACM Transactions on Management Information Systems)
null (Ed.)
Events such as Facebook-Cambridge Analytica scandal and data aggregation efforts by technology providers have illustrated how fragile modern society is to privacy violations. Internationally recognized entities such as the National Science Foundation (NSF) have indicated that Artificial Intelligence (AI)-enabled models, artifacts, and systems can efficiently and effectively sift through large quantities of data from legal documents, social media, Dark Web sites, and other sources to curb privacy violations. Yet considerable efforts are still required for understanding prevailing data sources, systematically developing AI-enabled privacy analytics to tackle emerging challenges, and deploying systems to address critical privacy needs. To this end, we provide an overview of prevailing data sources that can support AI-enabled privacy analytics; a multi-disciplinary research framework that connects data, algorithms, and systems to tackle emerging AI-enabled privacy analytics challenges such as entity resolution, privacy assistance systems, privacy risk modeling, and more; a summary of selected funding sources to support high-impact privacy analytics research; and an overview of prevailing conference and journal venues that can be leveraged to share and archive privacy analytics research. We conclude this paper with an introduction of the papers included in this special issue.
more » « less
Full Text Available
Binary Black-Box Attacks Against Static Malware Detectors with Reinforcement Learning in Discrete Action Spaces

https://doi.org/10.1109/SPW53761.2021.00021

Ebrahimi, Mohammadreza ; Pacheco, Jason ; Li, Weifeng ; Hu, James Lee ; Chen, Hsinchun ( May 2021 , 2021. IEEE S&P Workshop on Deep Learning and Security (DLS),)
null (Ed.)
Full Text Available
Trailblazing the Artificial Intelligence for Cybersecurity Discipline: A Multi-Disciplinary Research Roadmap

https://doi.org/10.1145/3430360

Samtani, Sagar ; Kantarcioglu, Murat ; Chen, Hsinchun ( December 2020 , ACM Transactions on Management Information Systems)
null (Ed.)
Cybersecurity has rapidly emerged as a grand societal challenge of the 21st century. Innovative solutions to proactively tackle emerging cybersecurity challenges are essential to ensuring a safe and secure society. Artificial Intelligence (AI) has rapidly emerged as a viable approach for sifting through terabytes of heterogeneous cybersecurity data to execute fundamental cybersecurity tasks, such as asset prioritization, control allocation, vulnerability management, and threat detection, with unprecedented efficiency and effectiveness. Despite its initial promise, AI and cybersecurity have been traditionally siloed disciplines that relied on disparate knowledge and methodologies. Consequently, the AI for Cybersecurity discipline is in its nascency. In this article, we aim to provide an important step to progress the AI for Cybersecurity discipline. We first provide an overview of prevailing cybersecurity data, summarize extant AI for Cybersecurity application areas, and identify key limitations in the prevailing landscape. Based on these key issues, we offer a multi-disciplinary AI for Cybersecurity roadmap that centers on major themes such as cybersecurity applications and data, advanced AI methodologies for cybersecurity, and AI-enabled decision making. To help scholars and practitioners make significant headway in tackling these grand AI for Cybersecurity issues, we summarize promising funding mechanisms from the National Science Foundation (NSF) that can support long-term, systematic research programs. We conclude this article with an introduction of the articles included in this special issue.
more » « less
Full Text Available
A Generative Adversarial Learning Framework for Breaking Text-Based CAPTCHA in the Dark Web

https://doi.org/10.1109/ISI49825.2020.9280537

Zhang, Ning ; Ebrahimi, Mohammadreza ; Li, Weifeng ; Chen, Hsinchun ( November 2020 , IEEE International Conference on Intelligence and Security Informatics (IEEE ISI 2020).)
null (Ed.)
Full Text Available
Binary Black-box Evasion Attacks Against Deep Learning-based Static Malware Detectors with Adversarial Byte-Level Language Model

https://doi.org/2012.07994

Ebrahimi, Mohammadreza ; Zhang, Ning ; Hu, James ; Raza, Muhammad Taqi ; Chen, Hsinchun ( January 2021 , 2021, AAAI workshop on Robust, Secure and Efficient Machine Learning (RSEML))
null (Ed.)
Full Text Available
Proactively Identifying Emerging Hacker Threats from the Dark Web: A Diachronic Graph Embedding Framework (D-GEF)

https://doi.org/10.1145/3409289

Samtani, Sagar ; Zhu, Hongyi ; Chen, Hsinchun ( August 2020 , ACM Transactions on Privacy and Security)
null (Ed.)
Cybersecurity experts have appraised the total global cost of malicious hacking activities to be $450 billion annually. Cyber Threat Intelligence (CTI) has emerged as a viable approach to combat this societal issue. However, existing processes are criticized as inherently reactive to known threats. To combat these concerns, CTI experts have suggested proactively examining emerging threats in the vast, international online hacker community. In this study, we aim to develop proactive CTI capabilities by exploring online hacker forums to identify emerging threats in terms of popularity and tool functionality. To achieve these goals, we create a novel Diachronic Graph Embedding Framework (D-GEF). D-GEF operates on a Graph-of-Words (GoW) representation of hacker forum text to generate word embeddings in an unsupervised manner. Semantic displacement measures adopted from diachronic linguistics literature identify how terminology evolves. A series of benchmark experiments illustrate D-GEF's ability to generate higher quality than state-of-the-art word embedding models (e.g., word2vec) in tasks pertaining to semantic analogy, clustering, and threat classification. D-GEF's practical utility is illustrated with in-depth case studies on web application and denial of service threats targeting PHP and Windows technologies, respectively. We also discuss the implications of the proposed framework for strategic, operational, and tactical CTI scenarios. All datasets and code are publicly released to facilitate scientific reproducibility and extensions of this work.
more » « less
Full Text Available

« Prev Next »