<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>JSMBox—A Runtime Monitoring Framework for Analyzing and Classifying Malicious JavaScript</title></titleStmt>
			<publicationStmt>
				<publisher>Springer Nature Switzerland</publisher>
				<date>10/19/2024</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10582492</idno>
					<idno type="doi">10.1007/978-3-031-75201-8_8</idno>
					
					<author>Phu H Phung</author><author>Allen Varghese</author><author>Bojue Wang</author><author>Yu Zhao</author><author>Chong Yu</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[In recent years, there has been a notable increase in the prevalence of malicious websites, leading to a majority of cyber-attacks and data breaches. Malicious websites often incorporate JavaScript code to execute attacks on web browsers. Despite existing methodologies documented in the literature, the analysis and detection of malicious JavaScript pose significant challenges due to the dynamic nature of JavaScript and the use of advanced evasion techniques. These challenges motivate the need for an innovative and efficient approach to comprehensively analyze the code to identify its malicious intent. In this paper, we introduce a monitoring approach for analyzing JavaScript code, which can capture all of the code’s features at runtime. Our method leverages the security reference monitor technique to mediate JavaScript security-sensitive executions, including function calls and property accesses. Therefore, the proposed method can capture behaviors at runtime regardless of how the code is written, even with recent advanced evasion techniques like WebAssembly diversification. We have implemented our approach as a JavaScript dynamic analysis framework called JSMBox in a Chromium-based browser extension. Our experiments demonstrated that JSMBox is capable of effectively countering sophisticated evasion techniques found in modern malicious JavaScript code, including WebAssembly diversification. We have also evaluated the framework’s ability to classify malicious behaviors based on a large-scale raw dataset comprising about 20,000 malicious and benign webpages. Our developed tool automatically launches the browser to execute these webpages, records JavaScript code execution events, and captures their execution frequency as extracted features. We have tested the extracted dataset with various machine-learning models, yielding promising experimental results that confirm the effectiveness of our approach and achieve a high accuracy rate.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The ubiquity of JavaScript in web development, as highlighted by W3Techs<ref type="foot">foot_0</ref> , poses both opportunities and risks. While JavaScript enhances user interaction and web functionalities, it has also become a vector for cyberattacks, particularly through malicious code on webpages. Indeed, malicious webpages with JavaScript code that launch attacks on web browsers have become increasingly problematic in recent years, carrying out threats against the user's browser, such as stealing the user's credentials or downloading additional malware. Unfortunately, the dynamic nature of the JavaScript language and its tight integration with the browser make it challenging to detect and block malicious JavaScript code. JavaScript-based attacks on webpages are a recent trend and top Internet security threats <ref type="bibr">[1]</ref>, which can defeat traditional signature-based approaches used in anti-virus tools <ref type="bibr">[2]</ref>.</p><p>Analyzing and detecting malicious JavaScript have received high attention and are still an active research direction in the literature <ref type="bibr">[3]</ref>, which employ static analysis, dynamic analysis, or combined static and dynamic analysis techniques <ref type="bibr">[4]</ref>. Static analysis is a traditional approach that typically extracts the semantic structure of the source code, abstract syntax tree, strings, objects, and functions to provide features for detection or machine learning algorithms. However, conventional static analysis methods typically fail to deal with dynamically generated code and evasion techniques used by attackers to hide malicious code <ref type="bibr">[5]</ref>. On the other hand, dynamic analysis techniques execute JavaScript code; therefore, they can capture dynamically generated code and runtime behaviors that static analysis methods might omit. Although JavaScript dynamic analysis approaches offer advantages in behavioral analysis and runtime features, their realizations suffer shortcomings <ref type="bibr">[6]</ref>. For example, several methods, e.g., <ref type="bibr">[7,</ref><ref type="bibr">8]</ref>, leverage platform-specific tools such as Windows-based in-browser debuggers that are not always available in other systems. Cova et al. <ref type="bibr">[2]</ref> extract dynamic features from execution traces using the HtmlUnit with Rhino engine simulation environment, which attackers can bypass by leveraging the differences between the emulated environment and a real browser. Recent malicious JavaScript code employed advanced evasion techniques to detect and subvert dynamic analysis methods <ref type="bibr">[6]</ref>. Notably, none of the existing work can tackle evasion techniques using WebAssembly <ref type="bibr">[9,</ref><ref type="bibr">10]</ref>.</p><p>The challenges mentioned above highlight the need for a robust analysis method that can capture dynamic behaviors in potential malicious JavaScript code, especially in the presence of advanced evasion techniques. To this end, we propose a novel JavaScript runtime analysis method and framework encompassing all JavaScript executions, including traditionally on-the-fly generated code and advanced evasion techniques. Our approach mediates JavaScript's security-sensitive operations, including function calls and property accesses at runtime, by leveraging the traditional security reference monitor technique <ref type="bibr">[11]</ref>. Since we monitor the code execution, our method can capture the code behaviors regard-less of the code's structure or evasion techniques. Specifically, the contributions of our work are as follows:</p><p>-We introduce a novel runtime analysis method and framework by leveraging the inlined security reference monitor technique to execute JavaScript code in webpages to capture its behaviors for maliciousness classification and detection. Our framework is designed to allow customization and fine-tuning of the feature extraction process, providing the most important features for machine learning models to improve their accuracy and reliability. -We have developed the proposed method as a JavaScript library, utilizing the language's flexibility and platform independence to create a lightweight runtime monitor. This allows us to efficiently capture all executions and their contexts, regardless of their appearances. We have implemented the framework as a browser extension, meeting the essential requirements for security reference monitors and preventing evasive code from concealing its behaviors. -We have demonstrated that our framework is highly proficient in extracting runtime features that are crucial for machine learning models to accurately classify malicious JavaScript on large-scale raw datasets. As entailed in Sect. 4, our proposed method offers a more effective feature extraction solution than traditional static analysis techniques and advances beyond recent dynamic or hybrid analysis approaches in dealing with malicious code that employs sophisticated evasion tactics, including the latest WebAssembly obfuscation and diversification. The remainder of this paper is structured as follows. In Sect. 2, we provide an overview of the background, review the literature, and discuss related work. Following that, Section 3 includes a running example that motivates our work, and presents our approach to developing and implementing the framework. In Sect. 4, we outline the evaluation of our proposed framework, in comparison with closely related work. Finally, we conclude our contributions and outline potential future work in Sect. 5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Background and Literature Review</head><p>This section briefly describes the background of JavaScript and its analysis methods. We also discuss challenges in detecting malicious JavaScript code with evasion techniques and provide examples. Finally, we review the literature and compare related work with our JSMBox framework.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">JavaScript and Malicious Webpages</head><p>JavaScript is one of the most popular versatile scripting languages primarily used for web development, enabling interactive and dynamic elements on web-pages. When a browser renders a webpage, it executes embedded JavaScript code, whether inlined, sourced from the same host, or retrieved externally. This code can access and alter the webpage's content and data stored in the browser. Furthermore, JavaScript can dynamically generate and execute new code, as well as load and run external scripts in real time. These dynamic features can be lever-aged by both developers and attackers <ref type="bibr">[12]</ref>. By inserting harmful JavaScript code, attackers can craft webpages to exploit vulnerabilities in users' web browsers. These pages can contain various types of malicious content, such as malware, phishing forms, or other forms of harmful information. Despite existing detection tools, JavaScript-based attacks on webpages remain a recent prominent Internet security concern <ref type="bibr">[1]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">JavaScript Analysis Methods</head><p>Existing works propose solutions from several approaches, including static analysis, dynamic analysis, and hybrid analysis <ref type="bibr">[4]</ref> to analyze and detect malicious JavaScript code. Specifically, static analysis is a traditional approach that typically extracts the semantic structure of the code to provide features for detection. Unlike static analysis, which analyzes code without executing it, dynamic analysis runs the code and observes its interactions with the environment in real-time <ref type="bibr">[13]</ref>. By executing code, dynamic analysis can discern malicious activities that static analysis might overlook <ref type="bibr">[13]</ref>. Furthermore, some existing works use a hybrid approach, combining static and dynamic analysis techniques. These works typically utilize static analysis to help identify known patterns and vulnerabilities before execution while using dynamic analysis to provide real-time insights into the actual behavior of the code during runtime execution <ref type="bibr">[14]</ref>. We discuss these methods in detail in the related work sub-section (Sect. 2.4).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Evasion Techniques</head><p>Evasion techniques of malicious JavaScript code are a critical aspect of contemporary cyber threats, wherein attackers employ sophisticated strategies to evade detection mechanisms and execute malicious actions within web environments. These techniques circumvent traditional security measures, including antivirus software, intrusion detection systems, and web application firewalls, posing significant challenges to cybersecurity professionals and researchers <ref type="bibr">[15]</ref>. Obfuscation is one of the commonly used evasion techniques. This technique complicates the readability and analysis of code by altering its structure and appearance to obscure its intended functionality. Techniques such as variable obfuscation, string obfuscation, property encryption, control flow flattening, dead code injection, debugging protection, self-defending, and polymorphic mutation are often utilized to impede code analysis <ref type="bibr">[5,</ref><ref type="bibr">16,</ref><ref type="bibr">17]</ref>. Research indicates that 71% of examined malicious samples employ obfuscation techniques <ref type="bibr">[18]</ref>. We describe common JavaScript evasion techniques below.</p><p>Obfuscation Techniques JavaScript obfuscation is a technique designed to make JavaScript code difficult to understand and analyze. This mechanism enhances the protection of the code and makes it more challenging to reverse engineer or replicate. The primary purpose of obfuscation is to conceal the true intent and structure of the code without altering its functionality. Various obfuscation techniques are available for different aspects of JavaScript code. Below, we list common obfuscation methods identified in the literature, together with their code snippets to illustrate their techniques.</p><p>-Variable and Function Renaming: Changing the names of variables and functions to make the code more challenging to understand and analyze <ref type="bibr">[15]</ref>.</p><p>-Code Compression (Minification): Compression is a technique used to reduce the size of data or code by encoding information in a more compact format. In the context of software development and obfuscation, code compression involves removing unnecessary characters, spaces, and lines from the source code to make it more concise <ref type="bibr">[19]</ref>.</p><p>-Code Transformation: Altering the structure and form of the code, such as changing the form of conditional statements or using ternary operators <ref type="bibr">[20]</ref>.</p><p>-Dead code injection: Dead code injection is a technique used to inject unused or non-executing code into a program. This technique can be employed as a form of obfuscation to make the code more complex and difficult to analyze. Injecting dead code does not affect the program's functionality but can confuse and deter reverse engineers, making it more challenging for them to understand the program's logic and structure <ref type="bibr">[21]</ref>.</p><p>-String Encoding: Converting string literals into other forms, like using Unicode encoding or Base64 encoding <ref type="bibr">[22]</ref>.</p><p>-Indirect method call: Indirect method call is a programming technique that allows the determination of which method or function to call dynamically at runtime. This is typically achieved using function pointers, callback functions, or function objects. While indirect method calls enhance code flexibility, they may also increase code complexity and difficulty of understanding <ref type="bibr">[23]</ref>.</p><p>-Instruction substitution: Instruction substitution is an obfuscation technique that involves replacing original instructions in a program with equivalent instructions that have a different structure or form, thereby increasing the complexity and difficulty of understanding the code while maintaining its functionality <ref type="bibr">[24]</ref>.</p><p>-Non-alphanumeric code: Non-alphanumeric code is an obfuscation technique primarily used to replace alphabetic and numeric characters in code with non-alphanumeric characters, such as symbols and special characters, to increase the complexity and difficulty of understanding the code <ref type="bibr">[25]</ref>.</p><p>-String splitting: This method involves separating a string or function name into multiple smaller fragments and then reassembling them at runtime <ref type="bibr">[5]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>WebAssembly obfuscation and diversification</head><p>WebAssembly (Wasm) is a binary instruction format that is designed to be executed in a web browser, aiming to provide a portable, high-performance for web applications that leverage existing codebases and libraries written in other common programming languages rather than JavaScript.</p><p>With that design, WebAssembly has quickly become an essential part of the Web, providing a great alternative to JavaScript <ref type="bibr">[26]</ref>. On the other hand, WebAssembly has also been used as a sophisticated evasion technique to conceal malicious code in webpages and evade code analysis and detection techniques. Wobfuscator <ref type="bibr">[9]</ref> is a recent research tool demonstrating a WebAssembly obfuscation technique that transforms parts of the JavaScript computation into WebAssembly and evades JavaScript malware detection tools. In <ref type="bibr">[10]</ref>, the authors developed an automatic binary WebAssembly diversification evasion technique that can evade most of the cases in popular detectors such as VirusTotal and MINOS. The research findings motivate innovative approaches that can address the modern technology on the Web.</p><p>Browser Fingerprinting Browser fingerprinting is a technology that creates a unique identifier (fingerprint) by collecting various attributes and behaviors of the client's web browser, allowing for user identification and tracking. These attributes may include the browser's user agent string, operating system, screen resolution, and installed plugins. Browser fingerprinting is commonly used for purposes such as user tracking, personalized advertising, and security verification <ref type="bibr">[27]</ref><ref type="bibr">[28]</ref><ref type="bibr">[29]</ref>.</p><p>Attackers can utilize this technology to examine specific attributes or configurations of the client's web browser to determine if they meet the conditions for an attack. For example, attackers may inspect the browser's user agent string or other characteristics to determine if it is the target browser and then execute malicious code or attacks accordingly. This type of inspection may be conducted to ensure the success of an attack or to tailor different attack strategies based on the targeted browser.</p><p>Browser fingerprinting or detection helps attackers ensure that their exploit is only triggered on the intended target browsers. This technology is used not only to detect the browser's version but also can be used to detect client-side content; it also possesses strong anti-detection capabilities, making it immune analysis methods <ref type="bibr">[30]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">Related Work</head><p>Methods to analyze webpages and JavaScript code for classifying and detecting malicious JavaScript in the literature can be categorized into three categories: static analysis, dynamic analysis, and a hybrid combination of static and dynamic analysis <ref type="bibr">[4]</ref>. In this section, we briefly discuss the static analysis approaches and review the dynamic approaches in more detail compared with our approach.</p><p>Static Analysis Traditional methods of static analysis are engineered to identify malicious JavaScript code without executing the source code. These methods extract the features of malicious code to build a malicious feature library. Subsequently, they evaluate the detected code to determine if it matches the features within this malicious feature library. More recent approaches employ machine learning and deep learning to improve the detection rate. These works typically transfer detected code into vectors using various methods, such as fixed-length vector representation, abstract syntax tree (AST), Control Flow Graph, and Program Dependency Graph <ref type="bibr">[31]</ref><ref type="bibr">[32]</ref><ref type="bibr">[33]</ref><ref type="bibr">[34]</ref><ref type="bibr">[35]</ref>. Based on these representations, detection models are built using machine learning classifier algorithms, includ-ing Random Forest, Naive Bayes, Support Vector Machine (SVM), and Random Forest. For example, ZOZZLE <ref type="bibr">[36]</ref> generates features based on the hierarchi-cal structure of the JavaScript AST and employs rapid pattern matching and Naive Bayes classifier for detection. JStap <ref type="bibr">[33]</ref> is a static malicious JavaScript detector that uses lexical analysis, AST, control flow, and data flow information, utilizing a Random Forest classifier. Ren et al. <ref type="bibr">[15]</ref> studies the effects of obfusca-tion on existing malicious JavaScript detectors, employing a range of classifiers. However, conventional static analysis methods typically fail to deal with dy-namically generated code and evasion techniques (e.g., obfuscation code) used by attackers to conceal malicious code <ref type="bibr">[5]</ref>. In a recent study <ref type="bibr">[15]</ref> of represen-tative static analysis-based approaches of detecting obfuscation code, they find "the feature spaces of existing detectors can only reflect shallow differences in code, not about the nature of benign and malicious, which can be easily affected by the differences brought by obfuscation." In other words, state-of-the-art static analysis-based approaches are still unable to detect malicious code that employs evasion techniques accurately.</p><p>Dynamic Analysis Dynamic analysis-based methods involve executing the program to uncover specific behaviors, even when the program is obfuscated, as indicated by Kim et al. <ref type="bibr">[30]</ref>. Researchers employ these approaches to extract behavioral features during the execution of code for the classification of malicious code within test environments, including sandboxes <ref type="bibr">[2,</ref><ref type="bibr">7,</ref><ref type="bibr">37,</ref><ref type="bibr">38]</ref>, honeypots <ref type="bibr">[39,</ref><ref type="bibr">40]</ref>, and browsers <ref type="bibr">[6]</ref>. Snyder et al. <ref type="bibr">[41]</ref> investigated the usage patterns of JavaScript features in modern web browsers, revealing that most features are rarely used and are often blocked by ad and tracking blockers. Based on how third-party trackers manipulate browser state, Roesner et al. <ref type="bibr">[42]</ref> developed an in-band client-side method for detecting and classifying five kinds of thirdparty trackers. Yagemann et al. <ref type="bibr">[43]</ref> designed an offline control flow analysis method for attack detection using deep learning on hardware execution traces to model a program's behavior and detect control flow anomalies. In addition, Ratana-worabhan et al. <ref type="bibr">[44]</ref> introduced a runtime heap-spraying detector that examines individual objects in the heap, interpreting them as code and performing a static analysis to detect malicious intent.</p><p>However, similar to static analysis, one limitation of these methods is their inefficiency in detecting evasion techniques.</p><p>Many studies <ref type="bibr">[4,</ref><ref type="bibr">[45]</ref><ref type="bibr">[46]</ref><ref type="bibr">[47]</ref><ref type="bibr">[48]</ref><ref type="bibr">[49]</ref> have focused on addressing obfuscated code to overcome this aforementioned limitation. Li et al. <ref type="bibr">[45]</ref> proposed a forensic engine that can efficiently record fine-grained details on the execution of JavaScript programs within the browser. Additionally, Fang et al. <ref type="bibr">[46]</ref> proposed a malicious JavaScript detection model based on LSTM that extracts features from the semantic level of bytecode and optimizes the word vector. Furthermore, Song et al. <ref type="bibr">[4]</ref> constructed the Program Dependency Graph to generate semantic slices. Based on this, they designed a malicious JavaScript detection model utilizing bidirectional LSTM. Neasbitt et al. <ref type="bibr">[47]</ref> presented an online forensic data collection system that allows for recording enough detailed information to enable a full reconstruction of web security incidents, including phishing attacks. Moreover, Wang et al. <ref type="bibr">[49]</ref> designed a deep learning framework that integrates sparse random projection, a deep learning model, and logistic regression to detect malicious JavaScript code. Rieck et al. <ref type="bibr">[48]</ref> inspected web pages to block the delivery of malicious JavaScript code and collected static and dynamic code features for ma-licious pattern analysis. Besides, Jueckstock et al. <ref type="bibr">[6]</ref> proposed a dynamic analysis framework hosted inside V8, the JavaScript engine of the Chrome browser, that logs native function or property accesses during any JavaScript execution to monitor browser behaviors. In comparison to others, this method proves significantly more efficient in detecting evasion techniques, as it can deal with both obfuscated code and browser fingerprinting. However, none of the existing work can address all evasion techniques discussed previously.</p><p>In contrast to the aforementioned research, our proposed method addresses the challenges of analyzing malicious JavaScript arising from dynamic JavaScript features and all advanced evasion techniques by capturing the behaviors of both static and dynamically generated code. We present our technical approach and discuss how it can tackle the challenges in the next section.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Technical Approach and Implementation</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">A Motivating Example</head><p>To present our approach, we consider the following JavaScript snippet example illustrated in Listing 3.1, providing key concepts underlying our proposed approach. The provided example highlights the code obfuscation of the HTMLCanvas-Element.prototype.toDataURL method, typically used in malicious code that exploits fingerprinting attacks to identify and track users <ref type="bibr">[50]</ref>. We note that actual malicious obfuscated JavaScript codes are substantially more sophisticated.</p><p>Since static analysis-based techniques do not execute the code, they encounter challenges in recognizing these obfuscated scripts <ref type="bibr">[5]</ref>. This limitation stems from the inadequacies of current machine-learning-based static analysis techniques to accurately identify malicious code that employs evasion strategies, as highlighted in recent empirical studies <ref type="bibr">[15]</ref>. As a result, the obfuscated JavaScript is executed, activating HTMLCan-vasElement.prototype.toDataURL method, which malicious actors can manipulate. To address the challenges in detecting the malicious intent of obfuscated code, various dynamic analysis strategies <ref type="bibr">[2,</ref><ref type="bibr">6,</ref><ref type="bibr">7,</ref><ref type="bibr">[37]</ref><ref type="bibr">[38]</ref><ref type="bibr">[39]</ref><ref type="bibr">[40]</ref> have been proposed. These strategies aim to monitor the runtime behavior of JavaScript code because obfuscated code cannot disguise its activities during execution. However, a notable gap in existing research is the lack of focus on monitoring the potentially malicious use of the HTMLCanvasElement.prototype.toDataURL method and the application of machine learning techniques to determine their maliciousness <ref type="bibr">[50,</ref><ref type="bibr">51]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Approach and the Overview of the Proposed Framework</head><p>The running example discussed above is one of many challenges in JavaScript code analysis that motivate our work. To address these challenges, we lever-age the runtime monitoring approach that executes JavaScript code to log its behaviors, regardless of their appearance or evasion techniques. Specifically, we propose JSMBox, a dynamic analysis JavaScript framework that adopts a behavioral sandbox approach <ref type="bibr">[52]</ref>. Our proposed approach aims to monitor and record.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Web Browser</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Fig. 1. Approach Overview</head><p>JavaScript code execution by intercepting its operations, such as property access and method calls, within the JavaScript execution environment. This method enables JSMBox to conduct real-time, dynamic analysis of the code, extracting its execution trace for data engineering and machine learning models. Our primary objective is to develop a robust and effective technique for analyzing malware capable of circumventing the sophisticated evasion methods employed by modern malicious webpages. In pursuit of this goal, we employ the traditional security reference monitor technique <ref type="bibr">[11]</ref> to oversee code execution, implemented exclusively in JavaScript, thus providing a more dependable and holistic solution, addressing the limitations of existing static and dynamic analysis techniques. Furthermore, our approach is lightweight and platformindependent, allowing for flexible deployment and feature customization. To the best of our knowledge, no prior research has utilized the reference monitoring method in JavaScript code for dynamic malware analysis.</p><p>Figure <ref type="figure">1</ref> depicts the overview of our proposed framework JSMBox. Within this framework, we incorporate a reference monitor, runs before the browser loads and executes any other JavaScript code on a given webpage. This mechanism ensures the monitor maintains a unique and original reference to the intercepted JavaScript events, i.e., function calls or property accesses, implemented in the browser. This approach effectively preserves the original functionality of the webpage while mitigating potential detection techniques employed in evasive malicious JavaScript code <ref type="bibr">[53]</ref>. The monitor utilizes configuration data, defining intercepted JavaScript events and the properties of extracted features to record and retain the code execution details, such as the frequency of event execution, in a log file. This log file is then employed as input for a machinelearning algorithm to classify the maliciousness of the code. We discuss key components of our behavioral sandbox approach below.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">The Monitor Initialization and Protection</head><p>We developed the monitor using pure JavaScript as a library within an anonymous function to preserve local references to all original built-in methods that will be utilized later in the monitoring process, along with other behavioral events to be monitored. By encapsulating these references within an anonymous function, external code cannot access them since local variables are protected within an anonymous function. As the library is the first code to be executed in the browser, we have the advantage of safeguarding against potential malicious attempts to subvert these built-in methods or monitored functions. This mechanism empowers the monitor to regulate all subsequent JavaScript code execution, ensuring complete mediation. Moreover, we can define policies to detect and prevent malicious code that attempts to bypass the monitoring at runtime. These mechanisms make our approach tamper-proof <ref type="bibr">[54]</ref> and shielded from evasive detection methods <ref type="bibr">[53]</ref>. In addition, they allow us to adapt event monitoring and policies to tackle potential new evasion techniques over time.</p><p>To make our framework more flexible and customizable, we can define JavaScript execution events, such as function calls and property accesses, as well as behaviors like the call frequency or sequence, in a configuration file. The monitor will then load this file and create wrapper functions based on the information provided. We will demonstrate the initialization steps using pseudo-code in Listing 3.2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Intercepting Execution Events</head><p>intercepted JavaScript functions and properties of a global object, such as document, window to monitor their invocations and accesses, i.e., execution events. For every method call specified in the configuration file, we start by preserving the original reference and its aliases. This mechanism ensures that the monitor captures any existing prototype inheritance chain of the reference to prevent possible attacks in malicious code <ref type="bibr">[54]</ref>. Semi-pseudo code in Listing 3.3 illustrates this interception process. For property accesses, e.g., document.cookie, we leverage the Object.defineProperty(..) standard API and define the handler functions whenever a property is accessed, i.e., read or write.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5">Implementation</head><p>Developed a pure JavaScript library, our JSMBox framework can be injected into a website in multiple ways to monitor the JavaScript code execution on the site. For example, we can inject the library as the first script to be executed in a webpage using webpage instrumentation, a web proxy, or a browser extension/add-on. As a framework for JavaScript analysis, we implement JSM-Box as a browser extension so that we can effectively collect the logged data and automate the browser with our extension on a largescale dataset. A browser extension or add-on is additional code that can be loaded into a browser to modify and enhance its capability. Major browsers, including Chromiumbased browsers, such as Google Chrome, Brave, Microsoft Edge, Opera, Vivaldi, and Firefox, support browser extensions or add-ons <ref type="bibr">[55]</ref>. Since our main code is written in JavaScript, it should be deployable in any browser supporting extensions/add-ons. We implement and test our code in the Chromium browser, a widely used codebase in many other browsers. As noted in <ref type="bibr">[55]</ref>, Chromium-based browser extensions can run in Firefox with just a few changes. To ensure that our code is executed first before the browser loads a webpage and executes its JavaScript code, we place our code in the background script of the browser extension. As discussed in Sect. 3.2, we have confirmed this mechanism by performing experimental tests.</p><p>Before loading a webpage upon request, the browser executes our code, which will intercept defined methods and properties. When a webpage is loaded in the browser, any JavaScript event that triggers these methods and properties will invoke our code, which will log the event and then invoke the original reference. This mechanism ensures that our code is set to monitor the behaviors of scripts at runtime, capturing right from the moment a page begins to load. Since we monitor the code at runtime, potential performance overhead exists. While we have not studied the overhead in this work, prior work demonstrated that the JavaScript inlined reference monitor approach lightweight performance overhead <ref type="bibr">[52,</ref><ref type="bibr">56,</ref><ref type="bibr">57]</ref>.</p><p>Although hundreds of commonly used JavaScript method calls exist, not all are susceptible to malicious JavaScript <ref type="bibr">[58,</ref><ref type="bibr">59]</ref>. Monitoring an excessively broad range of events can introduce noise and increase system overhead. Noises in extracted features reduce the accuracy of machine learning-based detection. In our current implementation of JSMBox, we have curated a selection of the most critical events with 59 method calls and property accesses. Benign JavaScript behaviors are selected based on the most commonly used by any website to maintain the dynamic nature of the website and keep it functioning. The malicious ones are selected based on their potential to be misused in web-based attacks, such as executing unauthorized code or scripts, e.g., eval, window.open for unwanted pop-up ads and navigator.sendBeacon can be used for unauthorized tracking. For instance, the charCodeAt(..) method of String is considered susceptible to misuse as it can be employed to encode data or generate obfuscated code that is difficult to decipher, thereby facilitating evasion techniques. These methods were identified based on analysis of human-labeled malicious JavaScript code, used in conjunction with each other <ref type="bibr">[60]</ref><ref type="bibr">[61]</ref><ref type="bibr">[62]</ref><ref type="bibr">[63]</ref><ref type="bibr">[64]</ref><ref type="bibr">[65]</ref><ref type="bibr">[66]</ref>. Table <ref type="table">1</ref> lists the 12 selected representative misused function calls with their descriptions implemented in JSMBox.</p><p>Our current JSMBox prototype monitors each event execution, i.e., behavior, and accumulates its frequency within a session to log the data as features. The resultant counting is transformed into a feature vector [a 1 , a 2 , a 3 , a 4 , a 5 , ..., a 59 ], where a i denotes the frequency of the i-th behavior. As discussed in Sect. 3.2, our JSMBox framework supports customization of input events and features. However, we leave this implementation prototype for future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Evaluation and Experimental Results</head><p>To evaluate our approach and the proposed JSMBox framework, we consider two research questions: RQ2 : What is the performance of JSMBox as a dynamic analysis tool for malicious webpage classification in machine-learning models?</p><p>We present and discuss our evaluation and experimental results for each research question below.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Defeating JavaScript Evasion Techniques</head><p>To evaluate how our JSMBox framework can deal with and defeat sophisticated evasion techniques, as discussed in Sect. 2.3, to monitor defined behaviors for analysis and data extraction, we replicate these techniques in webpages. We load these webpages in the browser with our extension and observe that all behaviors hidden in obfuscation code or other advanced evasion techniques are captured by JSMBox.</p><p>For instance, we simulate the fingerprinting attack by obfuscating this method as discussed in running example in Listing 3.1, where existing analysis approaches failed to capture <ref type="bibr">[50,</ref><ref type="bibr">51]</ref>. While running this attack in the browser, our JSMBox framework can still track and log its execution.</p><p>Another notable example is the case of the WebAssembly evasion technique discussed earlier. To confirm our approach can capture this new technique, we develop a simple C program that invokes a JavaScript method, illustrated in Listing 4.1. The C program is compiled into WebAssembly binary code (in a.wasm file, shown in Fig. <ref type="figure">2</ref>) and embedded into a webpage. When the webpage is loaded in the browser with JSMBox, the JavaScript method invoked from WebAssembly code is executed and logged by our framework, demonstrated in Fig. <ref type="figure">3</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Maliciousness Classification</head><p>In this section, we will outline our experiments designed to assess the performance of our framework using machine learning models. Our experiments were conducted on a powerful CyberRange environment utilizing a virtual machine with Ubuntu 22.04 OS, 12 CPUs, 32 GB of RAM, and a 500 GB hard drive.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Dataset and feature extraction</head><p>We collected a large number of malicious and benign website samples from two different datasets: the URLhaus database <ref type="bibr">[67]</ref> for malicious websites, and the Tranco dataset <ref type="bibr">[68]</ref> for benign websites. The URLhaus database, which is part of the Abuse.ch project, is well-known for its comprehensive collection of malicious URLs and is used by organizations such as the FBI, demonstrating its reliability. On the other hand, Tranco provides a strong website ranking by combining various data sources to ensure stability and resistance to manipulation, making it an excellent source of benign websites <ref type="bibr">[68]</ref>. Our initial dataset consists of over 200,000 benign websites and over 200,000 malicious websites from these two sources. We maintain these websites in two lists to label and evaluate them separately.</p><p>For each website, we need to load it into the browser with our extension so that the JavaScript code can be executed and captured by our framework at run-time. To automate this process for large-scale datasets, we leverage Puppeteer<ref type="foot">foot_2</ref> , a Node.js library that allows us to control Chromium-based or Firefox browsers, which is particularly helpful for browser automation and data collection. We develop and run a script with a list of websites, launching a new browser instance for each one using Puppeteer, which is then set to load the browser extension. Once a website is fully loaded, our code checks for captured data and writes it into a CSV file labeled as malicious or benign. This process has resulted in a total of approximately 10,000 records from the malicious list, as well as a similar number of records from the benign list. The data from the two CSV files are combined to create extracted features for further analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Machine-learning models</head><p>We have utilized eight well-known machine learning models, which include Support Vector Machine (SVM), Logistic Regression, Na&#239;ve Bayes, K-Nearest Neighbors (KNN), Decision Tree, Random Forest, XG-Boost, and Ensemble methods <ref type="bibr">[69,</ref><ref type="bibr">70]</ref>, to assess our collected dataset. Each model demonstrates different levels of accuracy in detecting malicious JavaScript, based on feature vectors extracted from the JavaScript code. Our comprehensive evaluation results enable users to choose the most suitable model for optimizing the detection process.</p><p>The metadata for our machine learning models can be found in Table <ref type="table">2</ref>. We have carefully fine-tuned specific hyperparameters as outlined in the second column of the table. One crucial aspect of this fine-tuning process involves optimizing model hyperparameters through the use of the GridSearchCV method <ref type="bibr">[71,</ref><ref type="bibr">72]</ref>. GridSearchCV conducts an exhaustive search over a specified parameter grid.</p><p>It trains the model on every combination of hyperparameters and selects the best combination based on cross-validation performance. This method performs a comprehensive search across a predefined grid of parameters, training the model with each parameter combination and identifying the optimal set based on cross-validation performance <ref type="bibr">[71]</ref>. In addition to hyperparameter tuning, our training pipeline incorporates the Synthetic Minority Over-sampling Technique (SMOTE) <ref type="bibr">[73]</ref> to address the class imbalance by generating synthetic samples within the feature space and enhancing model training effectiveness. We also use the Standard Scaler as a preprocessing step to standardize the features by removing the mean and scaling to unit variance. Additionally, we employ the Support Vector Classifier (SVC) <ref type="bibr">[74,</ref><ref type="bibr">75]</ref>, an adaptation of the Support Vector Machine algorithm, for classification tasks. JSMBox's effectiveness in classifying malicious JavaScript We have chosen specific performance metrics to directly address our research objectives, including accuracy, precision, recall, and F-scores <ref type="bibr">[76,</ref><ref type="bibr">77]</ref>. This methodology enables us to systematically categorize each JavaScript snippet, whether malicious or benign, into one of four potential outcomes. Using the classification of malicious snippets as an example: (1) Classified as malicious if it indeed contains malicious code, marking a true positive (TP) identification. (2) Classified as malicious erroneously when it is, in reality, benign, resulting in a false positive (FP). ( <ref type="formula">3</ref>) Classified as benign mistakenly when it contains malicious code, leading to a false negative (FN). ( <ref type="formula">4</ref>) Accurately classified as benign when it contains no malicious code, constituting a true negative (TN).</p><p>-Accuracy: the number of instances correctly classified over the total number of instances. Table <ref type="table">3</ref> shows the performance of eight machine learning models, with their effectiveness in spotting malicious JavaScript code scoring between 0.77 and 0.88, and for benign code, between 0.69 and 0.88, according to the F1-score. The Ensemble model shines by pinpointing malicious code with the greatest precision, 0.89. Meanwhile, the Random Forest model is the best at catching almost all malicious codes, achieving the highest recall of 0.88. When we look at the F1-score, which balances both precision and recall, Random Forest comes out on top for identifying malicious code, with the highest score of 0.88. Similarly, when finding benign code, both Random Forest and the Ensemble models are the best choices, each with top F1-scores of 0.88. The accuracy column provides an over-all measure of a model's performance. High accuracy indicates that the model effectively distinguishes benign and malicious JavaScript content. As observed from the table, the Random Forest model achieved the highest accuracy of 0.88, closely followed by the Ensemble model at 0.87. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Accuracy</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion and Future Work</head><p>In this paper, we introduced JSMBox, a novel behavioral sandbox approach designed to address the issues of analyzing and classifying malicious JavaScript code. Our method effectively addresses the limitations of traditional static and dynamic analysis techniques by monitoring and controlling JavaScript code behavior at runtime. By leveraging an inlined security reference monitor, our approach captures the behaviors of both static and dynamically generated code, including those employing advanced evasion techniques. We implemented JSM-Box as a runtime JavaScript analysis framework, which can monitor customizable events and their behaviors. The experimental results demonstrated the effectiveness and efficacy of our method, with machine learning models trained on features extracted by the framework achieving high accuracy rates, even when advanced evasion techniques are used to conceal malicious behaviors. Future work will focus on enhancing the range of features extracted by the framework, including more sophisticated behaviors and different behavioral patterns. We will also investigate how to extend the implementation of JSMBox to support multiple web browsers, ensuring its effectiveness and usability across different browsing environments. Additionally, we aim to develop a version of the framework that supports multiple web browsers or can be integrated directly into core browsers for more seamless and comprehensive monitoring. Experiments with a wider variety of datasets, including JavaScript files and newer web technologies like WebAssembly, will also be conducted to ensure the robustness and adaptability of our approach.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0"><p>According to the World Wide Web Technology Surveys in July</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2024" xml:id="foot_1"><p>(https://w3techs.com), 98.9% of all websites contain JavaScript code, which will be loaded and executed in a browser at the end-user.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_2"><p>https://pptr.dev/</p></note>
		</body>
		</text>
</TEI>
