<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>GUIFuzz++: Unleashing Grey-box Fuzzing on Desktop Graphical User Interfacing Applications</title></titleStmt>
			<publicationStmt>
				<publisher>IEEE</publisher>
				<date>11/16/2025</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10639404</idno>
					<idno type="doi"></idno>
					
					<author>Dillon Otto</author><author>Tanner Rowlett</author><author>Stefan Nagy</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Desktop applications represent one of today's largest software ecosystems, accounting for over 96% of workplace computing and supporting essential operations across critical sectors such as healthcare, commerce, industry, and government. Though modern software is increasingly being vetted through fuzzing-an automated testing technique for large-scale bug discovery-a major component of desktop applications remains universally under-vetted: the Graphical User Interface (GUI). Existing desktop-based fuzzers like AFL++ and libFuzzer are limited to non-GUI interfaces (e.g., file-or buffer-based inputs), rendering them wholly incompatible with GUIs. Conversely, mobile app GUI fuzzers like Android's Monkey and iOS's XCMonkey rely on platform-specific SDKs and event-handling, rendering them fundamentally unportable to the broader, more complex landscape of desktop software. For these reasons, desktop GUI code remains largely under-tested, burdening users with numerous GUI-induced errors that should, in principle, be just as discoverable as any other well-fuzzed class of software bugs.This paper introduces GUIFUZZ++: the first general-purpose fuzzer for desktop GUI software. Unlike desktop fuzzers that randomly mutate file-or buffer-based inputs, GUIFUZZ++ exclusively targets GUI interactions-clicks, scrolls, key presses, window navigation, and more-to uncover complex event sequences triggering GUI-induced program errors. Central to our approach is a novel GUI Interaction Interpreter: a middle-layer translating fuzzer-generated random inputs into distinct GUI operations, enabling successful non-GUI fuzzers like AFL++ to be easily ported to testing GUIs. Beyond supporting today's most popular GUI development frameworks like QT, GTK, and Xorg, we introduce a suite of enhancements capitalizing on ubiquitous Software Accessibility Technologies, significantly boosting GUI fuzzing precision as well as GUI bug-finding effectiveness.We integrate GUIFUZZ++ as a prototype atop state-of-the-art GUI-agnostic fuzzer AFL++, and perform a large-scale ablation study of its fundamental components and enhancements. In an evaluation across 12 popular, real-world GUI applications, GUI-FUZZ++ uncovers 23 previously-unknown GUI-induced bugswith 14 thus far confirmed or fixed by developers.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>Desktop software-applications deployed on personal computers or workstations-play an ever-growing role in our modern digital age, representing over 96% of workplace computing today <ref type="bibr">[4]</ref>. As desktop software spans important domains like healthcare, commerce, industry, and government, its correctness directly impacts many of society's most critical services. Unfortunately, program bugs remain a significant challenge across today's desktop computing ecosystems (e.g., Linux, macOS, and Windows), burdening users with outright failures, and developers with costly remediation efforts. In the race to proactively thwart bugs before they emerge postdeployment, developers are increasingly turning to fuzzing: an automated testing technique that scrutinizes software by generating massive amounts of randomly-mutated test cases.</p><p>Fuzzers are uniquely engineered to target specific software interfaces: the channels by which user input is passed into a program-and ultimately triggers its bugs. For example, popular fuzzer AFL <ref type="bibr">[54]</ref> and its successor, AFL++ <ref type="bibr">[10]</ref>, both focus on file-based interfaces, mutating on-disk files and subsequently re-executing the target program on each to uncover its aberrant runtime behaviors. Others, such as libFuzzer <ref type="bibr">[44]</ref> and Nyx <ref type="bibr">[42]</ref>, instead target memory-based interfaces, such as API functions that consume buffered data. As nearly all desktop software fuzzers are merely derivations of these few "mother" fuzzers <ref type="bibr">[37]</ref>-commonly AFL++ and libFuzzer-they subsequently target the very same interfaces. Yet, one crucial interface remains universally under-tested across today's ever-growing desktop software ecosystems: the Graphical User Interface (GUI).</p><p>Given their prevalence among desktop software, GUIs unsurprisingly are responsible for many program bugs. Public issue trackers reveal numerous crashes stemming from unhandled GUI-induced edge cases, plaguing applications as simple as calculators <ref type="bibr">[19]</ref> to those as complex as image editors <ref type="bibr">[7]</ref>, 3-D modeling tools <ref type="bibr">[16]</ref>, and web browsers <ref type="bibr">[2]</ref>. Unfortunately, existing desktop-based fuzzers like AFL++ are isolated to non-GUI interfaces, with zero direct support for GUIs <ref type="bibr">[10]</ref>. While mobile SDKs offer built-in GUI testing (e.g., Android's Monkey <ref type="bibr">[24]</ref>, iOS's XCMonkey <ref type="bibr">[22]</ref>), the diverse landscape of GUI software in commodity desktop OSes-coupled with incompatible system-level event handling between these platformsimpedes direct porting of mobile GUI fuzzers to desktop ecosystems. Consequently, desktop GUI fuzzing currently remains limited to cost-prohibitive commercial offerings (e.g., Ranorex <ref type="bibr">[18]</ref>, Squish <ref type="bibr">[17]</ref>) or one-off, target-specific fuzzers (e.g., GUIFuzz <ref type="bibr">[9]</ref> for calc.exe), leaving an untold number of GUI-induced bugs hidden among today's critical desktop GUI software ecosystems.</p><p>To overcome these challenges and unleash large-scale GUI testing on desktop applications, this paper introduces GUI-FUZZ++: the first general-purpose grey-box fuzzer for desktop GUI software. Unlike typical file-or buffer-mutating fuzzers, GUIFUZZ++ systematically explores GUIs by mutating its interactions-clicks, scrolls, key-presses, window navigation, and more-facilitating discovery of complex GUIinduced errors in diverse desktop applications. Central to GUI-FUZZ++ is a novel Interaction Interpreter: a middle-layer for translating fuzzers' randomly-generated test cases into distinct GUI operations, enabling conventional file-or buffer-mutating fuzzers to be repurposed for desktop GUI fuzzing. We further bolster GUIFUZZ++ by harnessing widely-available Software Accessibility Technologies <ref type="bibr">[13]</ref>, significantly enhancing GUI fuzzing precision and bug-finding effectiveness.</p><p>We implement GUIFUZZ++ atop today's leading non-GUI grey-box fuzzer, AFL++ <ref type="bibr">[10]</ref>, and evaluate its efficacy across a diverse corpus of 12 real-world desktop applications on Linux spanning popular GUI development frameworks such as QT <ref type="bibr">[6]</ref>, GTK <ref type="bibr">[12]</ref>, and Xorg <ref type="bibr">[15]</ref>. We empirically evaluate GUIFUZZ++'s contributions and enhancements through a series of ablation studies, showing how its combined components create an effective platform for discovering GUI bugs in desktop software. Notably, GUIFUZZ++ reveals 23 previouslyunknown GUI-induced bugs across 11 desktop applications, of which 14 are so far confirmed or fixed by their developers.</p><p>Through the following contributions, this paper introduces the first general approach for uncovering GUI-induced bugs in today's vast ecosystems of desktop GUI software:</p><p>&#8226; We examine the challenges of extending fuzzing to GUIbased applications on desktop platforms such as Linux, macOS, and Windows. We survey existing state-of-the-art fuzzing approaches, and weigh their shortcomings with respect to enabling systematic desktop app GUI fuzzing.</p><p>&#8226; We leverage our insights to design GUIFUZZ++: the first general-purpose grey-box fuzzer for desktop GUI software. We detail how GUIFUZZ++'s design facilitates practical and far-reaching desktop GUI fuzzing, maintaining high precision toward effective GUI bug discovery.</p><p>&#8226; We evaluate GUIFUZZ++'s capabilities through a series of ablation studies across 12 popular Linux GUI applications spanning various software domains. We show that GUIFUZZ++ enables effective GUI bug discovery, culminating in the identification of 23 new GUI-induced crashes, of which 14 are so far confirmed or fixed.</p><p>&#8226; We release GUIFUZZ++ in addition to all of our evaluation artifacts and benchmarks at the following URL: <ref type="url">https://github.com/FuturesLab/GUIFuzzPlusPlus</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. BACKGROUND, RELATED WORK, AND MOTIVATION</head><p>This section introduces the fundamental topics related to GUIFUZZ++: software GUIs, GUI-induced bugs, and the challenges of fuzzing desktop GUI software.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. GUIs: Graphical User Interfaces</head><p>Graphical User Interfaces (GUIs) are among the most prevalent features in modern software, enabling complex applications to offer rich and intuitive user interactions: clicking, dragging-and-dropping, scrolling, menu navigation, and much more. Today's GUI development market, valued at $885 million, is projected to surpass $2 billion by 2031 <ref type="bibr">[47]</ref>. To craft these interfaces, developers typically utilize dedicated GUI development frameworks. Common examples include cross-platform libraries like QT <ref type="bibr">[6]</ref> and GTK <ref type="bibr">[12</ref>], Android's Jetpack [26], and Apple's UIkit [23]. Program Crash Type Brief Description Bug ID Glaxnimate [3] Abort Text object properties #408 KolourPaint [30] Abort Double undo in new window #457915 LabPlot [31] Abort Fitting function data #372834 LibreCAD [36] Segfault Right click with move/copy #235 MATE-calc [38] Segfault "Not" on long hex value #114 PlotJuggler [8] Segfault Apply filter on curve #603 Umbrello [33] Segfault Cancelling seq diagram class #443580</p><p>TABLE I: Examples of known GUI-triggered bugs in desktop software.</p><p>Most GUI development frameworks employ a similar multithreaded architecture: a dedicated "main" thread updates the user interface and dispatches GUI-issued events to the application's back-end, while one or more "worker" threads process the application's back-end operations. This separation is key to ensuring that the interface remains responsive, even as complex operations are handled in the background.</p><p>However, this design also brings unique challenges: thread coordination can introduce subtle concurrency bugs <ref type="bibr">[35]</ref>, while complex GUI component lifecycles can trigger temporal memory errors-both often surfacing only under specific interaction sequences (Table <ref type="table">I</ref>). Moreover, the inherent complexity of GUIs, ranging from diverse user interactions and nested sub-menus to transient pop-up screens and other applicationspecific bottlenecks, poses major challenges to proactive bug discovery. To address this, a substantial body of research has emerged targeting GUI-induced errors in mobile app ecosystems such as Android and iOS <ref type="bibr">[24]</ref>, <ref type="bibr">[22]</ref>. Yet, while these approaches have achieved great success in uncovering GUI bugs within mobile apps, today's ever-growing desktop software ecosystems spanning Linux, macOS, and Windows remain completely overlooked, with no comparable solutions for discovering their GUI-related software defects.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Why Fuzzers Fail on Desktop GUI Software</head><p>Among today's most proven approaches for software bug discovery is fuzzing: an automated software testing technique that uncovers bugs by generating and mutating massive amounts of test cases. Despite the variety of fuzzing techniques available currently <ref type="bibr">[10]</ref>, <ref type="bibr">[44]</ref>, desktop applicationsaccounting for over 96% of workplace computing needs today <ref type="bibr">[4]</ref>-lack any practical fuzzing solutions for uncovering GUI-induced bugs. In the following, we survey contemporary fuzzing solutions, assessing their key shortcomings with respect to supporting GUI fuzzing for desktop-based software ecosystems.</p><p>Desktop Application Fuzzers: Popular application fuzzers like AFL <ref type="bibr">[54]</ref>, AFL++ <ref type="bibr">[10]</ref>, and honggFuzz <ref type="bibr">[50]</ref> all target file interfaces, mutating test cases as on-disk files that are subsequently each fed to the program under test. Others such as libFuzzer <ref type="bibr">[44]</ref> and Nyx <ref type="bibr">[42]</ref> instead target memorybased interfaces, mutating in-memory data that is ultimately read by API functions, respectively. Despite their proven success, these mainstream grey-box fuzzers-the foundational frameworks for most modern fuzzers <ref type="bibr">[37]</ref>-lack any support for GUI testing, instead concentrating on traditional file-and memory-based program interfaces. Though recent advancements attempt to bypass GUIs via automated program slicing (e.g., Winnie <ref type="bibr">[28]</ref>), these methods merely redirect testing to typical file-or buffer-based interfaces, leaving bugs caused by GUI interactions still undiscoverable.</p><p>Desktop Environment Fuzzing: EnvFuzz <ref type="bibr">[39]</ref>, a recent grey-box fuzzing approach, instead mutates desktop applications' environment-level interfaces such as configuration files, fonts, themes, and sockets. While EnvFuzz has indeed been applied to desktop GUI applications (e.g., calculators <ref type="bibr">[20]</ref>), it inherits conventional desktop fuzzing's <ref type="bibr">[10]</ref> limitation of supporting only data-level interfaces, leaving it unable to explore GUI interactions whatsoever. This limitation is further reflected in EnvFuzz's failure to uncover any genuine GUIinduced bugs across its tested GUI applications <ref type="bibr">[39]</ref>.</p><p>Mobile App GUI Fuzzers: Although GUI testing continues to see adoption in mobile ecosystems, fundamental differences between underlying GUI frameworks, event-handling models, and application architectures leave mobile app GUI fuzzers unusable on desktop software. Mobile platforms typically offer well-defined UI lifecycles and standardized GUI development APIs, and consequently, mobile GUI fuzzers remain tightly coupled with platform-specific SDKs (e.g., Android's Monkey <ref type="bibr">[24]</ref>, iOS's XCMonkey <ref type="bibr">[22]</ref>). In contrast, desktop environments are highly heterogeneous, with diverse GUI frameworks (e.g., Qt <ref type="bibr">[6]</ref>, GTK <ref type="bibr">[12]</ref>) and interaction paradigms (e.g., first-party vs. third-party windows) that lack centralized control mechanisms. Moreover, desktop applications often rely on complex, multi-window workflows as well as non-touchbased inputs, further complicating automated testing. For these reasons, mobile app fuzzers are neither currently usednor practically adaptable-for testing desktop-based GUIs.</p><p>Motivation: the need for a desktop-based GUI fuzzer. As today's fuzzers are universally unable to test desktop GUI software, we aim to bridge the gaps between widelysuccessful fuzzing platforms and desktop GUI targets. We envision a world where desktop application GUIs are fuzzable just as any other software interface, and thus design a fuzzer to meet these capabilities: GUIFUZZ++.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. GUIFUZZ++: CHALLENGES AND SOLUTIONS</head><p>To bridge the long-standing gap between conventional fuzzing and GUI-based software, we introduce GUIFUZZ++: the first system to extend general-purpose desktop fuzzing platforms to support today's diverse and complex desktop GUI application ecosystems. In the following, we outline the fundamental challenges that motivate GUIFUZZ++'s core design (Figure <ref type="figure">1</ref>), along with our corresponding solutions aimed at enabling effective and scalable GUI fuzzing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Challenge 1: Making Desktop Fuzzers Interact with GUIs</head><p>Unlike prior GUI testing tools which are tied to specific targets <ref type="bibr">[9]</ref> or platforms <ref type="bibr">[24]</ref>, GUIFUZZ++ aims for breadth in supporting a wide range of desktop GUI software and ecosystems. To achieve this, we draw inspiration from mainstream desktop-based fuzzers like AFL++ <ref type="bibr">[10]</ref> which, while incompatible with GUIs, are by far today's most far-reaching and ubiquitous desktop software fuzzing tools in practice <ref type="bibr">[45]</ref>. Our goal, thus, is extending these general-purpose fuzzers to desktop GUIs-with the least modification necessary-via a novel mechanism for directly translating their random inputs into concrete GUI events: our GUI Interaction Interpreter.</p><p>Interpreting Fuzzer Inputs as GUI Events: Contemporary desktop fuzzers such as AFL++ <ref type="bibr">[10]</ref> and libFuzzer <ref type="bibr">[44]</ref> operate at the byte level, generating a continuously-growing corpus of random, string-based inputs. Using widely-available GUI automation APIs <ref type="bibr">[49]</ref>, we thus see an opportunity to reinterpret these inputs as sequences of GUI operations, analogous to Op Structure Description of GUI Interaction</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="00">FF FF</head><p>Close currently-active window, ignoring the last two operands.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="01">CC FF</head><p>Input the key press corresponding to the extended ASCII encoding of primary operand CC, ignoring the second operand.</p><p>Ex: 01 7F FF &#8594; input extended ASCII key press "DEL".</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="02">XX YY</head><p>Click the location (X%, Y%) relative to the current window's dimensions, offset from its bottom-left coordinate (0,0).</p><p>Ex: 02 A0 1B &#8594; click relative position (62.5%, 10.5%).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="03">XX YY</head><p>Drag the cursor from its current position to the new position (X%, Y%) relative to the current window's dimensions, offset from its bottom-left coordinate (0,0).</p><p>Ex: 03 00 80 &#8594; drag to relative position (0%, 50%).</p><p>NN AA BB All higher opcodes (i.e., 04-FF): normalize the opcode via (NN % 4), reinterpreting the transformed opcode accordingly.</p><p>Ex: B2 2C 9F &#8594; reinterpret as click operation 02 2C 9F .</p><p>TABLE II: Overview of GUIFUZZ++'s core GUI operation grammar.</p><p>the randomized "monkey"-style GUI interaction testing that has historically proven effective in mobile GUI fuzzing <ref type="bibr">[24]</ref>.</p><p>To enable this, we formalize fundamental GUI actions via a minimal grammar of three-byte instructions, shown in Table <ref type="table">II</ref>.</p><p>Upon receiving an input from the fuzzer, GUIFUZZ++ invokes its Interaction Interpreter-a component fully independent of the fuzzer-parsing the input bytes according to our GUI event grammar and dispatching corresponding actions. To keep our grammar compact, our instruction set (Table <ref type="table">II</ref>) defines four core opcodes (i.e., 00-03), with any higher opcodes normalized via modulo and mapped back into this defined range. For instance, a three-byte sequence beginning with opcode 06 is normalized to 02, and thus interpreted as a click event. This simple normalization enables seamless integration with the random byte sequences produced by existing fuzzers, requiring no changes to their input generation logic.</p><p>Facilitating GUI Interaction: Central to our approach is leveraging existing GUI automation and introspection capabilities. Because many Table <ref type="table">II</ref> instructions require windowrelative positioning, GUIFUZZ++ first queries the target application's window dimensions using native windowing APIs provided by the host OS (e.g., x11-utils <ref type="bibr">[14]</ref> on Linux). It then executes the parsed GUI actions using cross-platform automation libraries (e.g., PyAutoGUI <ref type="bibr">[49]</ref>), which expose generic primitives for mouse, keyboard, and window management. These readily available APIs allow GUIFUZZ++ to issue platform-agnostic GUI interactions, including higherlevel actions such as spawning or closing windows, without the need for any application-specific instrumentation.</p><p>Breadth of GUI Interactions: Because GUIFUZZ++ aims to avoid the overhead of program-specific static analysis or tailoring, it operates using a minimal yet expressive set of core events: window-closing, key presses, clicks, and drags. These primitives form a functional superset capable of emulating a wide range of GUI element-specific interactions. For example, scrolling through a menu can be accomplished entirely via a drag operation, allowing GUIFUZZ++ to explore substantial interface behavior using just these basic inputs. Moreover, our Interaction Interpreter is designed for easy extensibility with new opcodes-and, as we demonstrate in &#167; III-C-it readily supports the integration of more targeted, elementspecific mutators toward higher-precision GUI exploration. Solution 1: GUIFUZZ++ introduces a GUI Interaction Interpreter that enables conventional, GUI-agnostic desktopbased fuzzers like AFL++ to be fully repurposed for GUI fuzzing-without any need for costly reengineering.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Challenge 2: Handling Desktop-specific Window Obstacles</head><p>Unlike mobile platforms, desktop GUI fuzzing faces significantly more obstacles from the proliferation of unwanted thirdand first-party windows, which disrupt fuzzing workflows and pollute the interaction space. On mobile OSes, apps are typically sandboxed with strict lifecycle control <ref type="bibr">[1]</ref>, where only one app is active in the foreground at a time, and popups or overlays are generally constrained by platform-level guidelines and permission models. In contrast, desktop environments allow multiple overlapping windows from different processes (e.g., update dialogs, crash reporters, or unrelated apps). Even within a single program, modal dialogs, system alerts, and nested windows may appear unpredictably. These extraneous windows can intercept input, obscure the target interface, or cause unintended side effects during fuzzing. In the following, we detail GUIFUZZ++'s mechanisms for mitigating unwanted third-and first-party windows to ensure interactions remain focused on the intended application GUI.</p><p>Tackling Third-party Window Interference: To ensure that GUI fuzzing remains confined to the intended application window, GUIFUZZ++ records the target process's PID, and initiates GUI interactions only when the currently displayed window matches that PID. Once a test case completes its sequence of interactions, GUIFUZZ++ sends an interrupt signal (SIGINT) to terminate the window, returning control to the fuzzer to begin the next iteration. If the target or OS spawns unwanted third-party windows (e.g., an update dialog or the web browser), GUIFUZZ++ collects their PIDs and similarly terminates each via SIGINT. Nearly all of GUIFUZZ++'s window management logic resides within the GUI Interaction Interpreter, with only two additional lines of code added to AFL++ to capture the target application's PID.</p><p>Tackling First-party Window Interference: While thirdparty interference accounts for the majority of window-related disruptions, we also observe several cases where first-party windows-those spawned by the fuzzed application itselfcan impede fuzzing. Unlike third-party windows, these originate from the target process, and thus cannot be filtered out using our PID-based third-party window filtering. The most common example involves file browser dialogs, which are often triggered by GUI actions such as clicking a SAVE or OPEN button. These dialogs pose a particularly dangerous risk: if the fuzzer inadvertently interacts with them, it may initiate unintended operations on the host file system.</p><p>To mitigate this, we extend our use of GUI introspection APIs ( &#167; III-A) to heuristically detect and suppress such di-alogs. Specifically, we scan for window titles containing common file-related keywords like LOAD, SAVE, and FILE, allowing us to identify and preemptively close or bypass most filebrowser windows before they interfere with fuzzing execution. As with our third-party window filtering, this mechanism resides entirely within GUIFUZZ++'s Interaction Interpreter, requiring no additional customization to the fuzzer itself.</p><p>Solution 2: GUIFUZZ++ mitigates unwanted window interference with minimal changes to the underlying fuzzer, leveraging its core window introspection APIs to find and suppress signs of disruptive or extraneous GUI activity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Challenge 3: Maintaining Precise Desktop GUI Interaction</head><p>A key challenge in GUI fuzzing is the disconnect between screen coordinates and interactive elements, making arbitrary interaction likely to miss actionable GUI components. While mobile platforms aid fuzzers <ref type="bibr">[24]</ref>, <ref type="bibr">[22]</ref> with built-in GUI introspection APIs (e.g., Android's AccessibilityService [25]), desktop OSes lack such centralized mechanisms for introspecting GUI elements, leaving GUIFUZZ++'s interactions unlikely to drive meaningful fuzzing progress. To overcome this, we introduce a suite of enhancements leveraging emerging desktop-based Software Accessibility Technologies <ref type="bibr">[13]</ref>, providing GUIFUZZ++ with a powerful means of directly targeting GUI components toward more effective fuzzing.</p><p>Achieving GUI Introspection via AT-SPI: To meet the needs of assistive devices such as screen readers, magnifiers, and braille displays, recent years have seen the widespread adoption of the Assistive Technology Service Provider Interface (AT-SPI) <ref type="bibr">[13]</ref>-the primary accessibility framework for Linux desktop environments, with emerging support extending to platforms like macOS. At a high level, AT-SPI exposes a hierarchical view of an application's GUI elements (e.g., buttons, menus, and text fields), enabling accessibility tools to support meaningful, non-visual navigation. With built-in support for popular GUI toolkits such as GTK and Qt, AT-SPI offers a robust foundation for external tools to inspect and reason about interface structure-making it a natural fit for introspectiondriven enhancements within GUIFUZZ++.</p><p>Leveraging AT-SPI in GUI Fuzzing: To improve GUI-FUZZ++'s GUI-interaction precision, we leverage AT-SPI's built-in recognition of standard GUI elements <ref type="bibr">[13]</ref>. Accordingly, we extend our GUI Interaction Interpreter with 11 new operators targeting the fundamental classes of interactable GUI elements (Table <ref type="table">III</ref>): toggleables, selections, movables, as well as general push buttons and user-controllable text fields. Continuing from Table II, we assign each of these 11 new instructions a unique opcode, with normalization similarly applied via modulo to bring any fuzzer-generated higher-opcode instructions (e.g., 17) within GUIFUZZ++'s full expanded instruction opcode range (i.e., 00-14).</p><p>Although GUIFUZZ++'s core clicking operation (</p><p>Table II) relies on window-relative positioning to find where to click, AT-SPI exposes a deterministic tree of all currently-visible Category Op Structure Element Type Visual Example General 04 AA BB Pushable Button Submit 05 AA BB Text Entry Field Find ... Toggleables 06 AA BB Checkbox Button Apply? 7 07 AA BB On/Off Button On Off Selections 08 AA BB Radio Button Use Option 1 Use Option 2 09 AA BB Spinner Button Width: 4px + -10 AA BB Table Cell Button % &#247; +/-11 AA BB Drop-down Item Option 1 &#189; Option 2 Option 3 12 AA BB Combination Box Search: "width" &#189; Option 2: Width Option 1: Height Movable 13 AA BB Scrollable Field Field 2: value 2 Field 3: value 3 Field 4: value 4 = 14 AA BB Sliding Selection 4 5 6 0 1 2 7 3</p><p>TABLE III: GUIFUZZ++'s AT-SPI enhanced interactions. As in Table II, every operation is mapped to a three-byte structure: a single opcode, followed by two operands AA and BB that reference the GUI element's position within the type-specific element list that is exposed by AT-SPI and updated alongside the GUI. Similarly, higher opcodes are normalized to range 00-14 via (NN % 15). For example, sequence 58 00 04 is normalized to 13 00 04 , and thus interpreted as an interaction on the fourth-indexed scrollable field element.</p><p>20% File Draw App Edit View Help New Open Donate! Draw App (Frame) Menu (Menu Bar) Edit (Menu Item) View (Menu Item) Help (Menu Item) File (Menu Item) Main Toolbar (Toolbar) Donate (Push Button) 20% (Combo Box) (Separator) Open (Push Button) New (Push Button)</p><p>Brush Tools (Table ) 

Pen (Table <ref type="table">Cell</ref>)</p><p>Fig. <ref type="figure">2</ref>: Example visualization of an AT-SPI <ref type="bibr">[13]</ref> dynamically-generated GUI element tree for a simple drawing app. In order to use the AT-SPI tree, GUIFUZZ++ flattens it into a list of each type of element (e.g., push button and menu item from Table <ref type="table">III</ref>). Different operators select different types of elements, and the operands are subsequently used to index into the list.</p><p>GUI elements, enabling operand-guided targeting for GUI-FUZZ++'s enhanced interactions as well. Namely, each element type in Table III is accessed via a click, with the instruction's final two operands used to index into a typespecific list of matching elements (Figure <ref type="figure">2</ref>). GUIFUZZ++ constructs these lists dynamically-re-polling AT-SPI's tree of GUI elements after each dispatched GUI interaction-and flattening all same-type nodes from the AT-SPI tree, enabling fast, deterministic element selection. If the resulting index exceeds the list's bounds, GUIFUZZ++ wraps it via modulo to ensure a valid target. This design allows GUIFUZZ++ to precisely fetch specific GUI elements even as the interface evolves at runtime, offering far more fruitful GUI fuzzing compared to blind, on-screen pixel clicking.</p><p>Solution 3: GUIFUZZ++ overcomes GUI interaction imprecision through a suite of AT-SPI-assisted operations, enabling targeted, element-aware interactions that significantly improve the effectiveness of desktop GUI fuzzing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. IMPLEMENTATION</head><p>We implement GUIFUZZ++ atop state-of-the-art grey-box fuzzer AFL++ <ref type="bibr">[10]</ref> v4.21c, enabling GUIFUZZ++'s inheriting of AFL++'s rich ecosystem of fuzzing enhancements. Below, we detail the technical integration of GUIFUZZ++'s core components within the AFL++ platform.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Fuzzing Process Execution</head><p>Like all AFL-based fuzzers, GUIFUZZ++ resets the target process for new test cases via forkserver-based process cloning <ref type="bibr">[53]</ref>, enabling higher fuzzing throughput than slower from-scratch process creation <ref type="bibr">[46]</ref>. Beyond executing the target, we configure AFL++ to additionally launch our GUI Interaction Interpreter ( &#167; III-A), which we implement via Python's PyAutoGUI framework <ref type="bibr">[49]</ref>. All other fuzzer execution steps-code coverage collection, crash recognition, and inter-process communication-are left as-is in AFL++'s core, underscoring GUIFUZZ++'s lightweight design. In total, our changes to AFL++'s core span just eight lines of code.</p><p>Importantly, GUIFUZZ++ supports any AFL-compatible bug oracle (e.g., flagging error-revealing process signals like SIGFPE) or sanitizer (e.g., AddressSanitizer <ref type="bibr">[43]</ref>), following the same compile-time instrumentation and target preparation steps as conventional non-GUI fuzzing workflows <ref type="bibr">[10]</ref>, <ref type="bibr">[44]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Test Case Mutation and Trimming</head><p>Since AFL++'s <ref type="bibr">[10]</ref> in-house mutators modify test cases at bit-and byte-level, they will overwhelmingly break the three-byte structure of our GUI operations ( &#167; III-A), leading to invalid interactions-and fruitless fuzzing. To address this, we implement a GUI-aware mutator, ensuring that mutations (e.g., modifications, insertions, and splices) occur strictly on well-formed GUI interactions. We further extend this to test case trimming, ensuring that incremental deletions similarly preserve GUI operation structures. As GUIFUZZ++'s mutation and trimming are both implemented via AFL++'s Custom Mutator API <ref type="bibr">[10]</ref>, no changes are needed to AFL++ itself.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Supported Software and Desktop Platforms</head><p>While our current prototype of GUIFUZZ++ targets GUI software in Linux environments, we anticipate its portability to other platforms supportive of AFL++ and PyAutoGUI <ref type="bibr">[49]</ref> such as macOS, and extensible to AFL-like fuzzers on platforms not directly supported by AFL++, such as WinAFL <ref type="bibr">[55]</ref> for Windows. We posit that the only platform-specific component of GUIFUZZ++ is the retrieval of the active window's PID and dimensions. Fortunately, nearly all modern OSes expose APIs for this functionality via their respective windowing subsystems (e.g., x11-utils on Linux <ref type="bibr">[14]</ref>, PyObjC/Quartz on macOS <ref type="bibr">[40]</ref>, and PyWin32 on Windows <ref type="bibr">[21]</ref>), enabling GUIFUZZ++ to merely swap-out these components.</p><p>Although the accessibility framework driving GUI-FUZZ++'s higher-precision fuzzing ( &#167; III-C), AT-SPI <ref type="bibr">[13]</ref>, sees best support for GTK- <ref type="bibr">[12]</ref> and QT-based <ref type="bibr">[6]</ref> GUIs, GUIFUZZ++ remains fully functional without it. This allows grey-box GUI fuzzing to be deployed across a wider range of targets-even in the absence of accessibility integration.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>V. EVALUATION</head><p>Our evaluation of GUIFUZZ++'s desktop GUI fuzzing capabilities is guided by the following fundamental questions: Q1: How does GUIFUZZ++'s grey-box fuzzing compare to traditional black-box GUI fuzzing? Q2: To what extent does GUIFUZZ++'s AT-SPI-enhanced interaction improve GUI fuzzing? Q3: Is GUIFUZZ++ effective at finding new GUI-induced bugs in desktop GUI software? Benchmarks: Table <ref type="table">IV</ref> shows our evaluation benchmarks. We evaluate GUIFUZZ++ on 12 open-source Linux-based GUI programs spanning a variety of application domains. To assess GUIFUZZ++'s support across today's diverse desktop GUI ecosystems, we include benchmarks spanning three distinct GUI development frameworks: Qt <ref type="bibr">[6]</ref>, GTK <ref type="bibr">[12]</ref>, as well as Xorg <ref type="bibr">[15]</ref>. We compile all applications with AFL++'s builtin source-level compilers (e.g., afl-clang-fast).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Program Description</head><p>Base GUI Dia <ref type="bibr">[27]</ref> Graphic Design GTK Glaxnimate <ref type="bibr">[3]</ref> Animation Qt KCalc <ref type="bibr">[29]</ref> Calculator Qt KolourPaint <ref type="bibr">[30]</ref> Image Editor Qt LabPlot <ref type="bibr">[31]</ref> Data Plotting Qt LibreCAD <ref type="bibr">[36]</ref> 3-D Modeling Qt MATE-calc <ref type="bibr">[38]</ref> Calculator GTK PlotJuggler <ref type="bibr">[8]</ref> Data Plotting Qt QCAD <ref type="bibr">[41]</ref> 3-D Modeling Qt Skrooge <ref type="bibr">[32]</ref> Finance Qt Umbrello <ref type="bibr">[33]</ref> UML Editor Qt XCalc <ref type="bibr">[52]</ref> Calculator Xorg</p><p>TABLE IV: Our desktop-based GUI fuzzing evaluation benchmarks. Experiment Setup &amp; Infrastructure: As there are zero fuzzers broadly supportive of desktop GUI software today, our evaluation seeks to understand the influence of GUIFUZZ++'s Program C1: Grey-box w/ AT-SPI C2: Grey-box w/o AT-SPI C3: Black-box w/ AT-SPI C4: Black-box w/o AT-SPI Speed CodeCov Bugs Speed CodeCov Bugs Speed CodeCov Bugs Speed CodeCov Bugs Dia 2491.2 12990.4 0 1631.6 12074.8 1 3154.0 12101.4 0 3047.0 11364.8 0 Glaxnimate 2562.6 34312.2 2 2850.2 30537.2 0 6625.6 30890.0 0 7374.0 29405.3 0 KCalc 2546.8 8013.8 1 2914.5 7616.4 1 7212.6 6625.0 1 6364.6 5473.8 2 KolourPaint 1859.8 5845.2 3 2019.4 4880.0 1 5579.4 5615.2 2 7595.8 4988.8 1 LabPlot 2190.0 25170.6 3 2588.8 25114.0 2 4647.8 26402.6 2 6037.0 27732.2 1 LibreCAD 2001.6 39613.6 1 2453.3 38703.3 1 6760.2 39076.8 0 8453.4 36529.4 0 MATE-calc 1989.0 1196.6 0 2421.0 1349.0 0 5542.8 1353.4 0 6490.0 1359.6 2 PlotJuggler 2411.2 17979.8 1 1968.2 16342.2 1 5389.2 17083.6 1 5697.6 16201.0 1 QCAD 2647.0 61598.0 0 3020.2 61568.4 1 7075.6 61734.2 0 6928.2 61626.4 0 Skrooge 1754.8 33794.2 0 1575.0 34052.6 0 6040.4 33550.4 0 7822.2 32642.4 0 Umbrello 2279.0 23431.0 5 3243.4 17900.8 3 6854.0 20335.8 1 8110.0 16693.6 2 XCalc 6344.8 441.8 2 5560.0 441.4 2 28078.8 440.6 2 27887.2 439.8 2 GEOMEAN: 2431.4 12079.5 1.9 2536.9 11409.2 1.3 6572 11635.3 1.4 7555.6 10961 1.5</p><p>TABLE V: Per-configuration mean fuzzing speed (i.e., test case throughput), mean target code coverage (i.e., control-flow edges), and total manuallydeduplicated bugs per benchmark. Bolded values indicate the best-performing configuration with respect to each evaluated metric per benchmark. key constituent parts. Accordingly, we set up GUIFUZZ++ in four fundamental configurations: (C1) grey-box mode with AT-SPI, (C2) grey-box mode without AT-SPI, (C3) black-box mode with AT-SPI, and (C4) black-box mode without AT-SPI.</p><p>We seed all fuzzing campaigns with a single sequence of 33 randomly-generated GUI interactions. Following the fuzzing evaluation standard established by Klees et al. <ref type="bibr">[34]</ref>, we fuzz each benchmark for five 24-hour trials per configuration. We deploy all fuzzing experiments within 20 KVM-based Ubuntu virtual machines, run atop a 24-core Ubuntu 22.04 workstation with 64G RAM and an Intel i9-12900K CPU.</p><p>Post-processing Results: We evaluate all GUIFUZZ++ configurations on the following metrics: speed via test case throughput, depth via code coverage, and true GUI-induced bugs uncovered. We measure code coverage as control-flow edges via AFL++'s built-in afl-showmap tool <ref type="bibr">[10]</ref>, and further plot mean normalized coverage in Figure <ref type="figure">3</ref>. For bugs, we perform manual analysis to deduplicate fuzzer-found crashes into their per-configuration unique bugs. We calculate and report all means as geometric means.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Q1: Grey-box vs. Black-box Desktop GUI Fuzzing</head><p>As GUI fuzzing has historically relied heavily on blackbox testing <ref type="bibr">[24]</ref>-eschewing any target-level feedback about test case significance (e.g., code coverage)-we evaluate GUI-FUZZ++'s performance in both traditional black-box and more modern grey-box (i.e., coverage-guided <ref type="bibr">[10]</ref>) fuzzing modes. Since GUIFUZZ++'s underlying fuzzer, AFL++, is inherently not a black-box fuzzer, we modify it to emulate black-box behavior by continuing to save all coverage-increasing test cases on disk, but refraining from actually using that coverage information during the fuzzing process. This strategy enables us to accurately measure code coverage even in the absence of coverage-guided exploration. Beyond code coverage, we also compare both modes' execution speeds (measured by testcase throughput) and their total fuzzer-reported unique crashes. Table <ref type="table">V</ref> summarizes these results side-by-side per benchmark.</p><p>Outcomes: As shown in Table V, GUIFUZZ++'s black-box configurations (C3 and C4) consistently achieve the highest test case throughput across all benchmarks, with grey-box configurations (C1 and C2) never surpassing them in speed. However, this performance advantage does not translate to effectiveness: black-box fuzzing yields lower code coverageoutperformed in 9 of 12 benchmarks-and fewer bugs, with grey-box mode discovering more issues in 7 of 12 applications. Interestingly, we observe that the few cases where blackbox mode excels in bug discovery involve simpler applications with more constrained interaction spaces: namely, the MATEcalc <ref type="bibr">[38]</ref> and KCalc [29] calculators. These results suggest that black-box fuzzing's speed is best suited for lightweight GUIs, whereas grey-box fuzzing is better-equipped to navigate complex, bug-prone paths in larger applications. Overall, by supporting both modes, GUIFUZZ++ enables flexible, target-tailored fuzzing strategies that adapt to the structure and complexity of today's diverse desktop GUI applications. Q1: GUIFUZZ++ adapts to target GUI complexity, facilitating black-box speed for simple apps, and grey-box precision for deeper bug discovery in more complex ones.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Q2: Impact of GUIFUZZ++'s Enhanced GUI Interaction</head><p>To determine whether GUIFUZZ++'s AT-SPI-enhanced GUI interaction ( &#167; III-C) improves desktop GUI fuzzing effectiveness, we further evaluate GUIFUZZ++ both with and without AT-SPI enabled. As in &#167; V-A, the test case throughput, code coverage, and fuzzer-reported unique crashes per configuration-benchmark pairing are shown in Table <ref type="table">V</ref>.</p><p>Outcomes: As Table <ref type="table">V</ref> shows, GUIFUZZ++'s AT-SPIassisted configurations (C1 and C3) face significant runtime overhead-being outperformed in throughput by their AT-SPIagnostic counterparts (C2 and C4) on 8 of 12 benchmarks. We posit this is expected, as AT-SPI integration incurs additional costs from both rendering the accessibility tree as well as invoking GUIFUZZ++'s targeted GUI-aware mutators.</p><p>Despite this cost, AT-SPI-assisted fuzzing proves valuable: it achieves higher code coverage in 9 of 12 applications-and in many cases, seeing consistently-higher coverage throughout fuzzing (Figure <ref type="figure">3</ref>)-indicating that richer GUI introspection enables exploration of deeper, more complex interaction paths that are otherwise missed by GUI-agnostic modes. This benefit extends to bug discovery as well: when paired with grey-box guidance, AT-SPI-enhanced fuzzing finds the most bugs on four benchmarks, surpassing the next-best configuration, greybox without AT-SPI (C2), which leads on only two. Altogether, these results demonstrate that while AT-SPI slows execution, it greatly boosts both code coverage and bug-finding in interface-rich applications-solidifying GUIFUZZ++'s effectiveness in GUI-focused fuzzing campaigns.</p><p>Q2: GUIFUZZ++'s AT-SPI integration trades raw speed for deeper exploration-unearthing more bugs and behaviors invisible to GUI-agnostic desktop GUI fuzzing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Q3: Discovery of GUI-induced Bugs in Desktop Apps</head><p>Lastly, we manually deduplicate all fuzzer-reported unique crashes per each of GUIFUZZ++'s evaluated configurations, obtaining the final set of real-world bugs uncovered by GUI-FUZZ++. We follow standard practice in fuzzing literature for crash deduplication <ref type="bibr">[34]</ref>, employing AddressSanitizerbased <ref type="bibr">[43]</ref> stack-trace bucketing. Table VI lists all GUIinduced bugs found in our evaluation, alongside their revealing GUIFUZZ++ configuration and current reporting status.</p><p>Outcomes: As shown in Table <ref type="table">VI</ref>, GUIFUZZ++ uncovers a total of 25 GUI-induced bugs across 11 real-world desktop GUI applications, with 23 being previously-unreported GUIinduced errors. Figure <ref type="figure">4</ref> breaks down the distribution of unique bugs across different fuzzing configurations, highlighting that ID Program Bug Type Brief Desc. New Status 01 Dia Bad Free Color area (transient) 7 02 Glaxnimate Segfault Improper closing 7 &#557; 03 Glaxnimate Segfault Invalid cut/pastes 7 &#557; 04 KCalc Invalid Ptr Inserting open parent 7 &#334; 05 KCalc Segfault Left bit shift overflow 06 KolourPaint Heap UAF Specific tools with undo 7 &#334; 07 KolourPaint Segfault Buggy bug report menu 7 &#557; 08 KolourPaint Segfault Shortcut settings dropdowns 7 &#557; 09 KolourPaint Segfault Print preview zooming 7 &#557; 10 LabPlot Invalid Ptr Invalid column insert 7 11 LabPlot Heap UAF Pinning spreadsheets 7 12 LabPlot Heap UAF Pinning matrices 7 13 LibreCAD Heap UAF Invalid plugin usage 7 14 LibreCAD Heap UAF Consecutive points 7 15 MATE-calc Bad Free Invalid square roots 16 MATE-calc Bad Free Empty inverse trig functions 7 17 PlotJuggler Segfault Quickly close button docker &#557; 18 QCAD Segfault Tool use in multiple sheets 7 &#557; 19 Umbrello Segfault Birds eye after discard 7 20 Umbrello Heap UAF Multiple sequence diagrams 7 21 Umbrello Heap UAF Undo after discard 7 22 Umbrello Segfault Print Preview after discard 7 23 Umbrello Segfault Cut on empty diagram 7 24 XCalc FPE Invalid modulus 7 &#557; 25 XCalc FPE Invalid modulus 7 &#557;</p><p>TABLE VI: All GUI-induced bugs uncovered by GUIFUZZ++, alongside a brief description of their bug-triggering semantics and reporting statuses ( = fixed by developers, &#334; = confirmed and waiting fixing, &#557; = pending).</p><p>For brevity, we classify bugs #04 and #10 as "Invalid Pointer" errors, and follow AddressSanitizer's <ref type="bibr">[43]</ref> crash taxonomy for all remaining bugs.</p><p>each configuration surfaces distinct issues-and underscoring the importance of GUIFUZZ++'s support for tailoring fuzzing strategies to the characteristics of the target application. In the remainder of this section, we present several representative case studies that showcase the depth and diversity of bugs exposed by GUIFUZZ++.</p><p>Case Study 1: LibreCAD (bug #13). GUIFUZZ++ uncovered a heap user-after-free in LibreCAD [36] triggered by the following high-level operations: selecting the "Same Properties" plugin from the Plugins menu, placing an angle on the canvas using the Angle tool, and then beginning a twopoint line with the Two Points tool, spanning a total of eight distinct GUI interactions (Figure <ref type="figure">5</ref>).</p><p>This bug was confirmed and patched by developers on the same day it was reported. The root cause lies in the "Same Properties" plugin, which attempts to copy attributes from one diagram entity to another. A call to the plugin's finish() method triggers a segfault, as it prematurely deallocates objects still in use by the partially drawn two-point line.</p><p>Because this line is incomplete, the plugin encounters invalid state during cleanup. Notably, this bug was only exposed under GUIFUZZ++'s AT-SPI-enhanced configuration. This is likely due to the need for precise sequencing of submenu interactions-such as selecting transient plugin entries-which are rarely triggered through random UI exploration alone.</p><p>Case Study 2: LabPlot (bug #11). GUIFUZZ++ finds another use-after-free in LabPlot <ref type="bibr">[31]</ref> triggered by the following four-sequence interaction (Figure <ref type="figure">6</ref>): clicking Pin Active Tab, opening the Spreadsheet menu, and selecting column #2 in the data browser. This bug, confirmed by the LabPlot developers one week after reporting, stems from incorrect usage of the Qt <ref type="bibr">[6]</ref> docking API. When a dock widget is pinned, it is removed from its parent QDockArea. If this removal leaves the dock area empty, the area itself is destroyed. Later, when interacting with the spreadsheet column, LabPlot attempts to access this now-destroyed dock area, resulting in a segmentation fault. This issue was uncovered only in GUIFUZZ++'s grey-box configuration, suggesting that coverage-guided exploration was necessary to navigate the specific conditions leading to the crash.</p><p>Case   Each of these operations assumes a valid diagram is present, triggering distinct SIGSEGV faults when dereferencing the null pointer. The exact crash varies by action, revealing a broader flaw in Umbrello's state management following diagram disposal. While these bugs were discovered across all configurations, a greater number surfaced during grey-box runs (Table <ref type="table">V</ref>), suggesting that coverage-guided exploration is more effective at uncovering deeper, state-dependent bugs.</p><p>Developer Responses: While several bugs remain under review at the time of writing (Table VI, many were acknowledged and confirmed by developers within days to a week of reporting-underscoring their reproducibility and practical significance. Collectively, these case studies showcase GUIFUZZ++'s effectiveness at surfacing real, high-impact GUI bugs in desktop applications, many of which stem from complex GUI interaction sequences that GUIFUZZ++'s supported fuzzing modes are uniquely equipped to explore.</p><p>Q3: GUIFUZZ++'s structured GUI interactions and application introspection enables its discovery of non-trivial GUI bugs-broadening the reach of fuzzing into the rich, event-driven behaviors of modern desktop applications.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VI. DISCUSSION &amp; THREATS TO VALIDITY</head><p>In the following, we weigh several limitations of our prototype implementation of GUIFUZZ++.</p><p>Improving Runtime Throughput: While conventional grey-box application fuzzing has seen numerous advancements for increasing speed toward accelerating bug discovery <ref type="bibr">[46]</ref>, <ref type="bibr">[56]</ref>, GUI fuzzing is inescapably much slower due to GUIspecific bottlenecks-namely, the high resource cost from rendering the GUI itself. Though some application-specific runtime bottlenecks seem mitigable (e.g., on-launch splash screens), we posit there is non-trivial engineering needed to build a generalizable way of (1) pinpointing and (2) successfully cutting-out these components without breaking the overall application. As such, we leave exploring potential optimizations for desktop GUI fuzzing to future work.</p><p>Supporting Other Interactions: Our current prototype's click-oriented GUI interactions (e.g., Table <ref type="table">III</ref>) currently focus on left-clicks, though extending GUIFUZZ++ to right-clicks is straightforward. However in practice, we observe rightclick menus largely expose redundant functionality (e.g., Cut, Copy, Paste) that is already accessible through leftclick-navigable menus (e.g., Edit &#8594; Cut), so omitting rightclicking does not meaningfully impact most GUI fuzzing. Likewise, while GUIFUZZ++'s key-pressing interactions currently issue just single-key presses, extending support to multikey combinations requires only minimal modification. We plan to explore these in future extensions to GUIFUZZ++.</p><p>Generality to Other Platforms: Although GUIFUZZ++'s current prototype remains Linux-specific, many of its components already are cross-platform-or are easily swappedout with their platform-specific counterparts-as we discuss in &#167; IV-C. Importantly, GUIFUZZ++ specifically targets desktopbased OSes, and intentionally avoids mobile OSes (e.g., Android, iOS) or tablet ones (e.g., iPadOS), as we expect that these platforms' already-mature GUI fuzzing tools <ref type="bibr">[22]</ref>, <ref type="bibr">[24]</ref> remain a better fit for their respective GUI app ecosystems. While we do plan to explore GUIFUZZ++'s porting to other desktop platforms like macOS and Windows, we leave this engineering and requisite re-evaluation to future work.</p><p>Generality to Other Applications: To mitigate risk of bias and overfitting, we evaluate GUIFUZZ++ across a wide range of targets (Table <ref type="table">IV</ref>) spanning diverse types of GUI applications: calculators, multimedia, financial, data analysis, and computer modeling. Additionally, these applications are built atop three of today's most popular GUI development frameworks-Qt <ref type="bibr">[6]</ref>, GTK <ref type="bibr">[12]</ref>, and Xorg <ref type="bibr">[15]</ref>-further underscoring the generality of GUIFUZZ++'s fundamental approach across diverse ecosystems of desktop GUI software.</p><p>Since GUIFUZZ++ builds on existing fuzzing frameworks <ref type="bibr">[10]</ref>, it can be applied to closed-source GUI binaries as well (e.g., Tesla's Qt-based infotainment software <ref type="bibr">[51]</ref>), provided that the base fuzzer and underlying program instrumentation supports them. However, we expect certain GUIFUZZ++ modes (e.g., grey-box mode, AT-SPI <ref type="bibr">[13]</ref>) likely require additional adaptation to work on closed-source targets.</p><p>Further, while we foresee many opportunities in extending GUIFUZZ++ to other GUI-based application domains beyond our present evaluation set-such as video games, web browsers, or an aircraft avionics software-many domainspecific orthogonal challenges need solving first. For example, web browsers are large, multi-threaded codebases, yet multithreading often deteriorates the stability of general greybox fuzzing <ref type="bibr">[5]</ref>. Similarly, video games likely require many new game-tailored bug oracles for non-crashing bugs (e.g., glitching-into Super Mario's "Minus World" <ref type="bibr">[11]</ref>, <ref type="bibr">[48]</ref>). Others, such as avionics software, call for porting from their niche platforms (e.g., PowerPC) to more tool-friendly ones like Linux. Nonetheless, with solutions to these obstacles, we anticipate that GUIFUZZ++ and future follow-on solutions will be ready to fuzz these and other important GUI domains.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VII. CONCLUSION</head><p>In this work, we introduce GUIFUZZ++, a general-purpose framework that bridges the longstanding disconnect between modern grey-box fuzzing and desktop GUI software. While traditional fuzzers operate primarily over byte-level or spatial input domains, GUIFUZZ++ formalizes GUI interactions as operand-driven instructions, systematically translating random bytes into actionable, logic-exercising GUI events. Through a minimal yet expressive core instruction set and integration with accessibility frameworks such as AT-SPI, GUIFUZZ++ supports both pixel-based and element-aware interaction modes, along with built-in mechanisms to isolate fuzzing to the intended application context. Altogether, GUIFUZZ++ enables mainstream fuzzers like AFL++ to be seamlessly extended to GUI fuzzing-with negligible intervention-providing the first scalable platform for effectively uncovering GUI-induced defects in today's rapidlyevolving desktop software ecosystems, as demonstrated by the 23 previously-unknown bugs it has discovered thus far.</p></div></body>
		</text>
</TEI>
