<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Interference Management For Next-Generation Dynamic Spectrum Sharing</title></titleStmt>
			<publicationStmt>
				<publisher>Washington University in St. Louis</publisher>
				<date>04/28/2025</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10648473</idno>
					<idno type="doi">10.7936/tf7j-td52</idno>
					
					<author>Jie Wang</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[base models and spatial multipath e"ects for the modeling components in CELF. It also verifies analytically and numerically the explainability of CELF, and validates its robust performance in complex environments. The FDMonitor system uses a bidirectional coupler, a two-port receiver, and a new source separation algorithm to simultaneously and adaptively estimate the transmitted signal and the signal incident on the antenna. FDMonitor has been running on POWDER, a large-scale wireless experimental testbed, since 2021, monitoring 19 SDR platforms accessible by outside experimenters. Results show that it achieves a low false alarm rate over 27 months of operation.Together, these solutions-supported by statistical models and extensive experimental validation-o"er scalable, e!cient, and trustworthy interference management strategies to enhance spectrum e!ciency and boost openness in wireless applications like radio dynamic zones and private cellular networks.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>List of Figures</head><p>v Figure 2.7: Variance reduction vs. CELF's hyperparameters on the indoor dataset. .            vi Figure 4.5: The custom Bidirectional coupler. While extremely wideband and low loss, the isolation between coupled ports can be as low as 10 dB. . . . .        vii List of Tables</p><p>Table 2.1: Specifications for the indoor and outdoor datasets. . . . . . . . . . . . . Table 2.2: Model hyperparameters for CELF. . . . . . . . . . . . . . . . . . . . . . Table 2.3: Running time comparison for training and testing among Okumura-Hata, ML models and the CELF algorithm. . . . . . . . . . . . . . . . . . . . Table 2.4: Data variance when the portable transmitter is stationary or rotating with a radius &#8593; 1&#969; f and the maximum variance reduction space for CELF. The sum of the data variance approximates the lowest possible modeling error variance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Table 3.1: Specifications for the indoor and three outdoor datasets. Note that the outdoor SLC1 dataset, due to uncalibrated receivers, is treated as 4 different subdatasets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Table 3.2: The four subdatasets from the SLC1 outdoor dataset due to uncalibrated receivers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Table 3.3: Model hyperparameters of CELF for the indoor and SLC1-Rooftop dataset. Table 3.4: Running time comparison for training and testing among the three ML models and the CELF algorithm. We use the training time for the entire training set to analyze CELF's loss field learning e!ciency. Instead, the testing time in &#181;s/link is used. It is to evaluate how fast CELF can predict channel loss for one unseen arbitrary link. . . . . . . . . . . . . . Table 3.5: The modeling error variances of TIREM and the log-distance path loss as the base models and variance reductions on outdoor test datasets via CELF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Table 4.1: TTR and IIR for signals of three modulation types. SMC increases signal isolation in CW scenarios only. FDMonitor provides more isolation across all signal types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Table 4.2: FDMonitor alert accuracy during continuous monitoring of 19 shared SDR platforms for 27 months since 2021. . . . . . . . . . . . . . . . . . 87 ix</p><p>Finishing this thesis took a lot of e"ort, and I could not have done it without the help and encouragement of so many people. It is hard to know where to start, but I will try to thank all the people who made this possible.</p><p>First, I am deeply grateful to my advisor, Professor Neal Patwari, for his steady guidance and belief in me. He is always patient and encouraging. His feedback was always spot-on, and he pushed me to dig deeper into my ideas while keeping me on track. Working with him has been a real privilege, and I have learned so much from his knowledge and approach. My collaborators and friends, Meles G. Weldegebriel, Frost Mitchell, Alex Orange, Leigh Stoller, Gary Wong, Professor Sneha Kumar Kasera, and Professor Jacobus Van der Merwe, deserve a huge thank you as well. Each of you brought something special to this thesis, whether it was technical know-how, creative solutions, or just keeping the momentum going. Your skills, ideas, and energy made this work stronger, and I am proud to have been part of the team.</p><p>The flux research team for the POWDER testbed also deserves a lot of credit. Letting me test the proposed systems and models on a real-world platform was a game-changer, and your willingness to troubleshoot, share insights, and make it work was amazing. This research would not have gotten o" the ground without that practical testing, so thank you for opening that door.</p><p>On a personal note, I want to thank my family and friends for sticking by me through this long process. My parents have been my foundation, always there with love, advice, and a reminder to keep going. My friends, especially those who put up with my late-night rants about my research or dragged me out for a break when I needed it, kept me grounded and smiling. Most importantly, I want to thank my husband, Jonathan Gornet. He has been my biggest supporter-listening when I needed to vent, looking after me when I was sick, and believing in me even when I doubted myself. His patience and kindness carried me through, and I am so lucky to have met him during this journey.</p><p>x Finally, I want to thank my committee members for their time and wisdom. Your thoughtful suggestions pushed me to refine my approach and see things from new angles. Your encouragement and big-picture perspective kept me motivated and focused on why this matters. Each of you made this thesis stronger, and I am truly grateful for your support throughout this process.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Jie Wang</head><note type="other">Washington</note><p>University in St. Louis May 2025 xi To deadlines, the real MVP of my PhD rollercoaster. xii ABSTRACT OF THE DISSERTATION Interference Management For Next-Generation Dynamic Spectrum Sharing by Jie Wang Doctor of Philosophy in Electrical Engineering Washington University in St. Louis, 2025 Professor Neal Patwari, Chair</p><p>The exponential growth of wireless devices and bandwidth-intensive applications has intensified the demand for e!cient spectrum utilization, exposing the limitations of traditional static spectrum allocation schemes. Next-generation spectrum sharing has emerged as a transformative solution to enhance spectrum e!ciency by enabling multiple wireless systems to coexist opportunistically in the same frequency band and geographic area. However, such dynamism introduces significant challenges, particularly in interference management, which is critical to enabling reliable coexistence, ensuring access priority, and building mutual trust across diverse applications.</p><p>This thesis advances interference management by developing advanced techniques for interference prediction, monitoring, and control in dynamic spectrum sharing environments. Key contributions include the Channel Estimation via Loss Field (CELF) model for accurate and rapid channel loss prediction, the augmented CELF model to enhance explainability and robustness under uncertainty, and a Full-Duplex spectrum Monitoring system (FDMonitor) developed and deployed on experimental testbeds. The CELF work uses channel loss measurements from a deployed network area and a Bayesian linear regression method to estimate a site-specific loss field for the area. Real-world indoor and outdoor datasets validate that CELF is more accurate than common machine learning (ML) benchmarks and is less computationally complex to train. A continued work on CELF explores di"erent</p><note type="other">xiii</note><p>Chapter 1 Introduction 1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Motivation</head><p>The rapid increase in data tra!c due to wireless device expansion and new bandwidthintensive applications has driven a surge of technology innovations to make spectrum use more dynamic and e!cient <ref type="bibr">[103,</ref><ref type="bibr">62]</ref>. Traditional spectrum allocation schemes only allocate such a critical scarce resource exclusively and statically nationwide, which has resulted in temporal and geographical underutilization of spectrum <ref type="bibr">[7]</ref>. This ine!ciency further hinders next-generation wireless networks from being more open, flexible, and intelligent, as specific frequency bands are exclusively assigned to fixed services. As a result, dynamic spectrum sharing has been discussed as a promising solution and is identified as a fundamental approach in the national spectrum strategy to meet the growing demand for spectrum <ref type="bibr">[78,</ref><ref type="bibr">102]</ref>.</p><p>Dynamic spectrum sharing aims to maximize spectrum utilization by enabling two or more wireless systems to operate opportunistically in the same frequency band and the same geographic area without causing interference with other users <ref type="bibr">[102]</ref>. Interference in this thesis refers to unwanted co-channel signals, intentional or unintentional, received from other transmitters to the intended wireless communication link. Secondary users with a lower access priority can use a shared band when it is unoccupied but have to adjust the radio frequency (RF) parameters of ongoing transmissions, e.g., operating channel or transmit power, in response to interference possibilities to primary users.</p><p>Commercial spectrum sharing use cases include the 3.5 GHz band <ref type="bibr">[40]</ref> and the 6 GHz band <ref type="bibr">[39]</ref>, as shown in Department of Defense naval radars, have the highest access priority and are guaranteed channel access at any time. They are protected against interference via a Spectrum Access System (SAS), an automated frequency coordinator that senses incumbent activities in real time and frees or suspends shared channel access accordingly <ref type="bibr">[112]</ref>. The 6 GHz band, i.e., 5.925-7.125 GHz, is allocated by the Federal Communications Commission (FCC) for sharing between license-exempt operators, e.g., Wi-Fi, and incumbents, e.g., fixed microwave links and satellite services <ref type="bibr">[42]</ref>. Di"erent from the CBRS band, the 6 GHz band has no middle-tier licensing market for short-term spectrum leases. Incumbents in this band receive interference protection by restricting unlicensed devices to be either (1) low power and indoor or (2) standard power but coordinated via an Automatic Frequency Control (AFC) system <ref type="bibr">[83]</ref>. An AFC is a database-driven approach that, given an unlicensed device's location, assigns permissible channels and maximum power levels on a daily basis <ref type="bibr">[47]</ref>.</p><p>Joint industrial and academic e"orts have shown that dynamic spectrum sharing is a transformative enabler of many next-generation wireless systems by unlocking flexible and coste"ective spectrum use. Examples include private cellular networks <ref type="bibr">[5]</ref> and Radio Dynamic Zones (RDZs) <ref type="bibr">[27]</ref>. A private cellular network is a dedicated local network that uses 5Gand-beyond technologies to support an enterprise's specific needs for security, control, and latency. Dynamic spectrum sharing facilitates the private wireless industry by providing a"ordable, flexible access to shared spectrum as a new solution, without the high cost and complexity of requiring exclusive licenses <ref type="bibr">[135]</ref>. The other example, an RDZ, is envisioned as a geographic area where radio spectrum is dynamically shared in real time across various active and passive wireless systems in a controlled manner <ref type="bibr">[144]</ref>. It is clear that dynamic spectrum sharing is the foundation of an RDZ design due to its focus on spectrum coexistence approaches.</p><p>Economically, dynamic spectrum sharing enhances the next-generation wireless ecosystem by monetizing underutilized resources and fostering a more competitive and innovative market <ref type="bibr">[28]</ref>. Private 5G networks are widely implemented on the shared CBRS band, which significantly reduces operational costs relative to exclusive spectrum licenses and increases network reliability over traditional unlicensed bands <ref type="bibr">[135]</ref>. Wi-Fi 6E and Wi-Fi 7 systems, operating on the newly shared 6 GHz band, greatly expand wireless connectivity and mitigate congestion in existing unlicensed spectrum including the 2.4 GHz and 5 GHz bands <ref type="bibr">[83]</ref>. These applications validate that dynamic spectrum sharing is essential to break down spectrum barriers, drive competition, and create low-cost opportunities for market innovation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.1.1">Private Cellular Networks</head><p>Private cellular networks are local area networks built on cellular technologies like Long-Term Evolution (LTE) and 5G to meet the unique needs of enterprises for performance, privacy, and control. Unlike public cellular networks which serve broad consumer bases over wide areas, private cellular networks are owned and managed by individual organizations within a specific geographic area such as an airport or a hospital <ref type="bibr">[79]</ref>. Their robust coverage and reliable service also make private cellular networks an ideal solution for environments where public networks are insu!cient, such as mining sites, distribution centers, and remote warehouses.</p><p>One of the core advancements in private cellular networks is the use of shared spectrum, particularly the CBRS band <ref type="bibr">[99]</ref>. Dynamic sharing of the CBRS band enables enterprises to deploy a private cellular network without the time-consuming and costly acquisition of exclusive licenses. A network can either access the band via the second-tier priority license or the general authorized access for free. Such cost-e"ectiveness makes private cellular networks easier to build and operate for enterprises of varying sizes.</p><p>In contrast to Wi-Fi systems operating in an unlicensed spectrum, private cellular networks in the CBRS band deliver superior link reliability and enhanced data privacy <ref type="bibr">[26]</ref>. The CBRS sharing framework mandates that all devices register with a SAS prior to access, which boosts a private network's resilience against interference. Meanwhile, unlicensed spectrums-such as the 2.4 GHz, 5 GHz, and 6 GHz bands-lack the same level of regulatory structure as CBRS, thereby increasing the risk of interference in dense environments. Furthermore, the use of cellular standards, including robust authentication and built-in security features, provides an additional security and privacy that is typically not available in Wi-Fi systems.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.1.2">Radio Dynamic Zones</head><p>RDZs are geographically defined areas within which spectrum resources are dynamically shared in a controlled manner across time, space, and frequency to enable advanced wireless applications and services <ref type="bibr">[70]</ref>. RDZs are designed to support a wide range of use cases, from space-air-sea communication to public safety and defense <ref type="bibr">[27]</ref>. By leveraging dynamic spectrum sharing, RDZs provide a flexible and e!cient framework for managing spectrum resources in complex and dynamic environments.</p><p>RDZs play a critical role in next-generation wireless applications that require low latency, massive connectivity, and high mobility. For example, in smart cities, RDZs can support a variety of Internet of Things (IoT) devices, from tra!c sensors and smart meters to autonomous vehicles and drones <ref type="bibr">[76]</ref>. By dynamically managing spectrum resources, RDZs allow these devices to operate adaptively while ensuring little channel congestion. In mission-critical situations such as natural disasters, communication networks often become congested, making it di!cult for rescue forces to coordinate their e"orts. RDZs can dynamically allocate spectrum resources to public safety networks so that first responders can access more wireless channels when needed <ref type="bibr">[111]</ref>.</p><p>A critical component of RDZs is the Zone Management System (ZMS). It acts as the digital twin intelligence for managing spectrum resources within the zone and protecting regular wireless systems outside the zone <ref type="bibr">[36]</ref>. Various players in an RDZ including spectrum consumers and spectrum sensors interact with the ZMS in real time for spectrum policy compliance. ZMS functionality is realized via programmable interfaces, which supports a diversity of options for interference rules, sharing fairness, and user priority <ref type="bibr">[67]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.2">Problem Statement</head><p>Dynamic spectrum sharing, despite o"ering huge benefits to wireless innovation and numerous vertical applications, presents a core challenge-interference management for secondary users-that must be e"ectively addressed. The coexistence of multiple secondary networks in shared spectrum environments significantly increases the risk of interference, which can degrade network performance and compromise the Quality of Service (QoS) for critical applications <ref type="bibr">[17]</ref>. The aforementioned approaches for interference management, including SAS for the 3.5 GHz and AFC for the 6 GHz, only o"er protection for primary users from secondary users, as shown in Figure <ref type="figure">1</ref>.3. SAS and AFC do not protect secondary users against interference from each other and thus leave GAA users in the CBRS band and unlicensed devices in 6 GHz vulnerable to mutual interference <ref type="bibr">[123,</ref><ref type="bibr">56]</ref>. Reports have shown that there are currently over 400,000 active CBRS base station devices (CBSDs), among which more than 71% are GAA-only devices and 94.9% have both GAA and PAL grants <ref type="bibr">[21]</ref>. Such a large number of secondary users and the increasingly severe interference risks show an urgent need for secondary user management without harmful interference. Similar to the SAS and AFC architecture for primary user protection, open research on interference management for secondary users still remains for predicting, monitoring, and controlling interference before and during shared channel access.</p><p>A robust management strategy must be e!cient, accurate, and scalable at managing all user coexistence, secondary and/or primary, that varies across time, frequency, and location.</p><p>Primary User (PU) Protection Propagation Models Spectrum Policies Protection Zones for PUs Spectrum Sensing for Mobile PUs AFC/SAS Geolocations for Static PUs Devices/Spectrum Databases Request Access Decision Figure 1.3:</p><p>The architecture of SAS and AFC for the shared 3.5 GHz band and the 6 GHz band, respectively, for managing interference from secondary users. The spectrum sensing capability is required for SAS to monitor mobile primary users like naval radars, whereas geolocations are su!cient for AFC to protect fixed primary users.</p><p>To address such requirements, three related categories of research have been explored in the past few decades: (1) interference prediction, (2) interference monitoring, and (3) interference source control. Next, we provide a description of these areas including their importance, past methods, and new challenges introduced by dynamic spectrum sharing.</p><p>&#8226; Interference prediction: Interference prediction uses requested transmit powers and path loss models to predict possible interference to existing users before accessing a shared frequency band. It facilitates channel allocation and interference mitigation by ensuring the required Signal-to-Interference-plus-Noise Ratios (SINRs) for all transmitter-receiver pairs.</p><p>Path loss models have a long history of development. They utilize propagation physics <ref type="bibr">[140]</ref>, site-specific information <ref type="bibr">[37]</ref>, historical measurements <ref type="bibr">[57]</ref>, or data from the area of deployment <ref type="bibr">[107]</ref> for RF interference prediction. However, current methods are insu!cient to ensure reliable spectrum sharing due to the following reasons: (1) Current path loss models used by SAS and AFC can be inaccurate and conservative. For example, the Irregular Terrain Model (ITM) model for over 1000 m can have 0-20 dB prediction error <ref type="bibr">[63]</ref>. Up to 12 dB Interference to Noise (I/N) safety margins are also applied by AFC for extra protection of public safety communications <ref type="bibr">[41]</ref>.</p><p>(2) Multiple users in the environment share the same spectrum, and any new request requires accurate interference prediction for a large number of co-channel links. (3) Path losses for most links change when one or more users move, which demands fast interference recomputation of all varying channel losses.</p><p>To improve the modeling performance, path loss models can be improved via realworld measurements and Machine Learning (ML) techniques for more accurate channel estimation and spectrum sharing e!ciency increase <ref type="bibr">[48,</ref><ref type="bibr">49]</ref>.</p><p>&#8226; Interference monitoring: Real-time interference monitoring senses the actual spectrum use of targeted transmissions to prevent co-channel or adjacent wireless interference. It is critical for identifying and characterizing interference sources.</p><p>Spectrum sensing techniques <ref type="bibr">[141]</ref> can be repurposed for monitoring shared spectrum use. Common methods include direct sensing <ref type="bibr">[139]</ref> and cooperative sensing <ref type="bibr">[8]</ref> which use one sensor and collaborative radios, respectively, for monitoring purposes. They rely on either blind received power <ref type="bibr">[14]</ref> or primary transmitted signal priors <ref type="bibr">[141]</ref> for transmission detection. The disadvantages of these methods, however, are that (1) blind received power based detection is insu!cient as it cannot reliably separate cochannel signals, and (2) it is unrealistic to require any prior on the transmitted signal or signals in the environment.</p><p>To address the restrictions above, estimation of actual RF signals from each user without any prior can be a significant intermediate step for accurately monitoring the spectrum use of each user while accessing a shared channel.</p><p>&#8226; Interference source control: E"ective interference management also involves reactive handling of interference sources. It first identifies the user causing interference and resolves the conflict by demanding a change to the source's transmission.</p><p>Identification of interference sources is commonly achieved via centralized database approaches such as SAS and AFC <ref type="bibr">[17]</ref>. They require di"erent networks in a shared spectrum environment to register and periodically update their locations and RF parameters for compliance checks. In case of interference possibilities, targeted transmission will be notified to either stop or adapt via frequency adjustment, power control <ref type="bibr">[74]</ref>, or beamforming <ref type="bibr">[138]</ref>. However, two additional challenges remain to be addressed for e"ective interference source control: (1) The worst-case interference, despite path loss models for interference prediction, can still occur due to model errors or unpredictable events like airplanes flying by <ref type="bibr">[133]</ref>. No mechanism is in place to identify and mitigate such interference; (2) There is currently no guarantee that all transmitters in violation will stop their harmful interference and stop in an e!cient manner.</p><p>To tackle the two challenges, interference monitoring can be incorporated into interference source control to provide users' actual spectrum use as forensic evidence for identifying any worst-case interference. A closed-loop control scheme can e"ectively address the second issue by automatically disabling the interference source when needed.</p><p>To summarize, the integration of predictive analytics, real-time monitoring, and source control ensures reliable management of interference in a shared spectrum. These strategies form the foundation of interference management, but current methods in these areas are not well-matched to the new needs of advanced dynamic spectrum sharing. Therefore, this thesis aims to explore accurate, e!cient, and time-continuous techniques for interference management to ensure spectrum compliance, improve spectrum use e!ciency, and facilitate trust across coexistent stakeholders.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.3">Contributions</head><p>This thesis, focused on interference management for dynamic spectrum sharing, presents an overview of the thesis, the research outcomes, and publications aimed at predicting, monitoring, and controlling RF interference throughout the sharing process.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.3.1">Thesis Overview</head><p>This thesis focuses on predicting, monitoring, and closed-loop control of wireless interference in spectrum sharing environments.</p><p>&#8226; Channel Estimation Modeling for Interference Prediction This work addresses the interference prediction challenge by proposing Channel Estimation via Loss Field (CELF), a data-driven Bayesian model for improved propagation modeling in shared spectrum environments. CELF characterizes the median-scale shadowing loss due to obstructions as a spatial loss field where the loss at each location represents the additional radio shadowing loss over a base channel model. The spatial loss field is learned via site-specific channel measurements from the area of interest and a Bayesian linear regression method. For any transmitter-receiver pair, CELF sums the learned loss field near the link to estimate its additional shadowing loss. Extensive measurements in indoor and outdoor environments verify that CELF increases the accuracy over a base channel model for both scenarios and outperforms three popular ML methods in terms of accuracy and training e!ciency. &#8226; Channel Estimation Model Augmentation for Interference Prediction. The work above describes the modeling methodology and an initial set of results for CELF performance. This work proposes several enhancements to augment the CELF model. First, we introduce a di"erent channel base model, Terrain-Integrated Rough Earth Model (TIREM), which uses not only empirical measurements but also physical mechanisms to estimate channel losses. It enhances CELF by discussing whether and how much CELF can improve the base model's accuracy from both over-and underpredicted modeling errors.</p><p>Second, an alternative spatial multipath model for the weight matrix in CELF is included to describe the impact of di"erent geometric shapes represented by the multipath models on the accuracy performance of CELF. Third, more real-world datasets, one collected in a shared frequency band and the other from a large urban environment, are used to evaluate CELF's performance robustness under complex environments. Last, a detailed analysis of CELF's explainability is given to verify numerically that CELF is an explainable learning approach. Crucially, CELF's low prediction time for unseen links highlights its potential to provide channel estimation and interference prediction for all users-whether primary or secondary-in the shared spectrum in an e!cient manner.</p><p>&#8226; Interference Monitoring and Source Control for Software-Defined Radio Platforms. This work develops a real-time spectrum monitoring system for Software-Defined Radio (SDR) base stations, or platforms, to address the interference monitoring and source control challenges. SDR platforms, due to their programmable feature, have been a key enabler of dynamic spectrum sharing. Numerous wireless experimental testbeds deploy SDR platforms for advanced wireless research. However, their generated signals may, intentionally or unintentionally, be transmitted in frequency bands and at power levels that violate spectrum policies. This work thus presents a FDMonitor system attached between a transmitter's power amplifier and its antenna to monitor simultaneously the transmit RF signal and also signal(s) incident on the antenna. FDMonitor has been deployed to monitor 19 SDR platforms on Platform for Open Wireless Data-driven Experimental Research (POWDER), a city-scale wireless testbed, since 2021. Its closed-loop feature sends alerts in real time whenever a violation is observed, and automatically turns o" the interference source as necessary.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.3.2">Research Outcomes</head><p>&#8226; Models: Development of statistical models for channel estimation and spectrum monitoring <ref type="bibr">[130,</ref><ref type="bibr">126]</ref>.</p><p>&#8226; Experimental measurements: Extensive real-world measurements under four different RF settings, i.e., signal modulations, carrier frequency, bandwidth, and transmit power, for model validation <ref type="bibr">[126]</ref>.</p><p>&#8226; Open-source tools: Public software for the closed-loop spectrum monitoring system and the data-driven channel loss learning algorithm <ref type="bibr">[130,</ref><ref type="bibr">131,</ref><ref type="bibr">126]</ref>.</p><p>&#8226; Real-world system deployment: Deployment and testing of the spectrum monitoring system on an experimental wireless testbed for continuous spectrum monitoring since 2021 <ref type="bibr">[126,</ref><ref type="bibr">127]</ref>.</p><p>&#8226; Collaborative e"orts: Transmitter localization via deep learning in spectrum sharing environments and active-passive spectrum coexistence <ref type="bibr">[81,</ref><ref type="bibr">133]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.3.3">Publications</head><p>Some material included in this dissertation has been previously published in peer-reviewed journals and conferences. The following list describes the publications and at which part of the thesis they are presented.</p><p>&#8226; J.</p><p>Wang, M. G. Weldegebriel, and N. Patwari. "Channel Estimation via Loss Field: Accurate Site-Trained Modeling for Shadowing Prediction," IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), pp. 312-321, 2024. This work is presented in Chapter 2. &#8226; J. Wang, M. G. Weldegebriel, and N. Patwari. "Augmenting Channel Estimation via Loss Field: Site-trained Bayesian Modeling and Comparative Analysis," Computer Networks Journal, vol. 258, 2025. This work is presented in Chapter 3. &#8226; J. Wang, J. Gornet, A. Orange, L. Stoller, G. Wong, J. Van der Merwe, S. K. Kasera, and N. Patwari. "Two Measure is Two Know: Calibration-free Full Duplex Monitoring for Software Radio Platforms," IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), pp. 267-276, 2024. This work is presented in Chapter 4. &#8226; J. Wang, J. Van der Merwe, and N. Patwari. "A Compliance Monitoring System for Open SDR Platforms," Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems (SenSys), pp. 351-352, 2021. This work is presented in Chapter 4. Other works related to dynamic spectrum sharing research and its applications are included in: &#8226; M. G. Weldegebriel, J. Wang, G. Hellbourg, N. Zhang, and N. Patwari. "Amplitude Based Spread Spectrum for Low SNR Watermark Detection in Pseudonymetry," to be submitted, 2025. &#8226; F. Mitchell, J. Wang, S. K. Kasera, and A. Bhaskara. "Utilizing Confidence in Localization Predictions for Improved Spectrum Management," IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), pp. 483-492, 2024. &#8226; F. Mitchell, J. Wang, N. Patwari, and A. Bhaskara. "Less is More: Improved Path Loss Prediction Using Simple Interpolation Models," IEEE DySPAN Workshop on Field Trials for Advanced Spectrum Sharing (FAST), pp. 139-144, 2024. &#8226; M. G. Weldegebriel, J. Wang, N. Zhang, G. Hellbourg, and N. Patwari. "Watermarking of OFDM for Pseudonymetry: Analysis and Experimental Results," IEEE ICC Workshop on Catalyzing Spectrum Sharing via Active-Passive Coexistence (CSSAPC), pp. 317-322, 2024. &#8226; M. G. Weldegebriel, J. Wang, N. Zhang, and N. Patwari. "Pseudonymetry: Precise, Private Closed Loop Control for Spectrum Reuse with Passive Receivers," IEEE International Conference on RFID (RFID), pp. 91-96, 2022. &#8226; M. A. Varner, F. Mitchell, J. Wang, K. Webb, G. D. Durgin. "Enhanced RF Modeling Accuracy Using Simple Minimum Mean-Squared Error Correction Factors," IEEE 2nd International Conference on Digital Twins and Parallel Intelligence (DTPI), pp. 1-5, 2022. The work related to RF sensing and wireless networking is: &#8226; J. Wang, A. S. Abrar, N. Patwari. "Received Power Based Vital Sign Monitoring," Contactless Vital Signs Monitoring, pp. 205-230, Academic Press, 2022. 1.4 Structure of the Thesis This thesis focuses on predicting, monitoring, and closed-loop control of wireless interference in spectrum sharing environments. It is organized into the following chapters: &#8226; Chapter 2: Channel Estimation via Loss Field. This chapter presents the CELF model to address the first interference management challenge for dynamic spectrum sharing: channel estimation before and during shared spectrum access. It o"ers a new type of explainable learning model for accurate and fast site-trained radio channel loss estimation.</p><p>&#8226; Chapter 3: Augmented CELF Model. This chapter augments the CELF work by discussing the Bayesian modeling components, its explainability, and performance robustness under uncertainties. It explores di"erent channel base models and spatial multipath models, CELF explainability, and performance in complex environments for comprehensive analysis and practical real-world application.</p><p>&#8226; Chapter 4: Spectrum Monitoring for Software-Defined Radio Platforms.</p><p>This chapter addresses the interference monitoring and interference source control challenges by developing a real-time spectrum monitoring system. It estimates the actual transmitted signal for spectrum use awareness and includes a closed-loop control for stopping transmissions in violation.</p><p>&#8226; Chapter 5: Conclusion and Future Work. This chapter summarizes the key findings of the thesis and outlines potential directions for future research.</p><p>Chapter 2</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>CELF: Channel Estimation via Loss Field</head><p>This chapter 1 addresses the first interference management challenge for dynamic spectrum sharing: channel estimation before and during transmitting in the shared spectrum for interference mitigation and access coordination. Wireless networks that share spectrum dynamically among groups of mobile users will require fast and accurate channel estimation in order to guarantee signal-to-interference-plus-noise ratio (SINR) requirements for co-channel links.</p><p>There is a need for channel models with low computational complexity and high accuracy that adapt to the particular area of deployment while preserving explainability. Therefore, we propose the Channel Estimation via Loss Field (CELF) model, which uses channel loss measurements from a deployed network and a Bayesian linear regression method to estimate a site-specific loss field for the area. The loss field is explainable as a site map of additional radio "shadowing", compared to a base channel model, but it requires no site-specific terrain or building information. For an arbitrary pair of transmitter and receiver positions, CELF sums the loss field near the link line to estimate its shadowing loss. We use extensive measurements to show that CELF lowers the variance of channel estimates by up to 56% compared to the path loss exponent model, and outperforms three popular machine learning methods in variance reduction and training e!ciency. CELF o"ers a new type of explainable learning model for accurate and fast site-specific radio channel loss estimation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RX</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RX TX TX</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RX TX RX</head><p>Group A Group B Interference Signal Figure <ref type="figure">2</ref>.1: To assign channels and transmit powers to ensure the required SINRs among links between T transmitters (TX) and R receivers (RX), it demands RT channel loss estimation and recomputation as users move.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Introduction</head><p>Spectrum allocation is becoming increasingly dynamic and shareable in order to meet the growing demand <ref type="bibr">[4,</ref><ref type="bibr">18]</ref>. Examples include the CBRS band <ref type="bibr">[112]</ref>, and the radio dynamic zone <ref type="bibr">[70,</ref><ref type="bibr">144]</ref>. A major part of the challenge to achieve reliable dynamic spectrum allocation is to accurately and e!ciently predict signal and interference powers between all pairs of proximate mobile transmitters and receivers, as shown in Fig. <ref type="figure">2</ref>.1, to ensure that SINRs are su!cient for all groups.</p><p>Current channel models are not well-matched to the needs of dynamic spectrum management in mobile networks. Many path loss prediction models require computing losses due to propagation mechanisms such as reflection and di"raction in the particular geometry of the network deployment area. For example, TIREM <ref type="bibr">[37]</ref> computes di"raction losses based on the terrain features and building heights extracted for each transmitter and receiver pair. Ray tracing models <ref type="bibr">[140]</ref> additionally require high-resolution environmental databases and are highly computationally complex. Such site-specific models have high accuracy compared to general-purpose models which curve-fit to empirical data, such as the Okumura-Hata <ref type="bibr">[57]</ref> and log-distance path loss <ref type="bibr">[101]</ref> models. However, if real-time dynamic spectrum management requires high-resolution site clutter data and significant computational resources, it will limit who can perform this management <ref type="bibr">[31]</ref>.</p><p>Emerging ML channel models can be both accurate and fast during testing but require very large datasets and computational resources during model training <ref type="bibr">[107,</ref><ref type="bibr">73]</ref>. Further, ML models su"er from the black-box problem, in which no human-understandable explanation 1 This was published in IEEE International Symposium on Dynamic Spectrum Access Networks (DyS-PAN), 2024. or reasoning for their predictions is possible <ref type="bibr">[45]</ref>. This prevents system engineers from diagnosing problems when a model performs poorly. Updating an ML channel model over time does not allow engineers to explain how (or if) the model has been impacted by changes in the environment, e.g., a new building having been constructed. Current and future regulations may require model explanations for legal purposes <ref type="bibr">[60]</ref> -if some system is harmed by path loss prediction model errors, a human-understandable explanation must be provided.</p><p>In this work, we develop and validate a new type of channel learning model, the CELF model, which simultaneously is more accurate than current ML channel models trained with the same data, is explainable, and is less computationally complex to train. CELF formulates the link fading loss as a linear function of a shadowing loss field. This loss field is connected to the underlying wave propagation physics in that it accounts for the physical mechanism of shadowing due to obstacles in the spatial domain, and is viewable as a simple image map.</p><p>Research also estimates the shadowing attenuation caused by people <ref type="bibr">[136]</ref> and walls <ref type="bibr">[68]</ref> via this type of model. As shown in Fig. <ref type="figure">2</ref>.2, the loss field is learned from training measurements via Bayesian linear regression, but training is lower in computation requirements compared to a general-purpose ML model. Using training measurements allows the model to fit the particular site of deployment. Sensors deployed as part of a radio dynamic zone, or conducted by nodes using the spectrum as part of the dynamic spectrum access protocol, can be used to collect these measurements. Training data quantities can be low in comparison with other ML methods. We also discuss Bayesian regression's stability and optimization for more robust and e!cient learning of the loss field. To predict shadowing loss for a new link, CELF computes a weighted sum of a small number of pixels of the learned loss field image.</p><p>The implementation of CELF is in <ref type="bibr">[128]</ref>.</p><p>We use outdoor and indoor datasets to experimentally quantify how accurately and e!ciently CELF performs. We compare CELF with three general-purpose ML methods: Support Vector Regression (SVR), Random Forests, and Multi-Layer Perceptron Artificial Neural Network (MLP-ANN), in terms of (1) variance reduction compared to a baseline model, ( <ref type="formula">2</ref>) training e!ciency, and ( <ref type="formula">3</ref>) prediction e!ciency. CELF reduces the variance of total fading loss estimates by up to 56% outdoors and 40% indoors. In comparison to the ML-based methods, CELF achieves larger variance reductions. The MLP-ANN model is the most accurate model out of the three ML-based methods, but it requires three times more time than CELF for model training. For shadowing loss prediction, CELF is faster than SVR but slower than MLP-ANN as the test dataset size and the loss field size severely impact CELF's prediction e!ciency.</p><p>For perspective, path loss models do not predict small-scale fading e!ects, i.e., those caused by sub-wavelength (cm-level) changes in the position of the transmitter or receiver. Smallscale fading is severe, e.g., more than 20 dB for 1% of the time in a Rayleigh fading channel <ref type="bibr">[101]</ref>. Path loss models do not know the device and environmental obstruction positions to the required level of accuracy. Instead, channel path loss models like our proposed model predict large-scale fading (caused by increasing distance) and medium-scale or shadow fading (caused by obstructions) <ref type="bibr">[55]</ref>. Since training and testing measurements include small-scale fading but our model cannot predict it, we cannot reduce the path loss variance to zero. Instead, we judge models by how much they can reduce fading variance compared to a standard statistical channel model. We find that CELF shows larger variance reductions across all of our experiments than any other model, ML or otherwise.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Related Work</head><p>Path loss prediction has an extensive disciplinary history over several decades. Models used today vary by what they rely on:</p><p>1. the physical mechanisms of radio propagation, e.g., reflection and di"raction;</p><p>2. information about the site, e.g., terrain and building geometry data;</p><p>3. curve-fitting to empirical data recorded in past measurements;</p><p>4. fitting or learning using empirical data collected in the area of deployment.</p><p>While some models do not characterize the probability distribution of the channel loss, statistical models describe a distribution for the loss variation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.1">Physics-based Models</head><p>Physics-based models aim to accurately characterize radio wave propagation e"ects such as reflection and di"raction. The most fundamental is the free-space path loss model <ref type="bibr">[101]</ref>, but it models only unobstructed channels, and is thus limited to satellite communication and unobstructed microwave relay links. The two-ray ground reflection model accounts for both the Line-Of-Sight (LOS) and the ground-reflected paths <ref type="bibr">[97]</ref>, and is typically used in flat clutter-free areas like plains <ref type="bibr">[50]</ref>. When more multipath must be modeled, ray tracing is both the most accurate and most complicated model for path loss <ref type="bibr">[140]</ref>. Ray tracing requires sitespecific building databases, i.e., building layout, heights, and dielectric properties, as well as detailed terrain and ground use data, so that each wave path can be traced using geometrical optics <ref type="bibr">[61]</ref>. Its computational complexity and need for high-resolution site-specific data make it impractical for large-scale, real-time applications.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.2">General Empirical Models</head><p>General empirical models are based on an analysis of measurements taken from an environment similar in use to the area of interest, e.g., urban or suburban. The Okumura-Hata model is based on measurements from Tokyo in the 1960s as formulated by Hata <ref type="bibr">[57]</ref>. It uses curve-fitting to model the e"ect of signal frequency, antenna heights, path length, and environment type on the channel loss. The COST-231 Hata model extends the Okumura-Hata model to data from some European cities <ref type="bibr">[109]</ref>. The benefits of statistical models are the simple closed-form formula and no need for data from the site of interest. However, they are restricted to certain frequency and distance ranges, and most critically, they are most accurate in the environments from which the measurements came <ref type="bibr">[50]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.3">Hybrid Empirical/Physical Models</head><p>The Longley-Rice model, also called the Irregular Terrain Model (ITM), combines empirical modeling and physical principles for ground reflection, knife-edge and far-field di"raction, and troposcatter predictions <ref type="bibr">[63]</ref>. This model considers environmental factors including surface refractivity, ground conductivity, atmospheric parameters, and terrain irregularities for path loss prediction <ref type="bibr">[140]</ref>. It is in use today in systems like SAS <ref type="bibr">[113]</ref>. The TIREM model <ref type="bibr">[37]</ref> considers a profile of the terrain features and building heights <ref type="bibr">[124]</ref>. The last hybrid model is the International Telecommunication Union's (ITU)-R P.1812 model, which uses detailed terrain profiles to target path-specific predictions. It has been widely used for terrestrial wireless systems <ref type="bibr">[108]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.4">Statistical Models</head><p>Statistical models characterize the statistical distribution of the channel losses, rather than only the average loss value. The most common model is the log-normal shadowing model, which models shadowing loss as normally distributed in dB <ref type="bibr">[50]</ref>. Other models explain the statistical correlation between the shadowing loss on two proximate links <ref type="bibr">[114,</ref><ref type="bibr">72,</ref><ref type="bibr">93]</ref>, which become correlated by passing through the same or similar obstructions. CELF models this correlation implicitly via its loss field. Other distributions for shadowing include the Gamma <ref type="bibr">[1]</ref> and inverse Gamma <ref type="bibr">[100]</ref> distributions. We note that the most well-known distributions, Rayleigh and Rician, are models for small-scale fading loss, and are thus not further discussed in this work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.5">ML Channel Models</head><p>Another popular class is ML channel models which are designed using general-purpose ML architectures and extensive datasets <ref type="bibr">[107,</ref><ref type="bibr">66,</ref><ref type="bibr">142]</ref>. We categorize these models as: (1) SVR, K-Nearest-Neighbors (KNN), and ensemble learning methods such as random forests <ref type="bibr">[142]</ref>;</p><p>(2) ANN models including MLP-ANN models <ref type="bibr">[137,</ref><ref type="bibr">89]</ref> and radial basis function-ANN models (RBF-ANN) <ref type="bibr">[87]</ref>, and (3) more complex Deep Neural Network (DNN) models <ref type="bibr">[73,</ref><ref type="bibr">122]</ref>. For example, the RadioUNet model in <ref type="bibr">[73]</ref> utilizes large datasets and environmental geometry as input to Unet, a special Convoluted Neural Network (CNN) architecture for path loss modeling.</p><p>ML-based methods can provide higher prediction accuracy than domain-specific models at the cost of extensive datasets or detailed environmental information. Additionally, the high complexity of model training and updating will result in significant latency. The lack of interpretability of ML methods is a particular challenge, as RF engineers can find it di!cult to diagnose a problem when the model performs poorly. Further, regulation increasingly requires businesses to be able to explain why an algorithm's prediction was made <ref type="bibr">[60]</ref>.</p><p>CELF is also a learning-based model which uses site measurements to train. It requires no knowledge about the environment and can be trained with fewer measurements than a general-purpose ML model. Further, CELF explains its estimates via the shadowing field image, which should correlate to the attenuating obstructions in the area.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Channel Estimation via Loss Field</head><p>In this section, we present the CELF model in three parts. First, we describe the idea of a base model, and describe what is used in this work. Next, we describe how CELF augments the base model for better path loss estimation, using a spatial loss field. Finally, we explain how to estimate the loss field from training measurements.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.1">First-order Channel Estimation</head><p>CELF predicts the additional path loss compared to a base model, an arbitrary path loss model. The base model could be any model described in the related work (Section 2.2), but presumably something simple to compute. CELF's role is to augment the estimates from the base model by additionally accounting for the natural spatial correlations in the path loss.</p><p>In this work, we use the log-distance path loss model as the base model, which states that the ensemble average power P (d l ) along a link l = (i, j) between node i and node j reduces in a logarithmic manner with increasing distance <ref type="bibr">[101]</ref>:</p><p>where P T is the transmitted power in dBm, d l is the link distance, # 0 is a constant specifying the dB loss at a reference distance $ 0 , and the path loss exponent n p indicates the level of environmental clutter.</p><p>Given the same distance d l , the received power measurements vary around the average P (d l ) due to shadow fading and small-scale fading <ref type="bibr">[50]</ref>. As a result, the received power P (d l ) along the link l can be written as:</p><p>where Z l is the total fading loss, which consists of independent shadowing loss X l and smallscale fading loss Y l <ref type="bibr">[136]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.2">Network Shadowing Model for Shadowing Correlation</head><p>The total fading loss Z l is commonly modeled as independent and identically distributed (i.i.d.) across links <ref type="bibr">[15,</ref><ref type="bibr">58,</ref><ref type="bibr">29]</ref>. However, that simplification disagrees with the empirical observation that shadowing losses along two links are correlated due to obstructions, e.g., outdoor buildings and terrain variations, and indoor walls and furniture <ref type="bibr">[52,</ref><ref type="bibr">3,</ref><ref type="bibr">72]</ref>.</p><p>In order to simultaneously model the correlations in shadow fading that exist across multiple link pairs in a network, we use the network shadowing model <ref type="bibr">[92]</ref>. Let L be a set of link pairs in a wireless network, and L = |L| where | &#8226; | counts the number of elements in the set. We assume that each link is di"erent in either transmitter or receiver location from the other links in the set L. The network shadowing model describes the joint link fading loss as:</p><p>where z = [Z 1 , Z 2 , . . . , Z L ] T &#8596; R L&#8593;1 is the total fading loss vector, W &#8596; R L&#8593;M is a weight matrix, p &#8596; R M &#8593;1 is a discretized loss field in dB, and &#949; &#8596; R L&#8593;1 is a noise vector. Their details are given below.</p><p>Loss field l i n k valid invalid pixel pixel Spatial loss field p. The spatial loss field of <ref type="bibr">[92]</ref> characterizes the environment of interest as a Gaussian random field that is isotropic wide-sense stationary. It has zero mean and an exponentially decaying spatial covariance function:</p><p>where d m,n is the Euclidean distance between the centers of pixels m and n, &#977; 2 X is the variance of the shadowing loss, and &#982; is a space constant.The shadowing loss Z l on link l is then a weighted sum of the loss field p over the pixels that cross near the link l.</p><p>Weight matrix model for W . A popular ellipse model in <ref type="bibr">[136]</ref> is adopted for the weight matrix W , as shown in Fig. <ref type="figure">2</ref>.3. It considers the two ends of link l as the foci and utilizes a tunable parameter &#969; to determine the ellipse width. A pixel is viewed as valid if it falls within the ellipse, and the corresponding weight in W will have a non-zero contribution to the shadowing loss of link l. Past studies <ref type="bibr">[92,</ref><ref type="bibr">3,</ref><ref type="bibr">136]</ref> construct the weight as:</p><p>where d i,m and d j,m are the distances from the center of pixel m to the two ends of link l, d l is the link distance, and &#969; is the ellipse width parameter.</p><p>Small-scale fading and noise. The Gaussian noise &#949; is the sum of small-scale fading loss and measurement noise which are independent of each other. Measurement noise is first assumed to be i.i.d. Gaussian. Small-scale fading, also known as multipath fading, describes the attenuation that occurs from constructive and destructive addition of multipath phasors <ref type="bibr">[50]</ref>. While small-scale fading may have multiple distributions depending on the number of significant-amplitude multipath components <ref type="bibr">[35]</ref>, here we model it as i.i.d. Gaussian in dB. The rationale is in <ref type="bibr">[3]</ref> which approximates small-scale fading to be (1) uncorrelated for mobile network nodes that are typically many wavelengths separated and (2) Gaussian distributed by averaging it across many frequencies. Note that without the Gaussian noise assumption, Bayesian linear regression still applies for loss field learning and the analytical solution given below can be adjusted accordingly.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.3">Loss Field Learning</head><p>Bayesian linear regression. Given the linear joint link model in (2.3) and the Gaussian loss field prior, we reconstruct the loss field p via Bayesian linear regression. We note the likelihood function of the total fading loss vector is,</p><p>where N (W p, &#977; 2 &#969; I L ) is a Gaussian distribution with a mean of W p and a covariance of &#977; 2 &#969; I L . Next, the loss field prior f (p) is modeled as Gaussian,</p><p>where C p is the covariance matrix formed by <ref type="bibr">(2.4)</ref>. Therefore the posterior pdf of p is multivariate Gaussian as</p><p>where</p><p>(2.9)</p><p>As a result, we can acquire the Maximum A Posteriori (MAP) estimator p as the posterior mean &#181; p|z in (2.9).</p><p>Solution stability. The linear regression, however, is an ill-posed problem, i.e., the attenuation image estimate p from the measurement vector is not unique, and/or (2.9) may not exist. Such ill-posedness is due to two main factors:</p><p>1. L &lt; M: there are more pixels to be estimated than link measurements, thus the problem is underdetermined;</p><p>2. L &gt; M but with a sparse W : only a few pixels are assigned non-zero weights for each link and thus W is rank-deficient regardless of the number of link samples.</p><p>For a stable solution, the regularization constant &#1009; is introduced such that the estimator p is expressed as: p = % 1 z</p><p>where &#977; &#969; is considered in &#1009;. In doing so, the estimator is robust to rank deficiency in the weight matrix, and the inverse term in the operator % 1 always exists.</p><p>Solution e!ciency. Latency can be the other concern given the heavy computation requirement for a large number of co-channel users, especially secondary users, and thus requires e!ciency improvement. If L &lt; M, we can review the problem as sparse linear regression and adopt the common Minimum Norm Estimator (MNE) as:</p><p>which calculates an inverse of only a R L&#8593;L matrix rather than R M &#8593;M .</p><p>If L &gt; M, we leverage the Cholesky decomposition <ref type="bibr">[51]</ref> to lower the latency. It is based on the fact that (W T W + &#1009;C &#8595;1 p ) in % 1 is symmetric and positive definite. Let A = W T W + &#1009;C &#8595;1 p , and b = W T z. We first calculate the triangular matrix S via the Cholesky factorization:</p><p>By reformulating the problem as SS T p = b, the loss field estimate p can be obtained via forward-backward substitution <ref type="bibr">[51]</ref>. According to <ref type="bibr">[44]</ref>, the Cholesky decomposition can be twice as e!cient as the general Lower-Upper (LU) decomposition. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">Evaluation Methodology</head><p>In this section, we describe one outdoor and one indoor real-world received power dataset, three popular ML-based methods, and two evaluation metrics for assessing the performance of the CELF algorithm.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.1">Real-world Received Power Datasets</head><p>Outdoor Dataset. This dataset <ref type="bibr">[80]</ref> is collected from a 2200 m &#8594; 2100 m university campus area. A portable commercial radio is used as the transmitter, and the receivers are 25 SDR nodes with omnidirectional antennas deployed on POWDER, an open wireless experimental testbed <ref type="bibr">[22]</ref>. The carrier frequency is 462.7 MHz and the transmit power is 1W. The receivers are one of 4 types, Rooftop, Fixed, Mobile, and Dense, according to the radio-antenna-placement di"erentiation. Table <ref type="table">2</ref>.1 gives specifications for each receiver type. Fig. <ref type="figure">2</ref>.4a and 2.4b show the GPS coordinates of the transmitter and all the receivers on the campus map. As the four types of receivers are heterogeneous and uncalibrated, this work treats the data collected by each type as a separate dataset.</p><p>Indoor Dataset. This dataset <ref type="bibr">[91]</ref> is from an indoor o!ce area, a 17.5 m &#8594; 15 m space surrounded by 1.8 m high cubicle walls, as shown in Fig. <ref type="figure">2</ref>.4c. Channels between all pairs of 44 device locations are measured by transmitting a pseudo-noise code with a 40-MHz chip rate at 2443 MHz. The transmit power is 10 mW. Thus, this indoor dataset has in total 44 &#8594; 43 &#8594; 0.5 = 946 measurements, assuming link reciprocity, as described in <ref type="bibr">[94]</ref>.</p><p>Train-Test Split. Each dataset needs to be split without overlapping for loss field estimation (training) and shadowing loss prediction (testing) purposes. We choose the link index  as the criterion to partition the datasets. Each dataset is split with a 7:3 ratio. Each data point is randomly assigned for training or testing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.2">Methods for Comparison</head><p>We adopt the Okumura-Hata model and three general-purpose ML models, Random Forest, SVR, and MLP-ANN, in this work for performance comparison. The rationale behind such choices is: (1) they represent the two main categories in the related work -non-learning and learning approaches; (2) they require neither site-specific terrain information nor large-scale datasets, unlike complex deep learning models such as RadioUNet <ref type="bibr">[73]</ref> and Path Loss Generative Adversarial Networks (PL-GAN) <ref type="bibr">[77]</ref>; (3) they have been widely used as benchmarks for path loss prediction <ref type="bibr">[50,</ref><ref type="bibr">134,</ref><ref type="bibr">142,</ref><ref type="bibr">137]</ref>, and the Okumura-Hata model particularly has been in use for the CBRS band sharing and analysis <ref type="bibr">[33]</ref>.</p><p>&#8226; Okumura-Hata <ref type="bibr">[57]</ref>: It provides a closed-form empirical formula for path loss computation over 150-1500 MHz frequency range. This model is only compared across outdoor datasets as it does not capture indoor environments.</p><p>&#8226; Random Forest <ref type="bibr">[88]</ref>: It is an ensemble learning approach that first constructs multiple decision trees on random subsets of the dataset and then combines them to improve the accuracy and robustness of the model.</p><p>&#8226; SVR <ref type="bibr">[82]</ref>: It is a variation of support vector machines used for regression. Unlike traditional squared error minimization, SVR fits a line or a curve by minimizing the error within a margin of tolerance.</p><p>&#8226; MLP-ANN <ref type="bibr">[137]</ref>: It is a feedforward neural network that consists of an input layer, an output layer, and multiple hidden layers. It is trained iteratively using algorithms like stochastic gradient descent for minimizing mean squared error.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.3">Evaluation Metrics</head><p>We adopt two evaluation metrics, variance reduction and running time, to quantify the performance of the tested algorithms. To specify, variance reduction is defined as the percentage decrease of the fading loss variance, i.e.,</p><p>where &#977; 2 z T is the fading loss variance of a dataset T after the first-order channel estimation in Section 2.3.1, and &#977; 2 CELF is the error variance after shadowing loss subtraction which is computed as the Mean-Squared Error (MSE):</p><p>where p is the spatial loss field learned from Section 2.3.3, W T is the weight matrix using the ellipse model, and N T = |T | is the size of the dataset T .</p><p>The other metric, running time, is a measure of the computational e!ciency of the proposed CELF algorithm. It has been crucial in time-sensitive applications such as real-time spectrum access and management systems <ref type="bibr">[36]</ref>. This metric includes the execution time for loss field learning and shadowing loss prediction. Note that the terms "learning" and "training", "prediction" and "testing" are used interchangeably for comparing CELF to the selected approaches in Section 2.5.</p><p>We take the following three steps to ensure result comparability. First, all the models are trained and tested on the same partitioned datasets. Second, the inputs of these ML models are the 2D coordinates of transmitters and receivers to be consistent with CELF. Lastly, all the results are obtained by running the algorithm on the same Linux system with a 16-core Intel Xeon Gold 6130 processor.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.5">Results</head><p>Experimental results of the proposed CELF algorithm are given in this section. We first present two loss field image examples which are learned from the datasets in Section 2.4.1.</p><p>We then compare CELF with the chosen models via variance reduction and latency from Section 2.4. present results on the measurement noise variance and the small-scaling fading loss variance. The combination of the two approximates the total noise variance as a lower bound for the fading loss variance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.5.1">Example Loss Field Images</head><p>This subsection presents two example loss field images using the log-distance path loss model in Section 2.3.1 and the proposed CELF algorithm in Section 2.3.3. They are learned from the Rooftop outdoor and indoor training datasets respectively. The rationale behind the Rooftop dataset choice is that these receivers, as deployed high above the ground, give better coverage of the campus area. We select both outdoor and indoor datasets to discuss CELF's practical use in various types of environments. The image boundaries are the same as Fig. <ref type="figure">2</ref>.4a and 2.4c.</p><p>The statistical analysis follows the next four steps. First, we determine the path loss exponent  <ref type="table">2</ref>.2. The first hyperparameter, &#982; p , denotes the attenuation image resolution and impacts both computation time and prediction accuracy. The second shadowing variance ratio, &#977; 2 X /&#977; 2 Z , represents the contribution of shadowing loss to the total fading loss. In comparison to outdoor environments, indoor surroundings have more multipath components as indoor obstacles like walls that obstruct radio wave propagation are relatively uniformly placed throughout the area. Therefore the indoor dataset shows less variation in shadowing. The third space constant &#982; indicates the obstruction size in the environment <ref type="bibr">[3]</ref>. We expect that obstacles will be smaller for the indoor area. In this case, the &#982; for the Rooftop dataset is 35, larger than 2.5 for the indoor dataset. The next hyperparameter &#969; is introduced by the ellipse weight model to select valid pixels for each link. It is determined by the area size and the pixel width. The last hyperparameter &#1009; balances the loss field prior and the data from the area of interest. We notice that &#1009; of the indoor dataset is about 100 times larger than that of the outdoor case. This can be explained by the 1/ &#8599; d l weight in 2.5. The path lengths d l of the indoor measurements are 100 times smaller, which makes &#1009; 100 times larger to balance the 1/d l discrepancy in 2.10.</p><p>Next, we derive the weight matrix and estimate the loss image via Bayesian linear regression. Fig. <ref type="figure">2</ref>.5 demonstrates the two trained loss images and the site maps as a reference. It can be observed that they have spatial loss ranges of -24-24 dB and -1.25-1.00 dB respectively. Higher losses can be seen at higher obstructions such as the marked rectangle areas in Fig. <ref type="figure">2</ref>.5a and near cubicle walls in Fig. <ref type="figure">2</ref>.5b.</p><p>The red ellipse area of Fig. <ref type="figure">2</ref>.5a highlights a mismatch between the estimated two high-loss regions and one high obstruction of the site map. The loss image estimate is in fact more accurate because the terrain profile is outdated; a new building recently constructed at the star (&#966;) location was not in the database used to generate the left image in Fig. <ref type="figure">2</ref>.5a. Note that CELF does not use any terrain or building information. Collecting and maintaining the site-specific terrain dataset could be time-consuming and expensive, but CELF can use channel loss measurements for accurate and cost-e"ective loss field estimation.</p><p>The correlation between obstructions and spatial losses is further proposed for wall imaging <ref type="bibr">[68]</ref>. Fig. <ref type="figure">2</ref>.5b presents the loss image of the indoor o!ce and the cubicle locations. It can be observed that desks, computers, and bookcases are generally positioned close to the cubicle walls. Correspondingly the estimated loss image is lower in the middle of each cubicle and higher close to the cubicle walls where these obstructions are more often placed. Similarly, the vertical corridor region at x &#8771; 3.2 m experiences lower losses than either side  of the corridor. The edges of the loss image are generally close to zero due to the lack of measurements and thus the estimates in that region mostly rely on the field prior. The match between the environment and the loss image validates that the proposed CELF approach has the potential for spatial loss field learning and further shadowing loss prediction.</p><p>The final step is to quantitatively assess the accuracy of the learned loss image via variance reduction. For the Rooftop training dataset, the fading loss variance after the log-distance path loss model is 58.4 dB 2 . The MSE by estimating the shadowing loss decreases to 30.7 dB 2 which is 47.4% less than that of the base model. For the indoor training dataset, the fading loss variance reduces from 19.8 dB 2 to 10.1 dB 2 , which corresponds to a 49.3% reduction. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.5.2">Accuracy Analysis</head><p>Upon obtaining the loss image, we evaluate CELF's performance on the test datasets. The first is the accuracy analysis using the variance reduction metric. Fig. <ref type="figure">2</ref>.6 demonstrates the variance reduction results on the outdoor and indoor test datasets. Note that the Okumura-Hata model predicts path loss directly without first-order channel estimation and thus its error variance is computed as the unbiased path loss variance. It can be seen from Fig. <ref type="figure">2</ref>.6 that all methods except the Okumura-Hata model can lower the fading loss variance to a certain degree. MLP-ANN gives the largest variance reduction among the three ML-based methods. However, CELF outperforms all the ML models across the test datasets. Take the Rooftop dataset for instance. CELF can achieve 42.3% variance reduction which is higher than MLP-ANN's 39.6%. To summarize, we are able to show that the CELF algorithm outperforms the three ML methods in terms of variance reduction. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Receiver</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.5.4">E"ect of Hyperparameters</head><p>CELF's hyperparameters play a significant role in its performance. We here present variance reduction as a function of CELF's three major hyperparameters on the indoor dataset. (a) The e!ect of the pixel width &#969; p . (b) The e!ect of the space constant &#969;. (c) The e!ect of the excess length &#949;. constant &#982;. Reductions for testing and training decrease as &#982; increase from 0.5m to 15m. As</p><p>&#982; approximates the obstruction size, unreasonable large space constants give lower variance reduction for training and testing. Fig. <ref type="figure">2</ref>.7c presents the e"ect of the excess length &#969; on variance reduction. It can be seen that too large of the excess length includes too many pixels for loss field estimation and thus leads to lower variance reduction.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.5.5">Lower Variance Bound Approximation</head><p>To further understand the lowest possible modeling error variance that CELF can reduce to, we analyze a subset of the outdoor dataset which is collected when the FM transmitter is either stationary or rotating with a radius less than or equal to 1 wavelength (&#969; f ). The subset has 14,026 received power observations. Variation in stationary data approximates the measurement noise variance, and the data for link distances changing on the order of the signal wavelength can estimate the small-scale fading loss <ref type="bibr">[50]</ref>. Hence the sum of the two gives a sense of the lower bound on the total fading loss variance &#977; 2 &#969; . Table <ref type="table">2</ref>.4 illustrates the variance of the two measurement sets. Note that the Mobile dataset is not applicable as the receivers are constantly moving. We can learn that for the Dense dataset, the variance reduction upper limit is 58.8% which, based on Fig. <ref type="figure">2</ref>.6, is 12.9% higher than the result of CELF. By comparing Table <ref type="table">2</ref>.4 and Fig. <ref type="figure">2</ref>.6, we can conclude that there is still room to lower the shadowing loss variance, but the proposed method has shown results closer to the limits.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.6">Summary</head><p>This work proposes CELF, which learns a spatial loss field and uses it to predict shadowing loss on any new links in a deployment area. It formulates total fading loss as a discretized linear model and applies Bayesian linear regression for loss field estimation.</p><p>The proposed method has been validated with two evaluation metrics, variance reduction, and running time for training and prediction. It is tested on one outdoor and one indoor realworld dataset. The Okumura-Hata model and three ML-based methods, SVR, random forest, and MLP-ANN, are used for performance comparison. Experimental results demonstrate that CELF presents larger variance reductions than all the other methods and can also estimate the loss image more e!ciently than the most accurate MLP-ANN model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Chapter 3</head><p>Augmenting CELF: Site-trained Bayesian Modeling and Comparative Analysis</p><p>In this chapter, we augment the CELF work by discussing the Bayesian modeling components, its explainability, and performance robustness under uncertainties<ref type="foot">foot_0</ref> . CELF in Chapter 2 describes the idea of a base model and the linear additive modeling of the loss field. The initial set of results has demonstrated its feasibility as a new site-trained channel estimation method. However, the modeling components, such as the base model choice and the weight matrix model, require more comprehensive analysis and evaluation for practical and broad use in the real world. As a result, a di"erent channel base model, TIREM, is presented, and numerical results show that CELF can reduce the test variance by up to 63%. Another spatial multipath model for the weight matrix is also included to compare with the ellipse weight model choice in the initial work. Results show close accuracy improvement. Next, we discuss the explainability attribute of CELF in two aspects: model design and model output. The first, model design, describes the spatial loss field modeling and the Bayesian learning method. The second, model output, focuses on why and how loss field estimates can be explainable. At last, two new complex environments, shared spectrum access and large urban area, are included to validate CELF's accuracy and e!ciency performance robustness. To summarize, analytical and numerical analysis are given in this chapter to verify that CELF o"ers a new type of explainable learning model for data-driven site-trained channel estimation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Introduction</head><p>E!cient and accurate channel estimation is a fundamental requirement for future wireless systems to achieve reliable dynamic spectrum sharing <ref type="bibr">[102]</ref>. Operational spectrum sharing examples include the 3.5 GHz band <ref type="bibr">[112]</ref> and the 6 GHz band <ref type="bibr">[39]</ref>, both of which require periodic channel re-estimation to ensure minimal interference caused by secondary users to incumbent licensees <ref type="bibr">[123,</ref><ref type="bibr">56]</ref>. Such a need imposes new complexities on channel models. First, when a new group of users requests shared spectrum access, co-channel interference from all existing and intended transmissions must be calculated to guarantee su!cient SINRs for all links. Second, as users move, spectrum management systems demand recomputation of the changing co-channel losses to adjust the frequency of operation or transmit powers when needed. Last, current channel models are not well-matched to the needs of dynamic spectrum management. Many path loss models experience significant delays in initiation and performance degradation over time due to their reliance on site-specific environmental databases <ref type="bibr">[140,</ref><ref type="bibr">37,</ref><ref type="bibr">31]</ref>. Given the real-time and continuous requirement of spectrum management, new channel models satisfying higher e!ciency and accuracy standards must be developed to enable adaptive and reliable spectrum allocation.</p><p>Our initial CELF work <ref type="bibr">[130]</ref> develops a new type of channel learning model which is simultaneously explainable, is less computationally complex to train, and is more accurate than current ML channel models trained with the same data. CELF formulates modeling errors after an arbitrary channel base model as a linear function of a shadowing loss field. This loss field is connected to the underlying wave propagation physics in that it accounts for the physical mechanism of shadowing due to obstacles in the spatial domain, and is viewable as a simple image map. A long history of radio propagation research has applied this linear additive loss modeling to estimate particular objects <ref type="bibr">[68]</ref> or image motion <ref type="bibr">[136]</ref>. However, we are unaware of any work using it as a foundation for a site-agnostic learning-based channel model. As shown in Fig. <ref type="figure">3</ref>.1, the loss field is learned from training measurements via Bayesian linear regression, but training is lower in computation requirements compared to a general-purpose ML model. Using training measurements allows the model to fit the particular site of deployment. Sensors deployed as part of a radio dynamic zone, or the dynamic spectrum access protocol, can be used to collect these measurements. Training data quantities can be low in comparison with other ML methods. To predict shadowing loss for a new matrix models formulate di"erent multipath environments, e.g., ellipse for reflected multipath <ref type="bibr">[95,</ref><ref type="bibr">136]</ref> and Cassini oval for scattered multipath <ref type="bibr">[90,</ref><ref type="bibr">85]</ref>. We hence incorporate the ellipse and Cassini oval models in CELF for comparison. Numerical results show that both models are valid weight matrix models and result in similar accuracy improvement on the test datasets.</p><p>The initial work of CELF appeared at Chapter 2. The major additional contributions in comparison to Chapter 2 are as follows:</p><p>&#8226; A hybrid empirical/physical model, TIREM, is included as a di"erent channel base model. The analysis verifies that CELF can robustly improve the accuracy of various channel models.</p><p>&#8226; Two spatial multipath models for a weight matrix in CELF are presented to discuss di"erent geometric shapes for which the spatial loss field contributes to the link loss. Numerical results validate both spatial models and they result in similar accuracy improvements.</p><p>&#8226; Two more real-world datasets, one collected in a shared frequency band and the other from a large urban environment, are experimentally analyzed to evaluate CELF's accuracy and e!ciency performance under the new scenarios. (4) We provide a detailed discussion on the explainability of CELF modeling and the learned loss field. Numerical tests verify that the increase in the loss field near known obstructions is statistically more significant in both indoor and outdoor environments, and hence the loss field can be explained by the locations of the obstructions in the area.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">CELF Augmentation</head><p>In this section, we present the CELF model in three parts. First, we describe the idea of a base model, and describe what is used in this work. Next, we describe how CELF augments the base model for better path loss estimation using a spatial loss field. Finally, we explain how to estimate the loss field from training measurements.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.1">Channel Base Model</head><p>CELF predicts the additional path loss compared to a channel base model. The channel base model could be any model described in the related work (Section 2.2). CELF's role is to augment the estimates from the base model by additionally accounting for the natural spatial correlations in the path loss that are not modeled by the base model. To verify the e"ectiveness of CELF, we explore two di"erent models in this work as the base model: log-distance path loss and TIREM.</p><p>Log-distance path loss model. It states that the received power estimate in dBm P (d l ) along a link l = (i, j) between node i and node j reduces in a logarithmic manner with increasing distance <ref type="bibr">[101]</ref>:</p><p>where P T is the transmitted power in dBm, d l is the link distance, # 0 is a constant specifying the dB loss at a reference distance $ 0 , and the path loss exponent n p indicates the level of environmental clutter.</p><p>Given the same distance d l , the received power measurements vary around the estimate P (d l ) due to shadow fading and small-scale fading <ref type="bibr">[50]</ref>. As a result, the modeling errors between the measurement P (d l ) and the estimate P (d l ) can be written as:</p><p>where Z l,log-dist. is the modeling error which consists of independent shadowing loss and small-scale fading loss <ref type="bibr">[136]</ref>.</p><p>TIREM. The second model, TIREM, is an extension of the widely-applied Longley-Rice model. It utilizes site-specific terrain features and electromagnetic theory to estimate path loss for frequencies between 1 MHz -1 THz and distances up to 30 km <ref type="bibr">[37]</ref>. One disadvantage, however, is that it over-predicts shadowing loss as obstacles are assumed to be infinitely long knife edges <ref type="bibr">[124]</ref>. We denote the di"erence between the received power measurements and the TIREM estimates as:</p><p>where PTIREM (d l ) is the estimated power given by TIREM, and Z l,TIREM. is the over-prediction error. To simplify the notation used throughout the rest of this work, Z l is used to represent the total modeling error by any arbitrary base model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.2">Joint Link Model for Loss Field</head><p>The total modeling error Z l is commonly modeled as independent and identically distributed (i.i.d.) across links <ref type="bibr">[15,</ref><ref type="bibr">58,</ref><ref type="bibr">29]</ref>. However, that simplification disagrees with the empirical observation that shadowing losses along two links are correlated due to obstructions, e.g., outdoor buildings and terrain variations, and indoor walls and furniture <ref type="bibr">[52,</ref><ref type="bibr">3,</ref><ref type="bibr">72]</ref>.</p><p>In order to simultaneously model the correlations in shadow fading that exist across multiple link pairs in a network, we use the network shadowing model <ref type="bibr">[92]</ref>. Let L be a set of link pairs in a wireless network, and L = |L| where | &#8226; | counts the number of elements in the set. We assume that each link is di"erent in either transmitter or receiver location from the other links in the set L. The network shadowing model describes the joint link modeling error as:</p><p>where z = [Z 1 , Z 2 , . . . , Z L ] T &#8596; R L&#8593;1 is the total modeling error after the channel base model, W &#8596; R L&#8593;M is a weight matrix, p &#8596; R M &#8593;1 is a discretized loss field in dB, and &#949; &#8596; R L&#8593;1 is the linear model error. Their details are given below.</p><p>Spatial loss field p. The spatial loss field of <ref type="bibr">[92,</ref><ref type="bibr">3]</ref> characterizes the environment of interest as a Gaussian random field in dB that is isotropic wide-sense stationary. It has zero mean and an exponentially decaying spatial covariance function:</p><p>where d m,n is the Euclidean distance between the centers of pixels m and n, &#977; 2 X is the variance of the shadowing loss, and &#982; is a space constant. The modeling error Z l on link l is then a weighted sum of the loss field p over the pixels that cross near the link l.</p><p>Weight matrix model for W . A weight matrix model formulates a spatial area near the link which has a non-zero contribution to the variation in modeling errors. In this</p><p>Loss field li n k valid pixel (a) The ellipse model Loss field li n k valid pixel (b) The Cassini oval model work, we consider a popular ellipse model <ref type="bibr">[136]</ref> and a Cassini oval model <ref type="bibr">[90]</ref> for the weight matrix W . The rationale is that these two models can theoretically describe reflection-and scattering-dominant modeling errors, respectively <ref type="bibr">[95]</ref>.</p><p>The ellipse model, as shown in Fig. <ref type="figure">3</ref>.2a, considers the two ends i and j of link l as the foci and utilizes a tunable parameter &#969; ellipse to determine the ellipse width. A pixel is viewed as valid if it falls within the ellipse, and the corresponding weight in W will have a nonzero contribution to the shadowing loss of link l. Past studies <ref type="bibr">[92,</ref><ref type="bibr">3,</ref><ref type="bibr">136]</ref> construct the weight as:</p><p>where d i,m and d j,m are the distances from the center of pixel m to the two foci i and j, d l is the link distance, and &#969; ellipse is the ellipse width parameter.</p><p>The Cassini oval model, as shown in Fig. <ref type="figure">3</ref>.2b, has the same transmitter and receiver of link l as the foci and uses &#969; Cassini to decide the oval shape. The weights of the Cassini oval model are constructed as:</p><p>The total modeling error z. The total modeling error is the sum of errors from multiple sources: 1) shadow fading, 2) small-scale fading, and 3) measurement error due to thermal noise, which are independent of each other. The distribution for shadow fading, according to the linear additive loss field modeling, is spatially correlated Gaussian in dB. We can also assume the measurement error due to thermal noise to be i.i.d. Gaussian in dB scale. For the last small-scale fading, while Rayleigh and Rician are the common models, the sum of all these errors can presumably be Gaussian in dB, according to the Central Limit Theorem <ref type="bibr">[110]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.3">Loss Field Learning</head><p>Bayesian linear regression. Given the linear joint link model in (3.4) and the Gaussian loss field prior, we reconstruct the loss field p via Bayesian linear regression. A stable estimator of the loss field, p, is given as:</p><p>where &#1009; is a regularization constant which includes &#977; 2 &#969; . More details can be found in Section 2.3.3 of Chapter 2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.4">Explainable Learning Method</head><p>Explainability is a significant feature of CELF as it provides reasoning for the learning model construction and contextualization for the learning goal, loss field, both in a humanunderstandable manner. In contrast, ML techniques especially deep learning models are often black-box solutions that can easily su"er from problem diagnosis <ref type="bibr">[53]</ref> and adversarial attacks <ref type="bibr">[43]</ref>. As a result, CELF as an explainable learning approach can greatly enhance engineers' and regulators' trust in the channel model <ref type="bibr">[132]</ref>.</p><p>CELF embraces explainability in two aspects: model design and model output. For the model design, we first adopt an inherently explainable linear model to describe the additive relation between link modeling errors and the spatial loss field <ref type="bibr">[132]</ref>. Second, a Gaussian Dataset Indoor SLC1 SLC2 ANTW Freq. (MHz) 2443 462.7 3543 868 Area 17.5 &#8594; 15 (m 2 ) 2.2 &#8594; 2.1 (km 2 ) 2.2 &#8594; 2.1 (km 2 ) 4.1 &#8594; 5.6 (km 2 ) Sample Size 9,460 59,323 84,682 162,568 TX Power 10 mW 1 W 1 W unknown Table <ref type="table">3</ref>.1: Specifications for the indoor and three outdoor datasets. Note that the outdoor SLC1 dataset, due to uncalibrated receivers, is treated as 4 di"erent subdatasets.</p><p>random field with an exponential kernel as the loss field prior can be justified by the fact that (1) shadow fading is commonly modeled and experimentally verified as Gaussian in dB <ref type="bibr">[101,</ref><ref type="bibr">50]</ref>, and (2) shadowing loss due to obstructions across space experiences a distancebased decay <ref type="bibr">[3]</ref>. For the model output, we show in Section 3.4.2 that the spatial loss field can be explained via the physical mechanism of shadowing due to obstacles and can further be numerically validated when the locations of obstructions in the area are known. Such explainability enables a closer tie between a digital spectrum twin and the real environment it is meant to represent <ref type="bibr">[115]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Real-world Datasets and Evaluation</head><p>In this section, we describe one indoor and three outdoor real-world received power datasets.</p><p>The system-related details of each dataset are provided in Table <ref type="table">3</ref>.1. Commonly used methods for channel estimation are presented next including one empirical and three ML-based methods. We describe in the end two evaluation metrics for assessing the performance of the CELF algorithm.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.1">Datasets</head><p>Indoor Dataset. This dataset <ref type="bibr">[91]</ref> is from in an indoor o!ce area, a 17.   for each link 5 times, this indoor dataset has in total 44 &#8594; 43 &#8594; 5 = 9460 measurements, as described in <ref type="bibr">[94]</ref>.</p><p>SLC1 Outdoor Dataset. The first set of measurements <ref type="bibr">[80]</ref> is collected from a 2.2 &#8594; 2.1 km 2 university campus area in Salt Lake City (SLC). A portable commercial radio is used as the transmitter, and the receivers are 25 SDR nodes with omnidirectional antennas deployed on POWDER, an open wireless experimental testbed <ref type="bibr">[22]</ref>. The carrier frequency is 462.7 MHz and the transmit power is 1W. The receivers are one of 4 types, Rooftop, Fixed, Mobile, and Dense, according to the radio-antenna-placement di"erentiation. Table <ref type="table">3</ref>.2 gives specifications for each receiver type. Fig. <ref type="figure">3</ref>.3b and 3.3c show the GPS coordinates of the transmitter and all the receivers on the campus map. As the four types of receivers are heterogeneous and uncalibrated, this work treats the data collected by each type as a separate dataset.</p><p>SLC2 Outdoor Dataset. The second dataset <ref type="bibr">[116]</ref> is collected on the same 2.2 &#8594; 2.1 km 2 University of Utah campus in SLC. However, it di"ers from the SLC1 outdoor dataset in that the center frequency is 3534 MHz which is in the CBRS band for shared spectrum use. Including this dataset helps evaluate how precisely and e!ciently CELF can perform in the real-world dynamic spectrum sharing scenario. According to <ref type="bibr">[116]</ref>, the SLC2 outdoor dataset uses 5 Dense nodes on POWDER to transmit a Continuous Wave (CW) signal at 1W transmit power while a portable SDR receiver is carried by walk and driving for sample collection. The transmitter and receiver locations are shown in Fig. <ref type="figure">3</ref>.3d.</p><p>ANTW Outdoor Dataset. The last outdoor dataset <ref type="bibr">[2]</ref> is the largest in terms of data size and coverage area. The measurements are taken in the city center of Antwerp (ANTW), Belgium by stationary cell towers. The mobile transmitters are carried by Antwerp's postal service vehicles while transmitting LoRaWAN messages at 868 MHz. The intention of adding this dataset is to evaluate CELF in a large urban area. Fig. <ref type="figure">3</ref>.3e presents the 11 stationary receivers and the GPS coordinates of the mobile transmitter.</p><p>Train-Test Split. Each dataset needs to be split without overlapping for loss field estimation (training) and shadowing loss prediction (testing) purposes. We choose the link index as the criterion to partition the datasets. Each dataset is split with a 7:3 ratio. Each data point is randomly assigned for training or testing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.2">ML Baseline Methods</head><p>We adopt three general-purpose ML models, Random Forest, SVR, and MLP-ANN, in this work for performance comparisons. The rationale behind such choices is that they require neither site-specific terrain information nor large-scale datasets, unlike complex deep learning models such as RadioUNet <ref type="bibr">[73]</ref> and PL-GAN <ref type="bibr">[77]</ref>. They have also been widely used as benchmarks for path loss prediction <ref type="bibr">[50,</ref><ref type="bibr">134,</ref><ref type="bibr">142,</ref><ref type="bibr">137]</ref>.</p><p>&#8226; Random Forest <ref type="bibr">[88]</ref>: It is an ensemble learning approach that first constructs multiple decision trees on random subsets of the dataset and then combines them to improve the accuracy and robustness of the model.</p><p>&#8226; SVR <ref type="bibr">[82]</ref>: It is a variation of support vector machines used for regression. Unlike traditional squared error minimization, SVR fits a line or a curve by minimizing the error within a margin of tolerance.</p><p>&#8226; MLP-ANN <ref type="bibr">[137]</ref>: It is a feedforward neural network that consists of an input layer, an output layer, and multiple hidden layers. It is trained iteratively using algorithms like stochastic gradient descent for squared error minimization.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.3">Evaluation Metrics</head><p>We adopt two evaluation metrics, variance reduction and running time, to quantify the performance of the tested algorithms. To specify, variance reduction is defined as the percentage decrease of the modeling error variance, i.e.,</p><p>where &#977; 2 z T is the modeling error variance of a dataset T after the first-order channel estimation in Section 2.3.1, and &#977; 2 CELF is the final variance after applying the learned loss field via CELF for shadowing prediction, which is computed as MSE:</p><p>where p is the attenuation image learned from Section 2.3.3, W T is the weight matrix model, and N T = |T | is the size of the dataset T .</p><p>Note that the variance reduction metric in (3.9) is considered in this work a measure of accuracy although, by definition, it describes precision. The rationale is that: 1) the ground truth for the loss field p is unavailable so we cannot directly quantify its accuracy, and 2) the datasets for training and testing are assumed to be sampled from identical distributions, and thus no sample bias is involved.</p><p>The other metric, running time, is a measure of the computational e!ciency of the proposed CELF algorithm. It has been crucial in time-sensitive applications such as real-time spectrum access and management systems <ref type="bibr">[36]</ref>. This metric includes the execution time for loss field learning and shadowing loss prediction. Note that the terms "learning" and "training", "prediction" and "testing" are used interchangeably for comparing CELF to the selected approaches in Section 2.5.</p><p>We take the following three steps to ensure result comparability. First, all the models are trained and tested on the same partitioned datasets. Second, the inputs of these ML models are the 2D coordinates of transmitters and receivers to be consistent with CELF. Lastly, all the results are obtained by running the algorithm on the same Linux system with a 16-core Intel Xeon Gold 6130 processor. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Results</head><p>Experimental results of the proposed CELF algorithm are given in this section. We first present three loss field image examples which are learned from the indoor and SLC1 outdoor datasets in Section 2.4.1. We then compare CELF with the chosen methods via variance reduction and running time from Section 2.4.3. We further explore CELF's accuracy improvement and robustness when TIREM is used as the channel base model. A discussion of the ellipse and Cassini oval weight matrix models is presented next. The impact of the hyperparameters on accuracy is discussed in the end.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.1">Example Loss Fields</head><p>This subsection presents two example loss field images using the log-distance path loss model as the channel base model in Section 2.3.1 and the ellipse weight matrix model in Section 3.2.2. The rationale behind the Rooftop dataset choice is that these receivers, as deployed high above the ground, give better coverage of the campus area. We select both indoor and outdoor datasets to discuss CELF's practical use in various types of environments. The image boundaries are the same as Fig. <ref type="figure">2</ref>.4c and Fig. <ref type="figure">2</ref>.4a.</p><p>The statistical analysis follows the next four steps. First, we determine the path loss exponent Second, we tune hyperparameters for CELF and interpret their values. The model hyperparameters are selected via 5-fold cross-validation. This procedure is to randomly sample 1/5 data out of the training dataset for hyperparameter validation and overfitting prevention. Their descriptions and values are given in Table <ref type="table">3</ref>.3. The first hyperparameter, &#982; p , denotes the attenuation image resolution and impacts both computation time and prediction accuracy. The second shadowing variance ratio, &#977; 2 X /&#977; 2 Z , represents the contribution of shadowing loss to the total modeling error. In comparison to outdoor environments, indoor surroundings have more multipath components as indoor obstacles that obstruct radio wave propagation are relatively uniformly placed throughout the area. Therefore indoor environments have more significant small-scale fading <ref type="bibr">[3]</ref>. The third space constant &#982; indicates the obstruction size in the environment <ref type="bibr">[3]</ref>. We expect that obstacles will be smaller for the indoor area. In this case, the &#982; for the SLC1-Rooftop dataset is 35m, larger than 2.5m for the indoor dataset. The next hyperparameter &#969; is introduced by the ellipse weight model to select valid pixels for each link. It is determined by the area size and the pixel width. The last hyperparameter &#1009; balances the loss field prior and the data from the area of interest. We notice that &#1009; of the indoor dataset is about 100 times larger than that of the outdoor case. This can be explained by the 1/ &#8599; d l weight in (2.5). The path lengths d l of the indoor measurements are 100 times smaller, which makes &#1009; 100 times larger to balance the 1/d l discrepancy in (2.10).</p><p>Next, we derive the weight matrix and estimate the loss image via Bayesian linear regression. Fig. <ref type="figure">3</ref>.4 demonstrates the two trained loss images and the site maps as a reference. It can be observed that they have spatial loss ranges of -1.25 -1.00 dB and -24 -24 dB, respectively. Higher losses can be seen at higher obstructions such as near cubicle walls in Fig. <ref type="figure">3</ref>.4a and the marked rectangle areas in Fig. <ref type="figure">3</ref>.4b.</p><p>The red oval area of Fig. <ref type="figure">3</ref>.4b highlights a mismatch between the estimated two high-loss regions and one high obstruction of the site map. The loss image estimate is in fact more accurate because the terrain profile is outdated; a new building recently constructed at the star (&#966;) location was not in the database used to generate the left image in Fig. <ref type="figure">3</ref>.4b. Note that CELF does not use any terrain or building information. Collecting and maintaining the site-specific terrain dataset could be time-consuming and expensive, but CELF can use channel loss measurements for accurate and cost-e"ective loss field estimation.</p><p>The last step for loss field learning is to quantitatively assess the training accuracy of the learned loss image via variance reduction. For the Rooftop training dataset, the modeling error variance after the log-distance path loss model is 58.4 dB 2 . The shadowing loss estimates via CELF decreases the channel loss variance to 30.7 dB 2 which is 47.4% less than that of the base model. For the indoor training dataset, the modeling error variance reduces from 19.8 dB 2 to 10.1 dB 2 , which corresponds to a 49.3% reduction.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.2">Loss Fields' Explainability</head><p>The estimated images are called shadowing loss fields as they can be explained as site maps of additional radio shadowing due to obstacles in the environment. As an example, consider the loss image of the indoor o!ce in Fig. <ref type="figure">3</ref>.4a. We can see from the photo of the area that desks, computers, and bookcases are generally positioned close to the cubicle walls, and our site data includes the wall locations. We can see in Fig. <ref type="figure">3</ref>.4a that the estimated loss field is lower in the middle of each cubicle and higher close to the cubicle walls. Similarly, the vertical corridor region at x &#8771; 3.2 m experiences lower loss than inside the cubicles. It is intuitively clear that the shadowing loss field can be explained by the locations of the obstructions in the area.</p><p>We further quantify this argument by showing that the increase in the loss field near known obstructions is, in fact, statistically significant, in both our indoor and outdoor environments. We apply the two-sample t-test to verify numerically that the loss field is higher near known obstructions. Let the shadowing loss field near the known obstructions be p O and the field values at the remaining positions be p R . Our hypotheses to test are:</p><p>where &#181; p O and &#181; p R are the sample mean of each population. To identify obstructions for the SLC1-Rooftop building map, we use any location where the building height is over 30m, which is the average antenna height of the Rooftop receivers. For the indoor set, we use the cubicle walls as the obstruction locations. Note that the obstruction locations are not only the exact coordinates, but also any neighboring pixel within 3% of the field width. This distance accounts for any potential error between our "ground truth" obstruction location and the location of high attenuation in the loss field estimate. This 3% of the field width translates to 1.4 and 2.4 times the pixel width for the indoor and SLC1-Rooftop cases, respectively. The two-sample t-test for the loss field estimates shown in Fig. <ref type="figure">3</ref>.4a and Fig. <ref type="figure">3</ref>.4b result in p-values of 8 &#8594; 10 &#8595;133 and 3 &#8594; 10 &#8595;3 , respectively. Therefore, we reject H 0 in both cases at a significance level of 0.01. We can further conclude that loss field values are in fact statistically higher near actual attenuating obstructions in the environment. This validates that the loss field via the proposed CELF is explainable in part by the locations of significant obstructions in the area, providing a key feature to its users.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.3">Accuracy Analysis</head><p>Upon obtaining the loss field, we evaluate CELF's performance on the test datasets. The first is accuracy analysis using the variance reduction metric. Fig. <ref type="figure">3</ref>.5 demonstrates the modeling error variance reductions on the indoor and outdoor test datasets using three ML methods and CELF. It can be seen from Fig. <ref type="figure">3</ref>.5 that all ML methods can lower the modeling error variance to a certain degree. MLP-ANN gives the largest variance reduction among the three compared ML-based methods. However, CELF outperforms all the ML models across the test datasets. Take the indoor dataset for instance. CELF can achieve 40.0% variance reduction which is higher than MLP-ANN's 32.1%. It can be further seen that CELF reduces the most variance of the SLC2 outdoor dataset, which validates CELF for channel modeling in the real-world spectrum sharing scenario. to show that the CELF algorithm outperforms the three ML methods in terms of variance reduction.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.4">E!ciency Analysis</head><p>We compare the training and testing e!ciency of the methods via running time. Their results are shown in Table <ref type="table">3</ref>.4. First, MLP-ANN, among the remaining methods, is the most e!cient for shadowing loss prediction but the most computationally expensive for training.</p><p>Second, the slowest model for testing is SVR except for the indoor dataset. Last, CELF is approximately 3 times faster than MLP-ANN for shadowing loss field learning. As a result, it can update the model with new measurements or learn the spatial loss of a new environment with much less computational cost.</p><p>Contrary to the training time which is for the entire training set, the prediction time is given in &#181;s/link. This measure can directly describe, given the learned shadowing loss field, how e!ciently CELF predicts channel loss for a single unseen link. We can see from Table <ref type="table">3</ref>.4 that CELF only needs on average 76 &#181;s for one link prediction. However, it is slower than MLP-ANN across all the datasets. This is due to the time-expensive weight matrix computation for each link. Optimization of the weight model is needed for prediction e!ciency improvement and remains future work. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.5">TIREM Enhancement</head><p>We use TIREM, a hybrid empirical/physical model, as another channel base model to discuss whether CELF can robustly enhance the channel estimation performance of a base model that requires no measurements from the area of deployment. A known problem of TIREM is its over-prediction of shadowing loss as it does not consider signal reflection and di"raction around obstacles <ref type="bibr">[124]</ref>. In this section, we train a newly learned loss field from TIREM's modeling errors and present CELF's accuracy improvement. The value at each pixel presents the loss estimate given by TIREM. We can see from Fig. <ref type="figure">3</ref>.6 several radiating rays outward from the transmitter due to the LOS paths at di"erent angles. However, regions at a -270 degree angle relative to the transmitter, e.g., at (900, 1200), show a sharp decline in channel loss. This corresponds to the lack of considering signal reflection and di"raction.</p><p>Dataset Size Base Model: Log-distance Path Loss Base Model: TIREM Modeling Error CELF Var/ Modeling Error CELF Var/ Var (dB 2 ) Reduction (dB 2 /%) Var (dB 2 ) Reduction (dB 2 /%) SLC1-Rooftop 3,935 58.6 33.8 (42.3%) 176.2 118.0 (33.1%) SLC1-Fixed 7,276 60.0 26.1 (56.5%) 209.4 129.6 (38.1)% SLC1-Mobile 2,607 40.1 29.3 ( 27.1%) 175.6 135.2 (23.0%) SLC1-Dense 3,981 26.0 14.0 (46.0%) 163.7 108.9 (33.5%) SLC2 25,405 67.6 22.0 (67.5%) 166.8 65.6 (60.7%) ANTW 48,771 34.3 18.5 (46.1%) 237.7 87.8 (63.1%)</p><p>Table <ref type="table">3</ref>.5: The modeling error variances of TIREM and the log-distance path loss as the base models and variance reductions on outdoor test datasets via CELF.</p><p>Table <ref type="table">3</ref>.5 shows the data size, the modeling error variances of the two base models, and the variance reduction results, all on the outdoor test datasets. The indoor dataset is not included as the terrain features are unavailable for TIREM. First, we can see higher variances using the TIREM base model. This is because the terrain profile available for TIREM is outdated and TIREM does not rely on the measurements from the area of interest. Therefore, TIREM shows larger variations in the modeling errors than the fitting-based log-distance path loss model. Second, CELF can decrease the variances across all the datasets, up to 67.5% for the log-distance path loss model and up to 63.1% for TIREM. Third, we observe smaller variance reductions using TIREM as the base model for all test datasets except ANTW. It can be that, unlike the log-distance path loss model, TIREM does characterize shadowing loss but has the over-prediction problem. As a result, the modeling errors in TIREM contain less shadowing loss and are not as highly spatially correlated. In summary, it verifies that CELF can robustly improve di"erent base models using the explainable shadowing loss field.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.6">Impact of Weight Matrix Models</head><p>Di"erent weight matrix models formulate distinct spatial patterns that contribute to the variations in channel losses due to multipath propagation. That is, obstacles in an environment, e.g., outdoor buildings and indoor walls, can reflect, di"ract, or scatter a transmitted signal wave, which causes the channel loss to be dependent on spatial locations in the field <ref type="bibr">[50]</ref>. Here we compare the Cassini oval weight model to the ellipse and discuss their impact on CELF's accuracy improvement across datasets.</p><p>We first present the learned loss fields using the SLC1-Rooftop dataset using the Cassini oval weight model. As seen in Fig. <ref type="figure">3</ref>.7, both loss fields share the same spatial loss range of -24 -24 dB. Comparing the marked rectangle areas between the obstruction map in Fig. <ref type="figure">3</ref>.7b and the loss field by the Cassini oval in Fig. <ref type="figure">3</ref>.7c, we can still see the correlation, i.e., higher losses at higher buildings.</p><p>A new white rectangle region in Fig. <ref type="figure">3</ref>.7a and Fig. <ref type="figure">3</ref>.7c highlights a di"erence in the learned spatial loss given by the weight models. It can be clearly seen that the loss field via the Cassini oval has a higher correlation with the reference map, which demonstrates the potential of using Cassini ovals as the weight matrix model.</p><p>We next describe the accuracy improvement results on the test datasets in Fig. <ref type="figure">3</ref>.8. It can be seen that CELF with the Cassini oval model performs better on the SLC1-Rooftop, SLC1-Mobile, SLC1-Dense, and ANTW test datasets. Counterintuitively, CELF with the Cassini oval reduces more variance in the SCL1-Dense dataset but less in the SLC2 dataset, even though both are collected using the same POWDER nodes in the same geographic area. This could be due to the coverage di"erence between the datasets or di"erent carrier frequencies. Despite the di"erence, CELF with the two weight models result in similar variance reduction.</p><p>To summarize, we find no systematic improvement in the Cassini oval model compared to the ellipse model. However, there may be environments and deployment types in which the Cassini oval is a better fit. Regardless, we can report that both the ellipse and Cassini oval models are valid for the weight matrix and can lower the test variance along with CELF.   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5">Summary</head><p>This study augments CELF by discussing the Bayesian modeling components, its explainability, and performance robustness under uncertainties. CELF models the total modeling error of a channel base model as a discretized linear model of a spatial loss field and employs Bayesian linear regression for loss field learning. The learned spatial loss field is then used to predict additional loss for any new link within a deployment area.</p><p>CELF's performance is validated via one indoor and three outdoor real-world datasets which vary across spatial scale, radio frequency, and hardware types. Experimental results demonstrate that, in comparison to three ML-based methods, SVR, Random Forest, and MLP-ANN, CELF presents larger variance reductions than all the other methods and can also estimate the loss field more e!ciently than the most accurate MLP-ANN model. Two-sample t-test results verify a statistically higher loss field at locations near known attenuating obstructions in the area, which validates the explainability feature of CELF.</p><p>CELF is further tested on a di"erent channel base model, TIREM, and a di"erent weight matrix model, Cassini oval, for comprehensive discussion. Numerical results show that, with TIREM, CELF can robustly reduce the test modeling error variance by up to 63%; and with the Cassini oval, CELF shows similar accuracy improvement as the ellipse weight model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Chapter 4</head><p>FDMonitor: Full Duplex Monitoring for Software-Defined Radio Platforms</p><p>This chapter 3 addresses another two interference management challenges in spectrum sharing environments: real-time spectrum monitoring for spectrum use awareness and closed-loop control for stopping transmissions in violation. Driven by the demand for spectrum and flexibility, SDR platforms have been widely deployed across academia and industry for nextgeneration wireless experimental research. However, SDR platforms, despite being an enabler of spectrum and infrastructure sharing, must ensure that their transmitted signals comply with spectrum rules. Violations may occur because of malware, misconfiguration, bugs in software, or RF frontend nonlinearities. We present FDMonitor, a full-duplex monitoring system attached between a transmitter's power amplifier and its antenna to monitor and control each SDR in the Platform for POWDER, an open city-scale SDR-based testbed.</p><p>FDMonitor uses a bidirectional coupler, a two-port receiver, and a new source separation algorithm to simultaneously and adaptively estimate the transmitted signal and the signal incident on the antenna. FDMonitor has been running on POWDER since 2021, monitoring 19 SDR platforms accessible by outside experimenters. Its closed-loop feature sends alerts in real time whenever a violation is observed, and automatically turns o" the SDR as necessary.</p><p>Our experimental results show that FDMonitor accurately separates signals across a range of critical RF parameters. We further validate the system-wide performance of FDMonitor with 27 months of observation. Over this period, it achieves a positive predictive value of 95%, with a total of 45 false alerts. Beyond its use on POWDER, FDMonitor, as a novel spectrum policy compliance solution, can be a key enabler of more dynamic sharing applications.</p><p>Full access gives users complete control. It further allows complete privacy of users' (potentially) proprietary waveforms and software stacks which can be commercial users' intellectual property. Our monitoring solution can not require access to user software.</p><p>Additionally, software-based monitoring is insu!cient to ensure compliance. Power Amplifiers (PAs) and RF hardware in general have nonlinearities that can induce spurious emissions that violate spectrum rules <ref type="bibr">[125]</ref>. The nonlinearities are di!cult to characterize perfectly. As a result, the exact analog signal transmitted from the antenna is largely unknown. Further, monitoring only in software opens the door to attackers that could evade spectrum violation detection.</p><p>RF-based spectrum monitoring. For these reasons, this work proposes RF-based spectrum monitoring instead of software-based monitoring for full spectrum awareness. We describe our monitoring system as a Full-Duplex Monitoring system (FDMonitor). The implementation of FDMonitor is in <ref type="bibr">[129]</ref>. We use full-duplex to emphasize that it continuously and simultaneously estimates both the signal transmitted by the SDR and external signals incident to the SDR's antenna. As a result, FDMonitor simultaneously provides two critical functions to POWDER operators:</p><p>&#8226; User monitoring: detect any violation of spectrum rules from transmitters in the testbed to ensure compliance.</p><p>&#8226; Environment monitoring: track potential interference from the environment to POW-DER experimenters.</p><p>Being the source of interference could harm POWDER's relationships with other (licensed) wireless operators. In fact, POWDER has periodically received inquiries from licensed operators about interference they observe. Our monitoring datasets have been critical to demonstrate, as forensic evidence, that our platform was not the cause.</p><p>However, FDMonitor has to address a critical and significant challenge posed by co-located transmitted signals, as shown in Fig.</p><p>4.1. POWDER's base station antennas are deployed on cellular towers by a tower provider. The tower space is commonly leased to multiple operators, and antennas used by other Mobile Network Operators (MNOs) may transmit at high power (e.g., 50 W) on the same tower. Since an antenna is a two-way device, some of a co-located MNO's signal impinges on the POWDER SDR's antenna. If we directly cable a Monitor SDR Bidirectional Coupler Meas. #1 Incident signal RX RX Platform Server On/Off Switch Antenna Experimental SDR TX TX FDMonitor Standard Setup TX signal Meas. #2 spectrum monitor to the RF line before the antenna, it records both transmitted and incident signals, and cannot distinguish them. In this case the monitor would wrongly conclude that the user is transmitting in the band owned by the MNO and, for spectrum compliance, turn o! the POWDER SDR.</p><p>Hardware design of FDMonitor. FDMonitor's bidirectional isolation hardware is the first step to address the co-located signal mixture problem. As shown in Fig. <ref type="figure">4</ref>.2, a bidirectional coupler measures the forward and backward traveling signals on the RF path in two di!erent linear combinations. However, a wideband bidirectional coupler does not perfectly isolate these two signals -the overall system can provide only 10-15 dB di"erence in the power of one source between the two coupled outputs. This is because RF subsystems are not perfectly matched over the wide bandwidths across which frequency-agile SDR platforms must be able to operate. Counterintuitively, the platform's transmitted signal and the incident signal are carried in both directions on the RF chain, so a directional coupler can only do so much.</p><p>Source Separation in FDMonitor. Separation of the transmitted and incident signals from the combinations above is the second step to realize accurate RF-based spectrum monitoring. The problems of such separation, however, are: (1) the exact linear mixture model is unknown and time-varying, and (2) neither transmitted nor incident signal is known to FDMonitor. Solutions like model calibration are time-intensive and require frequent manual e"ort.</p><p>We utilize a frequency-domain Independent Component Analysis (ICA)-based source separation algorithm to address the problem. It requires no information about the model or the signals and estimates adaptively the transmitted signal, the incident signal, and the linear mixture model all on the fly. Our algorithm also tackles the resulting scaling and permutation ambiguities of ICA so that the separated signals are at correct power levels for violation detection and are identified correctly as "transmitted" or "incident".</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Summary:</head><p>The major contributions are as follows:</p><p>&#8226; Introduce the spectrum violation risks of SDR wireless testbeds and the need of shared SDR platform monitoring.</p><p>&#8226; Propose FDMonitor as a systems solution that separates mixed source signals, sends spectrum violation alerts, and automatically turns o" the transmitters as necessary.</p><p>&#8226; Implement FDMonitor and deploy it on 19 shared SDR platforms available to researchers on POWDER.</p><p>&#8226; Evaluate FDMonitor's separation performance thoroughly over ranges of four RF parameters: modulation type, carrier frequency, bandwidth, and transmit power.</p><p>&#8226; FDMonitor has been running continuously on POWDER since 2021. It achieves a 95% positive predictive value of all reported violations over 27 months of operation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Related Work</head><p>Full-duplex monitoring of the shared SDR platforms is at the intersection of relevant research on (i) full-duplex communication and (ii) spectrum sensing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.2">Spectrum sensing</head><p>A shared platform's transmission could be monitored by repurposed spectrum sensing. Spectrum sensing <ref type="bibr">[141]</ref> was proposed to sense primary users for opportunistic spectrum reuse, but it is, in essence, an approach to remotely detect transmission. Repurposed spectrum sensing can be categorized as (i) direct sensing and (ii) cooperative sensing.</p><p>Direct sensing uses one node to locally sense a user's transmission <ref type="bibr">[139]</ref>. It can be 1) transmitted signal prior-based sensing or 2) blind detection. The first type includes likelihood ratio test <ref type="bibr">[141]</ref>, cyclostationarity detection <ref type="bibr">[54]</ref>, waveform based sensing <ref type="bibr">[117]</ref>, and matched filtering <ref type="bibr">[24]</ref>. These methods use priors such as signal distributions, cyclostationarity, and preamble and pilot patterns of the transmitted signals to be correlated with the received signal for signal presence detection. Blind detection, in contrast, does not require a prior <ref type="bibr">[14]</ref>. It includes Energy Detection (ED) <ref type="bibr">[12]</ref> or eigenvalue/covariance based detection <ref type="bibr">[13]</ref>. ED measures the direct energy output whereas the latter uses the covariance matrix as an indicator of the received signal strength for presence classification.</p><p>Cooperative sensing utilizes measured signals sharing among collaborative radios to enhance the transmission sensing performance <ref type="bibr">[141]</ref>. The spatial distribution of multiple nodes e"ectively avoids hidden node problems and ameliorates degradation due to multipath fading and shadowing <ref type="bibr">[46]</ref>. While cooperative sensing can be centralized, distributed, and cluster-based <ref type="bibr">[8]</ref>, the fundamental sensing method is still direct sensing.</p><p>The above sensing techniques can be repurposed to monitor targeted transmissions of the shared SDR platform. However, it is unrealistic for us to require priors on the transmitted or incident signals, and blind detection will not be able to reliably separate co-channel signals.</p><p>In comparison, FDMonitor can precisely separate and identify both the transmitted signal and incident signal, even if they are on the same channel. It does use two measurements, thus like cooperative methods, it benefits from redundant measurements.</p><p>Bidirectional sensing. The work described in <ref type="bibr">[119]</ref> reports on spectrum monitoring using a bidirectional coupler. That method assumes a known system model for estimating the transmitted signal. However, system model calibration requires time-intensive manual e"ort. Furthermore, weather changes result in system variations which, if not recalibrated, degrade the separation performance. Experimentally, we also find the approach cannot su!ciently separate the transmitted and incident signals when they overlap in the frequency domain. In comparison to <ref type="bibr">[119]</ref>, FDMonitor provides several new benefits: 1) it is robust across signal type, carrier frequency, bandwidth, and transmit power, 2) it enables mixing matrix estimation on the fly without system calibration, and 3) in addition to estimating the transmitted signal, it also estimates the incident signal, which enables full-duplex monitoring.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">System Design</head><p>We describe the design of FDMonitor, as shown in Fig. <ref type="figure">4</ref>.3, that can separate signals without a signal prior. The inputs to FDMonitor are the In-phase and Quadrature (I/Q) sampled signals at the two receiver ports. Whenever the power in either receiver port is higher than the noise floor, we use the proposed algorithm to separate the signal into two sources. Given the Power Spectral Density (PSD) limits defined by the testbed operator, FDMonitor determines whether a violation occurred and reacts accordingly.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.1">Problem Formulation</head><p>The overall goal of FDMonitor is to monitor the entire frequency range of the platform, in our case 100-6000 MHz. The monitor samples from one channel at a time, each with an RF bandwidth limited by the monitoring device capability, in our case 27.65 MHz. We describe, without loss of generality, how FDMonitor operates on a single frequency channel.</p><p>FDMonitor collects bidimensional samples r i (n) for sample n = 0, 1, . . . N &#8595; 1 from ports i = 0, 1. Upon referring the source signals as x i (n) with i = 0, 1, we describe the bidimensional observations of the form:</p><p>where r(n) = [r 0 (n), r 1 (n)] T and v(n) is zero-mean, uncorrelated additive Gaussian noise, i.e., v(n) &#8656; CN (0, &#977; 2 I). We now present the assumptions based on (4.1) and discuss the major problem that needs to be addressed. Signal mixtures can be instantaneous or convolutive <ref type="bibr">[30,</ref><ref type="bibr">96]</ref>. We assume the former due to the fact that FDMonitor collects I/Q samples within hundreds of microseconds, during which the linear model remains static. We assume no prior knowledge of the system matrix A because (1) calibration will not be required, and (2) weather and other changing conditions alter A in practice. Instead, FDMonitor can adaptively estimate the linear model on the fly. As the transmitted and incident signals are from di"erent sources: the SDR platform and outside world, one signal does not a"ect the other, leading to mutual independence. FDMonitor has no knowledge about the source signals as their properties are designed by platform users, and may even be proprietary and confidential. Given that digital signals are mostly non-Gaussian, assumption 2 holds.</p><p>Thus we design FDMonitor to solve the following problem: The separated estimates after JADE are X and the mixing matrix estimate is &#195;. Note that X have neither the same magnitude as that of raw samples nor correct labeling of "transmitted" vs. "incident" due to two common ICA ambiguities we discuss in the next section.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.4">Scaling and Permutation Alignment</head><p>ICA methods have two common problems: scaling ambiguities and permutation ambiguities. First, ICA solutions are scaled by an unknown constant. Second, the two signal estimates are arbitrarily assigned, and hence it is not known which signal was "transmitted" and which was "incident." These ambiguities impose great challenges on violation detection as FDMonitor does not know which estimate to look at for violation detection and which for environmental spectrum monitoring.</p><p>Assume that the two ICA estimates are each scaled by a multiplicative factor and may be permuted (i.e., swapped). These changes are modeled via the mixing matrix as:</p><p>where # is a diagonal scaling matrix. &#194; is the ultimate mixing matrix to be obtained, and W is either the 2 &#8594; 2 identity matrix, or if it is permuted, the 2 &#8594; 2 exchange matrix</p><p>In the next sections, we first recover the scale via the estimated mixing matrix and address permutation ambiguity using correlation coe!cients and power di"erences.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Scaling Alignment</head><p>We observe, from (4.4), that &#195; having a norm larger than 1 essentially causes the scaling ambiguity challenge. To recover the scale, we first diagonalize the mixing matrix &#195; to obtain a complex-valued diagonal matrix $( &#195;):</p><p>Xi (k), for i = 0, 1, as:</p><p>We then define the Transmit in port 0 to Transmit in port 1 Ratio (TTR) and the Incident in port 1 to Incident in port 0 Ratio (IIR) as:</p><p>,</p><p>Similar to the signal-to-interference ratio (SIR), our TTR and IIR values measure a power ratio. However, TTR and IIR do not require exact knowledge of the true transmitted and incident signals X 0 (k) and X 1 (k) for all k, which is unavailable, even during experiments. Our metrics, TTR and IIR, focus specifically on the isolation performance of FDMonitor rather than the quality of X0 (k) and X1 (k) individually.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4">Implementation</head><p>In this section, we present the implementation of FDMonitor on our large-scale wireless testbed, POWDER.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.1">Monitoring Hardware</head><p>We use the following hardware in our experiments:  NI USRP B210s which are able to transmit and receive in the spectrum range from 70-6000 MHz <ref type="bibr">[38]</ref>, with a sample rate up to 61 Megasamples per second (Msps). The antenna used is a TAOGLAS wide-band 4G LTE I-Bar, e"ective across a 698-6000 MHz band <ref type="bibr">[118]</ref>.</p><p>A critical component of FDMonitor, a bidirectional coupler, is designed and built as shown in Fig.</p><p>4.5a and 4.5b. It has four ports: P1 and P3 are input and output ports representing direct transmission whereas P2 and P4 are coupled ports which can capture mixed signals at di"erent scales. To show the directionality of the coupler when it is isolated, we measure its S-parameters across the 100-6000 MHz frequency range, as shown in Fig. 4.5c. S 11 shows low return loss, below &#8595;10 dB across the band. S 13 is close to 0 for the wide spectrum, indicating little power loss of the direct transmitted signal from P1 to P3. In addition, S 12 and S 14 show that P2 and P4 receive a copy of the transmitted signal that is at least 10 dB Algorithm 1: Algorithmic Operation of FDMonitor Result: TX/incident signals, alert notification Initialize user PSD limits vs. frequency; Initialize the list of channel center frequencies f list ; while True do for f in f list do Sample r i  (a) Signal Type (b) Freq. Overlap (c) Signal Bandwidth (d) Signal Power We further compare FDMonitor to SMC with typical digital signal types, CW, BPSK, and OFDM, using TTR and IIR. The results in Table <ref type="table">4</ref>.1 show that both methods increase the isolation of the transmitted signal in X1 . However, we observe two disadvantages of SMC. First, it can only improve TTR as much as FDMonitor in CW transmission scenarios. Large TTR di"erences in modulated transmitted signals expose the inability of SMC to remove the transmitted signal from X1 . Secondly, SMC shows only a small IIR increase when the incident signal is CW, but inadvertently reduces the isolation for other signals.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Carrier and Center Frequency</head><p>Can FDMonitor separate signals that overlap in the frequency domain? We experiment with two overlapping OFDM signals: an incident signal (5758-5762 MHz) and a transmitted signal (5756-5760 MHz) that overlap between 5758-5760 MHz. The separation results in Fig. <ref type="figure">4</ref>.6b show the complete removal of the incident signal from X0 , and the complete removal of the transmitted signal from X1 . Furthermore, our experience indicates that FDMonitor can separate signals that fully overlap in the frequency domain as long as the modulations are su!ciently di"erent.</p><p>To check the separation performance across center frequency at the FDMonitor, experiments are conducted in the 2.4 GHz and 5.8 GHz ISM and 3.6 GHz CBRS bands, while transmitting non-overlapping CW signals. Fig. <ref type="figure">4</ref>.7 shows that both methods increase the TTR across frequency, which means little impact of center frequency on either algorithm. However, the large di"erence shown in IIR indicates poor incident signal separation of SMC and robust estimation via FDMonitor.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Signal Bandwidth</head><p>The transmitted and incident signal properties are unknown to FDMonitor, and might both occupy large bandwidths. We next run tests to explore how the performance of FDMonitor is a"ected by signal bandwidth.</p><p>The experiment considers two OFDM signals: the transmitted signal is 10 MHz wide, centered at 2454 MHz, while the incident signal is at 2442 MHz with 4 MHz bandwidth. Fig. <ref type="figure">4</ref>.6c shows the incident signal is removed from X0 . Equivalently, the transmitted signal has been mostly eliminated in X1 . Notably, 4 dB of the edges of the transmitted signal remains in X1 .</p><p>Additionally, the spike at 2458 MHz was verified to be an environmental interference signal. This observation indicates that FDMonitor can perform source separation in the presence of multiple incident signals.</p><p>Fig. <ref type="figure">4</ref>.8 shows how TTR and IIR change as the transmitted signal bandwidth varies from 1 to 10 MHz (10 MHz is the maximum bandwidth a user can reserve for one experiment on the POWDER testbed). FDMonitor is stable across bandwidths, but SMC demonstrates decreasing TTR with higher bandwidth. Additionally, the IIR for SMC is stable but much lower than the IIR reference, whereas FDMonitor produces higher IIRs. Reduced IIR means there is a higher level of the incident signal in X0 than in R 0 , a negative result.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Signal Power</head><p>Power di"erence is used for solving the permutation ambiguity (Section 4.3.4). However, close power levels, in theory, could confuse FDMonitor. Thus, we present FDMonitor's performance as a function of signal power. Finally, we change the transmitter gain for separation performance evaluation, as shown in Fig. 4.9. The TTR and IIR reference are approximately 6 and 4 dB. After running SMC and FDMonitor, we make the following observations: (1) TTR for both methods increases while gain increases from 10 to 55 dB; (2) FDMonitor obtains higher TTR than SMC at each gain setting; (3) SMC provides IIR around 2-3 dB lower than the reference while FDMonitor increases IIR by 7-16 dB.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.5.2">Algorithm E!ciency</head><p>We compare the two systems' e!ciency via latency. Latency in our work refers to the elapsed time for frequency tuning, data collection, and further analysis. Wideband monitoring of RF transmissions involves two main components, frequency sweeping and source separation. The former tunes the center frequency of the receiver while the latter provides transmitted signal estimation in each 27.65 MHz channel. To assess the methods in terms of e!ciency, latency is measured in each monitoring channel.</p><p>Fig. <ref type="figure">4</ref>.10 shows that, for monitoring one channel, the median latency of FDMonitor is 0.17 s whereas SMC requires 0.32 s. Given the median latency, the total time spent by FDMonitor to sweep the 100-6000 MHz spectrum is 36.4 s, half the latency of SMC. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.5.3">Mixing Matrix Evaluation</head><p>We further evaluate the system performance via the mixing matrix &#194; &#8596; C 2&#8593;2 , which describes the linear system model for source separation, which is estimated on the fly. We collect mixing matrix and precipitation data for 29 days while continuously transmitting CW signals from both sources. Fig. <ref type="figure">4</ref>.11 shows the mixing matrix magnitude vs. precipitation and monitoring hours. First, we observe that the matrix magnitude is relatively stable across 29 days. Both a 11 and a 22 are centered at 0.01 dB with 0.002 dB standard deviation. Noisier a 12 and a 21 are around -19.29 and -7.10 dB respectively with 0.094 and 0.075 dB standard deviation. Additionally, we notice that the matrix varies with rainfall. At around 72, 187 and 348 hours, the magnitude of a 12 and a 21 decreases when rain starts and later goes back to the same pre-rain level. This can be explained by the change in the radio propagation environment for wet vs. dry surfaces <ref type="bibr">[59]</ref>. This is another reason why the baseline SMC method, which learns the system matrix only once during calibration, is not robust compared to FDMonitor.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.5.4">Adversarial Behavior</head><p>If FDMonitor sequentially monitors frequency channels to cover the entire 100-6000 MHz band, adversarial users can potentially transmit in violation while hopping between channels to avoid detection. To model adversarial behaviors, we use the following notation: 1) $T  Attack model We propose the following attack model: 1) an attacker can use any channel at any given time, 2) in each time slot $T , an attacker chooses 1 of the N C channels to transmit, 3) an attacker does not know which channel is being monitored.</p><p>Countermeasure To address this attack model, FDMonitor can no longer use a predictable monitoring scheme. Instead, we propose a countermeasure that randomizes the order: 1) in each monitoring cycle, FDMonitor generates a random permuted channel sequence of length N C for spectrum monitoring, 3) all channels are measured by FDMonitor in each cycle. The probability of first detecting an attacker at cycle T using the proposed countermeasure is:</p><p>.12)</p><p>Figure <ref type="figure">4</ref>.12: Probability of detecting violation in the first cycle.</p><p>P D (1) asymptotically converges to 1 &#8595; 1 e or 63% as N C &#8657; &#8598;. The average number of cycles for attacker detection is 1/P D (1), which for high N C is 1.58 cycles. The proof is in Appendix A. For validation, we show in Fig. <ref type="figure">4</ref>.12 results of a simulation run 10 4 times at each N C . In FDMonitor, N C = 214, and thus P D (1) = 65.18%.</p><p>We describe one attack model above and will investigate more adversarial behaviors in our future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.5.5">FDMonitor Case Study</head><p>We demonstrate a case study of the FDMonitor workflow in Figure <ref type="figure">4</ref>.13. For Timepoint 1, the SDR transmitter stays silent. Two types of incident signals, one unknown from the environment and the other from a known external source, are present in the PSDs of both directions. A radio monitoring graph given by FDMonitor can visualize the separated transmitted signal, i.e., X0 , across 100-6000 MHz. We can see that all incident signals are correctly separated, and thus FDMonitor reports no spectrum violation. For Timepoint 2, the SDR transmitter operates in an unauthorized band while the incident signals remain the same. We observe both incident and transmitted signals in the PSDs of direct coupler outputs. The updated monitoring graph confirms that FDMonitor correctly estimates the transmitted signal. Combining the signal estimate and the spectral mask, FDMonitor alerts the user of an illegitimate transmission via its closed-loop control scheme.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.5.6">System-wide 27-Month Deployment</head><p>FDMonitor has been continuously monitoring 19 SDR platforms on the testbed, all deployed at di"erent geographical locations, for 27 months since 2021. We evaluate its system-wide performance by investigating each violation alarm it generates during the period, and considering the alarm accuracy. A violation alarm notifies the user via email of detected signals being transmitted outside the declared spectrum. It can be a true detection of RF emission misbehavior, or a false alarm if the alert did not correspond to a user violating spectrum rules. We store measurements from each alert and request information from the user about their setup in order to determine the ground truth about spectrum use. Table <ref type="table">4</ref>.2 shows our analysis of the 989 total alerts. In summary, we observe only 45 false alarms, among which 28 alerts occurred because the user-declared frequency was, due to a software error, not recorded to the FDMonitor user PSD limits database. The other 17 false discovery emails were triggered by incorrectly resolved permutation ambiguities. Even so, the 95.4% Positive Predictive Value (PPV) represents high accuracy and robustness of FDMonitor across a large variety of real users, their signals, and the varying weather seen by the platform.</p><p>No false negative cases (when a user's violating transmission is not detected as a violation) have been reported by POWDER users, and there were no false negatives in our experimental violation tests. Future work should investigate other methods to quantify the false negative rate during normal user operation.</p><p>Finally, we note that FDMonitor ran about 1.9 million cycles in the 27 months. 45 false discoveries in this period correspond to a false alarm rate of approximately 2 &#8594; 10 &#8595;5 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.6">Discussion</head><p>We describe some limitations of FDMonitor and discuss the implications for future research.</p><p>Non-Gaussian constraint. One constraint of FDMonitor is Assumption 2 that at most one of the sources is Gaussian. One possible solution is that, if both signals are found to be Critically, our approach does not require extensive calibration, which would be very challenging to implement at the rate at which calibration becomes obsolete. Its performance is extensively validated with four di"erent types of RF signal experiments, across communication signal modulations, carrier frequency, bandwidth, and transmit power. We further validate 27 months of live system performance, which generates a low 4.6% FDR, with 45 false alerts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Chapter 5 Conclusion and Future Work</head><p>This thesis investigates interference management strategies for next-generation spectrum sharing, an essential aspect of ensuring e!cient and reliable wireless communication systems. In this chapter, we conclude the thesis and discuss open issues for future directions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">Conclusion</head><p>This thesis has explored the critical role of interference management in enabling dynamic spectrum sharing in next-generation wireless networks. By addressing the ine!ciencies of traditional static spectrum allocation, dynamic spectrum sharing o"ers a pathway to maximize spectrum utilization, supporting the growing demand driven by wireless device proliferation and emerging applications. However, the coexistence of multiple systems in shared spectrum environments increases the risk of interference, necessitating robust strategies to predict, monitor, and control it e"ectively. The works presented in this thesis have tackled these challenges through initial modeling for channel estimation via loss field, the augmented CELF model for enhanced explainability and robustness under uncertainty, and a real-time spectrum monitoring system for closed-loop control of SDR base stations.</p><p>The development of the CELF model represents a step forward in interference prediction. By leveraging site-specific, data-driven techniques, CELF provides accurate and e!cient channel loss estimation, overcoming the limitations of traditional path loss models in dynamic, multiuser settings. The augmented CELF work discusses Bayesian modeling and explainability, making it a viable tool for real-world spectrum access decisions. Last, the real-time spectrum monitoring system, deployed on the POWDER testbed since 2021, has demonstrated the practical use of continuous spectrum awareness and closed-loop control.</p><p>These works aim to improve spectrum e!ciency and further enable broader innovations, such as Open Radio Access Networks (Open RAN) and RDZs, which can utilize flexible spectrum management to unlock their full potential. Economically, the ability to monetize underutilized spectrum and reduce operational costs for wireless networks highlights the transformative impact of this work. In summary, this thesis explores techniques for reliable, scalable, and e!cient interference management to drive a more dynamic and inclusive wireless ecosystem.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Future Work</head><p>This thesis presents three works on interference management for next-generation spectrum sharing, but there are still many other research problems that have not yet been explored. A future wireless ecosystem may advance dynamic spectrum sharing by discussing the following questions: (1) What are dynamic and resilient spectrum sharing schemes to automate frequency band access on the order of minutes instead of years across geographic areas of di"erent sizes? (2) How to improve interference prediction, monitoring, and source control systems or models to easily adapt to shared spectrum use in di"erent geographic areas, time-varying environments, and a variety of terrestrial and non-terrestrial networks? (3) What can be the socioeconomic impact of dynamic spectrum sharing and how to advance the existing business models for spectrum sharing?</p><p>Driven by the ecosystem vision and previous works, key areas of future exploration include (1) extensions and improvements of existing interference prediction, monitoring, and source control works, (2) spectrum sharing solutions for open RAN, and (3) socioeconomic implications of dynamic spectrum sharing and new business models.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.1">Interference Management Advancement</head><p>While several methods are discussed in this thesis to advance interference management, more challenges still remain as open research for exploration including:</p><p>&#8226; Spatial-temporal channel modeling. First, channel models are, in nature, timevarying due to the foliage variation in a geographic space. Second, urban and rural environments are significantly di"erent across coverage, population, and building density. As a result, future work can add spatial and temporal dynamics to channel models such as CELF for broader and more practical use in the real world.</p><p>&#8226; E!cient and scalable management systems for secondary users. While SAS and AFC e"ectively protect incumbents and priority users, they do not manage interference for secondary users, e.g., GAA users in 3.5 GHz and unlicensed Wi-Fi devices in 6 GHz. As more channel models like CELF can address the prediction e!ciency challenge for a large number of secondary users, future work can explore, implement, and deploy management systems to coordinate shared access of secondary users without harmful interference.</p><p>&#8226; Expansion of the spectrum monitoring capability. The proposed FDMonitor system design requires direct coupling to each base station and focuses on terrestrial networks. Future work can advance the system for monitoring spectrum sharing among satellite, terrestrial, and underwater networks. Another investigation is to utilize distributed sensor networks for monitoring purposes, which would lower costs and enhance scalability.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.2">Spectrum Sharing with Open RAN</head><p>Open RAN <ref type="bibr">[98]</ref> represents a transition to a new paradigm of wireless networks. Unlike the traditional integrated RAN solution which is proprietary and vendor-specific, Open RAN promotes a programmable, interoperable, and intelligent architecture such that edge resources, e.g., radio spectrum and edge compute, can be managed e!ciently <ref type="bibr">[20]</ref>. Such flexibility opens research opportunities to incorporate open RAN with dynamic spectrum sharing. Dynamic spectrum sharing can advance the Open RAN vision by enabling flexible and adaptive spectrum and interference management for heterogeneous wireless services and applications.</p><p>Open RAN embraces four key principles to address the issues of traditional RAN: (1) disaggregation, (2) virtualization, (3) open interfaces, and (4) intelligent controllers <ref type="bibr">[98]</ref>. First, RAN disaggregation splits base stations into three major blocks: an Open Radio Unit (O-RU), an Open Distributed Unit (O-DU), and an Open Centralized Unit (O-CU). O-RUs are generally deployed at the cell site whereas the other units are at the RAN edge to bridge the O-RUs and the core network. Second, RAN virtualization refers to decoupling network functions from dedicated hardware. It allows various network functions to run on general-purpose servers and dynamic RF signal configuration via SDR. Third, open interfaces in Open RAN standardize communication protocols between di"erent RAN units to be vendor-neutral. In doing so, it facilitates multi-vendor interoperability and reduces hardware costs. Last, RAN intelligent controllers are software-defined components that are capable of orchestrating and optimizing RAN functions with closed-loop control. Di"erent service requirements for QoS can be configured as di"erent apps on the controller. Open RAN's architectural principles meet the hardware requirement for dynamic spectrum sharing and, in turn, dynamic spectrum sharing boosts open RAN's dynamism and network e!ciency. By allowing spectrum resources to be flexibly allocated and managed, dynamic spectrum sharing can support di"erent RAN slices based on customer demands or service requirements. For example, video gaming applications require high-speed connectivity whereas low bandwidths are su!cient for low-power Internet of Things (IoT) networks.</p><p>Future work can explore the shared spectrum e!ciency and network performance with open RAN. One example is to investigate how Open RAN's programmable interfaces and realtime controllers can optimize spectrum utilization, predict interference patterns, and enhance network performance in shared frequency bands. Another example is to examine the integration of Open RAN with ML-driven interference mitigation techniques and its scalability in large-scale, heterogeneous networks. By addressing these aspects, future work can unlock the full potential of Open RAN in enabling e!cient and adaptive spectrum sharing for 5G and beyond.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.3">Socioeconomics and Business Models for Spectrum Sharing</head><p>In addition to technical explorations of spectrum sharing, future research can look into market models, regulatory frameworks, and incentive structures to accelerate the deployment of shared spectrum solutions. Collaborative e"orts with industry and policymakers could translate these technical advancements into practical and widely adopted standards.</p><p>The significance of studying business models and the socioeconomic impacts of dynamic spectrum sharing lies in its potential to reshape wireless markets and societal access to connectivity. Shared radio spectrum, such as the CBRS band and the TV White Space (TVWS) band, can promote competition and bridge digital divides by lowering barriers to entry for smaller players, e.g., rural broadband via TVWS <ref type="bibr">[64]</ref>. Socioeconomically, dynamic spectrum sharing can drive economic growth through productivity gains in smart factories and IoT applications <ref type="bibr">[17]</ref>. However, a constant challenge lies in how to incentivize more primary users like weather radar systems to engage in spectrum sharing initiatives. Understanding these socioeconomic and business dynamics is essential for creating sustainable frameworks that balance the interests of all stakeholders.</p><p>More research directions on socioeconomic and business models for dynamic spectrum sharing include:</p><p>&#8226; Decentralized spectrum markets. Decentralized markets such as a blockchainbased market could enable transparent and e!cient spectrum trading among multiple stakeholders. This approach would reduce transaction costs and improve trust, particularly in environments with diverse users.</p><p>&#8226; Design of dynamic pricing mechanisms. Advanced dynamic pricing can adapt to real-time demand and supply conditions, ensuring fair access and maximizing revenue for stakeholders.</p><p>&#8226; Socioeconomic equality evaluation Existing spectrum sharing models can be analyzed from the socioeconomic perspective to ensure equal benefits of dynamic spectrum sharing across di"erent geographic areas.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_0"><p>This was published in the Computer Networks Journal, 2025.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1"><p>This was published in IEEE International Symposium on Dynamic Spectrum Access Networks (DyS-PAN), 2024.</p></note>
		</body>
		</text>
</TEI>
