<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Meteor Head Echo Detection at Multiple High‐Power Large‐Aperture Radar Facilities via a Convolutional Neural Network Trained on Synthetic Radar Data</title></titleStmt>
			<publicationStmt>
				<publisher>AGU</publisher>
				<date>04/01/2024</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10519092</idno>
					<idno type="doi">10.1029/2023JA032204</idno>
					<title level='j'>Journal of Geophysical Research: Space Physics</title>
<idno>2169-9380</idno>
<biblScope unit="volume">129</biblScope>
<biblScope unit="issue">4</biblScope>					

					<author>T Hedges</author><author>N Lee</author><author>S Elschot</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[<title>Abstract</title> <p>High‐power large‐aperture radar instruments are capable of detecting thousands of meteor head echoes within hours of observation, and manually identifying every head echo is prohibitively time‐consuming. Previous work has demonstrated that convolutional neural networks (CNNs) accurately detect head echoes, but training a CNN requires thousands of head echo examples manually identified at the same facility and with similar experiment parameters. Since pre‐labeled data is often unavailable, a method is developed to simulate head echo observations at any given frequency and pulse code. Real instances of radar clutter, noise, or ionospheric phenomena such as the equatorial electrojet are additively combined with synthetic head echo examples. This enables the CNN to differentiate between head echoes and other phenomena. CNNs are trained using tens of thousands of simulated head echoes at each of three radar facilities, where concurrent meteor observations were performed in October 2019. Each CNN is tested on a subset of actual data containing hundreds of head echoes, and demonstrates greater than 97% classification accuracy at each facility. The CNNs are capable of identifying a comprehensive set of head echoes, with over 70% sensitivity at all three facilities, including when the equatorial electrojet is present. The CNN demonstrates greater sensitivity to head echoes with higher signal strength, but still detects more than half of head echoes with maximum signal strength below 20dB that would likely be missed during manual detection. These results demonstrate the ability of the synthetic data approach to train a machine learning algorithm to detect head echoes.</p>]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>High-power large aperture (HPLA) radar instruments have been used for decades to observe meteoroid particles undergoing hypersonic entry into Earth's atmosphere <ref type="bibr">(Evans, 1965)</ref>. Many meteors observed via such instruments originate from particles that are micrometers in diameter <ref type="bibr">(Close et al., 2000)</ref> and micrograms in mass <ref type="bibr">(Janches et al., 2008)</ref>, smaller than those observed via sky cameras or optical systems <ref type="bibr">(Brown et al., 2017;</ref><ref type="bibr">Campbell-Brown &amp; Close, 2007)</ref>, but larger than those observed on space-based impact detectors <ref type="bibr">(Baggaley et al., 2007)</ref>. HPLA instruments are capable of observing head echoes and nonspecular trail echoes, in addition to the specular trail echoes observed by traditional meteor radars <ref type="bibr">(Stober et al., 2021)</ref>. By understanding the plasma physics that drives meteor radar observations, one can estimate properties of the parent meteoroid such as mass and bulk density via their radar observations <ref type="bibr">(Close et al., 2005</ref><ref type="bibr">(Close et al., , 2012))</ref>, quantifying potential hazards to spacecraft and astronauts <ref type="bibr">(Close et al., 2010)</ref>. Furthermore, meteors act as a probe of the neutral atmosphere through which they travel, enabling estimation of the atmospheric density (A. <ref type="bibr">Li &amp; Close, 2016;</ref><ref type="bibr">Limonta et al., 2020)</ref>. Meteors are particularly useful above the altitudes where weather balloons can perform measurements, but below the altitude where satellites in low-Earth orbit experience drag, from which atmospheric density can be estimated (A. <ref type="bibr">Li &amp; Close, 2015)</ref>.</p><p>As a meteoroid undergoes entry at altitudes between 80 and 130 km, its material vapourizes due to sputtering and thermal ablation <ref type="bibr">(Guttormsen et al., 2020;</ref><ref type="bibr">Popova et al., 2001)</ref>. The resulting neutral particles undergo highenergy collisions with atmospheric molecules traveling between 11 and 73 km per second, depending on the incoming trajectory of the meteoroid relative to Earth <ref type="bibr">(Blanchard et al., 2022)</ref>, which is sufficient to ionize many of the meteoric and surrounding atmospheric particles. A plasma cap is formed in the immediate vicinity of the meteoroid, which moves with the meteoroid and reflects radio waves <ref type="bibr">(Dyrud et al., 2008;</ref><ref type="bibr">Marshall et al., 2017;</ref><ref type="bibr">Sugar et al., 2021)</ref>. This produces the head echoes frequently observed at HPLA radar facilities that transmit power on the order of megawatts <ref type="bibr">(Janches &amp; Chau, 2005)</ref>. As of more recently, non-HPLA radar facilities, such as the Southern Argentine Agile Meteor Radar, have enhanced capabilities that enable head echo observations at sub-megawatt power <ref type="bibr">(Janches et al., 2014;</ref><ref type="bibr">Panka et al., 2021)</ref>.</p><p>As plasma undergoes momentum-scattering collisions with the atmospheric neutral particles and slows down relative to the surrounding atmosphere, it forms a trail along the path traveled by the entering meteoroid. The motion of the plasma relative to the stationary neutrals can cause a Farley-Buneman gradient-drift instability to develop into plasma turbulence <ref type="bibr">(Oppenheim &amp; Dimant, 2015;</ref><ref type="bibr">Oppenheim et al., 2000)</ref> and form field-aligned irregularities (FAI) capable of scattering radio waves incident perpendicular to the background magnetic field. This is one mechanism that creates meteor radar signatures known as nonspecular trail echoes <ref type="bibr">(Close et al., 2008;</ref><ref type="bibr">Dyrud et al., 2005)</ref>. In many cases, nonspecular trails are observed at equatorial radars that point directly zenithward, but non-equatorial radars can also observe these signatures if the beam is pointed appropriately. Non-FAI nonspecular trail echoes have also been observed at higher latitudes <ref type="bibr">(Chau et al., 2014;</ref><ref type="bibr">Kozlovsky et al., 2020)</ref>, and research efforts to explain these observations are ongoing. This work focuses specifically on meteor head echo signatures, which are the most frequently observed meteor radar signatures in HPLA radar data. At facilities including Jicamarca Radio Observatory and Millstone Hill Observatory, head echoes can appear thousands of times per hour <ref type="bibr">(Erickson et al., 2001;</ref><ref type="bibr">Y. Li et al., 2020)</ref>. In general, the head echo observation rate, and the detectability of any particular head echo, depends on factors including facility carrier frequency, power, geographic location and beam pointing direction, in addition to the incoming meteor population at a given time of day and year <ref type="bibr">(Sparks et al., 2009)</ref>. Taking into account these factors, one can use information from the head echo population as a whole to estimate physical quantities such as the atmospheric neutral density (A. <ref type="bibr">Li &amp; Close, 2016)</ref>. One can also utilize the observations of head echoes to more efficiently identify trail echoes, since trail echoes frequently appear after head echoes.</p><p>Many prior meteor experiments rely on manual searches of radar data to identify head echoes, using heuristics such as received signal power and tools such as Doppler-shifted matched filter banks <ref type="bibr">(Hedges et al., 2022)</ref>. However, these approaches require one to either search through vast sets of data, which is prohibitively time consuming for most experiments, or only consider signals above a certain threshold. The threshold approach does not differentiate head echoes from phenomena such as sporadic E and the equatorial electrojet, which are extremely common at radar facilities especially near the equator. Furthermore, this approach neglects the weakest head echoes below the signal threshold chosen, effectively neglecting the low end of the meteoroid mass range observable by HPLA radar instruments. To address these issues, Y. <ref type="bibr">Li et al. (2020)</ref> developed a rigorous method that calculates the probability that a head echo occurs at every radar pulse and range, without relying on a signal threshold. However, the method is computationally expensive and only performs well for very long coded pulses.</p><p>Machine learning methods have previously been developed to automate the identification of head echoes. Initially, a method was developed that applies a modified Hough transform to identify possible head echoes, followed by a convolutional neural network (CNN) to reject false positive identifications (Y. <ref type="bibr">Li et al., 2022)</ref>. This method minimizes false positives to less than 1% overall, but is computationally expensive and complex to implement. This motivated development of another method that relies purely on a CNN to detect head echoes, and achieves high precision and sensitivity at Jicamarca Radio Observatory (Y. <ref type="bibr">Li et al., 2023)</ref>.</p><p>Despite the success of machine learning methods at Jicamarca Radio Observatory, existing methods rely on the abundance of manually labeled head echoes from pre-existing data collected at the same facility and with the same Journal of Geophysical Research: Space Physics 10.1029/2023JA032204 experiment configuration. Parameters that influence the appearance of head echoes in raw data, such as the receiver sample rate and chosen pulse code, must be identical between experiments where training data is gathered. As new facilities and experiment types are employed to observe meteors, there remains a need to train such algorithms without relying on pre-existing data. We therefore develop an algorithm to generate synthetic head echoes for any meteor radar experiment, and additively combine them with a subset of other ionospheric phenomena, radar clutter, or noise observed at a particular facility. The resulting examples are used to train a simple CNN architecture. Identifying sufficient examples of data that are not head echoes requires a few hours of manual overhead, but this is much less than the days of effort required to label thousands of head echoes for training. Our CNN architecture accepts raw radar data as input without pre-processing, and therefore is extremely computationally efficient.</p><p>In this paper, we first describe the three HPLA radar facilities and data sets used to develop this technique. We describe our CNN detection methodology and architecture, followed by our method to generate synthetic head echoes corresponding to each facility and create training data sets for each facility. We then discuss the performance metrics that result from training a CNN via the synthetic head echo method at each facility and testing it on a subset of real data from the corresponding facility. We compare the properties of head echoes detected via the CNN versus those detected via an exhaustive manual search, and conclude with a discussion of how this work will facilitate improved analysis in future radar experiments that detect head echoes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Radar Experiment and Facilities</head><p>On October 10th and 11th, 2019, a meteor radar data collect was performed concurrently for 8 hr total at three radar facilities, each with a co-located transmitter and receiver. The facilities include Resolute Bay Incoherent Scatter Radar North (RISR-N), Millstone Hill Observatory (MHO), and Jicamarca Radio Observatory (JRO), with the intent to study neutral density variation in the lower thermosphere via meteor observations at varying latitudes but similar longitudes. Each facility is unique in its operational parameters and geographic location, which leads to observational biases in the meteor population. Biases inherent to latitude include the beam orientation and its angle relative to meteoroid sources, varying geomagnetic field orientation and ionospheric conditions. Other biases can result from the radar carrier frequency, incident power <ref type="bibr">(Urbina &amp; Briczinski, 2011)</ref>, beam pattern, incident polarization, and local solar time. One must therefore consider these biases in developing a machine learning algorithm that detects head echoes.</p><p>Each facility transmits long phase-coded pulses to maximize transmitted signal power and thus increase sensitivity. Long pulses take advantage of the sparsity of meteor radar observations, since there is usually only a single head echo being observed at any given time and range <ref type="bibr">(Volz &amp; Close, 2012)</ref>. Phase-coded pulses allow Doppler shifted matched filters to be applied to the raw data in the vicinity of the head echo to minimize its range ambiguity. The autocorrelation functions of long pulse codes contain range sidelobes surrounding a central peak, which are exacerbated by any mismatch in Doppler shift of the matched filter. However, assuming the Doppler shift is determined correctly, and since separate head echoes rarely overlap in such a way as to interfere with one other, the central peak clearly indicates the precise range of the meteor despite the presence of range sidelobes.</p><p>The RISR-N and JRO facilities transmit a minimum-sidelobe 51-baud (MSB 51) code of duration 51 &#956;s. MHO transmits a Barker-7 code of length 42 &#956;s, since the facility was not capable of a baud length as short as 1 &#956;s at the time of experiment, unlike the other facilities. This code produces nearly comparable signal-to-noise (SNR) performance to that of the MSB 51 code. The RISR-N facility receives signal at a sampling rate of 2 MHz, enabling range resolution of 75 m, whereas JRO and MHO receive at 1 MHz, enabling range resolution of 150 m. The JRO radar beam is directly zenith-pointing, and the RISR-N beam is almost zenith-pointing, allowing for shorter interpulse period to reach the meteor altitude range of 80-120 km. The MHO beam points at a 45&#176;e levation directly west to avoid ground clutter that would otherwise occur at the location. Each facility transmits at interpulse periods of milliseconds to ensure head echoes are resolved across many pulses and minimize uncertainty in the measurement of range and its evolution with time. Experiment parameters and details for each radar facility are provided in Table <ref type="table">1</ref>.</p><p>The dynamics of meteoroids entering Earth's atmosphere and their resulting head echoes are dependent on the conditions of the lower thermospheric neutral atmosphere and ionosphere during the experiment. Table <ref type="table">2</ref> specifies the solar and geomagnetic indices for the dates and times of experiment. The indices reflect a quiet geomagnetic day, as expected for the 2019 solar minimum.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Journal of Geophysical Research: Space Physics</head><p>10.1029/2023JA032204</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Machine Learning Method</head><p>To parse through the many hours of radar data and identify head echoes without relying on days of manual effort, a CNN technique is developed to perform this task automatically. To keep the technique simple and reduce the necessary computational power, the CNN is trained to determine whether small segments of raw data contain a head echo. A simple CNN architecture with three layers is chosen. By simplifying the meteor trajectory dynamics and modeling the reflection that generates the head echo signature, synthetic head echoes are generated to train the CNN. To ensure the CNN appropriately differentiates between head echoes and other phenomena, examples of data with such phenomena or noise are combined with the synthetic head echoes to generate training examples. This allows the CNN to be trained using tens of thousands of training examples. The CNN technique is implemented in Python using the PyTorch library <ref type="bibr">(Paszke et al., 2019)</ref>. Since PyTorch supports CUDA functionality, the CNN training and testing calculations are performed on a GPU for accelerated performance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Head Echo Detection Approach</head><p>We present a CNN architecture that accepts raw radar data as input, with no pre-processing beyond separating the data into segments of consistent size. This enables extremely computationally inexpensive classification of each segment. A matched filter is ill-suited for pre-processing since it requires prior knowledge of the Doppler shift for each head echo. One can use a technique based on the fast Fourier transform (FFT) that algorithmically accounts  for the Doppler shift to decode the raw data, known as FFT decoding, but this method is computationally expensive (Y. <ref type="bibr">Li et al., 2023)</ref>.</p><p>HPLA radar raw data effectively consists of a two-dimensional array of complex in-phase and quadrature (I&amp;Q) voltage values, in some cases with additional channels that capture dual polarization or interferometric measurements from the received signal <ref type="bibr">(Chau &amp; Woodman, 2004;</ref><ref type="bibr">Close et al., 2011)</ref>. The CNN architecture accepts the real and imaginary components of the raw data as separate image channels of the input. When the data is visualized as a range-time intensity (RTI) image, as shown in the radar image plots throughout this paper, the horizontal axis of the data represents radar pulses (i.e., time), and the vertical axis represents range gates (i.e., altitude). The decibel power, P dB , is computed via</p><p>where c rx is the raw data or decoded data via a matched filter at a particular pulse and range gate, and n f is the decibel noise floor as specified in Table <ref type="table">1</ref> for this experiment.</p><p>The raw radar data arrays contain hundreds of range gates and thousands of consecutive pulses at each facility, so the arrays are broken into separate segments to be analyzed by the CNN. We use segments with size of 150 pulses by 150 range gates for MHO and JRO. Since RISR-N uses a sample rate twice as large as the other two facilities, we use segments of 150 pulses by 300 range gates for this facility. The segment sizes are chosen to ensure the majority of a particular head echo is generally captured within the segment, given the pulse duration and sample rates used at each facility, without being so large as to consume excessive GPU memory usage when training the CNN. Each segment overlaps the adjacent segments on all four sides by 30 pulses or range gates. Since some head echoes can occur for only a few pulses, the overlap on the left and right segment sides ensures that such shortduration head echoes are fully captured within at least one segment, increasing the ability of the CNN to detect said head echoes. The overlap in range gates on the bottom and top segment sides is not necessary, given the use of long coded pulses that are spread across many range gates, but is included for simplicity. If more than one head echo is contained within a segment, the segment should be classified as positive. For this work, if a trail echo or other meteor signature is contained in a segment, the segment should be classified as negative. Figure <ref type="figure">1</ref> demonstrates how radar data is split into chunks and then labeled where head echoes are present, assuming perfect image classification. The data contained within a bounding box that surrounds adjacent positive classifications can then be fed into a head echo analysis routine to determine the range, velocity, and other such properties of every head echo within the region.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Convolutional Neural Network Architecture</head><p>Our CNN model architecture is inspired by the ResNet architecture <ref type="bibr">(He et al., 2015)</ref> and includes three convolutional layers followed by a fully connected layer, as depicted in Figure <ref type="figure">2</ref>. This architecture was chosen since it outperformed an architecture with two layers, but adding a fourth layer did not continue to improve performance. The first two convolutional layers employ a 3-by-3 convolution operation, followed by the leaky rectified linear Journal of Geophysical Research: Space Physics </p><p>where a &#224; 0.1 in this CNN architecture. The 2-by-2 maxpool operation splits each image channel into 2-by-2 blocks and then keeps only the largest element, which scales down the image channels, as visualized in Figure <ref type="figure">2</ref>. Although the maxpool operations scale down the image at each layer, the number of image channels increases throughout the network, since each convolution operation contains multiple kernels that operate in parallel and produce output on a separate layer (15 kernels on layer 1, 50 on layer 2, and so on).</p><p>The third and final convolutional layer replaces the 2-by-2 maxpool with a global maxpool, which keeps only the maximum element from each image channel and reduces the layer output to a one-dimensional vector with these elements. This is followed by a fully connected layer and softmax operation, which reduces the vector to a smaller output vector of length 2, where each component is the probability that the input image has a label of 0 or 1. In general, the softmax operation yields a vector with K components, where K is the number of labels that an image can be classified by. The ith component is then determined by the equation</p><p>where y ! &#8712; R K is the output from the fully connected layer. In this case, since K &#224; 2, the value of &#963; 2 specifies the probability that the raw data segment contains a head echo, as determined by the CNN. If this probability is greater than 0.5, the segment is classified as positive. The CNN architecture contains a total of 142,987 coefficients, or weights, that are optimized during training. Journal of Geophysical Research: Space Physics 10.1029/2023JA032204</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Synthetic Head Echo Model</head><p>To generate synthetic head echoes, we assume the head echo range is exponential in time, since it travels through the atmosphere with exponential density dependence on altitude, and the drag force experienced by the meteoroid is proportional to atmospheric density. The range is therefore given by</p><p>where r 0 is the initial range and v 0 is the initial radial velocity, which is the velocity component along the radar beam, where the value is defined positive when the velocity vector points toward the facility. The value of a 0 is the initial radial deceleration, which is the acceleration projected along the beam direction, defined positive when this vector points away from the facility (i.e., a deceleration as the meteor descends). The initial values are taken at the time of first head echo observation.</p><p>Although these assumptions about the meteoroid dynamics neglect more complex physical phenomena such as differential ablation <ref type="bibr">(Dyrud &amp; Janches, 2008;</ref><ref type="bibr">Janches et al., 2009)</ref> and fragmentation, and the dual polarization scattering that may result from fragmentation events is not included in the synthetic model <ref type="bibr">(Close et al., 2011)</ref>, this method still generates head echoes that closely resemble head echoes observed in real data. In practice, large portions of real head echoes will resemble those in the synthetic training set, allowing the CNN to detect them despite their more complex behavior.</p><p>To generate the raw radar data corresponding to a meteoroid, we assume the received pulse is created via a singlebody Doppler-shifted reflection of the transmitted pulse. For convenience, we define the variable t &#224; t t i , where t i &#224; i&#916;t p is the time at which the ith pulse starts transmitting, and &#916;t p is the interpulse period (i.e., pulse resolution). It is assumed that the magnitude of the received signal varies solely due to transverse meteor motion through the radar beam, as opposed to variation of the meteor plasma density or shape throughout the time of observation. Therefore, we assume the signal strength over time is proportional to the main lobe of the sinc function, sinc&#214;s&#220; &#224; sin &#214;&#960;s&#220; &#960;s , where s &#224; 2t t f 1 and t f is the time of the final pulse that the head echo is observed. More complex formulations of SNR were tested, such as including multiple local SNR maxima via a Fourier sine series throughout the duration of observation, but this approach did not produce superior results to a simple sinc function for each head echo. The peak SNR in decibels is given by the parameter S max . Note that this SNR value is defined as the head echo signal strength in raw data, which differs from the SNR definition used to quantify signal strength of real head echoes, which is the SNR in decoded data after a matched filter with the appropriate Doppler shift is applied.</p><p>The function c tx t represents the complex voltage signal of the transmitted pulse and is nonzero when t &#8712; &#226;0,B&#214;L 1&#220;&#228;, where L is the pulse length in bauds and B is the baud interval. The complex raw data of ith received pulse of the head echo is then determined by the equations</p><p>where f is the facility carrier frequency, c is the speed of light, n f is the noise floor of raw data in decibels, and t &#8902; i is the time when the ith transmitted pulse reflects from the head echo. The values of f and n f are defined in Table <ref type="table">1</ref> for each facility. The phase parameter &#981; 0 represents the phase of the received complex signal at the time of first observation. The value of t &#8902; i can be determined exactly by solving the implicit expression</p><p>Journal of Geophysical Research: Space Physics 10.1029/2023JA032204 but it is sufficient to approximate t &#8902; i to simplify the implementation of synthetic data generation. The simplest such approximation is to assume the meteor does not move much within one pulse, in which case</p><p>This approximation is improved by instead assuming that the meteor does not decelerate much during the time that the radio waves travel,</p><p>and then plugging in t &#8902; i t i via the simpler approximation of Equation <ref type="formula">9</ref>. Hence, the following equation is used to approximate t &#8902; for the purpose of synthetic head echo generation:</p><p>All of the parameters in Equations 5-7 are determined randomly for each synthetic head echo via a uniform probability distribution within the ranges specified in Table <ref type="table">3</ref>.</p><p>Although a more traditional CNN architecture is trained for this work, the synthetic head echo model can be adapted to train any neural network model, including more advanced architectures such as the You Only Look Once (YOLO) algorithm, which is capable of identifying objects and their bounding boxes <ref type="bibr">(Bochkovskiy et al., 2020)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Generation of Training Examples</head><p>A large set of head echoes can easily be generated via the synthetic model. However, one must position these head echoes in an image segment of the appropriate size and combine them with background noise or phenomena in order to generate training data examples. A set of 10,000 synthetic head echoes at JRO, and 25,000 head echoes at RISR-N and MHO, are generated with randomized parameters for each facility via the aforementioned procedure.</p><p>The bounding box of each head echo is positioned at a random floating-point coordinate within an otherwise empty raw data segment, with the constraint that no more than 25% of any side length of the bounding box around the head echo extends from a segment edge. The x-coordinate of the first pulse of the head echo is rounded to an integer to align with the pulse resolution. The y-coordinate of the first sample is not rounded to an integer; instead the simulated head echo samples are linearly interpolated to the exact coordinates of the range gates of the segment. Physically, this accounts for the fact that the received data need not be sampled in alignment with the individual bauds of the signal; there will be some arbitrary delay.</p><p>Once the head echo has been placed in its segment, background phenomena must be added to make the raw data realistic and ensure the neural network learns to differentiate between head echoes and other phenomena. Unfortunately, it is currently not easy to generate background phenomena synthetically, since there is a wide range of unique phenomena observed by any given facility. Although models have been constructed for nonspecular trails and the equatorial electrojet <ref type="bibr">(Dyrud et al., 2005;</ref><ref type="bibr">Hysell et al., 2002)</ref>, attempting to simulate radar data for each phenomena would be prohibitive. Thus, we combine synthetic head echoes with examples of background phenomena obtained from the real raw data that do not contain head echoes, which will henceforth be referred to as real negative segments. Real negative segments are much more common and easier to identify than segments that contain head echoes.</p><p>We initially identify more than 2,000 real negative segments of raw data, which requires a few hours of manual overhead for each facility. These segments are carefully selected to ensure they are truly negative; that is, contain no part of any object that resembles a head echo. The negative set chosen from JRO includes many instances of nonspecular trail echoes and the equatorial electrojet (EEJ). The MHO set contains some clutter and potential sporadic E events, although not as many events as JRO. Unlike the other facilities, RISR-N rarely observes objects other than head echoes throughout the experiment duration, so the negative segments chosen for RISR-N are simple noise examples. To reduce manual overhead as much as possible, we employ augmentation techniques to artificially increase the number of real negative segments at each facility from approximately 2,000 to 20,000. Such techniques include taking arbitrary linear combinations of the selected negative segments, and arbitrarily translating the segments in the horizontal and vertical directions, wrapping data around the edges of the segment. Although this process results in nonphysical discontinuities, it does not bias the training process in any way, since such discontinuities are not indicative of whether head echoes exist. A negative segment is then randomly selected to be additively combined with each synthetic head echo to create the set of up to 25,000 positive training examples. For each positive training example, the corresponding negative segment without the head echo is retained as part of the training set, for a total of up to 50,000 training examples. The process of combining a synthetic head echo with a real negative segment is demonstrated in Figure <ref type="figure">3</ref>, along with examples of training data generated for JRO.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5.">Neural Network Training Process</head><p>A single neural network per facility is trained on segment examples containing synthetic head echoes generated for the corresponding facility. The same CNN architecture is used at each facility, but the resulting training weights vary based on the experiment parameters and training set.</p><p>An algorithm based on mini-batch stochastic gradient descent, known as the Adam algorithm <ref type="bibr">(Kingma &amp; Ba, 2017)</ref>, is used with the cross-entropy loss function to train the network. Each optimization step takes into account 100 training examples. This batch size was chosen since it is large enough to ensure quick convergence, but small enough to not consume excessive memory during training. The learning rate, which controls step size of the optimization algorithm, starts at 0.001, and then is decreased manually between epochs to a final value of 1 &#8677; 10 5 . Since the training set is large, the algorithm converges within only a few epochs, or loops through each training example. Weight decay, which is similar to L2 normalization of the training weights, is employed to protect against overfitting, with a coefficient of one-tenth of the learning rate.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Neural Network Performance and Discussion</head><p>The CNN is tested on a set of manually labeled raw data segments taken at each facility from the concurrent radar experiment. At RISR-N and MHO, one test set is considered per facility. At JRO, we consider performance against two test sets; one before dawn when the EEJ is not present, and one during dawn where the EEJ is present.</p><p>In real raw data, we manually search for and identify head echoes using either FFT decoding or a bank of Dopplershifted matched filters, since head echoes are not always readily visible in raw data images, but it is desired that the CNN detect even the weakest head echoes in raw data. When using these methods, objects are occasionally present that resemble streaks but are not clearly head echoes. Segments containing such objects are considered inconclusive and excluded from calculations of performance metrics.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Journal of Geophysical Research: Space Physics</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>10.1029/2023JA032204</head><p>For each test set, labels for each segment are predicted via the trained neural network, and then checked against the truth labels. Performance metrics are quantified for each test set, including classification accuracy, which is the overall fraction of correct classifications, the precision p, which is the fraction of positive classifications that are correct, the sensitivity s, which is the fraction of positively labeled segments that are correctly identified, and the F 1 score, which is defined as</p><p>The sensitivity metric is broken into two separate categories; sensitivity to head echoes with a post-matched filter maximum SNR of greater than or equal to 20 dB, and sensitivity below this threshold. Head echoes of postmatched filter signal strength below 10 dB are rarely observed in this experiment configuration, so the latter category represents sensitivity to the weakest detectable head echoes. For each test set, the number of test segments with each label, and performance metrics, are provided in Table <ref type="table">4</ref>.</p><p>RISR-N has the greatest classification accuracy, precision and F 1 score, as expected given the infrequence of objects besides head echoes at this facility. Its reduced sensitivity as compared to MHO, although still high overall, could be a result of the larger segment size used or the more complex pulse code. Since MHO has some sporadic E and cluttering phenomena, its precision is slightly lower than that of RISR-N. JRO has the lowest precision, overall accuracy and F1-score, which results from its frequent and prominent observations of objects other than head echoes, especially nonspecular trail echoes which fill larger regions of data and frequently overlap head echoes. The precision drops further when the EEJ is present, but it is still greater than 0.5. This precision is still reasonable, since only one false positive needs to be rejected per true head echo. Journal of Geophysical Research: Space Physics 10.1029/2023JA032204</p><p>The CNNs demonstrate greater than 0.7 overall sensitivity to head echoes at each facility, including at JRO when the EEJ is present. Sensitivity is significantly higher for stronger head echoes, as one would expect, whereas for the weaker head echoes, it remains above 0.5 across all facilities. Therefore, one can employ this algorithm to obtain a comprehensive head echo data set, including numerous weaker head echoes, at any facility. Sensitivity to strong echoes decreases when the EEJ is present at JRO since the echoes are often obscured by the EEJ, but most of the head echoes identified via manual searching are still detected. The strong performance at each facility demonstrates that one could apply the synthetic head echo technique at any facility for a future head echo experiment and quickly identify head echoes without an intensive training process.</p><p>Prior work for head echo detection at JRO has demonstrated precision and sensitivity values above 0.9 (Y. <ref type="bibr">Li et al., 2023)</ref>. Although our CNN does not achieve this performance at JRO, it is likely that the reduced precision results from the use of purely synthetic data to train the CNN, in addition to the simplicity of the CNN architecture and lack of pre-processing. It is important to note that the reduced overall complexity enables significantly faster computational performance and ease of implementation. Future work will investigate incorporation of varying proportions of real data into the training set, in addition to more advanced architectures and inclusion of preprocessing to increase precision and sensitivity.</p><p>To quantify how the CNN algorithm improves over traditional (i.e., non-machine learning techniques) used to detect head echoes, the performance of the method is compared with that of using a signal threshold to search for head echoes. The signal threshold was performed as follows: FFT decoding is used to decode each segment of raw data, and then segments with any signal above a 20 dB threshold are labeled positive. Such a method is frequently used as a first pass to search for head echoes, but is expected to perform poorly when high-SNR phenomena other than head echoes are prevalent, and is furthermore expected to ignore any head echoes below the threshold signal strength. The overall sensitivity and precision of the threshold method are included in Table <ref type="table">4</ref>. Although perfect precision is achieved at RISR-N due to the infrequence of non-head echo phenomena at this facility, the precision drops significantly at MHO and JRO, and drops further when the EEJ is present. Since many head echoes are weaker than the signal threshold, the sensitivity of the method is poor in each test set besides that of JRO with the EEJ present. This occurs because with the presence of the EEJ, many segments are positively labeled despite the Note. At JRO, two separate test sets are evaluated; one without the equatorial electrojet (EEJ), and one with the EEJ present after dawn. The overall sensitivity and precision is also calculated for the method of assuming any segment with signal strength greater than 20 dB contains a head echo, after applying FFT decoding to each raw data segment. This demonstrates how using the CNN for identification improves upon using a signal strength threshold, which is a more traditional method of searching for head echoes.</p><p>Journal of Geophysical Research: Space Physics</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>10.1029/2023JA032204</head><p>technique being insensitive to the actual appearance of a head echo. Use of the CNN for head echo identification is therefore far superior to a signal threshold search, which effectively reduces to a manual search when the EEJ or other prevalent phenomena are present.</p><p>To demonstrate the ability of the CNN to identify head echoes and discriminate them from other observations, a class activation map (CAM) can be generated, which is an image that indicates which regions of a CNN input trigger a positive classification. The CAM is generated via intermediate output from any CNN architecture that utilizes a global pooling operation after the convolutional layers. We forward propagate a particular input example into the CNN, and consider the output from the activation function of the third layer before the global maxpool is performed, of dimension 300-by-38-by-38. This output contains 300 channels of 38-by-38 images that represent specific combinations of features in the image that the CNN has detected. Each of the 38-by-38 channels is multiplied by its corresponding scalar weight in the final fully connected layer of size 300, and the resulting arrays are summed together. The values in this array are interpolated to the coordinates of the larger input image, effectively upscaling it to the input resolution, and forming a twodimensional image that specifies which regions of the input example most influence the CNN to consider it a positive detection. A contour map is generated for this image and then overlayed onto an RTI image generated from the raw data segment. Some CAMs for head echoes correctly identified near other phenomena at JRO and MHO are shown in Figure <ref type="figure">4</ref>.</p><p>From the CAMs, it is clear that the CNN is capable of discerning the head echo signature, even in cases where background phenomena or clutter is present. At JRO, head echoes that are embedded within the equatorial electrojet or precede a nonspecular trail are clearly detected independently. At MHO, brief range-spread clutter is frequently detected, and the CNN detects head echoes separately from such events, even in cases where the clutter is stronger. RISR-N detects cluttering events so infrequently that there is no practical need to eliminate false positives that occur due to such events.</p><p>Within the test set corresponding to each facility, with the exception of the test set at JRO where the EEJ is present, each of the head echoes is analyzed to determine its altitude, radial velocity, and SNR. FFT decoding is used to directly solve for an initial approximation of the average Doppler velocity of each head echo, as described in Y. <ref type="bibr">Li et al. (2023)</ref>. A matched filter with the corresponding Doppler shift is then used to decode the raw data. Within the region of the head echo, the raw data value at the range gate of maximum SNR is chosen for each pulse, and the pulse-to-pulse Doppler shift is determined as described in <ref type="bibr">Hedges et al. (2022)</ref> to obtain the most accurate radial velocity throughout the duration of detection. The head echo properties for each facility are presented in the scatter plots in Figure <ref type="figure">5</ref>. Head echoes that are contained in at least one correctly labeled raw data segment are marked as identified by the CNN, whereas head echoes not correctly identified in any raw data segments are marked as not identified. Note that the fraction of head echoes correctly detected by the CNN is greater than or equal to the overall CNN sensitivity at each facility, since in this case, the sensitivity is defined as the fraction of segments containing head echoes that are correctly identified, and a head echo need only be detected in one raw data segment to be considered detected in the scatter plots. Journal of Geophysical Research: Space Physics 10.1029/2023JA032204</p><p>In each of the scatter plots, it is further demonstrated that head echoes with higher maximum signal strength are preferentially detected, but that many weaker head echoes below the 20 dB threshold are also detected. These are head echoes that likely would not be included in an analysis via more traditional detection techniques such as SNR thresholding, at least without significant manual overhead. Therefore, the CNN technique yields a more comprehensive population that includes head echoes resulting from meteoroids that travel farther from the center of the beam, in addition to those of smaller initial mass. Where interferometric measurements are available, such lower-mass observations can be directly identified.</p><p>Fewer head echoes are present in the test sets at RISR-N and MHO than at JRO, despite the larger number of raw data segments in the RISR-N and MHO test sets, since these facilities observe head echoes more infrequently than at JRO <ref type="bibr">(Hedges et al., 2022)</ref>. There are no clear trends amongst altitude or radial velocity in the undetected head echoes at RISR-N or JRO. At MHO, the CNN tends to overlook head echoes more frequently at lower radial velocities. This occurs because the MHO beam was pointed at a 45&#176;elevation angle due west for this experiment, unlike at the other facilities with near-vertical beams. This enables observation of head echoes with near-zero or negative radial velocities; for example, a meteoroid entering due west at 45&#176;from the horizontal will traverse the radar beam perpendicular to its pointing direction, resulting in zero radial velocity. However, the CNN was trained only on head echoes with radial velocity between 11 and 73 km per second. An attempt was made to train the CNN with synthetic head echoes between 20 and 73 km per second for MHO, but the inclusion of synthetic head echoes with near-zero radial velocity caused a significant increase in the false positive rate. This is likely due to similarity between head echoes with little radial motion and therefore little Doppler shift, and non-head echo phenomena including the range-spread clutter frequently observed at MHO, as observed by the CNN.</p><p>The radar observations at all three facilities were not coordinated to observe any known meteor showers, so it is assumed that the vast majority of observed meteors are sporadic <ref type="bibr">(Schult et al., 2018)</ref>. However, interferometric capability is necessary to conclusively confirm or deny the presence of any meteor showers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusions and Future Work</head><p>In this paper, a technique to generate synthetic head echo observations given any radar facility and experiment parameters is developed. Furthermore, it is proven that a CNN can successfully identify head echoes in raw radar data without any pre-processing, which greatly reduces computational cost. The utility of the synthetic head echo technique is demonstrated via its ability to train CNNs to detect head echoes at multiple HPLA radar facilities, including RISR-N, MHO, and JRO, which performed concurrent observations in October 2019. Each CNN is tested against a selection of real data from each facility, where raw data segments are manually labeled as either positive (contains one or more head echoes) or negative (does not contain a head echo). Each CNN demonstrates overall sensitivity greater than 0.7, indicating that most head echos present in the data are identified. Journal of Geophysical Research: Space Physics 10.1029/2023JA032204</p><p>The synthetic data approach is valuable because it significantly reduces the manual overhead necessary to train a computer vision algorithm to identify head echoes. Manually labeling head echoes would require days of effort and extensive head echo data from similar experiments (i.e., similar pulse code and other experiment parameters), which is not always available. With the synthetic data approach, one must still manually identify a set of raw data segments that do not contain head echoes, ideally with a comprehensive set of ionospheric phenomena and clutter observed by a facility. Since such segments are much more frequent than those with head echoes, and data augmentation techniques are used to allow fewer such segments to be identified, this process takes hours rather than days.</p><p>This technique will enable a more in-depth analysis of head echoes throughout the full data sets at all three facilities that performed concurrent observations. Previous analyses from <ref type="bibr">Hedges et al. (2022)</ref> only included hundreds of head echoes at each facility. A more complete analysis containing many more head echoes will elucidate any temporal variation in the head echo population throughout the duration of observations, allowing quantification of variation in the neutral density within each latitude region.</p><p>Future work will enhance the capability of the technique to generate physically accurate synthetic data. Incoming meteoroids will be simulated along three-dimensional trajectories that pass through the known antenna pattern of a facility, which produces range and signal strength modulations in head echoes that are not currently captured by the technique. Simulation of the trajectory will further enable a model of the additional receiving raw data channels available at JRO, including dual polarization in the four separate receiving quadrants of the facility. This allows interferometric measurements to be utilized to deduce the location of scattering targets observed by the radar, and thus could allow a neural network to differentiate between head echoes and phenomena such as the EEJ that may not be co-located within the beam. This will enhance the sensitivity of the CNN to head echoes observed at the same range as the EEJ. This approach can be extended to other facilities capable of interferometric measurements to enhance detection, including the Middle Atmosphere Alomar Radar System in Norway <ref type="bibr">(Schult et al., 2013)</ref>, the Southern Argentine Agile Meteor Radar <ref type="bibr">(Janches et al., 2014)</ref>, and the middle and upper atmosphere (MU) radar in Japan <ref type="bibr">(Kastinen &amp; Kero, 2022)</ref>. Additionally, future work will explore integration of real data to the training sets in conjunction with synthetic data, since this may improve performance while still leveraging the benefits of synthetic data.</p><p>Subsequent research will explore adaptation of the synthetic data technique to more advanced CNN architectures. In particular, incorporating pre-processing via decoding of raw data could increase the precision and sensitivity to be comparable to what was achieved in previous work. Addition of the capability to identify a bounding box surrounding each head echo in addition to classifying whether a head echo is present will enable complete automation of head echo analysis, including determination of the meteoroid velocity and deceleration versus time (Y. <ref type="bibr">Li et al., 2023)</ref>. This will enable usage of the technique to completely automate processing of head echoes, including determination of meteoroid velocity and deceleration versus time. Future research will also quantify sensitivity biases of CNN methods including this one, and determine how the identified head echo populations differ from those obtained manually. One must consider that manual identification may still yield a biased set of head echoes, since the weakest and shortest-duration head echoes are easily missed by the human eye. To avoid such bias, one must carefully identify an exhaustive head echo set that includes the weakest and shortest head echoes, as was done for the test sets in this work. Therefore, continued research of machine learning identification techniques will prove critical in the ongoing study of meteor head echoes and their scientific value.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>21699402, 2024, 4, Downloaded from https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2023JA032204 by Stanford University, Wiley Online Library on [30/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License</p></note>
		</body>
		</text>
</TEI>
