<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Prospects for 21 cm Galaxy Cross-correlations with HERA and the Roman High-latitude Survey</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>02/01/2023</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10451535</idno>
					<idno type="doi">10.3847/1538-4357/acaeb0</idno>
					<title level='j'>The Astrophysical Journal</title>
<idno>0004-637X</idno>
<biblScope unit="volume">944</biblScope>
<biblScope unit="issue">1</biblScope>					

					<author>Paul La Plante</author><author>Jordan Mirocha</author><author>Adélie Gorce</author><author>Adam Lidz</author><author>Aaron Parsons</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Abstract                          The cross-correlation between the 21  cm field and the galaxy distribution is a potential probe of the Epoch of Reionization (EoR). The 21 cm signal traces neutral gas in the intergalactic medium and, on large spatial scales, this should be anticorrelated with the high-redshift galaxy distribution, which partly sources and tracks the ionized gas. In the near future, interferometers such as the Hydrogen Epoch of Reionization Array (HERA) are projected to provide extremely sensitive measurements of the 21 cm power spectrum. At the same time, the Nancy Grace Roman Space Telescope (Roman) will produce the most extensive catalog to date of bright galaxies from the EoR. Using seminumeric simulations of reionization, we explore the prospects for measuring the cross-power spectrum between the 21 cm and galaxy fields during the EoR. We forecast a 12              σ              detection between HERA and Roman, assuming an overlapping survey area of 500 deg              2              , redshift uncertainties of              σ                              z                            = 0.01 (as expected for the high-latitude spectroscopic survey of Ly              α              -emitting galaxies), and an effective Ly              α              emitter duty cycle of              f              LAE              = 0.1. Thus the HERA–Roman cross-power spectrum may be used to help verify 21 cm detections from HERA. We find that the shot-noise in the galaxy distribution is a limiting factor for detection, and so supplemental observations using Roman should prioritize deeper observations, rather than covering a wider field of view. We have made a public GitHub repository containing key parts of the calculation, which accompanies this paper:              https://github.com/plaplant/21cm_gal_cross_correlation.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The Epoch of Reionization (EoR) is one of the last remaining frontiers in observational cosmology. The EoR is the time period when the first luminous sources in the universe formed and gradually photoionized neutral hydrogen in the surrounding intergalactic medium (IGM). This is thought to occur roughly 0.5-1 Gyr after the Big Bang. This large-scale transition of the universe has yet to be fully observed: in particular, the redshift evolution of the average ionization fraction, x i (z), and the overall topology of the reionization process remain highly uncertain. Measurements of the ionization history and its spatial fluctuations would help pin-down the properties of early star-forming galaxies, as well as cosmological parameters such as the optical depth of cosmic microwave background (CMB) photons to electron scattering, &#964;. As such, providing measurements of the EoR has important ramifications for multiple fields in astrophysics and cosmology.</p><p>Observationally, radio interferometers are attempting to measure the 21 cm signal of neutral hydrogen <ref type="bibr">(Madau et al. 1997)</ref>. The hyperfine transition of neutral hydrogen atoms emits and absorbs radiation with a rest wavelength of 21 cm, which can imprint an excess or deficit in brightness relative to the CMB backlight. During the EoR, the 21 cm signal is expected to have significant fluctuations due to the presence of large ionized regions, because once ionized, the hydrogen in the IGM no longer emits a 21 cm signal. While many experiments seek the 21 cm power spectrum (e.g., the Low Frequency Array; <ref type="bibr">van Haarlem et al. 2013</ref>; the Murchison Widefield Array, MWA; <ref type="bibr">Bowman et al. 2013;</ref><ref type="bibr">and the Owens Valley Long Wavelength Array;</ref><ref type="bibr">Hallinan et al. 2015)</ref>, in this work we focus on the Hydrogen Epoch of Reionization Array <ref type="bibr">(HERA;</ref><ref type="bibr">DeBoer et al. 2017)</ref>, currently under construction in the Karoo desert of South Africa. Once completed, HERA will be the most sensitive instrument to date measuring the 21 cm signal, and is expected to provide the first statistically significant detection of the 21 cm power spectrum <ref type="bibr">(DeBoer et al. 2017)</ref>.</p><p>At the same time, wide-field infrared telescopes are planning to observe a large ensemble of high-redshift galaxies directly, e.g., Euclid 7 and the Nancy Grace Roman Space Telescope (Roman; <ref type="bibr">Spergel et al. 2015)</ref>. In this work, we focus in particular on Roman, expected to launch in 2026, as its widefield surveys are deeper than the nominal Euclid survey. Among other scientific goals, Roman is slated to observe 2200 deg 2 of the sky with both imaging and spectroscopy, the socalled High Latitude Survey (HLS), with imaging and spectroscopic campaigns referred to as the HLIS and HLSS, respectively. Although this survey is designed to detect galaxies with redshift 1 &#61576; z &#61576; 3, it can also detect galaxies at redshifts of z &#61577; 5 via the Lyman-break technique <ref type="bibr">(Steidel et al. 1996)</ref>. These high-redshift galaxies include some of the sources thought to be responsible for ionizing the universe during the EoR. In principle, the galaxy field measured in such a widefield survey and the 21 cm signal should be statistically anticorrelated on large spatial scales during the EoR, at least in the expected case that large-scale overdensities are ionized first, i.e., in an "inside-out" reionization scenario. This anticorrelation is due to the galaxy field tracing highly biased regions of cosmic structure, whereas the 21 cm signal comes predominantly from lower-density, neutral regions. As such, the 21 cm and galaxy fields should be anticorrelated, at least on spatial scales that are large relative to the typical size of ionized bubbles.</p><p>The cross-power spectrum of the 21 cm and galaxy fields is also interesting in principle as a means to validate measurements made of the 21 cm auto-power spectrum from HERA. Detecting the cross-power spectrum between the 21 cm and an independent tracer will provide important evidence that should confirm inferences made from HERA measurements alone, such as information about the size distribution of ionized bubbles and their evolution with redshift. Furthermore, at low redshift the only 21 cm fluctuation measurements made to date have been in cross-correlation with galaxy and quasar catalogs, rather than as a 21 cm auto-power spectrum. For example, the recent results from the Canadian Hydrogen Intensity Mapping Experiment (CHIME<ref type="foot">foot_0</ref> ) report a detection of the cross-power spectrum between the 21 cm signal and galaxies and quasars from eBOSS <ref type="bibr">(CHIME Collaboration et al. 2022)</ref>. Additionally, H I intensity maps from z &#8764; 1 made by the Green Bank Telescope have been used in cross-correlation with galaxy surveys to measure the hydrogen abundance and bias parameters <ref type="bibr">(Chang et al. 2010;</ref><ref type="bibr">Masui et al. 2013;</ref><ref type="bibr">Switzer et al. 2013;</ref><ref type="bibr">Wolz et al. 2022</ref>). As such, modeling and measuring the 21 cm galaxy cross-power spectrum at high redshift is useful, independent of the 21 cm auto-power spectrum.</p><p>Previous studies <ref type="bibr">(Furlanetto &amp; Lidz 2007;</ref><ref type="bibr">Wyithe &amp; Loeb 2007;</ref><ref type="bibr">Lidz et al. 2009;</ref><ref type="bibr">Vrbanec et al. 2020</ref>) have looked at similar prospects for cross-correlation between 21 cm experiments and wide-field galaxy surveys. We move past this previous work by making cross-spectrum forecasts for the upcoming HERA and Roman data for the first time. Importantly, we also explicitly account for the "foreground wedge" <ref type="bibr">(Datta et al. 2010;</ref><ref type="bibr">Pober et al. 2013;</ref><ref type="bibr">Parsons et al. 2014)</ref>. That is, previous studies accounted only for foregrounds preventing measurements of Fourier modes with long-wavelength line-of-sight components, i.e., low-k &#8741; modes. In fact, the frequency dependence of the instrumental response of an interferometer leads to mode-mixing, and this corrupts some high k &#8741; modes as well. Nevertheless, this contamination is expected mostly to occupy a wedge-shaped region in the k &#8741; -k &#8869; plane. This further degrades the prospects for cross-power spectrum measurements, as discussed in this work. Furthermore, we include a detailed treatment of the Roman observations. Specifically, we account for the nominal magnitude and flux limits for the HLIS and HLSS, a complete handling of redshift uncertainties, and projections for the joint overlap area between HERA and Roman, now that such information is available <ref type="bibr">(Spergel et al. 2015;</ref><ref type="bibr">Dor&#233; et al. 2018;</ref><ref type="bibr">Wang et al. 2022)</ref>. Finally, we model how the cross-correlation signal varies as a function of ionization history and galaxy properties. Although these conclusions may be partly tied to the particular seminumeric simulation of reionization employed, these model variations nevertheless have important implications for the detectability and interpretation of the crossspectrum signal.</p><p>We organize the rest of this paper as follows. In Section 2, we described the seminumeric simulations used for this study. In Section 3, we present the primary results of our work. In Section 4, we discuss the feasibility of detection for upcoming 21 cm surveys. In Section 5, we explore ways that observations can improve on the fiducial measurement strategies. Finally, in Section 6, we provide a summary and avenues for future research. Throughout this work, we assume a &#923; cold dark matter cosmology with parameters consistent with the Planck 2018 results <ref type="bibr">(Planck Collaboration et al. 2020</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Methods</head><p>In this section, we describe the simulations used to explore the EoR and to model the cross-correlation power spectrum between the 21 cm and galaxy fields. We expect the 21 cm and galaxy fields to be well-correlated on relatively large scales (&#61577;1 h -1 Mpc) based on previous work <ref type="bibr">(Lidz et al. 2009</ref>). Thus we opt to simulate a large volume with moderate resolution, as this helps capture features of the field on the scales probed by upcoming observations. We begin by describing the seminumeric reionization method that we use, followed by our galaxy modeling. We also include a discussion of the various observational systematics present for both the 21 cm measurements as well as the galaxy field.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">21 cm Modeling</head><p>Accurately simulating the EoR is a computationally difficult problem. The complex interplay between dark matter, baryons, and photons necessitates employing N-body methods, hydrodynamics, and radiative transfer to capture the variety of physical effects self-consistently. Furthermore, the formation of luminous objects on subkiloparsec scales has implications for the ionization state of the IGM on scales of tens of megaparsecs, which requires a tremendous amount of dynamic range in the simulation. For now, even state-of-the-art simulation packages are incapable of handling such a tremendous workload. Furthermore, for the present study, we are most interested in the behavior of the IGM on relatively large scales, and so many of the small-scale details inside of individual galaxies are not relevant. Thus, we opt to use a seminumeric scheme for simulating reionization, which captures the main features in a reliable fashion while remaining relatively cheap computationally to allow for an exploration of the parameter space of various ionization histories. Specifically, we make use of the zreion seminumeric reionization code <ref type="bibr">(Battaglia et al. 2013</ref>), which has previously been applied to modeling the 21 cm signal in La <ref type="bibr">Plante et al. (2014)</ref>, La <ref type="bibr">Plante &amp; Ntampaka (2019)</ref>, and La <ref type="bibr">Plante et al. (2020)</ref>.</p><p>The central Ansatz of zreion is that the matter density field &#948; m (r) and the redshift at which a particular portion of the IGM is reionized ( ) r z re are correlated on large scales. We begin by defining the matter overdensity field &#948; m (r) as:</p><p>where rm is the mean matter density, and the equivalent "overdensity" field for the redshift of reionization &#948; z (r) is</p><p>where z is the mean value for the ( ) r z re field. The zreion method expresses the correlation between &#948; m (r) and &#948; z (r) as a scale-dependent bias factor b zm (k), which is defined as:</p><p>We parameterize this bias factor using two parameters, k 0 and &#945;, and express the bias as a function of Fourier wavenumber k:</p><p>( )</p><p>We use the value of b 0 = 1/&#948; c = 0.593, where &#948; c is the critical overdensity in spherical collapse halo models. Given a cosmological density field, zreion relies only on the value of the parameters { &#175;} a z k , , 0 to determine the redshift of reionization ( ) x z re . The midpoint of reionization-defined as the time at which the universe is 50% ionized by volume-is largely determined by the mean value z (though it is not identically equal to the parameter), and the duration is controlled by k 0 and &#945;.</p><p>For the current work, we run a series of dark-matter-only simulations that contain 1024 3 particles in a cubic volume with length L = 2 h -1 Gpc on a side, which corresponds to an angular extent of &#952; &#8776; 18 deg at z = 8. This is about twice the extent of the instantaneous field of view (FOV) of HERA <ref type="bibr">(DeBoer et al. 2017)</ref>. We first generate a set of initial conditions based off of transfer functions obtained from CAMB <ref type="bibr">(Lewis et al. 2000)</ref>. Given these initial conditions, we use second-order Lagrangian perturbation theory (2LPT) to evolve the particle positions as a function of redshift. Although 2LPT does not capture the nonlinear evolution of particles to the same extent as N-body methods, the results are sufficiently accurate on the scales of interest <ref type="bibr">(Scoccimarro 1998)</ref>. To generate the ionization field, we use 2LPT to evolve the particles to the midpoint of reionization z . To compute the matter density field &#948; m (x), we use triangular shaped clouds to deposit the particles in the simulation onto a regular cubic lattice. With the density field in hand, we apply a fast Fourier transform (FFT) and apply the bias from Equation (4) to &#948; m (k) to compute &#948; z (k).</p><p>After applying an inverse FFT, we use Equation (2) to compute ( ) x z re . To convert from this quantity at a particular redshift z 0 into an ionization field x i (x, z 0 ), we set all values of the ionization field to have a value of 1 where ( ) x z re is greater than z 0 (meaning that portion of the volume was reionized at an earlier time), and 0 otherwise.</p><p>Once we have the ionization and density fields, we can convert to the 21 cm brightness temperature by using <ref type="bibr">(Madau et al. 1997)</ref>:</p><p>where the quantity T 0 (z) is</p><p>In this equation, T S is the spin temperature of neutral hydrogen and T &#947; is the temperature of the CMB. It is common in the literature to assume the spin temperature to be coupled to the temperature of the gas T gas . This is thought to occur via X-ray photoheating, perhaps driven by X-ray binaries in early galaxies, such that the gas becomes substantially hotter than T &#947; before it is significantly ionized <ref type="bibr">(Furlanetto et al. 2006)</ref>. For contrast, it is also useful to consider a "cold reionization" scenario in which there is no such heating. In this case, we suppose that T gas cools adiabatically between when the gas thermally decouples from the CMB (near z &#8764; 200) and reionization. This makes T gas smaller than T &#947; , so the brightness temperature in Equation ( <ref type="formula">5</ref>) is negative and has a larger amplitude than in our fiducial scenario. We consider this case in Section 4.1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.1.">Changing the Ionization History</head><p>To explore the extent to which these results depend on the precise ionization history, we vary the parameters of our seminumeric reionization model described in Section 2.1 to produce alternate plausible histories. In addition to our "fiducial" scenario, we run a series of simulations that have a shorter reionization history with a comparable midpoint (the "short" scenario). We also run a simulation that has a slightly later midpoint of reionization (the "late" scenario), <ref type="foot">9</ref> which can improve our understanding of how the results are impacted by the midpoint of reionization occurring at redshift values that are not covered by the Roman grism (as mentioned above in Section 2.4.2).</p><p>Figure <ref type="figure">1</ref> shows the ionization fraction as a function of redshift for the different ionization histories described. In all cases, we use the same values of the galaxy bias parameters described below in Section 2.2. As discussed more there, we do not include any explicit connection between the ionization field and the galaxy field besides the fact that both are treated as biased tracers of the matter density field. However, given that we are primarily interested in relatively bright and biased galaxies in the HLS observations on large scales, we do not expect this simplification to be a significant source of error.</p><p>In the results below in Sections 3 and 4, we present results for all three of the ionization histories. In general, the amplitude of the cross-spectrum increases as the duration of reionization decreases. This is due to a larger amplitude of the 21 cm fluctuations on large scales for these shorter scenarios, which is a general feature of the seminumeric model used (and explored more in La <ref type="bibr">Plante et al. 2014)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Galaxy Modeling</head><p>Like reionization, galaxy formation is a rich and complicated physical process that involves the interactions between dark matter, baryons, and radiation over many decades in length scale. As a result, self-consistent simulations of galaxy formation and reionization-especially in the large volumes relevant to cross-correlations-remain exceedingly challenging <ref type="bibr">(Iliev et al. 2006;</ref><ref type="bibr">Trac &amp; Cen 2007;</ref><ref type="bibr">O'Shea et al. 2015;</ref><ref type="bibr">Ocvirk et al. 2016;</ref><ref type="bibr">Lewis et al. 2022)</ref>. Given that we are most interested in the large-scale correlation between the 21 cm and galaxy fields, we use a linear bias model to rapidly generate a galaxy abundance field rather than simulating it from first principles, though the linear bias values we adopt are themselves based on semianalytic and hydrodynamical models.</p><p>We first describe our model for star formation in high-z galaxies in Section 2.2.1 and the resulting predictions for the galaxy bias and abundance. Then, we describe our approach to Ly&#945; emission in Section 2.2.2, with an emphasis on the likelihood that galaxies detected in the Roman high-latitude imaging survey (HLIS) are also detected in the high-latitude spectroscopic survey (HLSS).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.1.">Galaxy Properties</head><p>The bias of galaxies-one of two key inputs to our modelcan be readily computed from simulations using an expression analogous to Equation (3), but using the galaxy overdensity field in place of the reionization redshift field. This is precisely what has been done for the BLUETIDES simulations <ref type="bibr">(Feng et al. 2014;</ref><ref type="bibr">Waters et al. 2016)</ref>. In this work, we will employ both the BLUETIDES predictions for the galaxy bias, as well as an efficient semiempirical model implemented in ARES. 10 Both models have been designed to match the high-redshift rest-UV luminosity functions (UVLFs), and so are in good agreement with current data sets by construction. ARES allows us to explore potential extensions to the HLS, for which galaxy bias predictions from simulations are not readily available. We summarize the most pertinent aspects of this model here briefly, and refer the interested reader to <ref type="bibr">Mirocha et al. (2017)</ref> and <ref type="bibr">Mirocha et al. (2020)</ref> for more details.</p><p>Our basic approach is similar to many semiempirical models put forth in recent years. We assume that the star formation rate (SFR) in galaxies is driven by halo growth, ( )</p><p>, where f * is a halo mass-dependent star formation efficiency. We derive halo mass accretion rates (MARs) under the assumption that halos evolve at fixed number density (see <ref type="bibr">Furlanetto et al. 2017)</ref>, which provides a good match to MARs derived from numerical simulations <ref type="bibr">(Mirocha et al. 2021)</ref>. We assume f * is a double power law, and calibrate its free parameters by jointly fitting to high-z UVLFs and UV colors (&#946;) from <ref type="bibr">Bouwens et al. (2015)</ref> and <ref type="bibr">Bouwens et al. (2014)</ref>.</p><p>Two key assumptions remain. First, we adopt the BPASS version 1 single-star models <ref type="bibr">(Eldridge &amp; Stanway 2009)</ref> with a stellar metallicity of Z = 0.004.<ref type="foot">foot_4</ref> Second, whereas many models in the literature neglect dust or adopt empirical relationships between dust attenuation and UV color, we selfconsistently forward model the full rest-UV spectrum of each model galaxy, with additional free parameters governing the dust production efficiency and scale length allowed to vary as well (see <ref type="bibr">Mirocha et al. 2020)</ref>. This results in slightly different relationships between, e.g., M UV and the UV extinction A UV than are predicted from the relationship between infrared excess and &#946; at lower redshifts, such as that of <ref type="bibr">Meurer et al. (1999)</ref>.</p><p>For semiempirical models like this, for which there is no simulation box, one must derive the galaxy bias as a weighted integral over the halo mass function, dn/dm, and halo bias, b h ,  <ref type="bibr">(2008)</ref> mass function throughout, and adopt their fitting formula for the halo bias as well <ref type="bibr">(Tinker et al. 2010)</ref>.</p><p>In Figure <ref type="figure">2</ref>, we compare the basic properties of galaxies in the ARES semiempirical model (solid black) to many others from the literature. <ref type="foot">12</ref> The most striking differences among this set of models occur in the stellar mass-halo mass (SMHM) relation (left), while the specific star formation rate (sSFR) of galaxies (center) and stellar mass-UV magnitude relation (right) are much more similar. At a glance, it is the semiempirical models that generally predict higher stellar masses at fixed halo mass <ref type="bibr">(Behroozi et al. 2013a</ref><ref type="bibr">(Behroozi et al. , 2013b;;</ref><ref type="bibr">Tacchella et al. 2018;</ref><ref type="bibr">Park et al. 2019;</ref><ref type="bibr">Mirocha et al. 2020)</ref>, as well as more not shown here (e.g., Sun &amp; Furlanetto 2016), while simulation-based estimates like BLUETIDES <ref type="bibr">(Feng et al. 2015</ref><ref type="bibr">(Feng et al. , 2016) )</ref> and, e.g., FIRE-2 <ref type="bibr">(Ma et al. 2018)</ref>, exhibit lower M * /M h ratios at fixed M h , more in line with Figure <ref type="figure">1</ref>. The global ionization fraction x i as a function of redshift z. We show this quantity for the three scenarios explored in this work. As mentioned in Section 2.1.1, these different histories help show the extent to which our forecasts are affected by details specific to one particular history.</p><p>UNIVERSEMACHINE predictions <ref type="bibr">(Behroozi et al. 2019</ref>) and other similar semiempirical models built on N-body simulations (e.g., <ref type="bibr">Moster et al. 2018</ref>). However, one can also find examples in the literature of very detailed ab initio simulations that predict higher SMHM ratios than BLUETIDES or FIRE (e.g., Renaissance, SERRA; <ref type="bibr">Xu et al. 2016;</ref><ref type="bibr">Pallottini et al. 2022)</ref>, so the source of the differences here remains (as far as we know) unclear.</p><p>To our knowledge, only <ref type="bibr">Waters et al. (2016)</ref> explicitly provided predictions for the bias of Roman HLS sources. From Figure <ref type="figure">2</ref>, we expect the bias of sources in BLUETIDES to be higher than the ARES models described earlier in this section: because there is good agreement between predictions for the sSFR as a function of stellar mass among all models (middle panel), and reasonably good agreement also in predictions for the M UV -M &#229; relation (right), we conclude that at fixed stellar mass, the models predict very similar intrinsic and dustreddened luminosities. As a result, the SMHM relation will dominate differences in galaxy bias predictions. Indeed, this is the case, as we show in Figure <ref type="figure">3</ref> (top), along with predictions for the abundance of high-z galaxies (bottom). Here, we show the BLUETIDES predictions as well as ARES models with four different magnitude cuts: the nominal HLS limiting magnitude of 26.7 (solid blue), as well as scenarios 1 magnitude shallower (dotted cyan) and 1 and 2 magnitudes deeper (dashed and dashed-dotted curves, respectively). As expected from Figure <ref type="figure">2</ref>, these models differ nontrivially in their predictions: while BLUETIDES predicts b(z = 8) ; 13.5, the <ref type="bibr">Mirocha et al. (2020)</ref> models predict b(z = 8) ; 9, with more rapid evolution at high redshifts than a linear extrapolation of BLUETIDES would suggest. In the bottom panel, we compare predictions for the surface density of galaxies as a function of redshift. Once again, there are noticeable differences, at the &#8764;2-3x level.</p><p>The differences between model predictions for the SMHM relation (and thus galaxy bias) are certainly interesting and warrant attention (see, e.g., Section 4.1 in <ref type="bibr">Tacchella et al. 2018, for additional discussion)</ref>. In this work, however, we will remain agnostic about which of these models (if any) are correct, and instead use their differences to motivate a plausible range of possibilities to explore in our cross-correlation forecast. Because BLUETIDES has provided predictions for the  bias of HLS sources explicitly, we will explore this as one possible scenario, and use the ARES models as a contrasting case, for which we can efficiently generate alternative scenarios for different survey parameters or galaxy properties. For additional comparisons of the BLUETIDES and ARES predictions, see Appendix A.</p><p>Finally, once we have the linear bias in hand, we construct the galaxy field from the matter density field &#948; m through the relation:</p><p>Note that given the relatively large values for the bias b g , as seen in Figure <ref type="figure">3</ref>, the simulated galaxy distribution does sometimes reach nonphysical values in the simulation, where &#948; g &lt; -1. Although this is a limitation of our current treatment, we do not expect these values to cause significant inaccuracies in the resulting predictions given that most of the crossspectrum sensitivity comes from large spatial scales where a linear biasing model is a good approximation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.2.">Nebular Emission</head><p>So far, we have only considered the bias and abundance of galaxies detected in at least one band in the HLIS. However, a key question is whether or not the galaxies detected in the imaging survey will also be detected spectroscopically, since redshift uncertainties &#963; z &#61577; 0.1 (e.g., photometric redshifts alone) are expected to prevent a cross-correlation detection <ref type="bibr">(Furlanetto &amp; Lidz 2007)</ref>. Specifically, if the redshift uncertainties in the galaxy survey are too large, this prevents measuring higher k &#8741; modes in the galaxy distribution, while 21 cm surveys generally lose the low k &#8741; modes to foreground contamination. For 21 cm galaxy cross-power spectrum measurements, it is thus crucial to obtain good spectroscopic redshifts in the galaxy survey, as this will help ensure that each survey measures some common Fourier modes. We discuss the impact of measurement uncertainties in more detail in Section 2.4.2, and here focus on whether galaxies bright enough to be detected in imaging ought to also be bright enough in their Ly&#945; emission to be detected in the spectroscopic survey.</p><p>We take a simple approach and assume photoionization equilibrium in the HII regions of galaxies. This allows us to relate the luminosity of Ly&#945; to the recombination rate (&#8764;2/3 of recombinations result in Ly&#945; photons), which is in turn related to the intrinsic ionizing photon production rate, one of the main predictions of the galaxy model. The implementation in ARES is described in more detail in Section 2.2.2 of <ref type="bibr">Sun et al. (2021)</ref>.</p><p>In Figure <ref type="figure">4</ref>, we show our predictions for the relationship between intrinsic Ly&#945; line luminosity and apparent UV magnitude at z = 8, with nominal sensitivity limits for the HLIS and HLSS overlaid for reference (vertical and horizontal lines, respectively). For the flux limits, we adopt the nominal sensitivity quoted in <ref type="bibr">Wang et al. (2022)</ref>, and include factors of 2x and 5x deeper/fainter surveys for reference as well. Objects detected in the upper-right quadrant of this plot are detected in both imaging and spectroscopy, while objects in the lower-right or upper-left quadrants are only detected in imaging or spectroscopy, respectively. We can see clearly that an object detected in imaging should also be detectable spectroscopically via its Ly&#945; emission. This remains the case regardless of whether one includes dust or not (circles versus squares), since dust here attenuates the UV continuum and Ly&#945; similarly. Setting the dust contents to zero by hand will boost the overall number of galaxies, as well as the magnitude of the brightest galaxy, but not the relationship between Ly&#945; flux and m AB . Additionally, each of the galaxies in the HLSS will be observed from multiple roll angles, and so to reduce the incidence of false positives, galaxies will be required to be observed from at least three different rolls.</p><p>Figure <ref type="figure">4</ref> also tells us that galaxies that are slightly too faint to be detected in imaging may still be detectable in the HLSS. However, given that Roman uses a grism, the expectation is to only extract spectra for known sources identified in imaging. The result shown in Figure <ref type="figure">4</ref> is not entirely a surprise. The nominal HLS limiting magnitude and flux limits are chosen to optimize low-z science, e.g., the ability to detect 1 &#61576; z &#61576; 3 galaxies both in imaging of the rest-optical continuum (with some contamination from lines) and the rest-optical emission lines themselves, like H&#945; and [O III]. Because the continuum of star-forming galaxies is relatively flat from the UV through the optical, and recombination results in &#8764;1 Ly&#945; photon for every H&#945; photon, it makes sense that sensitivities set for low-z science goals are about right for detecting high-z galaxies in Ly&#945; and UV continuum as well.</p><p>Before moving on, note that so far we have neglected IGM transmission effects, effectively assuming that every HLS galaxy resides in a very large fully ionized bubble. In all that follows, we adopt an Ly&#945; emitter (LAE) duty cycle or fraction, f LAE , which reduces the number of spectroscopically detected galaxies but not the luminosity of any individual object. In practice, the LAEs observable by Roman will have some spatial and luminosity dependence primarily owing to absorption from neutral regions in the IGM. However, for the sake of simplicity in this initial study, we instead use an overall factor and defer a more realistic treatment to future work. <ref type="foot">13</ref>Figure <ref type="figure">5</ref> shows a comparison between the LAE galaxy luminosity function predicted from ARES and the measurements from the SILVERRUSH survey <ref type="bibr">(Konno et al. 2018)</ref>. We include our galaxy models both with and without dust reddening, and plot different factors of f LAE . As can be seen from the figure, a value of f LAE = 0.1 and including dust reddening agrees reasonably well with the available measurements at z &#8764; 5.7 and z &#8764; 6.6. Note that our overall factor f LAE is not inferred the same way as f duty , which sometimes appears in the interpretation of these measurements. Constraints on f duty are typically based on clustering measurements: one infers a large-scale bias, which can be related to a minimum halo mass and thus a prediction for the total number density of halos. The value for f duty is then computed as the number of LAEs divided by the number of expected halos. In our case, the number of LAEs in any L &#945; bin is reduced both by f LAE and dust reddening. As a result, we do not need extreme values of f LAE = 0.01 unless dust is completely negligible.</p><p>In the following analysis, we choose f LAE = 0.1 as our default value, but also include predictions for f LAE = 1 and f LAE = 0.01 for the sake of comparison and understanding how this values impacts our final results. For example, f LAE = 0.01 may be a more realistic value at higher redshift, when the IGM becomes more neutral and the opacity to Ly&#945; photons increases due to there being smaller neutral regions (and hence less time for photons to redshift out of the Ly&#945; transition window).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">The 21 cm Galaxy Cross-spectrum</head><p>Once we have the 21 cm brightness temperature field defined in Equation (5) and the galaxy field defined in Equation (8), we are able to compute the cross-spectrum P 21&#215; gal . For the sake of comparing the results with observations, we compute this quantity as a function of k &#8869; and k &#8741; , i.e., the Fourier modes that lie in the plane of the sky and along the line of sight, respectively. Given the relatively small angular extent of our simulations, we use the flat-sky approximation when computing power spectra. We average over all modes that contribute to a given cylindrical wavenumber bin, described by (k &#8869; , k &#8741; ):</p><p>We first construct a pair of light cones-one each for the 21 cm field and the galaxy field-in a self-consistent fashion using our 2LPT simulations described above. Once we have these light cones, we apply FFTs and then compute the cross-power spectrum as described above.</p><p>Figure <ref type="figure">6</ref> shows a visualization of the 21 cm field and the galaxy fields as generated be the seminumeric models described above in Sections 2.1 and 2.2. These fields are generated from a realization of our fiducial ionization history by averaging over the respective light cones between 7.5 z 8.5. Note that in practice when computing the cross-spectrum defined above in Equation (9), we compute the FFT in three dimensions and do not average prior to computing this cross-spectrum. This approach ensures that we retain the full Fourier space information and can accurately model the effect of avoiding modes that are contaminated by the 21 cm foregrounds described below in Section 2.4.1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.">Observational Effects</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.1.">21 cm Observations</head><p>In general, 21 cm observations from interferometers are subject to foreground contamination that are many orders of magnitude larger than the cosmological signal from reionization <ref type="bibr">(Morales &amp; Wyithe 2010;</ref><ref type="bibr">Pober et al. 2013)</ref>. The dominant foreground is due to galactic synchrotron radiation from the Milky Way, which roughly follows a power law in frequency and hence is smooth in Fourier space. Na&#239;vely, these bright foregrounds should be restricted to small k &#8741; modes in Fourier space. However, the chromatic response of the interferometer causes this foreground signal to scatter into high k &#8741; modes, creating a "wedge" in (k &#8869; , k &#8741; ) Fourier space <ref type="bibr">(Parsons et al. 2012</ref>).</p><p>Figure <ref type="figure">5</ref>. The LAE galaxy luminosity function predicted by ARES and measured from SILVERRUSH <ref type="bibr">(Konno et al. 2018</ref>). As can be seen, an effective LAE duty cycle of f LAE = 0.1 with dust reddening agrees reasonably well with the data at both z &#8764; 5.7 and z &#8764; 6.6. We use this value as our fiducial value for the rest of the analysis, but include predictions for f LAE = 1 and f LAE = 0.01 as additional points of comparison.</p><p>Mathematically, the slope of the wedge m(z) is implicitly a function of baseline length and cosmology, and can be expressed as <ref type="bibr">(Thyagarajan et al. 2015)</ref>:</p><p>where &#955;(z) is the wavelength of the 21 cm signal at a given redshift z, D c (z) is the comoving distance to that redshift, f 21 is the rest-frame frequency of the 21 cm signal, and H(z) is the Hubble parameter. Note that this form of the foreground wedge in Equation (10) accounts for maximal data contamination: physically, the bright foreground contamination extends down to the horizon of the interferometer beam. Improved calibration of interferometers may make it possible to work "inside the wedge" at points in Fourier space where the contamination from foregrounds may be less severe, but for the purposes of this work, we assume this maximal amount of contamination.</p><p>To include this effect, we explicitly track which portions of Fourier space for each redshift are subject to this foreground contamination. For the redshift values relevant to the EoR, m &#8764; 3, which leads to a significant amount of Fourier space being subject to this contamination.</p><p>In addition to the foreground contamination, we include the effects of thermal noise present in measurements from HERA. For power spectrum measurements, the primary analysis pipeline for HERA uses the so-called "delay transform" of radio interferometer visibilities to estimate the power spectrum. Under the flat-sky approximation, the thermal noise contribution to the variance of the power spectrum can be written as a function of the observational frequency &#957; and wavenumber u as <ref type="bibr">(Parsons et al. 2014</ref>):</p><p>where T sys is the system temperature of the interferometer, &#937; p = &#8747;d[2]lA(l) is the integral of the primary beam of the antenna A(l), and</p><p>is the integral of the square of the primary beam. X(&#957;) and Y(&#957;) are factors that convert the "observed units" of the interferometer into cosmological units <ref type="bibr">(Furlanetto et al. 2006</ref>). X(&#957;) accounts for the conversion in the plane of the sky:</p><p>where D M (&#957;) is the transverse comoving distance (D M = D c for a flat universe); and Y(&#957;) accounts for the conversion along the line of sight:</p><p>Note that X(&#957;) has units of comoving megaparsecs per radian, and Y(&#957;) has units of comoving megaparsecs per hertz. Finally, t int is the integration time of the measurement, N pol is the number of instrumental polarizations used to estimate the power spectrum, and N bl (u) is the number of baseline pairs at a given u, where u denotes the physical separation of HERA dishes in units of observed wavelength. Each baseline of length u samples wavenumbers with transverse components, k &#8869; , according to u(&#957;) = D M (&#957;)k &#8869; /2&#960;. For HERA, we assume a system temperature of T sys = 400 K, an observation time of t int = 200 hr, and N pol = 2 for two independent linear polarizations. We assume an overall observing season of 1000 hr, which is distributed among five nonoverlapping fields. Such an approach was taken in the recent reanalysis of Phase I HERA data (The HERA Collaboration et al. 2022). We discuss combining observations from different regions of the sky more below in Section 4.3.</p><p>The number of baselines that observe a wavenumber u depends on the configuration of the array. We show the baseline distribution for HERA in Figure <ref type="figure">7</ref>. HERA features many short baselines, so most of the baselines probe modes for which u &#61576; 300. We also show the noise floor for an observation, given by Equation (11). For the current work, we limit ourselves to considering only the projected sensitivity for HERA. Next-generation radio telescopes, such as the Figure <ref type="figure">6</ref>. A visualization of our simulation volumes. Left: the 21 cm field generated from our simulations described in Section 2.1. This 2&#176;&#215; 2&#176;field represents about 1% of the simulated sky area from one of our simulations. Right: a self-consistently generated galaxy field described in Section 2.2. These two-dimensional slices are averaged over the redshift window 7.5 z 8.5, which spans the midpoint of reionization for this realization. Although we show the two-dimensional field for illustration purposes, we compute the cross-spectrum in three dimensions to avoid catastrophic cancellation of Fourier modes contaminated by 21 cm foregrounds.</p><p>Square Kilometre Array (SKA), are projected to provide data with even greater sensitivity. However, as we will see below in Section 4.1, the 21 cm signal variance is dominated by cosmic variance of the modes used to make the measurements, rather than the instrumental uncertainty. As such, the additional sensitivity from the SKA may not significantly improve the measured significance of the 21 cm auto-power spectrum. Nevertheless, the SKA may offer a significant improvement over HERA by being able to use more of the observable Fourier space, parameterized by the foreground wedge m in Equation (10). The SKA may also be able to observe more sky area that overlaps with the Roman HLS, yielding more joint sky coverage. It is also worth noting that given the improved sensitivity offered by the SKA, it may be possible to do a "stacking" type analysis in real (map) space, rather than Fourier space as assumed in this manuscript. It may be worthwhile to revisit some of these types of forecasts in the future to investigate the prospects for the SKA.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.2.">Galaxy Surveys</head><p>Upcoming galaxy surveys will provide us the deepest and most comprehensive catalog of high-redshift objects to date. Nevertheless, these objects are still relatively faint and rare, and as such are only sparsely sampled. Furthermore, the redshifts of the galaxies in these samples are only approximately known. Assuming Poisson statistics and accounting for redshift uncertainties, the noise power spectrum for the galaxy field is:</p><p>where &#963; &#967; = c&#963; z /H(z) is the comoving distance uncertainty along the line of sight, given a 1&#963; redshift precision of &#963; z , and n gal is the comoving number density of galaxies. The HLS in Roman is expected to have a total survey area of 2200 deg 2 <ref type="bibr">(Spergel et al. 2015)</ref>. However, the HLS footprint is not expected to fully overlap with the regions of the sky surveyed by HERA. For the purposes of this analysis, we assume an overlapping sky region of 500 deg 2 , as is expected given the nominal HLS footprint (see Figure <ref type="figure">8</ref>). Note that the planned Euclid deep fields are comparable to the HLS in depth.</p><p>Unfortunately, only one lies in the HERA stripe, and is relatively narrow in the sky area it covers. We discuss tradeoffs between depth and area further in Section 5. Also vital to the success of any cross-correlation effort are precise redshift measurements. Photometric redshift estimates are expected to yield uncertainties of &#963; z &#8764; 0.5 using the Lyman-break technique <ref type="bibr">(Bouwens et al. 2015;</ref><ref type="bibr">Finkelstein 2016)</ref>, while we expect uncertainties &#963; z &#61577; 0.1 to preclude a cross-correlation detection <ref type="bibr">(Lidz et al. 2009)</ref>. Given the high redshifts of interest, a successful HERA-Roman cross-correlation requires spectroscopically determined redshifts. Fortunately, the reference spectroscopic survey <ref type="bibr">(Wang et al. 2022</ref>) expects &#963; z ; 0.001(1 + z), and so redshift measurements in principle should not be a limiting factor. We explore the effect of redshift uncertainty explicitly in Section 4.1 below.</p><p>Given the sparsity of strong rest-frame UV lines in starforming galaxies, Ly&#945; is likely the only line to be detected in Roman spectroscopy at the redshifts of interest. Furthermore, grism spectroscopy will only be extracted at the locations of sources detected in imaging. As a result, we have two requirements: (i) that galaxies be detected via the Lymanbreak technique, and (ii) that Ly&#945; is brighter than the sensitivity of the spectroscopic survey. Though there are potential systematic observing issues, we defer a discussion of these to Section 4.4.1.</p><p>The first requirement is explored in Figure <ref type="figure">9</ref>, where we show the spectral coverage of the Roman photometry and spectroscopy. The bottom panel shows the transmission curves for the filters that will be used to find dropouts, while in the top panel, we show the redshift ranges corresponding to dropouts in each filter. We follow <ref type="bibr">Drakos et al. (2022)</ref>, and for simplicity assign the dropout filter as the bluest filter that contains Ly&#945;. All galaxy magnitudes are reported in the filter just redward of the dropout filter, in order to avoid contamination from the break and the Ly&#945; line itself. Though the dropout technique can be used to identify galaxies at redshifts at z &#61576; 4 with Roman, the grism covers only &#955; &#228; [1, 1.93]&#956;m. As a result, only photometrically selected galaxies at z &#61577; 7.2 have a chance at an Ly&#945; detection.</p><p>The second requirement for a galaxy to be included in crosscorrelation analyses-that Ly&#945; is bright enough to be detected -is more difficult to assess. As discussed in Section 2.2 and shown in Figure <ref type="figure">4</ref>, given the nominal HLSS line flux limits <ref type="bibr">(Wang et al. 2022)</ref>, any galaxy detected photometrically should also have detectable Ly&#945; emission assuming negligible absorption from the IGM. This is of course an optimistic assumption. In general, the occurrence rate of LAEs will be reduced by intergalactic absorption, particularly sources residing in small bubbles. Though the intrinsic LAE fraction at high redshifts is unknown, simulations suggest that Ly&#945; escape could be nontrivial at the redshifts of interest here <ref type="bibr">(Garel et al. 2021;</ref><ref type="bibr">Smith et al. 2022)</ref>. Furthermore, the HLS preferentially picks out bright sources that very well may live in large bubbles and so be subject to little attenuation from the IGM. We explore f LAE = 0.1 as our fiducial case, but also explore contrasting scenarios in which only 1% or 100% of galaxies detected in the HLIS have detectable Ly&#945; emission in the HLSS. We will denote these cases via f LAE = 0.01 and f LAE = 1, respectively. The pessimistic scenario may apply if  <ref type="formula">11</ref>) is inversely proportional to the number of baselines that observe a particular u mode on the sky, and is shown in orange. We plot the dimensionless noise power spectrum &#916; 2 (k) = k 3 P(k)/2&#960; 2 with a fixed value of k &#8741; = 0.1 h Mpc -1 , for 200 hr of observation (i.e., a single field).</p><p>the LAE populations uncovered by the Subaru Hyper-Suprime Cam (e.g., <ref type="bibr">Ouchi et al. 2018)</ref> are representative of all reionization era LAEs, while the f LAE = 1 case is included as a maximally optimistic limit.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Results</head><p>We first show results for the cross-correlation signal P 21cm&#215; gal as a function of k &#8869; and k &#8741; . This is the raw signal that we are attempting to measure. When computing the crossspectrum, we construct a light cone of observations that span the full redshift range 6 z 15. In practice, the Roman grism frequency coverage means that we are not sensitive to galaxies at z &lt; 7.2. Furthermore, the most detectable contributions to the cross-spectrum will come from relatively low redshift values (z &#61576; 9). Nevertheless, we include both higher and lower redshift ranges when constructing the light cones and then restrict ourselves to 7.5 z 12 for analysis. Once this has been done, we compute the Fourier transform of nonoverlapping windows along the line of sight that each cover 6 MHz of bandwidth. This is representative of the typical bandwidth used in previous HERA data analyses <ref type="bibr">(Abdurashidova et al. 2022a</ref><ref type="bibr">(Abdurashidova et al. , 2022b))</ref>. This bandwidth also translates to comoving distances of 50 &#61576; &#967; &#61576; 100 h -1 Mpc depending on the redshift. Although these cubes are not strictly comoving, the evolution induced by the light cone is sufficiently small so as not to induce spurious signal (La <ref type="bibr">Plante et al. 2014</ref>). To further decrease the amount of light cone evolution signal included, we apodize in the line-of-sight direction using a four-term Blackman-Harris window. We run 30 independent realizations of these simulations where we change the initial conditions but keep the cosmological and astrophysical parameters fixed, and average together the resulting power spectra computed in the described fashion. This averaging helps ensure that the spectra are smooth, especially on the scales relevant to upcoming observations.</p><p>Figure <ref type="figure">10</ref> shows the cross-spectrum P 21cm&#215; gal near the midpoint of reionization at z &#8764; 8.01, where the amplitude of the cross-spectrum is greatest. In general, the signal is negative, which indicates an anticorrelation. This anticorrelation is expected, as the brightness temperature &#948;T b in Equation ( <ref type="formula">5</ref>) is sourced by regions of neutral hydrogen, which tend to be lowdensity regions far from galaxies in an inside-out reionization scenario. We also note that the largest-amplitude signal comes from large scales (small values of k &#8869; and k &#8741; ). We also show as a solid line the value of the slope of the "horizon wedge" defined in Equation (10). For comparison, we also show values of m = 1 (dashed) and m = 0.5 (dotted) lines. These represent varying levels of foreground contamination: the horizon wedge reflects data that are maximally corrupted, whereas less-severe cuts would be possible if improved calibration methods make it possible to recover these data. It is also worth noting that there is a much larger dynamic range in k &#8869; than in k &#8741; : given the fact that we perform a Fourier transform on a slab with a relatively short axis along the line of sight, there are far fewer k &#8741; modes available for a given observation. In order to provide a more significant cumulative measurement of the signal, we combine measurements from nonoverlapping redshift windows along the line of sight. Although precise characterization of the k-and z-dependence of the signal would be ideal, for the near-term forecast most relevant to HERA, we are focused on a detection. We also ignore potential covariance between different k-and zbins on the assumption that they are relatively free of systematic errors. We return to these assumptions in Section 4.4.2.</p><p>We also show as horizontal lines several values of k &#8741; corresponding to &#963; z galaxy uncertainties defined in Equation (14). We plot values where k &#8741; = 1/&#963; &#967; , above which the galaxy uncertainty exponentially increases. Given the exponential nature of the noise contributed from this uncertainty, values of k &#8741; greater than these lines face significant contamination. That is, modes with larger k &#8741; are poorly measured in the galaxy survey and do not contribute significantly to a cross-spectrum detection (see further discussion in Section 4). As mentioned above in Section 2.4.2, this implies that photometric surveys are insufficient for present purposes, since the redshift uncertainties in high-redshift Lyman-break surveys are expected to be on the order of &#963; z ; 0.5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Detectability</head><p>We now turn to forecast the expected signal-to-noise ratio (S/N) for future cross-spectrum measurements. In this analysis, we primarily concern ourselves with the statistical uncertainty of these measurements, and follow related calculations in <ref type="bibr">Furlanetto &amp; Lidz (2007)</ref> and <ref type="bibr">Lidz et al. (2009)</ref>. For 21 cm experiments, we examine the projected sensitivity for HERA. For galaxy surveys, we concern ourselves primarily with the Roman HLS and HLS-like surveys. We do not consider the effect of systematic errors for two reasons: (i) in regions of Fourier space that are not dominated by foreground contamination of the wedge, the Fourier modes should be noise dominated, and (ii) in general, cross-correlation measurements are unbiased (at the cost of higher statistical variance) if there are no sources of joint systematic error between the two measurements. We begin with a more detailed description of the observational approaches of Roman and HERA in Section 4.2, go through a treatment of the S/N in our fiducial reionization scenarios in Section 4.1, explore building up cumulative S/N by combining measurements from different modes in Section 4.3, and then turn to possible systematic errors in Section 4.4.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">S/N Calculations</head><p>We are interested in the uncertainty of the cross-spectrum &#963; 21,gal as a function of k &#8869; and k &#8741; , as this defines our expected sensitivity in Fourier space. In this case, the uncertainty can be expressed as <ref type="bibr">(Lidz et al. 2009</ref>): gal 21 gal . Roman spectral coverage. Bottom: we show transmission curves for five filters, starting with the z band (F087), which is not used in the nominal HLS, followed by the Y, J, H, and F184 band filters, from left to right, which will all have HLS coverage. An example spectrum of a z = 8.5 dust-free galaxy from our model is shown for reference (with nebular emission neglected for clarity). Top: we show the corresponding redshift range probed by each filter, where the "dropout" filter is the bluest filter containing the Ly&#945; line. The fact that grism coverage is restricted to &#955; 1 &#956;m limits spectroscopic redshifts to galaxy candidates at z &#61577; 7.2. Similarly, the uncertainty of the 21 cm power spectrum &#963; 21 can be written as:</p><p>where T 0 is defined in Equation ( <ref type="formula">6</ref>) and P 21 noise is defined in Equation (11). Finally, the uncertainty of the galaxy power spectrum &#963; gal can be written as:</p><p>[ ( )] [ ( ) ( )] ( ) &#61520; &#61520; &#61520; s = = + P k k P k k P k var , , , 1 7 gal 2 gal gal gal noise 2</p><p>where P gal noise is defined in Equation ( <ref type="formula">14</ref>). These expressions capture the total variance of the observed power spectra, which is due both to cosmic variance and statistical instrumental uncertainties.</p><p>Figure <ref type="figure">11</ref> shows the signal and noise quantities defined in Equations ( <ref type="formula">15</ref>)-( <ref type="formula">17</ref>). The solid lines correspond to the signal, and the dashed lines correspond to the noise. To build intuition for how these quantities evolve with Fourier modes k &#8869; and k &#8741; , we hold one mode fixed and plot the signal and noise curves as a function of the other one. We show results for f LAE = 0.1 and &#963; z = 0.01 with a redshift window centered at z = 8.01, near the midpoint of reionization. The panel on the left shows the behavior for fixed k &#8741; = 0.09 h Mpc -1 , and varies k &#8869; . Also shown as a vertical dashed line is the location of the horizon wedge given the value of k &#8741; . As denoted in the figure, the modes to the right of this line are nominally contaminated by foregrounds with a horizon-level wedge defined in Equation (10). In this plot, the effect of the uv coverage of HERA can be seen in the 21 cm noise curves: at very low values (k &#8869; &#61576; 0.01 h Mpc -1 ) and high values (k &#8869; &#61577; 0.3 h Mpc -1 ), the experimental uncertainty increases dramatically because of the finite baseline size and the lack of very long baselines, respectively. As seen in Figure <ref type="figure">7</ref>, many of HERA&#700;s baselines are relatively short, meaning there is good coverage of intermediate k &#8869; modes at the expense of large ones. On the right panel, we show the behavior for fixed k &#8869; = 0.09 h Mpc -1 and vary k &#8741; . At relatively large values of k &#8741; , the noise dominates significantly for the galaxy signal. It is also worth noting that in this panel, the foreground-contaminated modes lie to the left of the vertical dashed line.</p><p>It is worth discussing some of the large-scale behavior of the trends in this figure to understand how the overall S/N ratio is affected by various observational and experimental considerations. First, it is worth noting that for many values of k &#8869; , the 21 cm signal is cosmic-variance dominated. Accordingly, for modes that are well sampled by HERA, a significant detection of the auto-spectrum should be possible. Conversely, the galaxy power spectrum as measured by Roman is dominated by experimental uncertainties, especially at large values of k &#8869; and k &#8741; . For large values of k &#8869; there is a drop-off in the amplitude of the signal at these modes, whereas the low sensitivity for large values of k &#8741; is driven primarily by the exponential scaling of the redshift-space uncertainty in Equation ( <ref type="formula">14</ref>). Note that the variance decreases with the number of modes in a survey, as given by Equation (18). Thus it is possible to decrease the instrument uncertainty by using a sufficiently large volume. We return to prospects for measuring the galaxy auto-spectrum below in Section 4.3. Given the uncertainty of the crossspectrum is essentially the geometric mean between the individual uncertainties, the S/N of the cross-spectrum is not cosmic-variance dominated, but it also is not as noise dominated as the galaxy spectrum.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Observational Strategies</head><p>In addition to the per-mode uncertainty described above, another important ingredient in the ultimate sensitivity calculations is the number of Fourier modes that may be observed given our survey volume V survey . In practice, we consider cross-spectrum estimates in bins in k &#8869; and k &#8741; . A given bin will generally receive contributions from many independent Fourier modes, with the Fourier-space resolution set by the survey dimensions, and averaging over these modes will decrease the uncertainties in the bin-averaged measurements.</p><p>Consider a cylindrical shell/bin of fixed logarithmic extent survey that lie within this cylindrical bin is given by:</p><p>Given that the underlying fields are real, there is a Hermitian symmetry of the Fourier transform applied when constructing the power spectrum. As such, we restrict ourselves to estimating the variance only from the upper half-plane so as not to double-count these modes when estimating the variance.</p><p>We have also taken the absolute value of k &#8741; assuming that positive and negative values are statistically similar. 14  As discussed above in Section 2.4 and shown in Figure <ref type="figure">8</ref>, we expect that the overlap between Roman and HERA is about 500 deg 2 . Given that HERA is a drift-scan instrument (rather than a tracking/pointing array), we can treat a joint measurement as a series of single-field patches that we sum together incoherently. This approach is similar to the one advocated in <ref type="bibr">Liu &amp; Shaw (2020)</ref>, where the number of such patches is parameterized as N patch . The instantaneous FOV of HERA is about 10&#176; <ref type="bibr">(DeBoer et al. 2017</ref>), so we take the number of such patches to be N patch = 5. We divide by the square root of this number when estimating the uncertainties in different wavenumber bins.</p><p>As a trade-off to this patch-based approach, we must compute our survey volume V survey self-consistently given the angular extent of one of these patches. For each redshift bin z i , we assume the FOV of HERA and use this value to compute the comoving distance &#967; &#8869; using the angular diameter distance D A (z i ). For the line-of-sight component, we compute the comoving distance &#967; &#8741; that corresponds to our 6 MHz of bandwidth. The resulting survey volume is thus</p><p>In practice the number of nonoverlapping patches will be dictated by the joint coverage of HERA and Roman. For the forecasts considered here, we assume that it is possible to construct N patch = 5 fields of observation from which we estimate our S/N. The overall uncertainty is only sensitive to the square root of this quantity, and so having fewer patches does not significantly affect the overall forecast.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Combining Modes</head><p>As discussed above in Section 4.2, the number of times an individual mode is measured depends on the details of the survey geometry. To first order, we have approximated this counting factor using Equation (18), which gives the differential number of modes within a volume element of (k &#8869; , k &#8741; ) space. By accounting for this factor, we can understand how the expected variance of the measurement is affected by the survey volume and the amount of sky area covered by the two different experiments. We define the bin-averaged S/N &#349; as: where P 21&#215; gal is measured empirically from our simulated volumes, and &#963; 21&#215; gal is computed from Equation (15). For the forecasts here, we choose logarithmic bins of fixed extent</p><p>1, though we have verified that the overall results are relatively insensitive to the precise choice of binning scheme.</p><p>Figure <ref type="figure">12</ref> shows the per-mode S/N ratio as a function of k &#8869; and k &#8741; of the cross-spectrum P 21,gal for our fiducial reionization scenario. The uncertainty &#963; 21&#215; gal is given by Equation (15), and the signal P 21&#215; gal is shown in Figure <ref type="figure">10</ref>. For the uncertainty calculation, we use the parameters f LAE = 0.1 and &#963; z = 0.01. When the mode-sampling in Equation ( <ref type="formula">18</ref>) is applied, we see that the significance of several modes is &#710;&gt; s 1. Also consistent with Figure <ref type="figure">11</ref>, the most significant modes lie in the regions of k-space that have small values of k &#8869; and k &#8741; . We also show the values of wedge corresponding to the horizon (solid), m = 1 (dashed) and m = 0.5 (dotted). In principle there is a significant amount of sensitivity that can be obtained by increasing the number of foreground modes that can be used, though as shown below, a significant detection is possible even if restricted to modes that are expected to be free of foreground contamination.</p><p>With these individual uncertainties defined in Equation ( <ref type="formula">19</ref>), we compute the cumulative S/N by adding the per-bin sensitivity &#349; for individual (k &#8869; , k &#8741; ) bins in quadrature:</p><p>In this way, we can build up cumulative sensitivity by combining measurements from different modes together. We also sum over all nonoverlapping redshift windows between 7.2 z 12. This restriction on the redshift value reflects the inability of the Roman grism to observe and detect LAEs below this threshold. In practice, given the low number density of galaxies expected above z &#61577; 10, the sensitivity of these measurements decreases significantly toward high redshift.</p><p>To capture how the total uncertainty varies with changes made to individual experimental parameters, we explore a multidimensional parameter space of statistical errors.</p><p>Next, we consider how the error bars on the cross-power spectrum vary with changes in both the survey and model parameters. Specifically, we explore how the S/N varies with respect to: (i) the slope of the 21 cm wedge (i.e., the degree to which foregrounds contaminate the signal of interest), (ii) the galaxy redshift uncertainties, &#963; z , (iii) the effective duty cycle of sources, f LAE , (iv) whether the spin temperature of the IGM is saturated or maximally cold, (v) the ionization history, and (vi) the galaxy bias factor predicted by ARES and BLUETIDES. Exploring these variations will help in formulating optimal observing strategies for upcoming cross-power spectrum measurements.</p><p>Table <ref type="table">1</ref> shows the cumulative S/N calculated using Equation (20) for our various combinations of parameters. For our default parameters, we choose a horizon wedge level of foreground contamination, f LAE = 0.1, &#963; z = 0.01, 15 , a spintemperature saturated IGM, our fiducial reionization scenario, and the galaxy bias b g (z) as predicted by ARES. We show the resulting changes to the S/N as a function of varying these 14 If one uses linear bins of dk &#8869; and dk &#8741; instead of logarithmic ones, the number of modes is given by: ( ) ( )</p><p>survey 2</p><p>where we have applied similar considerations for mode counting only in the positive half-plane and k &#8741; symmetry.</p><p>parameters. Encouragingly, given these default parameters, we forecast a 12&#963; detection for the nominal HERA and Roman sensitivities. For the same combination of observational parameters, we forecast a 282&#963; detection of the 21 cm autopower spectrum from HERA and a 14&#963; detection of the LAE galaxy auto-power spectrum from Roman. <ref type="foot">16</ref> We discuss further practical challenges in achieving this cross-power S/N in Section 4.4. We now briefly discuss some of the general trends on display in Table <ref type="table">1</ref> to contextualize some of the various observational effects at play. Not surprisingly, decreasing the slope of the foreground wedge significantly increases the cumulative S/N ratio. If a value of m = 0.5 is assumed, the cumulative S/N increases by nearly a factor of 3, to a roughly 32&#963; detection. As shown in Figure <ref type="figure">12</ref>, there are a nontrivial number of modes that are subject to foreground contamination, which are also the ones most strongly detectable. By increasing the number of modes that contribute to the S/N statistic, the prospects for successful detection and characterization are improved dramatically. Furthermore, this improvement relies almost exclusively on improvements to 21 cm calibration and analysis techniques. Although the design of HERA means that reliably accessing these modes may be difficult, it is worth exploring whether there are ways to extract and use some of this information when measuring the cross-correlation spectrum.</p><p>For the redshift uncertainty &#963; z , we see that there is asymmetric behavior in how the overall uncertainty changes. Increasing the fiducial value to &#963; z = 0.001 offers a modest improvement to 14&#963;. However, decreasing the uncertainty to &#963; z = 0.1 leads to a cumulative S/N well below 1. This can be seen in the behavior of the S/N shown in the right panel of Figure <ref type="figure">11</ref>: as k &#8741; increases, the galaxy noise power (and hence the cross-spectrum variance in the shot-noise dominated limit) increases exponentially. When the uncertainties are such that &#963; z &#8764; 0.1 (which is the expected level for photometrically determined redshift values), there are very few k &#8741; values that have an appreciable S/N value. It may be possible to access lower values of k &#8741; by increasing the bandwidth used to observe 21 cm measurements, though the foreground contamination swamps the signal for small values of k &#8741; <ref type="bibr">(Abdurashidova et al. 2022a)</ref>.</p><p>The effective LAE duty cycle f LAE also has a significant effect on the projected S/N ratio. As discussed above, changing this quantity only affects the galaxy number density calculation when determining the shot-noise term of the galaxy uncertainty P gal noise in Equation ( <ref type="formula">14</ref>). Nevertheless, this particular source of uncertainty is one of the most significant. The left panel of Figure <ref type="figure">11</ref> suggests that this shot-noise term dominates the galaxy spectrum uncertainty at all values of k &#8869; , which in turn limits the sensitivity of the cross-spectrum measurement. Thus, increasing this quantity to f LAE = 1 significantly reduces the shot-noise uncertainty and increases the S/N by a factor of about 3. Conversely, using f LAE = 0.01 decreases the S/N by a similar factor.</p><p>Next, we look at the impact of a saturated versus a cold IGM. As mentioned above in Section 2.1, our default model assumes that the spin temperature of hydrogen has been saturated, so that T s ? T &#947; . However, we also investigate the case of a "cold" IGM, in which the gas outside fully ionized regions is not heated. As a result, the 21 cm brightness temperature can be large and negative, which increases the amplitude of the 21 cm auto-power spectrum as well as the cross-spectrum of interest here. Although this assumption significantly increases the amplitude of the 21 cm auto-power spectrum, it only provides a modest increase in the cross-spectrum here (see also <ref type="bibr">Heneka &amp; Mesinger 2020)</ref>, largely due to the quadratic scaling of the fluctuation amplitude in the 21 cm auto-power spectrum versus the linear scaling here. As discussed above, the 21 cm signal is primarily cosmic-variance limited for the detectable Fourier modes in the cross-power spectrum, and so the higheramplitude fluctuations in the cold IGM boost both the signal and the noise and do not significantly improve the total S/N. Furthermore, we investigate the effects of changing the ionization history. As can be seen in Table <ref type="table">1</ref>, both the short and the late reionization scenarios lead to slightly smaller projected S/N values compared to the fiducial history chosen here. For the case of the short history, the cross-power spectrum itself has a slightly larger amplitude due to a larger 21 cm signal, noted as a feature in previous applications of this seminumeric model (La <ref type="bibr">Plante et al. 2014</ref>). However, this increase in signal amplitude is offset by having fewer redshift windows over which the 21 cm and galaxy signals have an appreciable crossspectrum amplitude. For the case of the late reionization scenario, imposing the cutoff of z 7.2 means that spectral windows near the midpoint of reionization are thrown out. Despite the smaller amount of redshift information that can be used, there is still enough sensitivity to detect the crossspectrum at roughly 9.5&#963;.</p><p>Finally, we also look at the effect of the linearly biased galaxy model chosen for the galaxy fields. Instead of our models generated from ARES, we use bias values and number densities inferred from the BLUETIDES simulation. As can be seen in Figure <ref type="figure">3</ref>, the galaxy properties predicted by BLUETIDES feature both a larger galaxy bias b g and a greater galaxy number density n g . The combination of these effects is a significantly larger predicted value for the cumulative S/N, with a value of 29&#963;. In Table <ref type="table">1</ref>, we also show the S/N for different values of the horizon cut. In general, the S/N for BLUETIDES is larger than the S/N for ARES by a factor of 2-3. We find that this trend holds for all of the parameter combinations we explored.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.">Sources of Systematic Errors</head><p>In addition to the statistical uncertainties discussed above, there are sources of systematic errors both for galaxy and 21 cm uncertainties. We begin by discussing issues related to galaxy observations and potential paths for mitigation, then turn to 21 cm measurements.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.1.">Galaxy Survey Errors</head><p>For the galaxy surveys, there is the potential for the measured redshift values from Ly&#945; lines to be systematically offset from the true galaxy redshifts (e.g., <ref type="bibr">Shapley et al. 2003;</ref><ref type="bibr">Erb et al. 2014;</ref><ref type="bibr">Shibuya et al. 2014;</ref><ref type="bibr">Mason et al. 2018;</ref><ref type="bibr">Endsley et al. 2022</ref>). As we show in the above analysis, accurately determined spectroscopic redshift values are key to yielding a statistically significant detection.</p><p>Fortunately, there are two potential solutions to systematic errors in the determination of galaxy redshift values. First, we may use data taken from other instruments as a means of crossvalidating and calibrating the measurements from the HLSS. Specifically, the near-infrared spectrograph <ref type="bibr">(Birkmann et al. 2011)</ref> aboard the James Webb Space Telescope (JWST) is expected to take high-resolution spectroscopic measurements of many high-redshift galaxies. By comparing several of the higher-quality spectra from JWST with a subset of those taken by Roman, it may be possible to model and remove any systematic uncertainties from the full catalog produced by Roman. Additionally, when endeavoring to measure the crosscorrelation signal in practice, we can treat the galaxy redshift values as uncertain quantities, or even a single nuisance parameter quantifying the typical systematic redshift offset (as done in, e.g., <ref type="bibr">CHIME Collaboration et al. 2022)</ref>, and marginalize over them in Markov Chain Monte Carlo (MCMC)-style analysis. Essentially, the cross-correlation signal is not significant when the incorrect values for the galaxy sample redshifts are used, but the MCMC technique yields a maximal signal as the accuracy is increased. We can use the measured redshift values to form relatively tight prior distributions for the values used in the analysis, which are made more accurate as the space is sampled.</p><p>Another source of uncertainty is the potential confusion of sources as measured by the grism, given that the instrument covers a relatively wide FOV. These sources can either be other nearby high-redshift galaxies, or interloping low-redshift galaxies or stars. Interlopers that are not high-redshift LAEs will increase the effective shot-noise of the measurement. An additional complication is the confusion of two LAE objects in the Roman survey. As discussed above in Section 2.4.2, we expect that objects observed in the HLSS will also be sufficiently bright in the HLIS so as to be well-localized. In <ref type="bibr">Satpathy et al. (2020)</ref>, the authors explicitly model and measure the expected number of overlapping galaxy spectra (see their Section 3.1.3). They find that about 7% of galaxies are expected to overlap. Assuming that the overlap of galaxies is not biased in some way, then this only results in an effective decrease in the number density of observed galaxies, which is accounted for in our f LAE parameter. As we show, even with a pessimistic value of f LAE = 0.01, we expect a significant detection of the cross-spectrum. Thus, this potential overlap should not preclude a successful measurement.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.2.">21 cm Survey Errors</head><p>On the 21 cm side, there are also considerations of systematic uncertainties. As discussed at great length throughout the rest of this paper, we know that the wedge and 21 cm foregrounds pose a significant challenge when making measurements and inferences. Although we have chosen a pessimistic value of a horizon cut for our 21 cm foregrounds, the amount of contamination may necessitate increasing this value even further. Previous analysis has shown that a "superhorizon buffer" may be required to fully remove the effect of foreground emission (e.g., <ref type="bibr">Pober et al. 2013)</ref>, which would lead to a further decrease in the amount of (k &#8869; , k &#8741; ) space that can be used for this cross-correlation. Furthermore, although it may be possible for HERA to produce data from inside the wedge, the primary mode of HERA data analysis thus far has been explicit foreground avoidance. The development and implementation of new techniques may be required to achieve levels of foreground cleaning required to improve the prospects for this measurement.</p><p>Separately to the issue of Fourier modes contaminated by foreground contamination per se, we have also ignored contributions of foreground signals to the cross-spectrum variance. Although we expect the foregrounds from the 21 cm data to be uncorrelated with the galaxy survey signal, they nevertheless contribute to the cross-spectrum variance. As explored above, the variance of the 21 cm signal contributes to the overall variance of the cross-power spectrum, so increasing the variance will reduce the forecast S/N values.</p><p>Another potential issue for 21 cm measurements is a decrease in the effective number of baselines measuring a given u mode on the sky. Figure <ref type="figure">7</ref> shows the distributions of baseline for the full HERA-350 array. As shown in Figure <ref type="figure">11</ref>, this full distribution is expected to yield measurements that are cosmic-variance limited, rather than instrument-noise limited. However, if the effective number of baselines is decreased, this may no longer be true, which would increase the effective noise of the measurement. Fortunately, given the highly redundant nature of HERA, this baseline distribution ensures that these low-k &#8869; modes that yield the highest S/N are very well sampled, and should be cosmic-variance dominated even with a smaller number of antennas.</p><p>Another source of systematic errors that may decrease the sensitivity of HERA measurements in practice is the presence of radio frequency interference (RFI), which affects frequency ranges detectable by the instrument. Wider frequency ranges that are relatively free of RFI mean that multiple different redshift windows can be used to increase the cumulative S/N. Additionally, wider frequency ranges mean that broader FFTs can be performed, which probe smaller values of k &#8741; . Given that most of the sensitivity for the cross-correlation statistic comes from low values of k &#8741; , having access to wide frequency ranges free of RFI is appealing. Fortunately, the observed frequency ranges for HERA are relatively wide <ref type="bibr">(Abdurashidova et al. 2022a</ref>), so it should be possible to construct several frequency windows for evaluating the cross-correlation S/N to make such a detection possible.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Discussion</head><p>Based on the S/N calculations above in Section 4.1, there are several important findings that have implications for various observation strategies. We now turn our attention to the proposed HLS survey, and ask about how to improve prospects for this cross-correlation measurement. Before discussing these prospects in detail, it is worth making some high-level remarks about this trade-off.</p><p>As shown above in Section 4.1 and Figure <ref type="figure">11</ref>, we expect the galaxy shot-noise term 1/n gal to be significantly larger than the intrinsic galaxy power spectrum P gal on all scales, but specifically for the low-k modes that contain the most sensitivity in the cross-power spectrum. If 1/n gal ? P gal on the scales of interest, then one is shot-noise limited, and it is better to attempt to make deeper observations so that the shotnoise term decreases. Conversely, once the observations are sufficiently deep such that 1/n gal &#8764; P gal , then one should use wider observations at this depth rather than deeper ones. We now explore some of the quantitative ramifications for the HERA-Roman cross-spectrum.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Increased Sky Coverage</head><p>Let us first suppose that we were able to make use of additional time on the instrument, and decided to cover additional sky area (in the HERA stripe) at the same depth. Using the noise formalism outlined above, the effect of increasing the survey area when cross-correlating with HERA is related to the number of nonoverlapping sky patches that are observed N patch . Specifically, by assuming that the signals in these patches average together incoherently, we increase the computed S/N value by N patch . In the above analysis, given that the projected overlapping area is expected to be 500 deg 2 and the primary beam of HERA is about 10&#176;, we have assumed that N patch = 5. If we instead assume a 20% increase in sky coverage such that N patch = 6, the S/N ratio increases only by about 10%. Although this gain is modest and is straightforward to model, it represents a significant increase in the amount of sky area that is jointly covered by the two instruments. As such, it may not be feasible to expand the overlapping area by such a large degree.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Deeper Observations</head><p>Alternatively, given additional observing time, it may instead be used to probe the existing sky coverage with greater depth. A fully self-consistent treatment of this choice is subtle, but we briefly discuss some of the high-level effects. By observing the sky for additional time, the limiting magnitude of objects in the survey increases, yielding more observed galaxies. Given that these galaxies are generally hosted in less-massive dark matter halos, the associated bias of these objects will decrease (though still remain relatively large, as shown in Figure <ref type="figure">3</ref>), meaning the amplitude of the crossspectrum will decrease by a small amount. However, given the relatively steep slope of the UVLF for these objects, we expect a significant increase in the number of objects observed.<ref type="foot">foot_9</ref> As discussed above in Section 4.1, the cross-spectrum S/N forecasts are limited by shot-noise in the galaxy distribution and the sample variance (i.e., cosmic variance) in the 21 cm measurements. Thus by observing more galaxies, we would be able to reduce this contribution to the total noise budget, and increase the significance of detection.</p><p>Figure <ref type="figure">13</ref> makes this trade-off more concrete. We show the nominal S/N for our 500 deg 2 survey as reported in Table <ref type="table">1</ref> and the expected limiting magnitude of the HLIS, where the detected galaxies have m AB &lt; 26.7. For our ARES models, we are also able to self-consistently model the change in galaxy bias and number density as a function of this magnitude limit (see Figure <ref type="figure">3</ref>). We compute the projected S/N for proposed deeper observations that are 1 mag deeper (m AB &lt; 27.7) and 2 mag deeper (m AB &lt; 28.7). To reflect the fact that a smaller sky area is probed, we set N patch = 1 (i.e., this deeper survey only covers a single patch of 100 deg 2 ). We can see that going 1 mag deeper leads to a modest increase in projected S/N, from 12&#963;-13&#963; for the default set of observing parameters, even at the expense of sky coverage. However, going 2 mag deeper leads to a significant increase in the projected S/N, to 29&#963;.</p><p>We note also that in principle, these deeper photometric surveys also require deeper grism coverage in the accompanying areas. As shown in Figure <ref type="figure">4</ref>, going 2 mag deeper in imaging requires roughly five times deeper spectroscopy to observe all of these sources in Ly&#945;. Other studies have considered other possible deep field options (e.g., <ref type="bibr">Drakos et al. 2022)</ref>, such as covering 10 deg 2 with relatively deep photometry and spectroscopy. In this case, we project a detection of 4.0&#963; for a limiting magnitude of m AB &lt; 27.7 (i.e., 1 mag deeper than the nominal HLIS), and a 9.0&#963; detection for m AB &lt; 28.7.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>In this paper, we explore the prospects for the 21 cm galaxy cross-spectrum for upcoming measurements. <ref type="foot">18</ref> We show that such a signal is nominally detectable for projected measurements for HERA and the Roman Space Telescope, with a forecasted sensitivity of 12&#963; even under pessimistic assumptions regarding the foreground wedge, i.e., the region of Fourier space unusable owing to foreground contamination. We find that the accuracy of spectroscopically determined redshift values &#963; z does not dramatically impact the sensitivity of the measurement, though photometrically determined redshifts will not produce a statistically significant detection. In our fiducial forecast, we assume an effective duty cycle of 10% for the LAE objects detected by the HLS, which has a significant impact on the forecasted sensitivity. If instead the effective duty cycle is 1%, a significant detection is still possible, though this lower significance can be mitigated with a deeper galaxy survey, less 21 cm foreground contamination, or an intrinsically larger galaxy bias consistent with the BLUE- TIDES predictions. We also explore the impact of varying the reionization history used in our simulations, and find that the effect is not very significant. The one caveat is for sufficiently late reionization histories (where the midpoint of reionization is z &#61576; 7), since the grism on Roman cannot detect Ly&#945; in objects at z &#61576; 7.2.</p><p>Given the forecast sensitivity for HERA, we expect the 21 cm measurements to be cosmic-variance dominated for the modes that contribute most to the cross-power spectrum. As such, there is not much additional statistical advantage to be gained from next-generation experiments such as the SKA. However, if the SKA allows for the recovery of the signal that is lost to foreground contamination, then the significance of detection may improve dramatically. Indeed, with the projected sensitivity levels, it may be possible to characterize the ionization history, rather than make a simple detection, by dividing the cumulative sensitivity into nonoverlapping redshift windows. Additionally, by using additional k-bins, it may be possible to constrain the size of ionized bubbles during the EoR. However, we leave a detailed investigation of this possibility for future work.</p><p>For the galaxy measurements, we find that the relatively low number of galaxies expected to be measured at high redshift has the largest impact on the overall sensitivity of the measurement. As such, increased observation time of the HLS patch may prove useful for making a more robust measurement. When weighing the potential trade-off between covering more sky area versus performing a deeper survey of the existing footprint, we conclude that a deeper survey is more beneficial. This strategy would lead to a larger number of observed galaxies, thereby decreasing the shot-noise contribution to the overall uncertainty. As this component is the most significant factor, taking steps to decrease it has the largest impact on the overall sensitivity.</p><p>We note that our modeling methods in this paper are relatively simple, and do not capture the full interplay between the 21 cm signal and high-redshift galaxies. In particular, there is no explicit link between the galaxies and the resulting ionization field. Additionally, we have modeled LAEs using a constant factor f LAE to account for the fraction of intrinsic LAEs actually observed by Roman. In reality, the observable LAEs will have some spatial and luminosity dependence We show different values of f LAE using different symbols, and use color to denote ARES vs. BLUETIDES galaxy models (the latter of which are only shown for m AB &lt; 26.7). For our assumed default parameters, 1 magnitude of deeper observations corresponds to a slight increase of the S/N, from 12&#963;-13&#963;. Note that when considering the deeper observations, we use a sky coverage of 100 deg 2 rather than 500 deg 2 . Going 2 magnitudes deeper corresponds to a further increase to an S/N of 29&#963;. See Section 5 for more discussion.</p><p>primarily owing to absorption from neutral regions in the IGM.</p><p>In the future, it would be beneficial to employ a more selfconsistent model. This is of course challenging and will be more computationally expensive than the approach we take here, but will be vital for unbiased inference.</p><p>Finally, when looking toward future 21 cm experiments, it may be useful to consider how projections for cross-correlation measurements such as the ones presented in this paper may provide guidance for array design and construction. When seeking to confirm or independently verify astrophysical and cosmological parameters measured from 21 cm experiments alone, cross-correlations may provide a key path forward for providing confidence in auto-power spectra.</p><p>of reionization, we expect these quantities to generally be anticorrelated: the galaxy signal comes from regions of high density, and the 21 cm signal during the bulk of reionization comes from regions of low density.</p><p>Figure <ref type="figure">15</ref> shows the spherical power spectrum &#916; 2 (k) and the cross-correlation coefficient r(k &#8869; ) for a fixed value of k &#8741; = 0.1 h Mpc -1 . We also show the 1&#963; error bars, where we have summed in quadrature the uncertainty described by Equation (15) for different (k &#8869; , k &#8741; ) modes that correspond to the same k mode. The cross-correlation signal evolves relatively slowly over this interval. On large scales, the signal peaks near the midpoint of reionization, which is consistent with the behavior of the 21 cm signal <ref type="bibr">(Lidz et al. 2008)</ref>. Interestingly, the error is smallest at redshifts slightly past the midpoint of reionization. This is a reflection of the fact that the expected number of observed galaxies is larger, meaning the galaxy shot-noise component is smaller. For the crosscorrelation coefficient r(k &#8869; , k &#8741; ), the anticorrelation between the fields is generally quite large on large scales, though not quite perfectly anticorrelated. The decrease toward small scales suggests there is no longer a strong correlation between the galaxy and 21 cm fields. This result further confirms the desire to work on relatively large scales, such as those probed by HERA and Roman.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_0"><p>https://chime-experiment.ca/en</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1"><p>The Astrophysical Journal, 944:59 (20pp), 2023 February 10 La Plante et al.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_2"><p>The zreion parameters for the fiducial scenario are: &#175;= z 8, &#945; = 0.2, and k 0 = 0.9 h Mpc -1 . The short scenario parameters are: &#175;= z 8, &#945; = 0.564, and k 0 = 0.185 h Mpc -1 . The late scenario parameters the same as the fiducial scenario, but with &#175;= z 7.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_3"><p>https://github.com/mirochaj/ares</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_4"><p>Note that our results are insensitive to this choice. Because the model is calibrated empirically, any change in Z must be met with a commensurate change in f * , which keeps the rest-UV emission roughly constant. See, e.g., Section 3.4 in<ref type="bibr">Mirocha et al. (2017)</ref> for additional discussion.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="12" xml:id="foot_5"><p>Note that we show two versions of the ARES model: a default approach using the mean halo MAR computed following<ref type="bibr">Furlanetto et al. (2017)</ref>, and another in which halos histories are extracted from N-body simulations<ref type="bibr">(Mirocha et al. 2021)</ref>. The latter agrees more with<ref type="bibr">Tacchella et al. (2018)</ref>, who also anchor their semiempirical model to N-body simulations. The reduced SMHM is due to the steeper growth histories of simulated halos, thus making them brighter and bluer, allowing one to reduce the overall normalization of the star formation efficiency<ref type="bibr">(Mirocha et al. 2021</ref>).</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="13" xml:id="foot_6"><p>Because the optical depth to Ly&#945; is so large close to the line center, even an LAE in an ionized bubble will generally be attenuated. For instance, if the intrinsic line were a simple Gaussian of some width, the blue half of the line will typically be lost, even in a large ionized bubble. In future work we plan to account for attenuation of LAEs due to the IGM explicitly, but for now use the overall factor f LAE to include these various physical effects.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_7"><p>&#963; z (z) = 0.001(1 + z) mentioned above<ref type="bibr">(Wang et al. 2022</ref>).</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="16" xml:id="foot_8"><p>Note that for the HERA and Roman auto-spectra, we have assumed sky coverage areas of 1000 deg 2 and 2200 deg 2 , respectively. Also note that the LAE galaxy auto-power spectrum does not include the foreground wedge excision, so the nominal S/N forecast is comparable to the cross-power spectrum.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="17" xml:id="foot_9"><p>Note that a much more significant increase in survey depth than we consider here might have quickly diminishing returns, because many newly detected faint galaxies are more likely to be in small bubbles, which impede the transmission of Ly&#945; and as a result, our ability to detect the galaxies in Ly&#945;.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="18" xml:id="foot_10"><p>Those interested in understanding more of the implementation details can look at a GitHub repository containing key parts of the calculation: https:// github.com/plaplant/21cm_gal_cross_correlation.</p></note>
		</body>
		</text>
</TEI>
