<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Constraining Cosmology with Machine Learning and Galaxy Clustering: The CAMELS-SAM Suite</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>08/18/2023</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10458996</idno>
					<idno type="doi">10.3847/1538-4357/accd52</idno>
					<title level='j'>The Astrophysical Journal</title>
<idno>0004-637X</idno>
<biblScope unit="volume">954</biblScope>
<biblScope unit="issue">1</biblScope>					

					<author>Lucia A. Perez</author><author>Shy Genel</author><author>Francisco Villaescusa-Navarro</author><author>Rachel S. Somerville</author><author>Austen Gabrielpillai</author><author>Daniel Anglés-Alcázar</author><author>Benjamin D. Wandelt</author><author>L. Y. Yung</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Abstract                          As the next generation of large galaxy surveys come online, it is becoming increasingly important to develop and understand the machine-learning tools that analyze big astronomical data. Neural networks are powerful and capable of probing deep patterns in data, but they must be trained carefully on large and representative data sets. We present a new “hump” of the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project: CAMELS-SAM, encompassing one thousand dark-matter-only simulations of (100              h              −1              cMpc)              3              with different cosmological parameters (Ω                              m                            and              σ              8              ) and run through the Santa Cruz semi-analytic model for galaxy formation over a broad range of astrophysical parameters. As a proof of concept for the power of this vast suite of simulated galaxies in a large volume and broad parameter space, we probe the power of simple clustering summary statistics to marginalize over astrophysics and constrain cosmology using neural networks. We use the two-point correlation, count-in-cells, and void probability functions, and we probe nonlinear and linear scales across 0.68 <              R              <27              h              −1              cMpc. We find our neural networks can both marginalize over the uncertainties in astrophysics to constrain cosmology to 3%–8% error across various types of galaxy selections, while simultaneously learning about the SC-SAM astrophysical parameters. This work encompasses vital first steps toward creating algorithms able to marginalize over the uncertainties in our galaxy formation models and measure the underlying cosmology of our Universe. CAMELS-SAM has been publicly released alongside the rest of CAMELS, and it offers great potential to many applications of machine learning in astrophysics:              https://camels-sam.readthedocs.io              .]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Since the earliest galaxy redshift surveys, it has been known that galaxies are not distributed randomly in space, but trace out vast structures, including walls, filaments, and voids. Dark matter (DM) makes up the majority of the mass content of the Universe, and it is the dominant driver behind large-scale structure formation. The distribution of galaxies in space is heavily influenced by the clustering of dark matter halos, but it also carries signatures of how galaxy properties map to the properties of these dark matter halos <ref type="bibr">(Peebles 1980;</ref><ref type="bibr">Wechsler &amp; Tinker 2018)</ref>. Galaxy clustering is a potential key probe of cosmology, yet accurately describing the baryonic physics that drives galaxy evolution, and determines this mapping between galaxy and DM halo properties is a large ongoing area of research. Astrophysical processes such as cooling, chemical enrichment, star formation, stellar feedback, black hole growth and feedback, and galaxy mergers interact in highly nonlinear ways. Active galactic nuclei (AGNs) expel gas far beyond the center of the host galaxy, are a crucial element of the feedback cycle in galaxies, and may even affect the distribution of dark matter itself (e.g., <ref type="bibr">McKee &amp; Ostriker 2007;</ref><ref type="bibr">Fabian 2012;</ref><ref type="bibr">Kormendy &amp; Ho 2013;</ref><ref type="bibr">Netzer 2015;</ref><ref type="bibr">Borrow et al. 2020)</ref>. Stellar feedback in the form of supernovae and radiation from massive stars drives galactic winds that are key to regulating star formation in galaxies (e.g., <ref type="bibr">Madau &amp; Dickinson 2014;</ref><ref type="bibr">Somerville &amp; Dav&#233; 2015;</ref><ref type="bibr">Angl&#233;s-Alc&#225;zar et al. 2017)</ref>.</p><p>Modern hydrodynamic simulations including these physical processes within the Lambda cold dark matter (&#923;CDM) cosmological framework have been quite successful at reproducing many features of the large-scale distribution of galaxies. For example, <ref type="bibr">Springel et al. (2018)</ref> measured the matter power spectrum for the dark matter, gas, and stellar components in the IllustrisTNG simulations, explored the halogalaxy connection, and found projected correlation functions that showed great consistency with diverse observations. More recently, the MillenniumTNG simulations expanded the model to enormous volumes and have refined our understanding of the galaxy-halo connection <ref type="bibr">(Barrera et al. 2022;</ref><ref type="bibr">Bose et al. 2023;</ref><ref type="bibr">Contreras et al. 2023;</ref><ref type="bibr">Hadzhiyska et al. 2023b</ref><ref type="bibr">Hadzhiyska et al. , 2022a;;</ref><ref type="bibr">Hern&#225;ndez-Aguayo et al. 2023)</ref>. However, although a broad narrative has developed for how these types of feedback tie into the formation and evolution of galaxies, there is still much uncertainty in our understanding of the details of how these physical processes operate to shape galaxy observables (e.g., <ref type="bibr">Steinhardt &amp; Speagle 2014;</ref><ref type="bibr">Somerville &amp; Dav&#233; 2015;</ref><ref type="bibr">Naab &amp; Ostriker 2017;</ref><ref type="bibr">F&#246;rster Schreiber &amp; Wuyts 2020)</ref>. One of the major open questions in astrophysics is how to disentangle the effects of cosmology and baryonic physics in order to realize the full potential of galaxies as probes of both cosmology and astrophysics.</p><p>The Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project<ref type="foot">foot_0</ref> (Villaescusa-Navarro et al. 2021a) posits: to constrain cosmology, we should leverage tools that can marginalize over uncertainties in baryonic physics, and thereby measure the underlying cosmology. Machine learning has great promise for this goal, as algorithms can learn relationships between features without requiring explicit functional forms or likelihoods. However, machine learning requires large data sets for accurate training and robust results. The CAMELS project created a large suite of simulations to specifically explore the potential of machine learning to constrain cosmology. The initial focus of CAMELS has been developing techniques that can constrain the density of matter in the Universe, &#937; M , and the amplitude of density fluctuations in the early Universe, &#963; 8 , under the broadly supported &#923;CDM model of cosmology. The project created 4000+ cosmological simulations of (25 h -1 cMpc) 3 spanning thousands of cosmological models. Half the simulations are dark matter only, and the other half are run with the IllustrisTNG <ref type="bibr">(Weinberger et al. 2017;</ref><ref type="bibr">Pillepich et al. 2018)</ref> and SIMBA <ref type="bibr">(Dav&#233; et al. 2019</ref>) hydrodynamic models of galaxy formation, creating 2000+ simulations also spanning thousands of astrophysical models. All of CAMELS has been publicly released, as detailed in Villaescusa-Navarro et al.  (2023). 11  Various studies based on CAMELS have explored different methods of constraining cosmology or astrophysical models with diverse types of machine learning, computational tools, and astrophysical objects and phenomena. For example, <ref type="bibr">Villaescusa-Navarro et al. (2021b)</ref> obtain constraints on &#937; M with 3%-4% errors using neural networks trained on total matter density maps, providing robust predictions irrespective of galaxy formation physics implementation. Similarly, <ref type="bibr">Nicola et al. (2022)</ref> used the electron density power spectrum P ee (k) to obtain cosmological constraints and to probe the strength of baryonic feedback through the mean baryon fraction ( fbar ). The z = 0 P ee (k) from the IllustrisTNG "hump" of CAMELS yields constraints with approximately 5%-10% error on &#937; M but no constraints on &#963; 8 . However, their results are encouragingly robust across galaxy formation models: the same neural network is also able to predict &#937; M and fbar when instead trained on simulations from the SIMBA hump. <ref type="bibr">Nicola et al. (2022)</ref> also confirm that it is possible to obtain constraints on &#937; M with approximately 4% errors using the CAMELS matter power spectrum P mm (k)s, though no significant constraints on baryon fraction fbar and &#963; 8 can be obtained with this method.</p><p>Though they represent a valuable resource for many science goals, the original CAMELS simulations, due to their small volumes and sensitivity to cosmic variance, are not well suited for leveraging the most readily used summary statistics for measuring cosmology from observations: galaxy clustering. For example, <ref type="bibr">Villaescusa-Navarro et al. (2021b)</ref> find that the power spectra measured from CAMELS 2D maps of the matter density field provide constraints on &#937; M with 20% error-a significant loss of information compared to using the full maps in their neural networks. The clear path to improve constraints from galaxy clustering statistics is to increase the volume of CAMELS. However, it is currently not computationally feasible to directly scale CAMELS and its full hydrodynamic simulations up to larger volumes. In this work, we present CAMELS-SAM, a third and larger "hump" of the CAMELS project to address the need for larger volumes of simulated galaxies, and we use it to probe the power of clustering summary statistics toward CAMELS' goals.</p><p>Semi-analytic models are a well-established technique for simulating galaxy properties in a cosmological context, using simplified but physically motivated recipes. SAMs are very successful at reproducing a broad range of galaxy observables, and the predictions of SAMs for many global galaxy properties have been shown to be in good agreement with the predictions of hydrodynamic simulations over a broad range of cosmic time <ref type="bibr">(Somerville &amp; Dav&#233; 2015)</ref>. However, SAMs are more computationally efficient than hydrodynamic simulations by many orders of magnitude. In this work, we make use of the well-established Santa Cruz SAM (SC-SAM; <ref type="bibr">Somerville &amp; Primack 1999;</ref><ref type="bibr">Somerville et al. 2008</ref><ref type="bibr">Somerville et al. , 2015</ref><ref type="bibr">Somerville et al. , 2021))</ref>. A detailed, halo-by-halo comparison of the predictions of the SC-SAM with the IllustrisTNG hydro simulations has recently been carried out by <ref type="bibr">Gabrielpillai et al. (2022)</ref>. <ref type="bibr">Hadzhiyska et al. (2021a)</ref> showed that the SC-SAM produces very similar predictions to IllustrisTNG for galaxy clustering, including two-point and higher-order clustering statistics.</p><p>SAMs are set within the backbone of cosmological dark matter merger trees, which specify how halos grow over time via accretion and mergers. Like most SAMs, the SC-SAM implements treatments of cooling, partitioning of cold gas into a molecular, atomic, and ionized phase, star formation, stellar feedback, chemical enrichment, and black hole growth and feedback. Each of these processes contain free parameters that represent our incomplete understanding of the physical processes. Traditionally, these parameters are adjusted to reproduce a set of observational calibrations for nearby galaxies. In this work, we run the SC-SAMs within merger trees extracted from the suite of new DM-only simulations that we have created for the CAMELS-SAM hump. In addition, we rerun the SAMs for many different values of the parameters controlling stellar and AGN feedback, in a similar spirit to the CAMELS hydro humps.</p><p>There has been much previous work describing galaxy clustering using simple mappings between galaxy and halo properties, often referred to as the Halo Occupation Distribution (HOD) framework <ref type="bibr">(Wechsler &amp; Tinker 2018)</ref>. HOD approaches often distinguish between the "one-halo" and "twohalo" terms, corresponding to small-scale clustering between galaxies in the same halo and clustering between separate halos, respectively; these studies have yielded great understanding of how particular types of galaxies inhabit halos (e.g., <ref type="bibr">Hadzhiyska et al. 2023b</ref><ref type="bibr">Hadzhiyska et al. , 2022a))</ref>. The HOD framework is central to the bias formalism, which describes the relationship between galaxy clustering and dark matter halo clustering. Related methods to connect galaxies to halos include empirical models, which map halo and galaxy properties using observational constraints into functional forms-often, e.g., parameterizing galaxy star formation rates as a function of host halo masses, mass accretion rates, and redshifts <ref type="bibr">(Behroozi et al. 2020)</ref>. A notable subset of empirical models is the subhalo abundance matching method (SHAM; e.g., <ref type="bibr">Conroy et al. 2006;</ref><ref type="bibr">Guo et al. 2010;</ref><ref type="bibr">Contreras et al. 2021;</ref><ref type="bibr">Hearin et al. 2022)</ref>, which specifically maps the abundance of galaxies to the abundance of halos at a given redshift, and by definition accurately recreates the observed stellar mass function. These approaches have the advantage of simplicity and great computational efficiency, and they have been used in many studies that attempt to use galaxy clustering observations to constrain cosmological parameters such as &#937; M and &#963; 8 . For example, <ref type="bibr">Contreras et al. (2023)</ref> use a SHAM atop the new MillenniumTNG simulations to build a very accurate emulator of galaxy clustering for a large range of scales.</p><p>"Classical" analyses with the HOD framework have, for example, leveraged theoretical descriptions of clustering statistics to find the best-fitting cosmology for simulations or observations (e.g., <ref type="bibr">Repp &amp; Szapudi 2020;</ref><ref type="bibr">Sugiyama et al. 2020)</ref>, or to create emulators or forward models for these statistics (e.g., <ref type="bibr">Zhai et al. 2019b;</ref><ref type="bibr">Wibking et al. 2020;</ref><ref type="bibr">Barreira et al. 2021;</ref><ref type="bibr">Kokron et al. 2021;</ref><ref type="bibr">Mead et al. 2021)</ref>. Several studies have also used the enormous Quijote simulation suite <ref type="bibr">(Villaescusa-Navarro et al. 2020b</ref>) to determine the cosmology dependence of various types of clustering using the Fisher matrix formalism (e.g., <ref type="bibr">Uhlemann et al. 2020;</ref><ref type="bibr">Bayer et al. 2021;</ref><ref type="bibr">Hahn &amp; Villaescusa-Navarro 2021;</ref><ref type="bibr">Massara et al. 2021</ref>). However, most of these approaches focus on larger scales (R &#61577;7 h -1 cMpc or &lt; k h 1 max cMpc -1 ), due to the difficulty of capturing the complexity of clustering on strongly nonlinear scales with these approaches. 12 Smaller nonlinear scales of clustering have been found to hold more cosmological information than larger scales <ref type="bibr">(Contreras et al. 2023;</ref><ref type="bibr">Lange et al. 2022</ref><ref type="bibr">Lange et al. , 2023))</ref>, and they are affected by the details of feedback and baryonic physics (seen acutely in the CAMELS matter power spectra in <ref type="bibr">Delgado et al. 2023)</ref>. Within the HOD framework and even for the thoroughly studied power spectrum, the chosen prescription for the nonlinear regime strongly affects measured cosmological parameters (e.g., a 5&#963; bias for a Euclid-like survey; <ref type="bibr">Safi &amp; Farhang 2021)</ref>; this motivates approaches that inherently reproduce small-scale galaxy clustering while also incorporating baryonic effects.</p><p>Another advantage of a SAM-or hydro-based approach is that it has the potential to provide direct insights into the astrophysical processes, which HOD/bias models bypass. Additionally, empirical and SHAM models often have limited predictive or interpretive power, being tied to the observations they are tuned to match. It has also been shown that the most basic and widely used HOD-type models (which assume that the galaxy-halo mapping depends only on halo mass) do not accurately describe the clustering predictions of full hydrodynamic simulations <ref type="bibr">(Hadzhiyska et al. 2021b)</ref>. Galaxies in hydrodynamic simulations such as IllustrisTNG show a phenomenon called "assembly bias," which means that their clustering depends on halo properties in addition to mass. Galaxies generated by the SC-SAM have been shown to closely reproduce the assembly bias signal seen in IllustrisTNG <ref type="bibr">(Hadzhiyska et al. 2021a</ref>). Other works have also shown indications that HOD models may need more secondary parameters to accurately recreate the complexity of observed galaxy clustering <ref type="bibr">(Hahn &amp; Villaescusa-Navarro 2021;</ref><ref type="bibr">Szewciw et al. 2022)</ref>.</p><p>Machine learning enables the study of the galaxy-halo connection using nearly any type of galaxy property or feature, in particular those for which relationships are very hard to formulate or model (e.g., <ref type="bibr">de Santi et al. 2022;</ref><ref type="bibr">Jo et al. 2023;</ref><ref type="bibr">Shao et al. 2023c;</ref><ref type="bibr">Delgado et al. 2023;</ref><ref type="bibr">Rodrigues et al. 2023)</ref>. Machine learning also avoids some of the limitations of "classical" methods, and it notably has the ability to find constraints with fewer samples or over a larger parameter space than a Fisher formalism or a covariance matrix use <ref type="bibr">(Alsing &amp; Wandelt 2019;</ref><ref type="bibr">Alsing et al. 2019;</ref><ref type="bibr">de Santi &amp; Abramo 2022)</ref> as well as for summary statistics for which theoretical descriptions and likelihoods do not exist <ref type="bibr">(Makinen et al. 2021)</ref>. Several works have leveraged machine learning to probe how galaxy clustering is influenced by cosmology and baryonic physics. <ref type="bibr">Aric&#242; et al. (2021)</ref> created a multidimensional neural network emulator of the "baryonification" of the nonlinear matter power spectrum atop the unique BACCO suite <ref type="bibr">(Angulo et al. 2021</ref>) that performs cosmological rescaling of N-body simulations. The emulator has been tuned to scales within 0.01 &lt; k &lt; 5 h cMpc -1 and 0 &lt; z &lt; 1.5, and yields 1%-2% accuracy when tested against several dozens of hydrodynamic/N-body simulation pairs. Combined with the emulator of <ref type="bibr">Contreras et al. (2020)</ref> for the dark matter power spectrum from BACCO, it is expected to give predictions for the nonlinear matter power spectrum within 2%-4% accuracy. Additionally, <ref type="bibr">Xu et al. (2021)</ref> used machine learning to predict the HOD, real-space 3D correlation function clustering, and assembly bias for the SAM of <ref type="bibr">Guo et al. (2011)</ref>. <ref type="bibr">Ntampaka et al. (2020)</ref> developed a hybrid deep machine-learning-based technique for accurately measuring &#963; 8 and &#937; M (to within 3%-4%) from mock galaxy redshift surveys built atop AbacusCosmos <ref type="bibr">(Garrison et al. 2021</ref>) with a HOD. <ref type="bibr">Rodrigues et al. (2023)</ref> used neural networks to predict various galaxy properties (including clustering) in IllustrisTNG. Finally and notably, the SimBIG forward modeling framework <ref type="bibr">(Hahn et al. 2022</ref>, using the Quijote simulations and state-of-the-art supplemented HOD models) finds very precise constraints for &#937; M and &#963; 8 using simulation-based inference with normalizing flows and the power spectrum ( = k h 0.5 max Mpc -1 ) for the BOSS CMASS sample.</p><p>The primary deliverable of this work is the new CAMELS-SAM suite (publicly released in Villaescusa-Navarro et al. 2023, found at: <ref type="url">https://camels-sam.readthedocs.io</ref>). This work is also a proof-of-concept example of CAMELS-SAM's potential as a machine-learning data set for cosmology and astrophysics. In this work, we probe how well galaxy clustering and neural networks can together: (1) marginalize over the uncertainties in astrophysics to constrain cosmology and (2) also help us learn something about secondary effects of astrophysics on galaxy clustering. We also investigate how well neural networks constrain the astrophysics parameters that control stellar and AGN feedback with galaxy clustering. Our work encompasses vital first steps toward being able to marginalize over the uncertainties in our galaxy formation models and measure the underlying cosmology of our Universe. More specifically, the proof-of-concept work presented here with CAMELS-SAM makes several new contributions to the field, in particular in our use of neural networks for parameter inference:</p><p>1. We use a physics-based SAM to model galaxy formation, which innately provides predictions of galaxy clustering that agree well with hydrodynamic simulations and are more physically meaningful than HOD based approaches. 2. With this SAM, we create an extensive suite of simulated galaxies in large dark-matter-only simulations across a very wide range of both cosmological and astrophysical parameter space, specifically aimed at providing training data for machine learning. 3. Beyond two-point clustering, our neural networks use the information on nonlinear and higher-order clustering within count-in-cells (CiC) and the less commonly used void probability function (VPF) statistics. The VPF and CiC incorporate information on nonlinear higher-order clustering across many distance scales (e.g., <ref type="bibr">Croton et al. 2004</ref>), but they have been difficult to model in physically logical ways (e.g., <ref type="bibr">Yang &amp; Saslaw 2011;</ref><ref type="bibr">Hurtado-Gil et al. 2017)</ref>. Additionally, theoretical likelihoods do not yet exist for the VPF or have only recently been probed for CiC <ref type="bibr">(Repp &amp; Szapudi 2020)</ref>, complicating "classical" approaches to cosmological constraints with them. 4. We probe smaller, more nonlinear scales than most cosmological inference works have attempted ( &lt; k max 8.5 h cMpc -1 for the two-point correlation function and &lt; k max 5.85 h cMpc -1 for the VPF and CiC), where baryonic effects are expected to be important to galaxy clustering and constraints on cosmology are stronger. 5. Our neural networks are able to successfully marginalize over the large breadth of astrophysical models from the SAM, finding constraints on cosmology as low as 3%-5% with galaxy clustering. 6. While constraining cosmology, our neural networks are also able to learn the effects of individual astrophysical parameters in a physics-based model with galaxy clustering-a nontrivial result first seen here, made possible by the creation of CAMELS-SAM. In particular, we include astrophysical parameters in our parameter inference and leverage orthogonal information about cosmology they encode in galaxy clustering, rather than marginalizing over that information.</p><p>The layout of this paper is as follows. In Section 2, we describe the creation of the N-body simulations, the Santa Cruz SAM, and how we apply it to our simulations. In Section 3, we explain how we measure galaxy clustering and our implementation of neural networks to infer the input cosmological and astrophysical parameters. In Section 4, we explore how well our neural networks constrain the cosmological parameters &#937; M and &#963; 8 across various experiments, such as different galaxy selections, including or excluding a random downsampling to fixed number density, and using a single redshift or combining several. In Section 5, we explore how these experiments affect instead the SC-SAM feedback parameters. In Section 6, we focus particularly on how each of the clustering statistics we measure perform independently. We discuss the CAMELS-SAM suite through a "meta" lens in Section 7, comparing it to CAMELS and other simulation suites, and discussing the potential in our data release and our results. We conclude and summarize our results in Section 8. Our Appendices hold additional explanatory figures for various parts of the project.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">The New CAMELS-SAM Suite</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Specifications for the Simulations</head><p>The backbone of CAMELS-SAM consists of 1005 N-body simulations of volume (100 h -1 cMpc) 3 and N = 640 3 particles, covering the broad cosmological space of &#937; M = [0.1, 0.5] and &#963; 8 = [0.6,1.0], and containing 100 snapshots between 20 z 0. The parameters &#937; M and &#963; 8 are central and strongly influential for large-scale structure <ref type="bibr">(Dodelson 2003)</ref>, and they are among the least well-constrained cosmological parameters. For example, constraints on cosmology directly from galaxy clustering constrain the combined s = W &#187; &#61617; S 0.3. 0.78 0.027 8 8 M <ref type="bibr">(Lange et al. 2023)</ref>, meaning constraints for &#937; M and &#963; 8 are muddled not just by uncertainties in the galaxy-halo connection, but also by degeneracies with each other.</p><p>We generated initial conditions with second-order Lagrangian perturbation theory starting at z = 127, and generated the linear power spectra with CAMB <ref type="bibr">(Lewis et al. 2000)</ref>. We specified a periodic box of volume (100 h -1 cMpc) 3 and with N = 640 3 dark matter particles. The N-body portion of CAMELS-SAM was run with a setup of AREPO similar to that which ran Illustris (TNG) <ref type="bibr">(Springel 2010;</ref><ref type="bibr">Vogelsberger et al. 2013;</ref><ref type="bibr">Genel et al. 2014)</ref>. The resulting mass resolutions are between roughly 1-5 &#215; 10 8 h -1 M e , and the gravitational softening length was fixed to 4 comoving kpc until z = 1, after which it is fixed to a maximum of 2 physical kpc. Apart from varying &#937; M and &#963; 8 for our work, we assume a standard flat &#923;CDM cosmology with &#937; b = 0.049, 13 h = 0.6711, n s = 0.9624, &#8721;m &#957; = 0.0 eV, and w = -1. We stored 100 snapshots between 20 z 0 in the same spacing as IllustrisTNG. 14  These 100 snapshots were run through ROCKSTAR 15 and CONSISTENTTREES 16 to identify dark matter halos and subhalos, and to obtain merger trees. Finally, each N-body simulation was run through multiple iterations of the SC-SAM, where we varied parameters for stellar and AGN feedback across a similarly broad hyperspace. Table <ref type="table">1</ref> summarizes all the parameters we vary in CAMELS-SAM, and we discuss the physical meaning of these parameters in more detail in the next subsection. The 1000+ resulting SAM catalogs each contain hundreds of thousands to millions of halos and galaxies at each of the 100 redshift snapshots. We refer readers to Sections 2.2-3.1of <ref type="bibr">Gabrielpillai et al. (2022)</ref> for a fuller explanation of how ROCKSTAR and CONSISTENTTREES are used for the SC-SAM, as well as the full narrative of the SC-SAM astrophysics in the version used to create CAMELS-SAM.</p><p>Our implementation of the SC-SAM demands that "root" z = 0 halos have at least 100 dark matter particles, and it also prunes any parts of the tree that are less than 100 times the mass of the dark matter particles. This guarantees that all halos 13 Some readers may wonder why &#937; b was not varied in CAMELS or CAMELS-SAM, especially considering how sensitive galaxies are thought to be to the number of baryons in a halo. &#937; b is more strongly constrained by studies of the cosmic microwave background than &#937; M and &#963; 8 constraints. As derived in Planck <ref type="bibr">Collaboration et al. (2020)</ref>: &#937; b h 2 = 0.02233 &#177; 0.00015 (0.67%, dominated by h); &#937; M = 0.3146 &#177; 0.0074 (2.4%); &#963; 8 = 0.8101 &#177; 0.0061 (0.75%). However, the CAMELS team is currently creating another wing of L = 25 h -1 cMpc hydrodynamic simulations that vary &#937; b alongside &#937; M and &#963; 8 , to fully account for its effects. 14 See the complete list within "Description of Simulations" in the documentation: <ref type="url">https://camels-sam.readthedocs.io/en/main/simulations.html</ref>. 15  <ref type="bibr">Behroozi et al. (2013a)</ref>: <ref type="url">https://bitbucket.org/gfcstanford/rockstar</ref>. 16  <ref type="bibr">Behroozi et al. (2013b)</ref>: <ref type="url">https://bitbucket.org/pbehroozi/consistent-trees/  src/main/</ref>. and the merger histories the SC-SAM uses are well resolved. Given our cosmological parameter range, these strong demands of the merger tree resolution translate to probing halos of at least M halo &gt; 1-5 &#215; 10 10 h -1 M e . Additionally, the CONSIS-TENTTREES merger trees are post-processed to exclude subhalo trees, as the SC-SAM models the evolution of subhalos internally. Central galaxies of M star &gt; 10 8 h -1 M e created by the SC-SAM have been shown to remain very similar throughout a large range of mass resolutions <ref type="bibr">(Gabrielpillai et al. 2022)</ref>. In practice, this means that key galaxy observables for SC-SAM galaxies remain converged beyond M star &gt; 10 8 h -1 M e and M halo &gt; 10 11 h -1 M e , as we discuss in Section 2.5 and show in Appendix A. Taken together, these specifications for the SC-SAM make us confident that the difference in each LH simulation's particle mass resolution does not artificially improve our cosmological constraints under our galaxy and halo selections.</p><p>Key CAMELS-SAM simulation data products, such as the halo catalogs, merger trees, and SC-SAM galaxy catalogs, have been publicly released alongside the CAMELS suites in <ref type="bibr">Villaescusa-Navarro et al. (2023)</ref>. We direct readers there for access details, and to the CAMELS-SAM documentation website for up-to-date information: <ref type="url">https://camels-sam.  readthedocs.io</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">The Santa Cruz Semi-analytic Model in Context</head><p>Here, we briefly summarize the version of the Santa Cruz semi-analytic model for galaxy formation used in this work. The core SC-SAM is similar to that used in <ref type="bibr">Somerville et al. (2015)</ref>, and the small updates in the version used here are described in <ref type="bibr">Gabrielpillai et al. (2022)</ref>. We also direct readers to the descriptions in <ref type="bibr">Somerville et al. (2008)</ref> and <ref type="bibr">Porter et al. (2014)</ref>, as well as the recent applications to creating robust mock observations in <ref type="bibr">Somerville et al. (2021)</ref> and <ref type="bibr">Yung et al. (2019a)</ref>. Our pipeline to go from AREPO N-body simulations to SC-SAM galaxy catalogs is nearly identical to that described in <ref type="bibr">Gabrielpillai et al. (2022)</ref>, who probe how well the SC-SAM run upon the IllustrisTNG100-1 dark-matter-only simulation compares to the full hydrodynamic IllustrisTNG100-1 simulation.</p><p>Numerical simulations like IllustrisTNG and SIMBA explicitly solve equations of gravity, thermodynamics, (magneto-)hydrodynamics, etc., for discrete particles and fluid elements or cells representing dark matter, gas, stars, and black holes. However, they must adopt subgrid recipes for small-scale processes that are not resolved nor fully understood in galaxy formation (e.g., stellar feedback). These simulations provide detailed information on the spatial distribution, kinematics, and composition of the baryonic component as galaxies evolve, which allows for rich and diverse science. For example, the hydrodynamic core of CAMELS led to generated maps of neutral hydrogen in <ref type="bibr">Hassan et al. (2022)</ref> and detailed analyses of the circumgalactic medium in <ref type="bibr">Moser et al. (2022)</ref>. However, full hydrodynamic simulations are computationally expensive to run, often limited to either small volumes or low resolution. This makes it difficult to thoroughly explore the parameter space of the subgrid recipes (though even CAMELS is helping to address this in <ref type="bibr">Jo et al. 2023)</ref>.</p><p>SAMs instead work from a dark matter halo merger tree that represents how halos grow and merge in the context of a chosen &#923;CDM cosmology. Merger trees are either measured directly from N-body simulations, as we have done, or constructed from an extended Press-Schechter formalism (e.g., <ref type="bibr">Somerville &amp; Primack 1999;</ref><ref type="bibr">Yung et al. 2020)</ref>. Within this framework, SAMs compute how mass and metals move between different reservoirs-intergalactic medium (IGM), circumgalactic medium (CGM), interstellar medium (ISM), stars, etc.-using a set of coupled ordinary differential equations. Nearly all SAMs assume that the rate at which gas is accreted into the CGM is proportional to the growth rate of the dark matter halo. Nearly all SAMs adopt a cooling model that determines how rapidly gas in the CGM cools and accretes into the ISM, often based on the one presented in <ref type="bibr">White &amp; Frenk (1991)</ref>. The main differences between different SAMs lie in the specific equations and parameters that are adopted to describe star formation, stellar feedback, and black hole growth and AGN feedback. This approach offers flexibility; for example, the minimum resolution of an SC-SAM galaxy is not limited to the resolution of stellar mass particles. As noted previously, several of the key processes in SAMs contain adjustable parameters, similar to the parameters contained in the subgrid recipes in hydrodynamic simulations.</p><p>The SC-SAM is fairly typical in the general approach and functional forms used to describe these processes. For example, most SAMs specify the mass loading of stellar driven winds using a power-law function similar the SC-SAM's, described in the next section. The implementation of star formation in the SC-SAM is somewhat different from that implemented in many other SAMs. The SC-SAM relies on partitioning gas into different phases (ionized, molecular, and atomic), and the adopted star formation relation is based on only the molecular gas density (rather than total gas density). However, <ref type="bibr">Somerville et al. (2015)</ref> showed that this has a rather minor effect on most predictions of the SAM. The treatment of black hole growth and AGN feedback in the SC-SAM is also substantially different from the implementation in other SAMs, but this also turns out to have a rather minor effect on most galaxy properties. Models can also differ significantly in their treatment of satellite galaxies and orphans, but this should not strongly affect the relatively large scales considered in this work. There are many works comparing the predictions of different SAMs (e.g., <ref type="bibr">Lu et al. 2014;</ref><ref type="bibr">Somerville &amp; Dav&#233; 2015;</ref><ref type="bibr">Knebe et al. 2018)</ref>. The overall takeaway from these studies is that, in spite of the many differences in implementation and choice of physical recipes, most SAMs produce similar predictions for key quantities such as the stellar mass function, suggesting a similar stellar to halo mass relationship <ref type="bibr">(Somerville &amp; Dav&#233; 2015)</ref>. This implies that, to first order, these models will also make similar predictions for galaxy clustering. The predictions of SAMs for more "second-order" effects such as assembly bias have not yet been systematically studied.</p><p>Finally, the fiducial SC-SAM (calibrated to z = 0 observations) naturally predicts galaxy clustering consistent with both hydrodynamic simulations and galaxy observations across a wide span of redshifts. <ref type="bibr">Yung et al. (2022)</ref> robustly compared the SC-SAM predictions for future galaxy clustering observations in the process of creating forecasts for JWST. Their large light cones<ref type="foot">foot_3</ref> , populated with SC-SAM galaxies and postprocessing galaxy observables, produced projected two-point correlation functions in good agreement with those measured by PRIMUS and DEEP-2 in GOODS-N for 0.2 &lt; z &lt; 1.2 galaxies, 1.25 &lt; z &lt; 4.5 galaxies in CANDELS, and 3.5 &lt; z &lt; 7 galaxies from <ref type="bibr">Harikane et al. (2016)</ref>. <ref type="bibr">Yung et al. (2023)</ref>, while making forecasts for the Roman Space Telescope, also compared their light cones' SC-SAM clustering to those from light cones made with the very different UniverseMachine <ref type="bibr">(Behroozi et al. 2020)</ref> and DREAM <ref type="bibr">(Drakos et al. 2022</ref>) models, and they found great agreement across all models for the angular correlation function of rest-UV selected galaxies at z &gt; 4. Additionally, <ref type="bibr">Hadzhiyska et al. (2021a)</ref> found that SC-SAM galaxies generated atop the IllustrisTNG dark matter merger trees yield two-point clustering statistics and galaxy assembly bias signatures very similar to first order to the IllustrisTNG complete hydrodynamic galaxies. Given these results, one can likely trust the SC-SAM clustering predictions for in both idealized simulation space (as our work is) and in a more realistic observational space. Finally, we remind readers that the SC-SAM was not calibrated using galaxy clustering; these findings are predictions naturally resulting from the model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">The Santa Cruz Semi-analytic Model for CAMELS</head><p>To mimic the stellar feedback parameter variation in CAMELS, we focus on the SC-SAM parameters &#61682; SN and &#945; rh , which control the mass outflow rate out of galaxies driven by supernovae and radiation from massive stars. Like other SAMs and numerical models, which expect winds to conserve energy and momentum in a galaxy, the SC-SAM assumes that the mass outflow rate due to stellar feedback scales with the depth of the galaxy's potential well. Specifically, the SC-SAM relates the mass outflow rate from stellar driven winds to the SFR of the galaxy and the circular velocity of the halo:</p><p>Here, &#61478; * m is the SFR; V 0 is a normalization constant set to 200 km s -1 ; and V c is the maximum circular velocity of galaxy's disk (assumed to be the maximum rotational velocity of its host dark matter halo). The parameters &#61682; SN and &#945; rh are adjustable. This is quite similar in spirit to the treatment of kinetic stellar driven winds in IllustrisTNG and SIMBA.</p><p>The SC-SAM, however, employs a somewhat different approach for implementing AGN feedback compared to the CAMELS TNG and SIMBA model variations. In the SAM, radiatively inefficient accretion onto black holes is assumed to cause heating of the hot halo gas via energetic radio jets. The rate of accretion onto the black hole from the hot halo is given by</p><p>Here, kT is the temperature of gas within the Bondi accretion radius ( &#186; r GM c 2 A s BH 2 ), and &#923;[T, Z h ] is the temperature-and metallicity-dependent cooling function <ref type="bibr">(Sutherland &amp; Dopita 1993)</ref>. This radio mode accretion heats the hot halo gas at a rate that is proportional to &#61478; m radio , and it can partially or completely offset cooling and accretion into the ISM. Thus, we can control the strength of the feedback from the jet mode by varying the parameter &#954; radio . The AGN feedback mainly affects the most massive galaxies (e.g., Figure <ref type="figure">12</ref> in Appendix A).</p><p>In the SC-SAM, the default values for the above parameters are: &#61682; = 1.7 SN , &#945; rh = 3.0, and &#954; radio = 0.002. These values, and the other parameter values that go into the SAM, were selected by tuning the SAM "by hand" to reproduce a set of key observed relationships <ref type="bibr">(Somerville et al. 2008</ref><ref type="bibr">(Somerville et al. , 2015</ref><ref type="bibr">(Somerville et al. , 2021;;</ref><ref type="bibr">Yung et al. 2019a)</ref>, such as the stellar mass function, cold gas fraction, mass-metallicity relation for stars, and black hole mass versus bulge mass relation <ref type="bibr">(Gallazzi et al. 2005;</ref><ref type="bibr">Kirby et al. 2011;</ref><ref type="bibr">Baldry et al. 2012;</ref><ref type="bibr">Bernardi et al. 2013;</ref><ref type="bibr">Moustakas et al. 2013;</ref><ref type="bibr">McConnell &amp; Ma 2013;</ref><ref type="bibr">Kormendy &amp; Ho 2013;</ref><ref type="bibr">Rodr&#237;guez-Puebla et al. 2017;</ref><ref type="bibr">Catinella et al. 2018;</ref><ref type="bibr">Calette et al. 2018)</ref>. We check that we reproduce these calibrations with our implementation of the SC-SAM within the CAMELS merger trees in Appendix A Figure <ref type="figure">9</ref>, finding excellent agreement with <ref type="bibr">Gabrielpillai et al. (2022)</ref>. The predictions of the SAM for these galaxy properties are quite insensitive to the resolution of the input dark matter merger trees <ref type="bibr">(Gabrielpillai et al. 2022 and Figure 9)</ref>.</p><p>In this work, as in the CAMELS hydrodynamic humps, we vary prefactors for each parameter over a fairly broad range. We assign the prefactor of &#61682; SN as the multiplicative A SN1 , and vary it between [ 1 4 ,4] with a default of unity. Similarly, the prefactor for &#954; radio is the multiplicative factor A AGN , and it varies between [ 1 4 ,4] with a default of unity. Both A SN1 and A AGN are generated evenly in logarithmic space to compare the "order-of-magnitude" scale of effects. <ref type="foot">18</ref> The parameter &#945; rh appears in the exponent of a power law, and we therefore define an additive factor A SN2 . We vary it between [-2, 2] and generate it evenly in linear space with a default value of zero.</p><p>Though this work selects three influential parameters of broad galaxy properties in the SC-SAM to mimic what CAMELS has done, there is ongoing work in both CAMELS and CAMELS-SAM to explore the full range of astrophysical parameters in their given galaxy models. The CAMELS team has begun creating an expanded TNG wing where all 28 parameters of the IllustrisTNG model are varied over 1024 simulations using a Sobol sampling sequence <ref type="bibr">(Sobol 1967)</ref>. The CAMELS-SAM team is beginning an exploration of all 15 or so adjustable parameters in the SC-SAM atop our existing merger trees. These data sets will offer fascinating explorations of the full power of these models for galaxy formation, as well as push the boundaries of parameter inference from moderately sized data sets.</p><p>Table <ref type="table">1</ref> gives a summary of our chosen CAMELS-SAM scaling parameters and their ranges. They appear within Equations (1) and (2) in the following manner:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.">LH, CV, and 1P Simulation Sets</head><p>Table <ref type="table">2</ref> describes all products within CAMELS-SAM as well as their significance and use. The core "Latin hypercube" (LH) set consists of 1000 simulations, each with different values of &#937; M , &#963; 8 , A SN1 , A SN2 , and A AGN . We first generated 1000 N-body simulations with AREPO over a Latin hypercube 19  of &#937; M = [0.1,0.5] and &#963; 8 = [0.6,1.0] (yielding individual dark matter particle masses of approximately 1.3-6 &#215; 10 8 h -1 M e ). The random phases of the initial conditions in the N-body simulations are allowed to vary. The parameters in the Latin hypercube are randomly generated: linearly across &#937; M = [0.1, 0.5], &#963; 8 = [0.6, 1.0], and A SN2 = [-2, +2]; and logarithmically across both A SN1 and A AGN = [0.25, 4.0]. These SC-SAM prefactor parameters are described in Section 2.3 and Table <ref type="table">1</ref>.</p><p>In addition to the full CAMELS-SAM LH suite of 1000 simulations, we also created a small set of five "cosmic variance" (CV) simulations, run with different random seeds for the initial conditions while using the same fiducial parameters of {&#937; M , &#963; 8 , A SN1 , A SN2 , A AGN } = {0.3, 0.8, 1, 0, 1}. The CV set allows us to evaluate how cosmic variance in our (100 h -1 Mpc) 3 volumes may affect clustering statistics.</p><p>Atop the first two CV N-body simulations, CV_0 and CV_1, we also created 12 min-max "one-parameter" (1P) simulations to serve a similar purpose as in the hydrodynamic CAMELS suites <ref type="bibr">(Villaescusa-Navarro et al. 2023)</ref>. The 1P SC-SAM galaxy catalogs cover the minimum and maximum prefactor parameter values, or {A SN1 } = {0.25, 4.0}, {A AGN } = {0.25, 4.0}, and {A SN2 } = {-2, 2}. These simulations are useful to investigate each parameter's effect on our clustering statistics. For example, in Appendix A <ref type="bibr">(Figures 10,</ref><ref type="bibr">11,</ref><ref type="bibr">and 12)</ref>, we examine how the extreme ends of these parameters affect key galaxy summary statistics, including the stellar mass function and the ratio of cold gas fraction to stellar mass in galaxies.</p><p>We additionally created five "extreme" simulations with the same resolution but eight times larger volume. These simulations have box size length L = 205 h -1 cMpc and N = 1280 3 particles. One was created with the fiducial cosmology of {&#937; M , &#963; 8 } = {0.3, 0.8}, and the other four were created at the corners of the full cosmological parameter space of &#937; M = [0.1,0.5] and &#963; 8 = [0.6,1.0]. We ran the SC-SAM on each of these "extreme" simulations using the fiducial parameters that best recreate z = 0 observations. This allowed us to confirm that our selected galaxy clustering statistics show the expected influence of cosmological parameter variations. 20</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.5.">Verifying the Simulations</head><p>As a confirmation of our N-body volumes and the ROCK- STAR products, we first examine the behavior of the halo mass functions (HMFs). Figure <ref type="figure">1</ref> shows z = 0 and z = 2 HMFs for the five CV simulations. We compare the CV set against the HMFs of the "extreme" simulations, four of which exist at the extreme corners of our cosmological parameter space and one of which is at the same fiducial cosmology as the CV set. These simulations show well-converged halo statistics for our smaller fiducial volume and illustrate the broad range of conditions probed by CAMELS-SAM. We also note that the relevant HMFs of the IllustrisTNG300-1 and -2 simulations are very consistent with the "fiducial" cosmology Ex4 simulation and all the CV simulations (as confirmed in Appendix A).</p><p>The SC-SAM has been tested using the largest and highestresolution IllustrisTNG simulations, with box side length of L = 205 h -1 cMpc and 2500 3 particles <ref type="bibr">(Gabrielpillai et al. 2022)</ref>. We confirmed the SC-SAM would still robustly match key observational statistics for high-mass halos at our lower resolution, similar to IllustrisTNG300-2 (of 12 Additional galaxy catalogs Ex "Ex"-treme cosmology and volume simulations of (205 h -1 cMpc) 3 and (1240) 3 at &#937; M = {0.1, 0.5, 0.5, 0.1, 0.3} and &#963; 8 = {0.6, 1.0, 0.6, 1.0, 0.8} 5 Not currently shared; ROCKSTAR catalogs used in Figure <ref type="figure">1</ref>.</p><p>19 See <ref type="bibr">Santner et al. (2003)</ref> and <ref type="bibr">Fang et al. (2005)</ref> for review texts about this method, which originally was developed in ancient Rome to optimize agriculture, but allows for inference with sparse coverage of a highdimensional parameter space. There has also been recent innovation on further reducing the number of instances needed for parameter space sampling in cosmological contexts <ref type="bibr">(Rogers et al. 2019)</ref>, or leveraging the Latin hypercube for innovations in astrophysical computation <ref type="bibr">(Albers et al. 2019)</ref>. 20 At the time of writing, the "extreme"(-ly large) simulations are not released with the rest of CAMELS-SAM.</p><p>Mpc and 1250 3 particles), with all parameters held to the IllustrisTNG cosmology and the best-fit SC-SAM parameters from <ref type="bibr">Somerville et al. (2021)</ref>.</p><p>In Appendix A, we confirm that our suite setup recreates key observed summary statistics under the fiducial model. We compare the result of our CV simulations to the SAM outputs of the two highest-resolution IllustrisTNG300 volumes as well as various near-Universe observations for: the stellar mass function, the stellar mass-halo mass, the cold gas fraction versus stellar mass of disk-dominated galaxies, stellar metallicity-stellar mass, and black hole mass-bulge mass relationships. The consistency with the larger IllustrisTNG300 SAM catalogs and the overall SC-SAM agreement with z = 0 observations support the choice of volume and resolution for our CAMELS-SAM suite. In Appendix A, we also probe these relationships for the "1P" galaxy catalogs, to further understand how each of the SC-SAM parameters we vary affects astrophysical summary statistics.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methodology</head><p>In this section, we describe our methodology for constraining cosmology and astrophysics using clustering summary statistics and neural networks. In Section 3.1, we describe the clustering statistics we test in this work, how they were measured, and how they are prepared for the neural network pipeline. In Section 3.2, we describe our neural network architecture and process.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Galaxy Clustering</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.1.">Introduction to Clustering Statistics</head><p>The spatial distribution of galaxies traces the structure of underlying dark matter and carries signatures of both the cosmology as well as details of how galaxies interact with their environment and each other. There are many ways to measure the clustering of galaxies, each with unique strengths, uses, theoretical foundations, and connections to other physical concepts. In this proof-of-concept work, we use the void probability function (VPF), count-in-cells (CiC) function, and (real-space) two-point correlation function (2ptCF).</p><p>The widely used 2ptCF quantifies the probability of finding two galaxies within a certain distance from each other (compared to a random distribution; e.g., <ref type="bibr">Peebles 1980;</ref><ref type="bibr">Landy &amp; Szalay 1993)</ref>. CiC quantifies the number of galaxies within a randomly placed cell of a given size and theoretically includes all higher-order n-point clustering statistics <ref type="bibr">(Peebles 1980</ref>), but it is the most computationally costly of these three statistics. The VPF is a less commonly used clustering statistic that simply asks: how likely is a sphere of a given size to contain zero objects for a given galaxy selection criterion? The VPF is simple and efficient to calculate, is tied to all higher-order correlation functions as the zeroth moment of count-in-cells statistics, and encodes information from higher-order clustering that is not captured in the 2ptCF <ref type="bibr">(White 1979;</ref><ref type="bibr">Conroy et al. 2005;</ref><ref type="bibr">Perez et al. 2021)</ref>. 21  These clustering statistics are known to be powerful tools for constraining cosmology in observations and N-body simulations. CiC has been prominent and promising in several recent works constraining cosmology. <ref type="bibr">Uhlemann et al. (2020)</ref> note the inclusion of CiC improved their constraints on &#937; M and &#963; 8 by factors of five and two, respectively, compared to the matter power spectrum alone, and it also broke the degeneracy between massive neutrino mass and &#963; 8 . <ref type="bibr">Salvador et al. (2019)</ref> also probed how well CiC can constrain linear and higher-order galaxy bias, finding its constraints are consistent with measurements of the bias from galaxy-galaxy clustering, galaxy-galaxy lensing, cosmic microwave background lensing, and shear-clustering measurements. Excitingly, Repp &amp; HMFs of the (205 h -1 cMpc) 3 , N = 1280 3 "extreme" cosmology volumes, where Ex{0,1,2,3,4} have &#937; M = {0.1, 0.5, 0.5, 0.1, 0.3} and &#963; 8 = {0.6, 1.0, 0.6, 1.0, 0.8}, are shown with {purple, blue, cyan, orange, red} lines. 21 We emphasize here the VPF is separate from work with cosmic voids, which are large underdense regions in the cosmic web that require large detailed sky surveys to map and catalog. The VPF is a simple statistical tool that counts empty circles/spheres, whereas cosmic voids can have complex shapes and contain rare and interesting galaxies (e.g., <ref type="bibr">Habouzit et al. 2020)</ref>. Cosmic voids are being successfully used as alternate cosmological probes <ref type="bibr">(Pisani et al. 2015;</ref><ref type="bibr">Hamaus et al. 2016;</ref><ref type="bibr">Zhang et al. 2020)</ref>.</p><p>Szapudi (2020) developed a theoretical description for the CiC distribution (enabling analyses requiring understanding of the variance and covariance). They found that CiC breaks the degeneracy between &#963; 8 and the bias parameter, and it yields an 11% error on their &#963; 8 measurement for the SDSS Main Galaxy Sample. <ref type="bibr">Dantas (2021)</ref> identified that the presence of baryons in IllustrisTNG affects the best-fitting theoretical model for CiC. Finally, <ref type="bibr">Wen et al. (2020)</ref> developed a new technique with the CiC PDF to probe different cosmological models with dark energy, focusing on dark matter halos in the DEUS simulations <ref type="bibr">(Reverdy et al. 2015)</ref>  <ref type="table">at 0 &lt; z &lt; 4</ref> and <ref type="table">scales  2-25 h -1 cMpc.</ref> Additionally, more uncommon types of the 2ptCF have recently been found to help constrain dark energy observables <ref type="bibr">(Zhai et al. 2019a</ref><ref type="bibr">(Zhai et al. , 2019b))</ref>. <ref type="bibr">Van Daalen et al. (2016)</ref> also used the projected 2ptCF, creating a clustering estimator that improved constraints on astrophysics within SAMs (specifically, the Munich SAM of <ref type="bibr">Guo et al. 2013)</ref>. <ref type="bibr">Wang et al. (2019)</ref> used the VPF and CiC in conjunction with the projected 2ptCF to help constrain galaxy assembly bias using halo occupation distributions, and <ref type="bibr">McCullagh et al. (2017)</ref> and Walsh &amp; Tinker (2019) used the VPF to refine generalized HOD fitting. Our work is among the first to use the VPF as the 2ptCF and CiC have been used, and to systematically compare the constraints that can be obtained from all three statistics. Part of the VPF's uncommon use can be attributed to how difficult it is to model for nonrandom galaxy distribution even for a narrow fiducial &#923;CDM cosmology, often excluding it from "traditional" frameworks of cosmological parameter inference from galaxy clustering.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.2.">Measuring Clustering in CAMELS-SAM</head><p>We measure all clustering statistics between 1 and 40 cMpc (0.6711-26.84 h -1 cMpc) using the CORRFUNC package <ref type="bibr">(Sinha &amp; Garrison 2020)</ref>. We constrain our analysis to distances larger than 1 cMpc, due to known inaccuracies of the assignment of satellite positions in the SC-SAM relative to hydro simulations <ref type="bibr">(Hadzhiyska et al. 2021a)</ref>. This is because the SC-SAM does not use the subhalo positions within their host halo provided by the N-body simulation; rather, it treats subhalo merging and tidal destruction using a semi-analytic recipe (see Section 2.2.1 in <ref type="bibr">Somerville et al. 2021)</ref>. Moreover, it is expected that the presence of baryons will affect the orbits of satellites and the efficiency of their tidal heating and destruction, in a manner not currently modeled in the SC-SAM <ref type="bibr">(Jiang et al. 2021)</ref>. Our analysis still probes smaller nonlinear scales than most similar studies, reaching the equivalent of &lt; k h 5.85 max cMpc -1 (VPF/CiC) or 8.5 h cMpc -1 (2ptCF), and it leverages additional information from nonlinear terms of galaxy bias.</p><p>For the 2ptCF, we generate 20 distance scales evenly in logarithmic space between 10 0 and 10 1.6 cMpc, yielding 19 distance bins centered between 1.1 and 36.1 cMpc. With CORRFUNC, we measure the 2ptCF in 3D real space using the <ref type="bibr">Landy &amp; Szalay (1993)</ref> </p><p>, where DD is the number of data-data pairs in the bin that encloses the distance scale R, &#61522; D represents data-random pairs, and &#61522;&#61522; is random-random pairs. For the 2ptCF, we use 100 times as many random points as there are galaxies.</p><p>For the VPF and CiC, we use CORRFUNC to perform the calculation for 25 distance scales with a maximum of 40 cMpc, yielding 25 linearly spaces radii between 1.6 and 40 cMpc. We measure the VPF by randomly dropping 100,000 or 500,000 spheres of each tested distance scale (for galaxy densities of 0.001 or 0.005 h 3 cMpc -3 , respectively), and counting those with no galaxies (see <ref type="bibr">Conroy et al. 2005</ref> for a discussion on the effects of the number of dropped spheres). For CiC, we also randomly drop 100,000 or 500,000 spheres of each tested distance scale and count how many have 0, 1, 2, ..., n = 500 galaxies inside them. Measuring to n = 500 captures nearly the entire CiC distribution at these densities over almost all scales. The VPF is additionally calculated and verified<ref type="foot">foot_5</ref> using the swift k-nearest neighbor method of <ref type="bibr">Banerjee &amp; Abel (2021)</ref>. Future expansions to this work would do well to include higher-order nearest neighbor statistics, shown to give additional cosmological constraints by <ref type="bibr">Banerjee &amp; Abel (2021)</ref>.</p><p>Figure <ref type="figure">2</ref> displays these clustering measurements for 20 of the LH galaxy catalogs, under two types of "selections" that vary the SC-SAM galaxy property, the threshold value, and number density downsampling (described in the next section). For the 20 simulations shown (LH_630 -649), the span of covered parameters is <ref type="bibr">[0.26, 3.84]</ref>, and therefore qualitatively representative of our parameter space. We also measured the clustering of the CV catalogs to confirm that the change in clustering caused by varying the parameters exceeded that caused by cosmic variance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.3.">Context: Galaxy Selections within CAMELS-SAM</head><p>Before measuring any clustering statistic, we apply a selection across CAMELS-SAM to yield a coherent sample of galaxies across all cosmologies and feedback parameters. In this initial proof-of-concept study, galaxies are selected by 1. Halo mass, usually (M log 10 halo /M e ) &gt; 10-12. 2. Stellar mass, usually (M log 10 stellar /M e ) &gt; 9-10. 3. Instantaneous star formation rate (SFR), M e yr -1 . 4. Specific star formation rate (sSFR = SFR/M stellar ),</p><p>Gyr -1 .</p><p>These properties are predicted directly by the SC-SAM and are relatively easy to compare directly to other simulations. Although they are not directly observable, there is extensive work in the literature to estimate stellar masses and SFR from observed samples using various methods (e.g., see reviews such as <ref type="bibr">Blanton &amp; Moustakas 2009;</ref><ref type="bibr">Madau &amp; Dickinson 2014;</ref><ref type="bibr">Somerville &amp; Dav&#233; 2015)</ref>. For example, the stellar mass of galaxies is often measured with spectral energy distribution fitting to broad-or medium-band photometry <ref type="bibr">(Walcher et al. 2011;</ref><ref type="bibr">Barro et al. 2013;</ref><ref type="bibr">Conroy 2013;</ref><ref type="bibr">Duncan et al. 2014;</ref><ref type="bibr">Mobasher et al. 2015)</ref>, or spectra (e.g., <ref type="bibr">Brinchmann et al. 2004;</ref><ref type="bibr">Brammer et al. 2011)</ref>. Star formation rate can be measured with flux in several different emission lines or bands targeting different sources or tracers of star formation, though many are sensitive to dust or contamination and all carry the uncertainties of assumed timescales (e.g., <ref type="bibr">Calzetti 2013;</ref><ref type="bibr">Ellis 2008)</ref>. Specific SFR gives deep insight into the process of galaxy evolution over cosmic history (e.g., Figure <ref type="figure">11</ref> in <ref type="bibr">Speagle et al. 2014)</ref>, and is sometimes measured by proxy with emission lines or with careful analysis of the stellar mass function (e.g., <ref type="bibr">Davidzon et al. 2018)</ref>. Finally, the clear bimodal nature of galaxy colors (and the separation between star-forming versus quiescent galaxies) is easily seen in the sSFR versus stellar mass plane (e.g., <ref type="bibr">Muzzin et al. 2013)</ref>.</p><p>Ultimately, a natural end goal of this work would be to train a neural network capable of measuring the cosmology from observed galaxy clustering. This is an ambitious goal that requires much future work, such as identifying a galaxy sample that CAMELS-SAM is large enough to simulate, the generation of such realistic galaxies and/or their clustering over each CAMELS-SAM LH simulation, and a careful understanding of the observation's selection function and systematic errors. In Section 7.4, we expand on the future work needed to apply neural networks like those we present in the next section. For this initial proof-of-concept work, however, we apply straightforward and idealized selections on fundamental galaxy properties.</p><p>Our simplistic selections on stellar mass, SFR, and sSFR are consistent with the results of past studies and upcoming surveys. For example, several studies have implemented or </p><p>Our probed SFR and stellar mass cuts are achievable for future surveys; for example, SFR &gt; 0.1 M e yr -1 will likely be observable within z &lt; 1 for wide Roman and Euclid surveys, assuming surveys to (dustcorrected) limiting magnitudes of 27 in Roman's WFI F062 filter at 62 &#956;m and Euclid's VIS instrument (from <ref type="bibr">Yung et al. 2023)</ref> Roman light cones created with the SC-SAM, and forecasts similar to those in <ref type="bibr">Yung et al. (2019a</ref><ref type="bibr">Yung et al. ( , 2019b))</ref>. Our M star &gt; 10 9 M e selection should similarly be widely detected with Roman within z &lt; 1, and to a similar extent with Euclid (though more easily at closer to z &#8764; 1; stellar masses above 10 9 M e will likely correspond to the brightest Euclid galaxies at z &lt; 0.1) Additionally, our higher stellar mass selections complement the limiting stellar masses derived for 14,000 deg 2 across various Legacy Surveys serving the DESI project: M star &gt; 10 9.5 M e at z &#8764; 0.1, M star &gt; 10 10.5-11 M e above z &gt; 0.5 <ref type="bibr">(Zou et al. 2019</ref>). Finally, these selections are consistent with what the Vera Rubin Observatory will measure with LSST: Clustering statistics for CAMELS-SAM simulations 630 through 649 (each in a unique color). We show the VPF (left), CiC at 27.2 cMpc (center), and realspace 3D 2ptCF (right), for two different selections of z = 0 galaxies (top: stellar mass, bottom: SFR). The shaded black area in the top figure indicates the clustering across five simulations using the "fiducial" cosmological and SAM parameters but generated with different random seeds, to show the effect of cosmic variance for these volumes.</p><p>M star &gt; 10 9.5 M e at z &lt; 0.5, M star &gt; 10 10 M e at 0.5 &lt; z &lt; 1.0 <ref type="bibr">(Riccio et al. 2021, Figure 6)</ref>.</p><p>When finalizing the selection criteria for our chosen galaxy or halo property, we confirm each criterion will be met by enough CAMELS-SAM simulations to obtain robust statistics across most of the 1000 LH simulations. As we later discuss in Section 3.2.1, we split up our 1000 CAMELS-SAM LH simulations into roughly 70%/15%/15% training, validation, and testing sets, respectively. However, not all simulations yield a large enough sample to compute the clustering statistics with all selection criteria, and we choose to prioritize having at least 700 simulations in the training set while still keeping moderately large validation and testing sets.</p><p>For all halo or galaxy selections we present in this work, enough simulations meet the criteria to guarantee training sets of at least 700 samples and validation and testing sets of at least 80-100 each. This range is still large enough to sample the parameter space well, while allowing the flexibility to use selection criteria that pick out rarer objects, potentially revealing unique relationships between the parameters and galaxy clustering. The simulations that do not meet the selection criteria tend to have little structure formation-very small &#937; M and &#963; 8 -meaning they cluster in a specific corner of the parameter space. CAMELS and CAMELS-SAM intentionally cover such a large parameter space in order to avoid being affected by the distribution priors in the central, more realistic region of the parameter space. Therefore, missing a single corner will still yield robust constraints near the values of the cosmological parameters that are favored by observational constraints.</p><p>The final aspect of our selection is whether or not we normalize to a specific number density of objects. Number density has a strong effect on the CiC and the VPF, and some stellar mass or SFR selections yield samples with very large numbers of objects, creating a computational bottleneck. We address this in this work by randomly "downsampling" the objects that pass a given selection to either 0.001 h 3 cMpc -3 or 0.005 h 3 cMpc -3 , corresponding to 1000 or 5000 galaxies in our (100 h -1 cMpc) 3 volumes. These densities are additionally large enough to mitigate Poisson noise while producing samples small enough to calculate the clustering statistics within a reasonable computational cost. In Section 4.2, we examine how our constraints on cosmology and SAM astrophysics change if we do not randomly downsample to a fixed number density after applying our mass-or SFR-based selections.</p><p>Finally, as a contextual example to compare against our selections, the <ref type="bibr">Springel et al. (2018)</ref> clustering measurements of IllustrisTNG300 found thresholds for these selections at these densities: for a galaxy space density of 0.001 h 3 cMpc -3 , log 10 M stellar &gt; 10.18 h -1 M e , SFR &gt; 1.55 M e yr -1 , sSFR &gt; 1.27 h Gyr -1 . For a galaxy space density of 0.003 h 3 cMpc -3 , log 10 M stellar &gt; 10.49 h -1 M e , SFR &gt; 3.03 M e yr -1 , sSFR &gt; 4.43 h Gyr -1 . In observations, our chosen densities mimic the galaxy number densities many studies have measured or are expected to measure. Examples near 10 -3-4 h 3 cMpc -3 : SDSS red/blue galaxies with predicted M halo &#8764; 10 12.7 h -1 M e <ref type="bibr">(Zehavi et al. 2005, Table 4</ref>); SDSS-IV BOSS for ELGs with measured M star &#8764; 10 10.5 h -1 M e <ref type="bibr">(Raichoor et al. 2017, Figure 11)</ref>, projected EUCLID galaxies within z &lt; 1.5 <ref type="bibr">(Amendola et al. 2018, Table 3</ref>); and a "dense" sample of red sequence galaxies in the Kilo Degree Survey with M stellar &#8764; 10 10.8 h -1 M e <ref type="bibr">(Vakili et al. 2023</ref>). However, we note these samples were not randomly downsampled and are often fluxor volume-limited.</p><p>In simulations and SAMs, the exact mapping of a given selection in stellar mass or SFR to the resulting galaxy number density is strongly model-dependent and can be at odds with other observational calibrations. This often leads to a different approach to fixing the number density of a galaxy sample in the literature. Instead of randomly downsampling, studies will choose objects above a threshold in mass or SFR in order to obtain a desired number density, sometimes referred to as abundance matching. For example, <ref type="bibr">Hadzhiyska et al. (2021b)</ref> (comparing IllustrisTNG and SC-SAM galaxies) select the most massive galaxies such that they reach their desired number density. This has a very different effect on clustering, as it is selecting a differently clustered/biased population of dark matter halos, while our random downsampling selects halos with the same clustering properties, but just sparsely samples them to reduce computational load.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Neural Network Implementation</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.1.">Preparing CAMELS-SAM Clustering for Neural Networks</head><p>We use a 70%/15%/15% split of the suite for training/ validation/testing, meaning each network was trained on the first 700 of the CAMELS-SAM LH simulations, validated for performance on the next 150, and tested on the final 150. (See 3.1.3 for what we do if not all 1000 simulations are usable.)</p><p>The "best" model is whichever has the lowest error value when applied to the validation set, though we often found that at least 5-10 models performed quite similarly, meaning these results are not tied to a specific architecture. In the figures that follow, we show the performance of these best models on the test set of simulations.</p><p>In this fiducial case of "all" clustering, the data given to the neural networks consist of: for z = {0, 0.1, 0.5, 1.0}, the 2ptCF and the VPF between roughly 1 and 40 cMpc, and the CiC probability distribution for several distance scales (between 11 and 21 cMpc for our lower-density samples, and between 16 and 30 cMpc for the larger density samples; see Table <ref type="table">3</ref>). The clustering measurements at all four redshifts are strung into a 1D array<ref type="foot">foot_6</ref> ; all statistics are measured at exactly the same distance scales across all simulations that pass the selection, so the radius of each measurement is irrelevant for the neural network. Additionally, we randomly resample to our selected number density for each redshift<ref type="foot">foot_7</ref> . See Section 4.3 for an exploration of how our results change if we use only one redshift at a time, and Section 6 for the performance when the clustering statistics are used separately.</p><p>Neural networks perform best when trained on normalized data, where the mean is about zero and the standard deviation is one. We normalize the clustering 1D array so that, for each value (corresponding to, e.g., the 2ptCF at R &#8776; 10 cMpc), we take the base-ten logarithm across all 1000 values, subtract the mean from each, and then divide each by the standard deviation. For galaxy selections that yield values of zero for a given simulation (e.g., a particularly unclustered simulation that finds no voids at large scales), the value is set to 10 -12 before normalizing. For galaxy selections where all 1000 simulations yielded zero (rare, and often at the largest distance scales for extreme selections), we set the first simulation to a value of 2 &#215; 10 -12 to guarantee the normalization does not fail, and then we continue. The cosmological and astrophysical parameters are normalized in almost the same way-by their mean and standard deviation in linear space-and are returned by the neural network as 1D output arrays.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.2.">Loss Functions</head><p>In this work, we use the CAMELS-SAM simulations to do parameter inference with galaxy clustering statistics: how do the input parameter(s) relate to the statistics, and how do we measure the input parameter(s) given only the statistics? This marginal posterior we seek to learn is p(&#952;|C), which relates the parameter space &#952; to the 1D array of clustering measurements C. In our work, we often have &#952; = {&#937; M , &#963; 8 , A SN1 , A SN2 , A AGN } for all five input parameters of each simulation.</p><p>The marginal posterior over a single parameter &#952; i (out of, e.g., all five we constrain) can be defined as</p><p>The marginal posterior describes the probability that a simulation (and its array of clustering measurements) were created with a particular combination of parameters &#952;. Here with CAMELS-SAM and in many applications of CAMELS, we give the network completely flat priors or no prior knowledge of the underlying distribution of parameters (e.g., the measurement of cosmology from Planck Collaboration et al. 2016). The estimated mean of the marginal posterior for a given parameter &#952; i is</p><p>The estimated standard deviation of the marginal posterior for a given parameter &#952; i is</p><p>The goal of a neural network is to learn the posterior accurately enough that the mean &#956; i and standard deviation &#963; i it predicts are consistent with estimated posterior values given the input parameters. Our neural network here assumes a singlepeak posterior with one mean and standard deviation. The actual marginal posterior may not have these properties; for example, there may be a degenerate combination of our parameters that yields very similar clustering measurements. In these cases, we can expect a large standard deviation measurement that will attempt to cover the multiple peaks in the posterior.</p><p>For a neural network to learn a posterior, it requires a loss function to measure its performance (i.e., calculate the gradients it uses to update the weights between neurons in order to eventually converge on values closest to the true ones). We perform both parameter regression with a standard mean-squared error (MSE) validation criterion, and likelihood-free inference (LFI) with the method from Jeffrey &amp; Wandelt (2020), updated for CAMELS in Villaescusa-Navarro et al. (2022a) and featured in <ref type="bibr">Jo et al. (2023)</ref>. Our parameter regression is a fast and straightforward way to measure the mean of the posterior and therefore roughly approximate the network's accuracy, while the latter trains for longer in order to also measure the posterior's standard deviation. Finally, our loss functions are assessed over a given batch of input data, with a batch size of N batch = 64 as default, meaning that 64 random simulations in the training set are trained at a time per node worker.</p><p>The MSE loss function <ref type="foot">25</ref> we use simply measures the meansquared error between each element of the neural network's predictions and its true value, and it is built to approximate the mean of the marginal posterior (even if non-Gaussian). For a given batch size, the MSE loss on the model's predictions is</p><p>batch where &#956; i,j is the network's prediction for the mean of parameter i&#700;s posterior for simulation j; and &#952; i,j is the true input value of parameter i for simulation j. We note that this loss sums over all the cosmological and astrophysical parameters being constrained, meaning it attempts to measure the marginal posterior means for each parameter at once. With the LFI loss function, our goal is to instead get the neural network output to converge on both the mean and standard deviation of the marginal posterior. With slight Notes. The spread of n refers to which "counts" we consider-e.g., cells with 0-50 galaxies only. When doing a single redshift at a time, we broadened our choices to encompass the parts of the distribution with the most divergence across our simulations. To maximize the data we include for a single redshift, we skip every other n value. Therefore, each simulation's single-redshift count-in-cells sampling yields 510 total data points for the neural network to work upon. For all redshifts combined, we sample the first 50 n values at each distance scale, because the distance scales we selected encompass most of their variety in those regions. The all-redshifts CiC sampling yields 600 total data points.</p><p>modifications to the loss function presented in <ref type="bibr">Jeffrey &amp; Wandelt (2020)</ref>, one can define the loss function so that the neural network outputs converge over both &#956; i and &#963; i (see Villaescusa-Navarro et al. 2022a Section 3.1.2):</p><p>( )</p><p>As described in Villaescusa-Navarro et al. (2022a), the effect of this loss function is that the neural network ignores noisy parameters until it has learned the well-determined parameters first. This is important in circumstances where a feature is very sensitive to specific parameters and not at all for others (e.g., the CAMELS dark matter density field, which very mildly detects TNG/SIMBA astrophysical parameter variations). This LFI loss function removes the dependence on the overall scale in the scatter of a parameter with the included logarithm, therefore inverse-variance weighting the combination of gradients from the different terms compared to the MSE loss in Equation (7). The clustering of galaxies is dominated by that of dark matter halos, meaning the influence of the cosmological parameters will likely be much stronger than any of the SC-SAM astrophysical parameters. Therefore, we prioritize the use of the LFI loss method throughout this work.</p><p>We use the MSE loss only on the single-redshift exploration in Section 4.3, due to its faster training time, and LFI everywhere else for more detailed constraints. We generally find the MSE and LFI results are broadly consistent, though the LFI are often slightly more precise, likely due to the introduction of the logarithm discussed above.</p><p>We do, however, leverage the root mean square error (rMSE) to gauge the accuracy of our neural network predictions. In Tables 7-8 reporting our results, we list two types of errors: "rMSE" and s. si is the mean across the test set's &#963; i values after going through the best-performing neural network under the LFI loss, and reflects the actual 1&#963; error on the constraint. "rMSE" refers to the rMSE calculated on the mean &#956; i values from the relevant loss function, and roughly measures the accuracy of the predictions. Following the common definitions, the rMSE and si for the predicted parameter &#952; i across the test set of length N test are</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.3.">Tests for Accuracy</head><p>We generally find that the rMSE calculated on the LFI-losspredicted means is often very close to the LFI-loss-predicted error, though often rMSE s &gt; . To confirm the general behavior of our LFI loss networks is accurate and is not underpredicting the error, we implement a few simple quantitative tests.</p><p>First, we include a calculation of each individual test simulation's "Z-value" to exclude the rare outlier(s), according to the behavior of the entire test set's LFI predicted means. We begin by calculating the absolute value difference between actual and predicted parameter i values across the whole test set of length t:</p><p>Then, for each simulation j in the test set, the "Z-value" for a parameter i is</p><p>We remove simulations in the test set with Z &gt; 6 before calculating the rMSE on the LFI predicted means that are listed in future tables; visually, these are the extreme outliers clear to the eye whose predictions are often unrealistic or beyond the bounds of the parameters (e.g., see A AGN for Figure <ref type="figure">18(b)</ref>). Though it is very rare to find an outlier of Z &gt; 6, none of the selections we have implemented had more than a single outlier in the test set, and we have found it is nearly always the same LH simulation. We remind readers that the best-performing neural network minimizes the loss across all simulations in the given set; therefore, the network as a whole still performs well across the parameter space even with the outlier. This simple "Z-value" test helps generate rMSE assessments for the networks that are reflective of their visual performance in our figures.</p><p>Next, we examine whether our neural networks are under-or overestimating the s errors, as we find the rMSE is often slightly larger. As did <ref type="bibr">Jeffrey et al. (2022)</ref> for their moment networks, we define for a given parameter &#952; i and simulation j &#967; j (&#952; i ) as</p><p>The distribution of &#967;(&#952; i ) can help qualify the accuracy of the neural network. Parameters that find accurate constraints-or that follow the 1:1 slope on our figures, regardless of how closely-have &#967;(&#952; i ) roughly consistent to a Gaussian centered at zero with a variance of one. For parameters that cannot be constrained by the network, e.g., the SC-SAM parameters in halo mass selections, the distributions of &#967;(&#952; i ) are either flat distributions within -2 &lt; &#967;(&#952; i ) &lt; 2, or whose peak is &#967;(&#952; i ) &#8776; &#177;1.5.</p><p>In Appendix B&#700;s Figure <ref type="figure">13</ref>, we plot the distributions of &#967;(&#952; i ) for the two types of mass-selected clustering in Figures <ref type="figure">4</ref> and <ref type="figure">5</ref> across all five parameters. We find that our good constraints, even if their s trends slightly smaller than rMSE, have &#967;(&#952; i ) consistent with neural networks that are accurate and not overor underpredicted. In tables where relevant, we list both the rMSE and s calculated for various parameters, and we show only the rMSE errors in summary Figures <ref type="figure">3</ref><ref type="figure">4</ref><ref type="figure">5</ref><ref type="figure">6</ref><ref type="figure">7</ref><ref type="figure">8</ref>for clarity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.4.">Architecture</head><p>We used the OPTUNA package <ref type="bibr">(Akiba et al. 2019)</ref> to quickly train and validate 1D:1D fully connected neural networks, all while identifying the best-performing hyperparameters (e.g., number of hidden layers, neurons per layer, learning rate, etc.). When using OPTUNA to create neural networks, we limit it to the following:</p><p>1. Take in a 1D array of normalized clustering values for a given galaxy selection, and predict the five parameters of &#937; M , &#963; 8 , A SN1 , A SN2 , A AGN (or a single one, in the specific experiment of Section 5.3). 2. Have no more than five layers total, each with no more than 1000 neurons. 3. Assess 250-1000 trials (i.e., sample neural networks within the hyperparameter space), each with 500 training epochs per trial.</p><p>4. Use Leaky ReLu activation in each layer. 5. Use the Adam optimizer with &#946; parameters equal to 0.5, 0.999.</p><p>We often find 3-4 layers of 500-700 neurons do very well, testing more than 250 trials was often unnecessary, and most trials converge within 300 epochs and after approximately 2-3 GPU days. The MSE loss training and hyperparameter selection often took less than 24 GPU hours, while the LFI loss averaged 3-3.5 GPU days per trained network.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Constraining Cosmology with Clustering Using CAMELS-SAM</head><p>In this section, we explore the constraints on the cosmological parameters &#937; M and &#963; 8 that we obtain using clustering statistics from the CAMELS-SAM simulation suite. In Section 4.1, we explore how the constraints respond to various galaxy selections, especially between halo and stellar mass. In Section 4.2, we test how our cosmological constraints respond to including density downsampling or not. Finally, in Section 4.3, we examine the constraints at each of the individual redshifts we probe between 0 &lt; z &lt; 1. For ease of reading, the bulk of the quantitative details, figures, and tabulated results can be found in Appendix C.</p><p>Throughout the rest of this work, our figures will often contain the results for constraints on the SC-SAM astrophysical parameters; those will be reported and discussed independently in Section 5. For the best comparison, we compare only the results of "all" clustering, which includes the VPF, 2ptCF, and CiC at z = 0, 0.1, 0.5, 1.0. The 2ptCF and VPF are measured between roughly 1 &lt; R &lt; 40 cMpc, and the CiC at select distance scales within that range (see Table <ref type="table">3</ref>). See Section 4.3 for redshift-by-redshift comparisons, and Section 6 for statistic-by-statistic comparisons. When we report errors as percentages, we calculate them against the fiducial values (i.e., &#937; M = 0.3, &#963; 8 = 0.8).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Cosmological Constraints: Comparing Galaxy Selections</head><p>The SC-SAM properties we select upon are halo mass, stellar mass, star formation rate, and specific star formation Figure <ref type="figure">3</ref>. A summary of the 1&#963; rMSE errors we find on our five parameters when using "all" clustering (the 2ptCF, VPF, and CiC at z = {0.0, 0.1, 0.5, 1.0}) across our various galaxy selections. The color of each point indicates the property we selected by: halo mass (black), stellar mass (magenta), SFR (cyan), or sSFR (blue). The shape of each marker indicates what density we downsampled to after the selection: 0.001 h 3 cMpc -3 (downward-pointing triangle), 0.005 h 3 cMpc -3 (upwardpointing triangle), or no density downsampling (star). The plus sign markers in deeper colors indicate the rMSE constraints found by having a neural network focus only on the single parameter (see Appendix D); these symbols have been shifted slightly left for clarity. The gray dashed-dotted lines indicate the mean of the parameter's prior space; errors around this value indicate poor to no constraints found by the neural networks. The density plotted on the x-axis is the z = 0 value averaged across the five CV simulations after applying each selection, but before randomly downsampling. See Tables <ref type="table">4</ref>, <ref type="table">5</ref>, and 7 for complete details. rate. In this section, we specifically focus on cosmological constraints. In Section 5, we shift to discussing the constraints on the SC-SAM astrophysical parameters. Throughout both sections, we will briefly summarize the most relevant results of our experiments and primarily focus on discussing their meaning and significance. The detailed quantitative results are tabulated in Appendices C-D.</p><p>Figure <ref type="figure">3</ref> summarizes (and Table <ref type="table">4</ref> in Appendix C details) the various selections applied (the SC-SAM galaxy property, the cutoff value applied, and the downsampling density), and results for all five parameters (the rMSE error across the entire test set, and the mean s 1 error estimation from the LFI loss method across the test set). Good fits are much smaller than the mean of the distribution (dashed-dotted lines) and show a 1:1 relationship between the predicted and actual parameters values. Parameters with no constraints beyond the prior will appear centered around the mean of the prior distribution. Finally, we note that Figure <ref type="figure">3</ref> includes constraints that will be discussed in Section 4.2, where we do not randomly downsample to a specific number density before measuring the galaxies' clustering.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.1.">Dark Matter Halo versus Stellar Mass Selections</head><p>Figure <ref type="figure">4</ref> presents the best constraints our neural networks find on &#937; M and &#963; 8 , using a moderate halo mass selection and when randomly downsampling to a relatively high density. The best-performing neural network produces &#937; M predictions accurate to 4.7% about the fiducial value &#937; M = 0.3, under both the rMSE and LFI loss regimes. For &#963; 8 , this same neural networks predicts &#963; 8 to 3%-4% accuracy (LFI and rMSE respectively). These are the "best" possible constraints we refer to throughout the rest of this work.</p><p>Of the SC-SAM galaxy property selections we test, stellar mass can be expected to give the tightest cosmological constraints. The stellar and halo masses of galaxies are the most correlated of the selections we probe, meaning the tight constraints from halo mass clustering will likely also be seen somewhat in stellar mass. As with halo mass, a lowthreshold mass selection and higher downsampling density yields the tightest constraints on cosmology, and nearly exactly what the halo mass "best-case" selection found: errors of 4.7%-6.7% for &#937; M and 3%-5% for &#963; 8 (LFI and rMSE, respectively). Figure <ref type="figure">5</ref> presents these best constraints our neural networks find on &#937; M and &#963; 8 , using stellar mass selections.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.2.">SFR and sSFR Selections</head><p>Next, we explore the constraints obtained when galaxies are selected via their star formation rate (SFR) and specific star formation rate (sSFR, or SFR divided by stellar mass). In this section, we focus on cosmological constraints when using these selections; see Section 5.1 for the constraints on the SC-SAM parameters controlling baryonic processes of stellar and AGN feedback.</p><p>Much like with stellar mass, our experiments here test both the selection threshold and the number density to which we randomly downsample. Here, we probe SC-SAM galaxies with SFR &gt;0.2 or 1.25 M e yr -1 at a high and low density, respectively. For sSFR, we probe SC-SAM galaxies with sSFR &gt;0.1 or 0.2 Gyr -1 with a high-density downsampling. For SFR and sSFR, errors on the cosmological parameters span 9%-15% for &#937; M , and 5%-8% for &#963; 8 (though we note the highthreshold, low-density SFR selection finds essentially no constraints). The tightest constraints on cosmology consistently come from stellar mass.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.3.">Discussion: Selections for Cosmological Constraints</head><p>The good accuracy and precision found on &#937; M (5%-7%) and &#963; 8 (3%-5%) with SC-SAM stellar mass-selected clustering are quite encouraging. First, we reach accuracy comparable to the dark matter only clustering (5% on &#937; M , 3%-4% on &#963; 8 ), even with the introduction of an astrophysical model. Second, it is heartening to see that, though this neural network was trained on simulations with a broad range of feedback from the SAM baryonic prescriptions and was asked to constrain all five parameters at once, it is still able to obtain robust cosmological constraints and focus on their strong influence on clustering.</p><p>Of the basic galaxy properties we select by, stellar mass tends to give slightly better constraints across the board. This is expected, as dark matter halo mass clustering would be expected to give the best constraints, and stellar mass tracks halo mass the most closely of the selections we probe. SFR provides good constraints on &#937; M , but interestingly much worse constraints on &#963; 8 , even obtaining no constraint at the strongest SFR selection at the lower density. sSFR shows good Figure <ref type="figure">5</ref>. Our best cosmological constraints using a stellar-mass-selected sample and the LFI loss function. Here, we selected SAM galaxies with stellar mass greater than 10 9 M e , downsampled to a density of 0.005 h 3 cMpc -3 at z = {0.0, 0.1, 0.5, 1.0}.</p><p>Figure <ref type="figure">4</ref>. Our best cosmological constraints with CAMELS-SAM, using "all" N-body only halo clustering and the LFI loss function. Here, we selected halos with mass greater than 2 &#215; 10 11 M e , downsampled to a density of 0.005 h 3 cMpc -3 at z = {0.0, 0.1, 0.5, 1.0}. constraints on both cosmological parameters, likely benefiting from the simple combination of stellar mass and star formation.</p><p>We also find that more extreme selections on these parameters have mixed effects, occasionally occluded by the necessity of a lower downsampling density and the introduction of more Poisson noise. For example, doubling the stellar mass threshold at the lower downsampling density has inconclusive effects on &#937; M (increasing rMSE but decreasing s) and worsens &#963; 8 by approximately 0.1%. However, doubling the sSFR threshold improves cosmological constraints by 0.1-1 percentage points (&#937; M , &#963; 8 , respectively).</p><p>Finally, there is a possible trend of obtaining better constraints for simulations with parameter values that yield higher densities across the suite, notably &#937; M and A SN2 . Why would &#937; M be better constrained with the clustering of galaxies under a selection that yields higher densities, especially for stellar mass? A lower stellar mass threshold combined with our density downsampling means that the objects whose clustering we measure will tend to be lower mass, and therefore may better probe the general range of structure formation within the simulations. Additionally, poorer constraints at higher stellar mass may come from a combination of more sensitivity to Poisson noise (at the lower density), as well as a possible degeneracy with &#963; 8 that may mask the effect of &#937; M .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.4.">Information in Galaxy versus Halo Clustering</head><p>Some readers may wonder: Exactly how much information about cosmology is contained within the clustering of SAM galaxies? We can probe the relationship between halo and galaxy clustering in our sample with a simple experiment: if we trained a neural network to measure cosmology with only the clustering of dark matter halos, how well can it predict cosmology when tested using galaxy clustering instead? Figure <ref type="figure">6</ref>. Comparing how well a neural network trained on the clustering of a dark matter halo mass-selected sample can predict cosmological parameters for a stellarmass-selected sample. This network was trained with "all" clustering at z = {0.0, 0.1, 0.5, 1.0} for dark matter halos with dark matter halo mass greater than 2 &#215; 10 11 M e , downsampled to a density of 0.005 h 3 cMpc -3 . Blue circles show the results of giving the network the test set for the same type of clustering with which it was trained and validated. Red stars show the results from instead evaluating the best-performing model on the test set of simulations using the clustering of SAM galaxies of stellar mass greater than 1 &#215; 10 9 M e , downsampled to the same density of 0.005 h 3 cMpc -3 .  <ref type="table">6</ref>, and Appendix D, illustrating the effects of using clustering constraints at a single redshift vs. four redshifts combined (All).</p><p>Figure <ref type="figure">6</ref> shows the results of this experiment. We first train a neural network with "all" clustering for SAM galaxies with host dark matter halo mass greater than 2 &#215; 10 11 M e , downsampled to a density of 0.005 h 3 cMpc -3 . As expected, the network finds tight constraints if it is tested with the same type of dark-matter-only clustering. Notably, the LFI loss gives very precise and accurate predictions, consistent with the predictions on the means, for dark-matter-only clustering: 4.7% for &#937; M and 4% for &#963; 8 .</p><p>We then have the best-performing model try to measure the cosmology when given only the clustering of SAM galaxies of stellar mass greater than 1 &#215; 10 9 M e , downsampled to the same density of 0.005 h 3 cMpc -3 . We note that this stellar mass selection was chosen because it performed the best of those we tried, and not through any connection between the halo mass above and the stellar mass of galaxies within them (making this a more challenging problem for the neural network). We carry out exactly the same calculations, changing only the objects in the test set. The constraints are less precise but still fairly accurate: &#937; M rMSE = 0.062 (21%) and &#963; 8 rMSE = 0.059 (7.4%). The LFI errors remain very small, consistent with what the network achieved on its training set.</p><p>This exercise of training on halo mass clustering and then testing on stellar mass clustering is helpful in two ways. First, it humbly reminds us that the precision of machine-learning results is only as good as the training samples. The LFI error predictions underestimate the true error on the stellar mass clustering constraints, as the network simply assumes the data it is given works the same way that the dark matter only clustering did. Second, this exercise emphasizes the strength of using clustering: having networks trained only on dark matter halos (which we theoretically understand well and can simulate cheaply) can yield accurate (if imprecise) constraints when given galaxy clustering. The neural network is retaining some information, even when being tested on the "wrong" thing, precisely because galaxies generally follow the clustering of dark matter.</p><p>However, this is not to understate the significance of how much our cosmological constraints improve when training directly with SAM galaxy clustering (as in Figure <ref type="figure">5</ref>). The clustering of dark matter halos dominates the signal of galaxy clustering, but it does not completely describe it. Without the additional information of the SC-SAM galaxy formation model, and without allowing the neural network to marginalize over the effects and uncertainties they introduce, there would be no improving the dark-matter-only constraints and no path forward in this field. Galaxies and halos have complicated relationships that are not well understood, yet even with the many variations of the SC-SAM models that we include and the broad parameter space that our suite covers, there is still more information to be learned than exists in a simplistic galaxy-halo model. Information about cosmology is lost when  <ref type="table">8</ref> and <ref type="table">9</ref>, and Appendix D). Dotted lines connect the same galaxy selection scenarios for ease of comparison. The dashed-dotted lines indicate the mean of the prior, where constraints are not informative.</p><p>the clustering of galaxies is assumed to be related to dark matter in an overly simplistic way, and much of it may be gained back when including a robust galaxy formation model. This exercise therefore reinforces the importance of one of CAMELS' central tenets as a project: to teach neural networks to marginalize over the uncertain galaxy formation physics. Neural networks are able to learn the complicated ways that galaxy physics affect clustering statistics, and yield constraints within 3%-10% on cosmology from galaxy clustering. Finally, this exercise also strongly motivates implementing additional methods of linking galaxy properties to dark matter (e.g., various HODs, more SAMs, subhalo abundance matching) in order to more fully explore the marginalizing power of the neural networks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Cosmological Constraints: Effect of Number Density Downsampling</head><p>Next, we examine how the choice to downsample to a given density affects our cosmological constraints. CiC and VPF are sensitive to number density and encode its effect very strongly, so we chose to correct for its influence to pull out the effect our parameters have on the large-scale structure more clearly. Additionally, randomly downsampling after a selection means the same underlying clustering is maintained, and it reduces the computational load of clustering measurements for our 1000+ simulations. However, how do constraints change if we allow number density to vary-and therefore allow the neural networks to use that information for predictions?</p><p>To test constraints without our density downsampling, we applied high-threshold cuts for select SC-SAM properties and measured the clustering on all resulting galaxies. To reduce the computational load while still maintaining robust number statistics for the sparsest volumes, we chose high-value cuts for our simulations that yielded at least several hundred to a few thousand galaxies in more than 95% of the simulations. These selections yield galaxy densities mostly between 10 -4 -10 -1 h 3 cMpc -3 , which corresponds to as many as fifty to one hundred thousand galaxies in a volume across the entire breadth of our parameter space. For more grounded context, the CV simulations with the fiducial parameters yield ten to twenty thousand galaxies under these selections. These selections therefore likely have minimal Poisson error in the bulk of their clustering measurements.</p><p>The cosmological constraints found for these three selections are detailed in Appendix C Table <ref type="table">5</ref> and are included in the summary Figure <ref type="figure">3</ref> for direct comparison with earlier selections. We discuss the effect of not downsampling on the SC-SAM astrophysical parameters in Section 5.2. Appendix C shows in detail some of the best-performing neural networks' results for this experiment.</p><p>Without the normalization to a fixed number density across all simulations and models, SFR-and stellar mass-selected clustering yield slightly improved cosmological constraints. For the stellar mass selections, &#937; M constraints stay the same or improve slightly, while &#963; 8 constraints notably improve by a few percent (dropping from approximately 6.3% to 4%-5.5%). For SFR selection, &#937; M constraints are about as good as when downsampling is applied (under the LFI loss). However, removing the downsampling greatly improves the &#963; 8 constraints from the strong SFR selection, yielding 8.1% errors. This is much improved over the essentially unconstrained s = 0.103 when taking a similarly strong SFR cut and then randomly downsampling. Though we cannot confirm just how much of the improvement comes from reduced Poisson noise, our other results suggest that some of this improvement might be attributed to the additional information contained in the predicted number density values.</p><p>The results of this experiment, at least with regard to cosmology, can be interpreted in a mixed way: constraints are often improved by including number density, especially for &#963; 8 when SFR selected samples are used, though not a significant amount of information about &#937; M is lost when randomly downsampling to a fixed number density. The computational effort required to measure so many galaxies' clustering is not negligible, meaning that for an initial assessment of cosmology, randomly downsampling could perhaps be a practical choice. We revisit this assessment when analyzing the constraints on the SC-SAM astrophysical parameters in 5.2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Cosmological Constraints: Effect of Redshift Choice</head><p>Next, we briefly examine the effect of combining multiple redshifts when training our neural networks. <ref type="foot">26</ref> With real galaxy observations, it would be difficult to make identical galaxy selections at multiple redshifts, especially those as observationally different as z = 0 versus z = 1. Therefore, we create single-redshift samples at z = {0.0, 0.1, 0.5, 1.0} to examine the constraints at single redshifts for a few types of selections.</p><p>At each redshift, we randomly downsample to our selected number density, meaning each redshift measures the clustering of a different randomly selected sample of objects. We also include more radii in the CiC distribution, as using one redshift at a time allows for more spatial scales to be included (see Table <ref type="table">3</ref> for full details; in essence, we sample more distance scales for more n count-in-cells measurements). We use the same VPF and 2ptCF as in previous sections.</p><p>Summary Figure <ref type="figure">7</ref> presents the calculated rMSE errors across the test set for all selections probed; the detailed results are tabulated in Appendix D Table <ref type="table">6</ref>. We discuss the SC-SAM parameter constraints when data from a single redshift at a time are used in 5.2.1. Additionally, Appendix D Figures <ref type="figure">16</ref> and <ref type="figure">17</ref> focus on results from experiments using halo mass and stellar mass selections at high and low density, respectively.</p><p>As expected, combining the information from multiple redshifts improves the constraints. Networks trained on clustering from a single redshift find slightly worse constraints, with fractional errors increasing by at least 5% on &#937; M and 1%-5% on &#963; 8 . We note that one or two extreme outlier test simulations are more common in the neural networks trained on single redshifts (see the visible red star outliers in several of the SC-SAM parameters in Figures <ref type="figure">18(c)</ref> and <ref type="figure">(b)</ref>). Constraints on &#937; M and &#963; 8 worsen by at least a few percent within the two types of mass selection we probe for individual redshifts. The individual redshifts have comparable rMSE errors across all parameters. There may be a degradation in &#937; M constraints with decreasing redshift, but the pattern is not conclusive.</p><p>We remind readers that, for the single-redshift experiment, the VPF and 2ptCF were kept identical but the CiC distributions were expanded to use the longer arrays the neural network can easily handle (Section 3.2.1 and Table <ref type="table">3</ref>). Interestingly, we note here that including clustering at more redshifts improves the constraints much more than including more of the CiC information at a single redshift. As Table <ref type="table">6</ref> and summary Figure <ref type="figure">7</ref> show, the constraints from "all" clustering at four redshifts always improve for &#963; 8 and nearly always improve for &#937; M , often dropping several percentage points. However, this may be due to the neural network independently learning the growth factor (a factor that determines the growth of density perturbations as a function of cosmology, especially &#937; M ). We do not find strong evidence for a particular redshift outperforming another.</p><p>Though we have not run experiments for each iteration and combination of clustering selections, our initial results lend credence to prioritizing getting samples of the same type of galaxies at different redshifts rather than measuring more detailed clustering statistics for a single sample, or measuring clustering across a broader range of scales. For example, a galaxy sample for which this is likely feasible in the near future is H&#945; and [O III] emission line galaxies confirmed with photometric redshifts, either from large-scale structure surveys like DES <ref type="bibr">(Abbott et al. 2018)</ref>, or narrowband surveys like HiZELS <ref type="bibr">(Khostovan et al. 2018</ref>) and LAGER <ref type="bibr">(Khostovan et al. 2020)</ref>.</p><p>Finally, we note that each CAMELS-SAM simulation has halo and galaxy information at 100 snapshots between 20 &lt; z &lt; 0, and we have only probed z &lt; 1 in this initial work. We also re-emphasize the caveat that we did not use exactly the same CiC for the single-redshift training as we did with all four redshifts (see Table <ref type="table">3</ref>): we include more distributions at a slightly larger range, in an attempt to leverage as much information as we could give the simple 1D:1D neural network at a single redshift.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Constraining SC-SAM Parameters for Stellar and AGN Feedback</head><p>Next, we explore in detail the constraints our neural networks obtain on the SC-SAM parameters controlling baryonic processes related to stellar and AGN feedback. All selections we tested are included in Figures <ref type="figure">3</ref> and <ref type="figure">7</ref>, and are detailed in Appendices C and D, Tables <ref type="table">4</ref>, <ref type="table">5</ref>, and 6.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Astrophysical Constraints: Comparing Galaxy Selections</head><p>First, we compare how the selection on galaxy property affects how well our neural networks are able to constrain astrophysical parameters from galaxy clustering. See upcoming Section 5.2 for a discussion about how constraints on the SC-SAM parameters change with no downselection and at individual redshifts, and Section 6.2 for how individual clustering statistics perform for these parameters.</p><p>As introduced in Section 4.1, we select SC-SAM galaxies based on stellar mass, star formation rate, and specific star formation rate. Here, we examine the constraints that our experiments find on the SC-SAM astrophysical parameters A SN1 , A SN2 , and A AGN . We remind readers that all neural networks have been asked to constrain all five parameters at the same time (with the exception of the "focused" experiment in the upcoming Section 5.3), and that our LFI loss function (described in 3.2.2) has been used specifically for its strength in pulling out the influence of weaker parameters. The results for constraints on the SC-SAM astrophysical parameters are summarized in Figure <ref type="figure">3</ref> and detailed in Appendix C Table <ref type="table">4</ref>, and select neural network constraint examples can be seen in Appendix C. We note that poor fits on the SC-SAM parameters correspond to rMSE and s errors near unity.</p><p>Of the core selections with "all" clustering with density downsampling, we find all selections find good constraints in different circumstances. Stellar mass and sSFR do well at constraining the parameter A SN2 ( s &lt; &lt; 0.5 0.8), and find moderate ( s &lt; &lt; 0.6 0.8, stellar mass) or very poor (s = 1, sSFR) constraints on A SN1 . Both are outperformed by SFR selections, especially at high-density downsamplings. All selections in this category find poor to no constraints on A AGN . As we will explore later, removing the density downsampling vastly improves constraints on the SC-SAM parameters.</p><p>The high-density SFR selection does well at constraining both SC-SAM A SN parameters-not surprisingly, given that those parameters control the normalization and slope of the mass outflow rate driven by stellar feedback, regulating star formation in galaxies. Specific SFR, as the simple combination of stellar mass and star formation, unfortunately does not show "the best of both worlds" and does not improve on the results from stellar mass or SFR selections separately (and as explored in 4.1.2, it also does not stand out for cosmological constraints).</p><p>The poor to nonexistent constraints on A AGN may perhaps be due to the fact that it only affects the properties of galaxies that are much more massive than our selection limit (see Figure <ref type="figure">12</ref> in Appendix A). In Section 5.3, we attempt to improve these astrophysical constraints by having the neural networks learn one parameter at a time.</p><p>Finally, we note some trends on the constraints obtained with different galaxy selections. As seen in Figure <ref type="figure">3</ref>, the parameter A SN2 (like &#937; M ) is better constrained with lower stellar mass and lower SFR selections (higher number density), and the dependence on mass or SFR is fairly strong. A SN1 appears to show the inverse behavior, with better constraints from highermass selections that yield fewer galaxies (but less of a clear trend with SFR selection). Marginally better constraints on A AGN may also be obtained with higher-mass (lower-density) selections, likely because this parameter has the greatest effect on the highest-mass, and therefore most star-forming, galaxies (e.g., Appendix A Figure <ref type="figure">12</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Astrophysical Constraints: Effect of Number Density and Redshift</head><p>Next, we report the effects of using clustering at one redshift and not downsampling to a single number density on the constraints on the astrophysical parameters A SN1 , A SN2 , and A AGN .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.1.">One Redshift versus Multiple</head><p>Our work has found there is no strong trend on the quality of the constraint on the astrophysics parameters with redshift (Appendix D, Table <ref type="table">6</ref> and Figure <ref type="figure">18</ref>), much like what was found with the cosmological parameters in Section 4.3. A SN2 constraints are noninformative (i.e., they are close to the mean of the prior) at the individual redshifts, and they are only somewhat constrained with all four redshifts combined. The constraints on A AGN remain very poor at individual redshifts. A SN1 is still decently constrained across redshifts. There is an interesting phenomenon of "all" four redshifts combined finding slightly worse rMSE constraints on A SN1 than the individual redshifts, though we note the LFI loss prediction finds s = 0.621, slightly better than three of the four constraints. For the SC-SAM supernova parameters, there therefore may be no loss of constraining power when focusing on a single redshift.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.2.">Effect of Density Downsampling</head><p>Throughout our process of narrowing down what galaxy selections we would undertake, we found that varying the SC-SAM parameters for stellar feedback created large variations in the number of galaxies that pass a given stellar mass or star formation rate cut. This is not surprising, especially with, e.g., the strong influence the A SN parameters have on the stellar mass functions, as seen in Appendix A. However, how much do they affect the clustering of galaxies?</p><p>We revisit summary Figure <ref type="figure">3</ref> and Appendix C Figure <ref type="figure">15</ref> and Table <ref type="table">5</ref> to assess the constraints found on the SC-SAM parameters A SN1 , A SN2 , and A AGN . Without downselecting to fixed number density, we find significantly improved constraints on the SC-SAM parameters for both SFR-and stellarmass-selected samples-often decreasing s by a tenth or more. Both types of selections get imprecise but not inaccurate constraints on A SN2 . Stellar-mass-selected samples produce better constraints on A SN1 , while stellar mass and SFR selected samples show the first signs of finding any constraint on the elusive A AGN parameter.</p><p>Appendix A uses the 1P catalogs to explore how the SC-SAM parameters affect key galaxy relationships at either very low or high values. We find that the SN parameters show strong influence on relationships involving stellar mass, especially the stellar mass function (SMF) and stellar masshalo mass function. The AGN parameter has the strongest influence at high halo masses. Therefore, it is not surprising that including the additional information on galaxy number density would improve the SC-SAM parameter constraints.</p><p>The VPF and CiC are quite sensitive to number density (the VPF especially, essentially constraining it alone). Including the number density as additional information for a neural network to leverage gives it a lot of knowledge about the SMF and therefore allows it to more easily learn the effect of the SC-SAM parameters. This is perhaps why A AGN is finally being constrained at all: the VPF and CiC sense whatever small influence it may have on galaxies' stellar mass and sSFR, and the neural network has more information to learn its relationships.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.">Astrophysical Constraints: Focused Neural Networks</head><p>We have explored how various galaxy selections and choices around clustering statistics affect constraints on cosmological parameters and SC-SAM astrophysics parameters when all are constrained at once. However, how much improvement can we find on our constraints for the SC-SAM parameters around stellar and AGN feedback by focusing on them one at a time?</p><p>These experiments function just like those already described, but the neural networks are asked to predict only one parameter at a time, given some array of galaxy clustering. We first confirm that this method works by isolating the cosmological parameters; our constraints do not improve when a neural network focuses on &#937; M or &#963; 8 alone given dark-matter-only or stellar-mass-selected galaxy clustering. This likely indicates that the full five-parameter predictions are accurately pulling out their influence. See Appendix D Table <ref type="table">7</ref> and Figure <ref type="figure">22</ref> for detailed comparisons.</p><p>Next, from our various experiments, we determine which galaxy and clustering selections may give the most useful constraints on the SC-SAM parameters. For the A SN parameters, we select a strong stellar mass selection with no density downsampling. For A AGN , we select a strong star formation rate with no downsampling. For both, we use "all" clustering and all four redshifts.</p><p>Our best constraints for the SC-SAM parameters when training networks to focus on them alone are detailed in Figure <ref type="figure">22</ref> and Appendix D Table <ref type="table">7</ref>. They are also included in summary Figure <ref type="figure">3</ref> for easier comparison against the neural networks that constrained all parameters at once (slightly darker plus signs). The rMSE constraints A SN1 and A SN2 are only slightly improved by individualized training. Similarly, the rMSE constraints on A AGN are slightly improved with the focused approach, which we partially attribute to this parameter being the most subtle of the SC-SAM parameters with weak influence (see Appendix A&#700;s Figure <ref type="figure">12</ref>). The LFI-predicted errors s when constraining all five parameters at once give constraints very similar to or slightly better than those of the focused parameter rMSEs.</p><p>Though the focused neural networks did not yield the desired outcome-stronger constraints on the SC-SAM astrophysical parameters-this exercise communicates the strength of our neural network implementation. The "Focused NN" results we summarize in Figure <ref type="figure">3</ref> are all either nearly identical to or slightly worse than the s that our LFI loss predicts when predicting all five parameters at the same time. In theory, focusing the entire breadth of a neural network to predict only one parameter will lead to the tightest constraints the architecture is able to find. This result therefore indicates that the LFI loss is behaving exactly as advertised-it can indeed learn noisy and less sensitive parameters even in the presence of strongly influential parameters.</p><p>As we describe in Section 3.2.2, the LFI loss function removes from the scatter of a parameter the dependence of the overall space, which in practice in CAMELS and CAMELS-SAM means it is able to learn parameters with relatively subtle effects, such as A AGN , while also learning &#937; M and &#963; 8 . Finally, we also ran focused neural networks for the cosmological parameters and found their constraints showed very little difference from those obtained when constraining all five parameters, additionally confirming the LFI loss method has optimized the constraints on the strongest parameters.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Comparing Constraints from Different Statistics: 2ptCF versus VPF versus CiC</head><p>Throughout this work, we have obtained constraints using the combined results of the 2ptCF, CiC, VPF. This leads to natural questions: how much is each statistic contributing? Is one better than the others for specific constraints? What statistics are worth investing computational time into?</p><p>In this section, we compare the constraining power of each of the clustering statistics that we have used. We train neural networks keeping all choices but the clustering statistics the same. We test several galaxy selections, making sure to keep the density, the radii tested, and redshifts selected (z = 0.0, 0.1, 0.5, 1.0) the same for each clustering statistic. We compare each clustering statistic's constraints against the combination of "all" clustering to determine which statistic may be dominating a given constraint for a given selection.</p><p>Summary Figure <ref type="figure">8</ref> shows the 1&#963; error predictions for &#937; M and &#963; 8 (top half), and A SN1 and A SN2 (bottom half). Appendix D Tables <ref type="table">8</ref> and <ref type="table">9</ref> detail all test results.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1.">Clustering Statistic Experiment Setup</head><p>We test the constraints from individual clustering statistics across several halo and galaxy selections: two thresholds of halo mass, three thresholds of stellar mass, and one of SFR, all randomly downsampled to either &#61518; = 0.001 h 3 cMpc -3 or 0.005 h 3 cMpc -3 . The exact details of how we measure and prepare the 2ptCF, CiC, and VPF for our neural networks are explained in 3.1.2-3.2.1. In this experiment, when giving a neural network the independent clustering statistics, we pass exactly the same values that go into "all" clustering, allowing for fair comparison.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.">Results: Constraints by Statistic</head><p>We summarize how the 2ptCF, CiC, and VPF constrain &#937; M and &#963; 8 in the upper half of summary Figure <ref type="figure">8</ref>; detailed results can be found in Appendix D Table <ref type="table">8</ref>. The lower half of Figure <ref type="figure">8</ref> (and Appendix D Table <ref type="table">9</ref>) focuses instead on the SC-SAM parameters controlling stellar feedback in Equation (1). We note that the AGN feedback parameter A AGN from Equation (2) is generally difficult to constrain, such that each clustering statistic alone (or even all combined) cannot constrain it beyond the mean of the prior under the selections we probe.</p><p>Across the many selections we attempt, patterns emerge in how cosmological constraints are affected by the choice of clustering statistic. First, as expected, the combination of multiple clustering statistics nearly always improves the constraints found, especially for the galaxy selections. The clustering statistics often find similar constraints on &#963; 8 . The 2ptCF tends to do best at &#963; 8 for lower-density downsampling selections, which is not unexpected: the 2ptCF, via its connection to the power spectrum, is a close measurement of the density fluctuations of the Universe. Interestingly, the VPF often yields better constraints on &#937; M than the 2ptCF and CiC; our thoughts on why are discussed in 6.3.</p><p>Unlike the cosmological parameters, there is notable improvement in constraining A SN1 and A SN2 when all clustering statistics are combined. There is some evidence across the galaxy selections that the VPF may drive the bulk of the constraints on A SN1 and A SN2 . CiC and 2ptCF mostly perform comparably to each other, and the VPF tends to also perform similarly for A SN2 across the bulk of our selections.</p><p>Examination of the neural network results in Appendix D Figure <ref type="figure">21</ref> helps give important context to some of the constraints we find. That specific neural network predicted low-value A SN1 slightly more accurately than the high A SN1 values, while maintaining a still generally flat dependence across the whole parameter space; this leads to deceptively lower errors given the performance. For a visual example of this phenomenon, compare this parameter's constraints in Figure <ref type="figure">21</ref>(c) against the others. Table <ref type="table">9</ref> indicates that the VPF yields the best constraints on the A SN1 and A SN2 feedback parameters (and therefore likely dominates the "all" clustering neural network results), but this is not an evident pattern in the figures themselves and could be a statistical anomaly.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3.">Discussion: Comparing Clustering Statistics</head><p>The clustering statistics we probe often find similar constraints on &#963; 8 , with some evidence that the 2ptCF tends to do best at &#963; 8 for lower-density downsampling selections. As the Fourier transform of the power spectrum, which describes the amplitude of density fluctuations across distance scales, the 2ptCF is indeed expected to constrain &#963; 8 well, even despite higher Poisson noise at the lower density. We also find that the VPF often results in better constraints on &#937; M than the 2ptCF and CiC. The VPF also appears to drive the bulk of the weak constraints on the SAM SN parameters. Why might this be?</p><p>First, let us consider our approach to CiC. CiC in its entirety contains all information from all orders of correlations <ref type="bibr">(Uhlemann et al. 2020)</ref>, and it might therefore be expected to give the most robust constraints (e.g., <ref type="bibr">Samushia et al. 2021</ref>). However, due to the limited amount of data the neural network can take in and still promptly converge on a solution, we cannot use the full range of our CiC measurements. The input 1D array of clustering should have fewer than 1000 values, so that our fully connected neural networks do not spiral into an unwieldy size with each new layer of neurons. For example, a fully connected network of four layers with 1000 neurons with an input of 10,000 elements and output of five elements would have: 10,000 &#215; 1000 4 &#215; 5 internal relationships to consider when optimizing the network. This can be feasible with powerful GPUs and patience, but the amount of time it takes to train grows very quickly. In our work, even sampling to cells with fewer than 50 galaxies (n = 50 in the terminology of Table <ref type="table">3</ref>) for 10 distance scales between 1 and 40 cMpc yields 500 data points for a single redshift.</p><p>The default of our current setup uses four redshifts between 0 &lt; z &lt; 1, and in this scenario we instead sample CiC at three distance scales where n = 0-50, taking care to choose distance scales that show the differences between the simulations, as well as aiming to sample evenly across approximately 5-40 cMpc. Additionally, taking distance scales that are far apart helps reduce the correlation between measurements (e.g., see <ref type="bibr">Gangolli et al. 2021</ref> for an examination of correlations within VPF measurements, and Uhlemann et al. 2020 for CiC). Table <ref type="table">3</ref> fully describes how we sampled a small part of the full CiC distribution due to computational limitations, both for our default four-redshift networks as well as when focusing on one redshift at a time in Sections 4.3 and 5.2. Though we have sampled a small part of the full CiC distribution to reduce computational load, we still find the CiC inputs offer competitive constraints and sometimes outperform the 2ptCF.</p><p>We note here the relevant study of <ref type="bibr">Wang et al. (2019)</ref> that reached similar conclusions within a different framework. Using the "decorated" HOD <ref type="bibr">(Hearin et al. 2016</ref>) that includes galaxy assembly bias, they used a complete Fisher matrix analysis to probe how robustly various clustering statistics constrain assembly bias. They compared the projected 2ptCF (which projects the 3D 2ptCF into the dimensions of galaxy observations in R.A., decl., and degrees), the VPF, the galaxygalaxy lensing signal, and variations on CiC such as count-incylinders and -annuli and probability distributions of them. Their work strongly motivates including CiC statistics alongside the popular projected 2ptCF and lensing signal for efficient constraints on galaxy assembly bias. Specifically, the VPF was good at constraining the number of central galaxies (not surprisingly, since it is a binary statistic that finds only empty test spheres and will be less sensitive to satellites), while the varied CiC refined the number of centrals and satellites well.</p><p>Our handling of CiC may give an explanation for why the VPF performs well. The VPF is the zeroth moment of countsin-cells, i.e., the n = 0 measurement. Because it yields a single value at each distance scale, we are able to include many more distance scales when preparing our data for a neural network. Therefore, the VPF serves as a sampling of the CiC and all the higher-order moments at many distance scales <ref type="bibr">(White 1979)</ref>. Future work may therefore benefit from including the VPF and similar statistics (e.g., the k-NN statistics proposed by Banerjee &amp; Abel 2021) alongside the popular 2ptCF.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">CAMELS-SAM Discussion and Implications</head><p>This work is a proof of concept, both for the power of the CAMELS-SAM simulation suite and the use of SAMgenerated galaxy catalogs, galaxy clustering, and neural networks to constrain cosmology and galaxy formation. In this section, we discuss implications of our analyses.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.1.">Comparison with Original Hydrodynamic CAMELS</head><p>Some readers may wonder if the work carried out in this work can be done to any extent with the original CAMELS suites. Important to the conception of CAMELS-SAM was the difficulty of applying galaxy selections across the entire CAMELS hydrodynamic suite. For example, if one tries to select any objects with nonzero stellar mass, the SIMBA "hump" will often easily find several thousand galaxies in the (25 h -1 cMpc) 3 volumes, while 10 or more percent of the IllustrisTNG "hump" would struggle to get more than a hundred galaxies. <ref type="foot">27</ref> We found there is no basic galaxy property selection that yields enough objects for acceptable Poisson noise and a large enough training sets across both the SIMBA and TNG humps, even if not downsampling to a single density. <ref type="foot">28</ref> We are therefore unable to robustly explore how well galaxy clustering marginalizes over multiple hydrodynamic astrophysics models to measure cosmology.</p><p>Among the most stringent CAMELS selection we are able to make across both hydrodynamic "humps" is a halo mass cut of M halo &gt; 2 &#215; 10 10 h -1 M e . We make this cut and then randomly downsample to &#61518; = h 0.064 3 cMpc -3 (1000 galaxies in each volume). We apply this to both the IllustrisTNG-DM and SIMBA-DM humps, and update our clustering to account for the smaller volume (and therefore allowed distance scales). <ref type="foot">29</ref>Using the clustering of these CAMELS-DM halos, we find rMSE constraints of 0.031 for &#937; M and 0.081 for &#963; 8 (10% atop the fiducial &#937; M = 0.3 and &#963; 8 = 0.8).</p><p>Cosmic variance in CAMELS-or the variance due to creating cosmological volumes of just (25 h -1 cMpc) 3 for each combination of parameters-results in noisy clustering statistics and makes the neural network predictions less accurate. Our choice to create much larger volumes at lower mass resolution improved the predictive power of our neural networks, both in terms of decreasing the effect of cosmic variance and expanding the galaxy selection and scale of clustering measurements we are able to carry out. This exercise further motivates the creation of larger CAMELS hydrodynamic wings, as well as the development of other techniques such as "next-generation" SAMs, which are specifically designed to emulate the results of specific hydrodynamic simulations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.2.">The Influence of Astrophysics on Cosmological Parameter Inference</head><p>Our work with the clustering in CAMELS-SAM reaches two key conclusions: first, that we are able to successfully marginalize over astrophysics to constrain cosmology to precision nearly as well as with dark-matter-only clustering; and second, that we are able to learn something about the astrophysical parameters at the same time. Some readers may wonder just how much the presence of astrophysical variations in CAMELS-SAM affects our constraints.</p><p>First, we can explore how much information is lost to astrophysics in our CAMELS-SAM clustering analysis. We compare the results presented earlier in this work-where neural networks work with the clustering of galaxy catalogs generated using parameters across a 5D Latin hypercube-with the results for a neural network trained on the clustering of galaxy catalogs whose generating parameters cover only a 2D cosmological space. Appendix E details this experiment.</p><p>We find that the cosmological constraints stay the same or improve when trained on the clustering of galaxy catalogs run with only the fiducial SC-SAM prescriptions. The rMSE errors slightly worsen for both &#937; M and &#963; 8 , but the LFI loss errors remain the same for &#937; M and slightly improve for &#963; 8 . We conservatively believe that some of the improvement of &#963; 8 constraints are partially due to the SC-SAM reinforcing or encoding information from the merger trees. However, this experiment shows that including the astrophysical parameters in the inference does not lead to lost information on cosmology from galaxy clustering.</p><p>In a secondary experiment, we test the constraints when instead using the clustering of all galaxies above a high stellar mass threshold (and therefore introducing variations in number density for the neural networks to use). This scenario finds remarkable improvements in both cosmological parameters, though especially in &#937; M (not surprisingly, as increasing the total mass density will increase the total number of halos formed). We attribute this to the neural networks not having to work around degenerate effects on the number density that the SC-SAM parameters cause. However, the two experiments taken together emphasize the importance of including variations in astrophysics in cosmological parameter inference: the effects of astrophysics are many and not wellunderstood, and not marginalizing over them might lead to spuriously optimistic constraints.</p><p>One might also wonder: How robust is the cosmological inference to the number of SC-SAM parameters varied? This work only begins to probe this question, comparing constraints with zero versus three SC-SAM parameters varied. As shown in earlier in this section and in Appendix E, cosmological constraints remain quite robust under three SC-SAM parameters, with the LFI loss maintaining the precision and accuracy found within our dark-matter-only clustering tests.</p><p>We believe this is due to the properties of the LFI loss function our networks use (see Section 3.2.2) and that we have a large enough Latin hypercube that the networks can learn the behavior of all five CAMELS-SAM parameters. Projects in progress within CAMELS-SAM and CAMELS will test this completely with the SC-SAM and the IllustrisTNG models, by varying all available parameters in an expanded Latin hypercube or Sobol sequence <ref type="bibr">(Sobol 1967)</ref>, respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.3.">CAMELS-SAM among Other Simulations</head><p>Next, we discuss CAMELS-SAM in context with other large simulation suites, as well as some future work and possible experiments that CAMELS-SAM enables.</p><p>The N-body portion of CAMELS-SAM straddles a unique point between the limitations of computing power, data storage capacity, and useful scientific application, especially for machine learning. This part of CAMELS-SAM comprises over 1000 unique simulations of a moderately large volume, with moderate mass resolution (sufficient to robustly resolve galaxy properties relevant to upcoming observational samples), with 100 snapshots saved throughout 20 &lt; z &lt; 0, as well as sampling of very broad cosmological parameter space in &#937; M and &#963; 8 . Individually, other simulation suites may be comparable or superior to each of these aspects, but they are combined in such a way to fill a unique role.</p><p>There exist N-body simulation suites that are much larger and/ or higher-resolution-e.g., BACCO <ref type="bibr">(Angulo et al. 2021</ref><ref type="bibr">), Aemulus (DeRose et al. 2019)</ref>, ABACUS Cosmo/Summit <ref type="bibr">(Garrison et al. 2021;</ref><ref type="bibr">Maksimova et al. 2021)</ref>, Uchuu <ref type="bibr">(Ishiyama et al. 2021)</ref>, and Dark Quest <ref type="bibr">(Nishimichi et al. 2019</ref>)-but they may not be as well suited for training neural networks or for running semi-analytic models. For example, comparable simulation suites often contain significantly lower numbers of realizations (e.g., between several dozen to several hundred), which risks providing a small training set. <ref type="foot">30</ref> Next, many of these suites cover a narrower cosmological parameter space, which incurs the risk of neural network results being too tightly focused around the priors <ref type="bibr">(Villaescusa-Navarro et al. 2022b</ref>; see, e.g., Figure <ref type="figure">6</ref> of <ref type="bibr">Ntampaka et al. 2020</ref> with ABACUS Cosmo, who had to restrict their cosmological space in &#963; 8 , due to biasing at the edges). Additionally, many of these suites solved the volumeresolution-data storage balance by saving a small number of snapshots (often 10-50, with as many as 65 or as few as 5). Though ideal for their specific science goals, this limits the possibility of running SAMs to generate galaxies, as they require densely sampled merger tree histories. Finally, we also point out N-body simulation suites that do not vary cosmological parameters, but which are well-suited to study other key features of large-scale structure and cosmology: Indra from <ref type="bibr">Falck et al. (2021)</ref>, comprising several hundred large volumes and many snapshots for excellent statistics; and UNIT from <ref type="bibr">Chuang et al. (2019)</ref>, consisting of several hundred large volumes at excellent resolution for nonlinear statistics.</p><p>Most comparable to CAMELS-SAM (beyond the hydrodynamic CAMELS "humps") is the Quijote project <ref type="bibr">(Villaescusa-Navarro et al. 2020b)</ref>. The Quijote suite is unique in that it covers an even broader cosmological parameter space than all of CAMELS: 7000 unique models over six cosmological parameters, including massive neutrinos. Quijote also has much larger volumes of 1 (h -1 Gpc) 3 (though at lower resolution), and an astounding 44,000 simulations. However, its five stored snapshots and lower mass resolution make it unsuitable for applications using merger trees-and therefore preclude it from being the backbone for CAMELS-SAM. Though it simulates only dark matter, the vast Quijote suite may be able to answer questions this work has not or cannot; for example, probing larger-scale clustering, using more clustering statistics, implementing more sophisticated statistical tools like the Fisher matrix, expanding the cosmological models probed, etc.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.4.">CAMELS-SAM Data Release and Possibilities</head><p>Like the previous CAMELS "humps," CAMELS-SAM was created to serve as a data set to train machine-learning tools to measure and analyze cosmology and uncertain aspects of astrophysics models. With the addition of the Santa Cruz SAMgenerated galaxies atop large N-body-only volumes, CAMELS-SAM offers a completely unique data set for machine learning. As part of the CAMELS Public Data Release in <ref type="bibr">Villaescusa-Navarro et al. (2023)</ref>,<ref type="foot">foot_14</ref> we release the following:</p><p>1. The halo catalogs from ROCKSTAR. 2. The merger trees generated from CONSISTENTTREES. 3. The galaxy catalogs from the Santa Cruz SAM. 4. Documentation at <ref type="url">https://camels-sam.readthedocs.io/</ref>.</p><p>Those wishing to directly analyze the raw simulation snapshots of CAMELS-SAM should reach out to the CAMELS team. The raw data (full N-body snapshots across redshifts) have been stored on tape and may be retrieved upon request.</p><p>An intuitive next step with this work is to leverage other types of neural networks and use all available information, rather than just summary statistics. <ref type="bibr">Ntampaka et al. (2020)</ref> incorporated convolutional neural networks (cNNs; <ref type="bibr">Fukushima &amp; Miyake 1982;</ref><ref type="bibr">LeCun et al. 1999)</ref> for cosmological inference on maps of HOD-simulated galaxies, finding few-percentage constraints on &#937; M and &#963; 8 . cNNs can extract image features, like edges, shapes, and textures from 2D or 3D images, and they have been shown to remarkably improve cosmological constraints compared to standard statistical approaches in the context of cosmology. For example, <ref type="bibr">Ravanbakhsh et al. (2017)</ref> fed their cNN the full dark matter distribution and compared this to the classic method of maximum likelihood fitting, finding significantly better constraints with the cNN even with much smaller volumes. Later, <ref type="bibr">Pan et al. (2020)</ref> further tested cNN constraints and their robustness and biases when using 3D dark matter distributions to measure &#937; M and &#963; 8 . With its relatively large volumes and SAM-generated galaxies, CAMELS-SAM offers a great opportunity to train cNNs to constrain cosmology from galaxy maps.</p><p>The CAMELS collaboration has recently found great success in cosmological parameter inference with graph neural networks (gNNs). gNNs operate on data structured into nodes (the objects themselves) and their edges (how they interact), and they are able to leverage global and local relationships in the data <ref type="bibr">(Battaglia et al. 2018;</ref><ref type="bibr">Xu et al. 2018;</ref><ref type="bibr">Zhou et al. 2018;</ref><ref type="bibr">Hamilton 2020;</ref><ref type="bibr">Naidoo et al. 2020)</ref>. The first in CAMELS to use gNNs for cosmological inference were Villanueva-Domingo &amp; Villaescusa-Navarro (2022), who trained gNNs to perform likelihood-free inference of &#937; M at the galaxy-field level. With only z = 0 real-space galaxy positions, their gNNs were able to accurately predict the power spectrum of CAMELS galaxies and moderately constrain &#937; M ; by including galaxy properties, the gNNs find significantly tighter cosmological constraints (though unfortunately only when training and testing on the same hydrodynamic suite, leaving improved robustness to astrophysics as a goal for the team). Next, <ref type="bibr">Shao et al. (2023a)</ref> were able to robustly constrain &#937; M and &#963; 8 across several types of N-body codes and the CAMELS TNG and SIMBA hydrodynamic suites using information about dark matter halos (indicating perhaps some fundamental relation, as they probe and identify in <ref type="bibr">Shao 2022</ref><ref type="bibr">, Shao 2023b)</ref>. <ref type="bibr">de Santi et al. (2023)</ref> are training gNNs to predict &#937; M given galaxy properties, finding great robustness to the various CAMELS astrophysical models. With its larger volume and separate model for galaxy physics, CAMELS-SAM promises to yield great results using the full halo or galaxy distributions with gNNs. Beyond CAMELS but relevant to CAMELS-SAM, gNNs have also been used to accurately map the complete hyperplane and dispersion of galaxy properties onto merger trees, offering a path to more precisely and efficiently emulate the output of SAMs <ref type="bibr">(Jespersen et al. 2022)</ref>. Additionally, <ref type="bibr">Makinen et al. (2022)</ref> utilized information maximizing neural networks <ref type="bibr">(Charnock et al. 2018</ref>) atop a gNN architecture of DM halo catalogs to constrain cosmology far beyond what twopoint galaxy clustering could, even in a noisy survey.</p><p>Finally, what of the possibilities of doing the work of this project with observed galaxy clustering? If we hope to use neural networks to constrain for our Universe the values of &#937; M and &#963; 8 , as well as astrophysical feedback in the context of the SC-SAM, how must this analysis progress? A neural network's predictions are only as good as the data on which it is trained, meaning an analysis like we have presented will only succeed on observations if the training data accurately reflects them. Therefore, expanding this project to observed galaxy clustering requires first robustly generating realistic galaxies from the properties the SC-SAM produces. This would entail selecting a galaxy sample that CAMELS-SAM is large and resolved enough to simulate (e.g., perhaps the 2D clustering of emission line galaxies in a small but well-studied field like COSMOS; <ref type="bibr">Khostovan et al. 2018)</ref>, as well as a detailed understanding of the selection function and observational systematics for the sample (such as, e.g., <ref type="bibr">Hahn et al. 2022</ref><ref type="bibr">Hahn et al. , 2023</ref> include in their forward model for BOSS CMASS galaxies). Additionally, expanding to other SAMs (and their unique parameterization of galaxy physics) could improve the neural networks' ability to marginalize over many forms of astrophysical prescriptions and better constrain &#937; M and &#963; 8 for similar galaxy selections. A large variety of future work is still required within CAMELS-SAM, as well as for the cosmological inference field as a whole, before directly working with observed galaxies.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.">Conclusion</head><p>In this work, we present and show an initial proof-of-concept project of CAMELS-SAM, a new and bigger "hump" of the CAMELS project. CAMELS-SAM is composed of more than 1000 unique N-body-only simulations of volume (100 h -1 Mpc) 3 and N = 640 3 particles each. The N-body simulations were generated across a broad cosmological parameter space of &#937; M = [0.1,0.5] and &#963; 8 = [0.6,1.0], with 100 stored snapshots between 20 z 0. Each of these N-body simulations has associated ROCKSTAR halo catalogs and CONSISTENTTREES merger trees. Finally, each N-body simulation was run through a unique iteration of the Santa Cruz SAM for galaxy formation, covering a broad range of the parameters controlling feedback from massive stars, supernovae, and AGN radio jets.</p><p>These halo catalogs, merger trees, and galaxy catalogs have been publicly released for the community to use for a variety of science applications: <ref type="url">https://camels-sam.readthedocs.io</ref>  <ref type="bibr">(Villaescusa-Navarro et al. 2023)</ref>. As a proof of concept for the capabilities of this simulation suite, we used galaxy clustering statistics to constrain cosmology, marginalize over astrophysics, and probe astrophysical feedback with simple neural networks.</p><p>We give a brief summary of our proof-of-concept work with CAMELS-SAM:</p><p>1. We measure and analyze the void probability, count-incells, and real-space 3D two-point correlation functions of halos and SC-SAM galaxies. We compare clustering across selections by halo mass, stellar mass, star formation rate, and specific star formation rate. 2. We leverage simple 1D:1D neural networks and a likelihood-free inference method to measure the mean and standard deviation of each parameter's marginal posterior. We leverage CAMELS-SAM's large suite size for a split of 700/150/150 training/validation/testing simulation sets. 3. We explore how the accuracy and precision of our parameter inference varies with different halo/galaxy selections and density downsamplings, and compare the choice of combining versus keeping separate the redshift and clustering statistics used. Section 4 focuses on the cosmological parameters {&#937; M , &#963; 8 }, while Section 5 focuses on the SC-SAM feedback parameters. 4. Our neural networks are indeed able to marginalize over astrophysics to accurately constrain cosmology. The tightest constraints on cosmology we find are with the clustering by dark matter halo mass. Our cosmological constraints for {&#937; M , &#963; 8 } find fractional errors of {4.7%, 3%} about their fiducial values of {0.3, 0.8}, respectively. With the clustering of SC-SAM galaxies by stellar mass, we predict {&#937; M , &#963; 8 } with errors of 4.7%-6.7%, 3%-5%. 5. We find that our other selections based on stellar mass yield predictions for {&#937; M , &#963; 8 } with errors of 8.6%-12.3%, 5.4%-8%. Selecting on instantaneous star formation rate often yields predictions for {&#937; M , &#963; 8 } with errors of 10%-15.6%, 6.4%-12.8%. Selecting by specific star formation rate often yields predictions for {&#937; M , &#963; 8 } with errors of 8.3%-9.3%, 4.6%-6.9%. 6. Our neural networks are also able to constrain the astrophysical parameters alongside the cosmological parameters. We are able to learn some information about each of the SN and AGN feedback parameters of the SC-SAM, reaching constraints as good as 30%. We tend to find tighter constraints on both cosmological parameters and astrophysics parameters when we do not randomly downsample to a fixed number density. However, some galaxy property selections perform comparably well after downsampling. 7. We find better constraints when we combine clustering statistics from several redshifts. When using one redshift at a time, we do not find evidence for strong redshift dependence on the quality of the constraints. 8. In Section 6, we compare the constraints that each clustering statistics finds independently. When comparing each of the clustering statistics-VPF, CiC, and 2ptCFfor various mass-threshold clustering samples, we find that all the statistics find similar constraints for &#963; 8 . 9. We find that CiC and VPF often slightly outperform the 2ptCF alone for &#937; M . We also find the VPF often drives the bulk of the constraints found on the SC-SAM parameters.</p><p>Finally, these are key implications of our work and CAMELS-SAM:</p><p>1. This is the first work to leverage machine learning and adjacent tools to not only constrain parameters of a SAM (e.g.</p><p>, Van Daalen et al. 2016), but to simultaneously constrain cosmology and improve the information a neural network is able to learn from galaxy clustering. 2. Our work with CAMELS-SAM includes smaller nonlinear scales than most have probed for cosmology (reaching R &gt; 1.1-1.6 cMpc, or -&lt; k h 5.85 8.15 max Mpc -1 ). 3. Our work contributes to the growing literature that advocates the inclusion of CiC, VPF, and related statistics alongside the 2ptCF for constraining cosmology (e.g., Wang et al. 2019; Uhlemann et al. 2020; Samushia <ref type="bibr">et al. 2021</ref>). 4. By implementing a robust model for galaxies and their complex astrophysics, our neural networks are able to learn the unique ways that the SC-SAM affects galaxy bias, and they provide significantly improved constraints on cosmology as compared to those that would be found if we were to simply assume galaxies follow the clustering of dark matter halos. 5. The halo products from CAMELS-SAM inhabit a unique position in simulation volume and resolution, excellent snapshot and redshift coverage, number of simulations, and vast cosmological parameter space, ideal for several applications across machine learning, galaxy modeling, and dark matter analysis.  SN0 (cyan) and &#61682; = &#193; 4.0 SN1 SN0 (magenta). See Figure 9 for explanation of plotted quantities. rh (cyan) and a = + + A 2 SN2 rh (magenta). See Figure 9 for explanation of plotted quantities. while A SN2 affects the slope at all but the highest masses. Both A SN parameters affect the low-mass half of the cold gas fraction versus stellar mass relationship in similar ways: lower values create a sharper drop and lower valleys.</p><p>A AGN , on the other hand, has much more subtle effects. It has nearly no effect on stellar metallicity versus stellar mass relationship, and it shows only very mild effects on the SMF and black hole versus bulge mass relationship at the highest stellar masses. Unlike the A SN parameters, its effect on the relationship of cold gas fraction versus stellar mass is on the higher-mass end. It does, however, show strong effects on the stellar mass-halo mass relationship on the right/higher-mass half of the "mountain," suppressing the stellar mass values as its effect is strengthened.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Appendix B Tests for Neural Network Accuracy, Continued</head><p>As discussed in Section 3.2.3, part of our verification of our neural network's accuracy under the likelihood-free inference involves examining the distribution of &#967; j (&#952; i ) for the test set. This confirmation of the LFI is nearly identical to that in Figure <ref type="figure">3</ref> of <ref type="bibr">Jeffrey et al. (2022)</ref>. Those authors developed the moment networks framework at the heart of our LFI loss in <ref type="bibr">Jeffrey &amp; Wandelt (2020)</ref>, and they apply it in the context of detecting the primordial B-mode of the cosmic microwave background.</p><p>Even with a model with more than 10 5 parameters, the moment networks gave good results with accurately estimated errors. We apply the same test to probe how underestimated or inaccurate our errors may be.</p><p>Figure <ref type="figure">13</ref> shows the distribution of &#967; j (for each j simulation in the test set) for two of our clustering selections' neural network predictions of each parameter &#952; i . We show each of the top five best-performing neural networks as a unique color. As discussed in Section 3.2, the top few best-performing neural networks are all quite similar in overall behavior, so this highlights the general behavior of the LFI predictions for the given selections. If the LFI loss and moment network are working as they should to compute the mean and variances of the posteriors, the &#967;(&#952; i ) distributions will have a mean of zero and variance of one. A pattern emerges among the &#967;(&#952; i ) distributions we create: parameters that get decent constraints (i.e., about the 1:1 relationship and whose error bars look visually consistent) appear as rough Gaussian distributions with a mean of zero and variance of one. Unconstrained parameters appear flat within a similar range or peak at one of the extreme ends, though they visually still appear to have a mean of zero and variance of one, indicating the moment network still performed as prescribed.</p><p>Figure <ref type="figure">13</ref>(a) shows the &#967; j distribution for the marginal posteriors given "all" clustering of dark matter halos with mass Figure <ref type="figure">13</ref>. The &#967; i distribution across the test set for the top five best-performing networks (each in a unique color). The dotted gray curve is a Gaussian whose center is at zero (black vertical line) and has a variance of one. tvjmline 0.1hgreater than 2 &#215; 10 11 M e and randomly downsampled to 0.005 h 3 cMpc -3 . The cosmological constraints tend to be very good, so their &#967; j distributions are consistent with a Gaussian distribution centered at zero and with variance of one. The constraints on the SC-SAM astrophysical parameters are nonexistent (just around the mean of the prior and with large errors), which show up as &#967; j distributions that are flat along the range or that peak at the Gaussian tails. Figure <ref type="figure">13(b)</ref> shows the &#967; j distribution for the marginal posteriors given "all" clustering of SC-SAM galaxies with stellar mass greater than 1 &#215; 10 9 M e and randomly downsampled to 0.005 h 3 cMpc -3 . This selection gives good constraints on the cosmological parameters and A SN2 . Even though the constraints are not as good on A SN2 (i.e., they are spread around the 1:1 relationship with large error bars), the errors are not underestimated. approximately 5% about the fiducial value &#937; M = 0.3. Our LFI method also measures an average 1&#963; standard deviation error of s = 0.014 on &#937; M . For &#963; 8 , the rMSE error of the mean values across the test set is 0.032 (4% for &#963; 8 = 0.8), while s is 0.024 (3%). We also probe a much higher halo mass selection of 1.2 &#215; 10 12 M e with a lower-density downsampling of 0.001 h 3 cMpc -3 , and find comparable though slightly worse constraints (likely due to Poisson noise introduced from the lower density). We take the tightest constraints from our dark matter selections as the "best" throughout this work.</p><p>Stellar Mass Clustering: In Figure <ref type="figure">5</ref>, we showed the constraints on &#937; M and &#963; 8 from "all" clustering at z = {0.0, 0.1, 0.5, 1.0} of galaxies with stellar mass greater than 1 &#215; 10 9 M e , randomly downsampled to 0.005 h 3 cMpc -3 . Other stellar mass selections we also tested include stellar mass greater than 1 &#215; 10 10 M e to a density of 0.001 h 3 cMpc -3 , as well as stellar mass greater than 2 &#215; 10 10 M e to a density of 0.001 h 3 cMpc -3 , shown in Figure <ref type="figure">14(a)</ref>. We found the lower threshold and higher downsampling density selection yielded the tightest constraints on cosmology. The best-performing neural network, using the clustering of galaxies with stellar mass greater than 1 &#215; 10 9 M e , produces &#937; M predictions accurate to rMSE = 0.02, or approximately 7% about the fiducial value &#937; M = 0.3. The LFI loss measures an average 1&#963; standard deviation error of s = 0.014 on &#937; M , the same value as it found with dark-matter-only clustering. For &#963; 8 , the rMSE Notes. Constraints for cosmological and astrophysical parameters across different SC-SAM galaxy selections with "all" galaxy clustering statistics at z = {0.0, 0.1, 0.5, 1.0}, when downsampling to a single number density (in h3 cMpc-3).</p><p>Figure <ref type="figure">15</ref>. Exploring downsampling to a fixed density affects constraints on cosmology and the SC-SAM feedback parameters, for "all" clustering of galaxies selected by (a) stellar mass (magenta) or (b) star formation rate (cyan). Detailed quantitative comparisons can be found in Table <ref type="table">5</ref>. Outliers (red stars, often out of range) are simulations in the test set whose "Z-values" are greater than 6 (Section 3.2.3).</p><p>error of the mean values across the test set is 0.038 (approximately 5% for &#963; 8 = 0.8), while s = 0.021 (3%). SFR and sSFR: Under the set of choices we have adopted so far, with "all" clustering statistics at z = {0.0, 0.1, 0.5, 1.0} and random downsampling to a fixed number density, we probe SC-SAM galaxies with SFR &gt; 0.2 M e yr -1 randomly sampled to &#61518; = h 0.005 3 cMpc -3 , and also SFR &gt;1.25 M e yr -1 randomly sampled to &#61518; = h 0.001 3 cMpc -3 . For sSFR, we probe SC-SAM galaxies with sSFR &gt;0.1 and Gyr -1 and sSFR &gt; 0.2 Gyr -1 , both randomly sampled to &#61518; = h 0.005 3 cMpc -3 . Figures <ref type="figure">14(b</ref>) and (c) show two of the neural networks trained on SFR-and sSFR-selected galaxy samples.</p><p>Density Downsampling: Figure <ref type="figure">15</ref> highlights some of the best-performing neural network results when using clustering with no downsampling of galaxies. The only repeated selection from Section 4.1 is M stellar &gt; 2 &#215; 10 10 M e (Figure <ref type="figure">15a</ref>). We also include M stellar &gt; 7 &#215; 10 9 M e as a small test of threshold sensitivity, as well as SFR &gt;1 M e yr -1 (Figure <ref type="figure">15b</ref>). Table <ref type="table">5</ref> details our constraints on our cosmological and astrophysical parameters under different SC-SAM galaxy selections, when using 'all' clustering statisticstics at z = {0, 0.1, 0.5, 1.0.}, but when not downsampling to a single number density. We include the range of galaxy number densities these selections (h 3 cMpc -3 ) yield across the LH suite for context. Comparing redshifts. Throughout this work, our default input to our neural networks has been clustering statistics from z = {0.0, 0.1, 0.5, 1.0} combined together. This, for example, mimics possible future experiments leveraging similarly selected galaxy populations at different redshifts. However, in Sections 4.3 and 5.2, we considered how our constraints might change if made only at a single redshift. Table <ref type="table">6</ref> details our rMSE constraints on our cosmological and stellar feedback parameters with clustering at each redshift individually.</p><p>We describe our clustering methodology in Section 3.1, and specifically highlight Table <ref type="table">3</ref> for the slight adjustments we made to what clustering we give to the single redshift neural network. We also note that a more fair comparison would have followed the example of the clustering statistic tests in Section 6, where the only adjustment we made was splitting the data up by statistic. However, we expand the CiC distributions given to the neural network at each redshift to answer the question: how good might our constraints be if we focus in on a single redshift and give a neural network as much data as it can handle?</p><p>We share two representative examples of these neural networks in this appendix, both downsampled to a density of 0.005 h 3 cMpc -3 : cosmology-only constraints for a highdensity halo mass selection of 2 &#215; 10 11 M e (Figures <ref type="figure">16</ref> and <ref type="figure">17</ref>), and constraints on all five parameters at once for a lowdensity stellar mass selection of 1 &#215; 10 9 M e (Figure <ref type="figure">18</ref>).</p><p>Additional information. Focused NNs. As explored in Section 5.3, we tested how our constraints fared if the neural network was told to focus on a single parameter at a time. Table <ref type="table">7</ref> compares the best constraints found in all previous experiments for each parameter with the constraints when focusing the neural network on each parameter one at a time under the same selection and clustering values. When focusing on one parameter at a time, we find that the constraints either stay the same or slightly improve when comparing rMSE errors. We also find that the LFI errors from having all parameters trained at once occasionally match or even exceed those from focusing on one at a time. This indicates our LFI loss is performing as expected and finding all possible available information for each parameter.</p><p>Comparing clustering statistics. In Section 6, we compare the constraints that each independent clustering statistic is able to find for our five parameters. Though we tested several selections, we include two representative examples in this appendix: cosmology-only constraints on &#937; M and &#963; 8 for a halo mass selection of 2 &#215; 10 11 M e downsampled to a density of 0.005 h 3 cMpc -3 (Figures <ref type="figure">19</ref> and <ref type="figure">20</ref>), and constraints on all five parameters at once for a stellar mass selection of 1 &#215; 10 9 M e (Figure <ref type="figure">21</ref>). Outliers, marked in red stars in Figure <ref type="figure">21</ref>, are simulations in the test set whose "Z-values" (Equation ( <ref type="formula">12</ref>)) are greater than 6, and they are excluded when calculating the rMSE (see Section 3.2.3 for more details). Table <ref type="table">8</ref> details the constraints on our cosmological parameters when comparing clustering statistics, and Table <ref type="table">9</ref> detailes the constraints instead for the SC-SAM supernova parameters. Finally, we remind readers that the true parameter distributions in the test set only appear slightly skewed to lower values for the A SN1 and A AGN parameters, due to their original generation in logarithmic space. This effect would disappear if plotted in log-scale, though we choose to keep all scales linear for consistency with plots in <ref type="bibr">Villaescusa-Navarro et al. (2021a)</ref>.</p><p>For the cosmological parameters, the focused neural networks use "all" clustering of SAM galaxies with halo mass greater than 2 &#215; 10 11 M e across four redshifts (downsampled to 0.005 h 3 cMpc -3 ), as well as SAM galaxies with stellar mass Notes. Constraints for cosmological and astrophysical parameters from different SAM galaxy selections with "all" galaxy clustering statistics at z = {0.0, 0.1, 0.5, 1.0}, when not correcting to a fixed number density. The density ranges give a rough idea of the number of galaxies passing each selection across the LH suite, and they are in units of h 3 cMpc -3 .</p><p>greater than 2 &#215; 10 10 M e across four redshifts (no downsampling). For the A SN parameters, we select "all" clustering of SAM galaxies with stellar mass greater than 2 &#215; 10 10 M e across four redshifts (no downsampling). For A AGN , we select the "all" clustering of all SAM galaxies with star formation rate greater than 1 M e yr -1 across four redshifts (also no downsampling). Figure <ref type="figure">22</ref> shows the focused NN results for the SC-SAM parameters.   <ref type="table">3</ref>). Masses are log 10 M e , and densities for each selection are indicated with superscript symbols: * for 0.001 and &#8224; for 0.005 h 3 cMpc -3 . "All" here indicates the four combined redshifts.</p><p>Figure <ref type="figure">16</ref>. Examining redshift dependence on constraints for the cosmological parameter &#937; M , using the clustering of halos with masses greater than 2 &#215; 10 11 M e downsampled to a density 0.005 h 3 cMpc -3 . See Table <ref type="table">3</ref> for details of what exact distance scales and values were used for CiC for "all" clustering measured for this training.</p><p>Figure <ref type="figure">18</ref>. Examining redshift dependence on cosmological constraints for the clustering statistics of SAM galaxies with stellar mass greater than 2 &#215; 10 10 M e , downsampled to a density of 0.001 h 3 cMpc -3 . See Table <ref type="table">3</ref> for details of what exact distance scales and values were used for CiC for "all" clustering measured for this training.</p><p>Figure <ref type="figure">19</ref>. Which clustering statistic is best at constraining cosmology through &#937; M ? We compare the constraints on &#937; M found by different clustering statistics based on the clustering of dark matter halos with mass greater than 2 &#215; 10 11 M e , randomly sampled to a density of 0.005 h 3 cMpc -3 . We combine the clustering at z = {0.0, 0.1, 0.5, 1.0}. The 2ptCF (a) is measured between 1.1 &lt; R &lt; 36.1 cMpc; CiC (b) is measured at at R = 16.0, 22.4, 28.8 cMpc; and the VPF (c) is measured between 1.6 &lt; R &lt; 40 cMpc. We combine "all" these statistics (d) for the best constraints. Detailed quantitative comparisons for the cosmological parameters can be found in Table <ref type="table">8</ref>, and in Table <ref type="table">9</ref> for the stellar feedback parameters.  Notes. The neural networks are given "all" clustering across 0 &lt; z &lt; 1 for the following selections on galaxy properties. We list the rMSE of the LFI posterior means when predicting the parameters alone or alongside the other four. We use the respective clustering of SAM galaxies with stellar masses greater than 1 &#215; 10 9 M e , downsampled to a density of 0.005 h 3 cMpc -3 . Detailed quantitative comparisons can be found in Tables <ref type="table">8</ref> and <ref type="table">9</ref>. Notes. We compare the cosmology (&#937;M and &#963;8) constraints from the best-performing neural networks across clustering statistics in <ref type="bibr">Figures 19,</ref><ref type="bibr">20,</ref><ref type="bibr">and 21</ref>. For z = {0.0, 0.1, 0.5, 1.0}, we use either the 2ptCF, CiC, VPF, or "all" combined. The densities for each selection are indicated with superscript symbols: * means a density of 0.001 h 3 cMpc -3 , while &#8224; means 0.005 h 3 cMpc -3 . We note that, for these parameters, an rMSE on the LFI loss means or the mean standard deviation s around 0.1 indicate imprecise and inaccurate constraints, with error bars that span half the parameter space and predictions that are flat and around the mean of the prior. Notes. We compare the SC-SAM supernova parameter constraints from the best-performing neural networks across clustering statistics in Figure <ref type="figure">21</ref>. For z = {0.0, 0.1, 0.5, 1.0}, we use either the 2ptCF, CiC, VPF, or "all" combined. The densities for each selection are indicated with superscript symbols: * means a density of 0.001 h 3 cMpc -3 , while &#8224; means 0.005 h 3 cMpc -3 . We note that, for these parameters, rMSE errors for the LFI means around 1.0 indicate imprecise and inaccurate constraints, with error bars that span half the parameter space and predictions that are flat and around the mean of prior. Parameters with rMSE less than 0.8 tend to show a rough 1:1 relationship but with considerable 1&#963; errors. We dive further into constraining the A AGN parameter in Section 5.3.</p><p>details, for the complete five-parameter LH catalogs, the constraints we find are 0.021/0.025 (LFI/rMSE) for &#937; M and 0.03/0.036 for &#963; 8 . The remarkable improvement in this scenario likely comes from the networks not having to work around the degenerate effects of the five parameters upon the number density of stellar-mass-selected catalogs. However, this experiment should remind us of the importance of including multiple variations and prescriptions for astrophysics in cosmological parameter inference: the effects of astrophysics are many and poorly understood, and not accounting for them will lead to overly optimistic and possibly incorrect constraints.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ORCID iDs</head><p>Lucia A.</p><p>Perez https:/ /orcid.org/0000-0002-8449-1956 Shy Genel https:/ /orcid.org/0000-0002-3185-1540 Francisco Villaescusa-Navarro https:/ /orcid.org/0000-0002-4816-0455 Rachel S. Somerville https:/ /orcid.org/0000-0002-6748-6821 Austen Gabrielpillai https:/ /orcid.org/0000-0003-4295-3793 Daniel Angl&#233;s-Alc&#225;zar https:/ /orcid.org/0000-0001-5769-4945 Benjamin D. Wandelt <ref type="url">https://orcid.org/0000-0002- 5854-8269</ref> L. Y. Aaron Yung https:/ /orcid.org/0000-0003-3466-035X</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_0"><p>https://www.camel-simulations.org/</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_1"><p>https://camels.readthedocs.io/</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_2"><p>The Astrophysical Journal, 954:11 (41pp), 2023 September 1 Perez et al.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="17" xml:id="foot_3"><p>It is worth noting that the updated version of the L-GALAXIES SAM has been run atop the enormous MillenniumTNG light cones by<ref type="bibr">Barrera et al. (2022)</ref>, who show the great accuracy of the produced two-point clustering of their galaxies in their improved SAM infrastructure.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="18" xml:id="foot_4"><p>When we explore our neural networks' constraints for these parameters, A SN1 and A AGN will therefore appear to bias toward smaller values on the plots' linear [1/4,4] x-axes.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="22" xml:id="foot_5"><p>During this work, we identified a feature of how CORRFUNC generated the "random test spheres" for the VPF and CiC that led to inaccurate VPF measurements for small-to-medium-sized galaxy samples. This has now been corrected fully, allowing the powerful CORRFUNC to be applied to even more samples than it was originally created for.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="23" xml:id="foot_6"><p>Though explained later in the relevant sections, when we are not looking at "all" clustering statistics at all four redshifts, the neural network is instead given in a 1D array: all statistics at a single redshift (Section 4.3), or an individual clustering statistic at the four redshifts (Section 6).</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="24" xml:id="foot_7"><p>There is no way to fully know how a neural network learns what it does. However, our choice to resample at each redshift, and the additional noise introduced, may help prevent the networks from focusing upon the growth factor<ref type="bibr">(Hamilton 2001)</ref> if it, e.g., decides to try taking ratios between clustering statistics. In future work, shuffling the order of different redshifts' clustering when training may help further counteract this.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="25" xml:id="foot_8"><p>A default option in Pytorch<ref type="bibr">(Paszke et al. 2019</ref>).</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="26" xml:id="foot_9"><p>We note that, e.g.,<ref type="bibr">Nicola et al. (2022)</ref> and<ref type="bibr">Uhlemann et al. (2020)</ref> have also included a few redshifts at a time in their analyses.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="27" xml:id="foot_10"><p>As explored in<ref type="bibr">Perez et al. (2021)</ref>, when, where, and how the VPF clustering should be measured depends on galaxy density and total covered volume. This logic can be comfortably extended to CiC, and it confirms much of the common-sense logic of the 2ptCF in the literature.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="28" xml:id="foot_11"><p>This is worrisome, since the accuracy of "deep learning" algorithms like our neural networks roughly scales with the size of the training set (e.g.,<ref type="bibr">Hestness et al. 2017</ref>).</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="29" xml:id="foot_12"><p>The VPF radii are 10 between with R = 0.8-8 h -1 cMpc; the 2ptCF radii are 19 bins whose edges are evenly log-spaced between 0.68 and 9.3 h -1 cMpc; and we include CiC to n = 50 at R ={3.2, 5.6, 7.2} cMpc. We combine the two "humps" for training under the MSE loss function criterion, yielding a total combined suite of approximately 1400/300/300 training/validation/ testing simulations.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="30" xml:id="foot_13"><p>It is worth noting, however, that these suites were not created for machinelearning training, and they excel at creating, e.g., accurate and robust emulators of various phenomena. We acknowledge our somewhat unfair comparison.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="31" xml:id="foot_14"><p>https://camels.readthedocs.io/, camels.simulations@gmail.com.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_15"><p>The Astrophysical Journal, 954:11 (41pp), 2023 September 1 Perez et al.</p></note>
		</body>
		</text>
</TEI>
