<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Cosmology with Multiple Galaxies</title></titleStmt>
			<publicationStmt>
				<publisher>IOP Publishing</publisher>
				<date>07/01/2024</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10538578</idno>
					<idno type="doi">10.3847/1538-4357/ad4969</idno>
					<title level='j'>The Astrophysical Journal</title>
<idno>0004-637X</idno>
<biblScope unit="volume">969</biblScope>
<biblScope unit="issue">2</biblScope>					

					<author>Chaitanya Chawak</author><author>Francisco Villaescusa-Navarro</author><author>Nicolás Echeverri-Rojas</author><author>Yueying Ni</author><author>ChangHoon Hahn</author><author>Daniel Anglés-Alcázar</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[<title>Abstract</title> <p>Recent works have discovered a relatively tight correlation between Ω<sub>m</sub>and the properties of individual simulated galaxies. Because of this, it has been shown that constraints on Ω<sub>m</sub>can be placed using the properties of individual galaxies while accounting for uncertainties in astrophysical processes such as feedback from supernovae and active galactic nuclei. In this work, we quantify whether using the properties of multiple galaxies simultaneously can tighten those constraints. For this, we train neural networks to perform likelihood-free inference on the value of two cosmological parameters (Ω<sub>m</sub>and<italic>σ</italic><sub>8</sub>) and four astrophysical parameters using the properties of several galaxies from thousands of hydrodynamic simulations of the CAMELS project. We find that using properties of more than one galaxy increases the precision of the Ω<sub>m</sub>inference. Furthermore, using multiple galaxies enables the inference of other parameters that were poorly constrained with one single galaxy. We show that the same subset of galaxy properties are responsible for the constraints on Ω<sub>m</sub>from one and multiple galaxies. Finally, we quantify the robustness of the model and find that without identifying the model range of validity, the model does not perform well when tested on galaxies from other galaxy formation models.</p>]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Some of the most fundamental questions we can ask in cosmology are: What are the components that make up the Universe? How much does each component contribute? We now know that the Universe should be made up of at least three main components: (1) baryons, representing all the substances and materials we know, (2) dark matter, some fundamental particle that interacts with baryons mostly (perhaps uniquely) through gravity, and (3) dark energy, a mysterious substance (perhaps a property of the vacuum) responsible for the recent acceleration of the Universe. From cosmological data, we believe these three components represent roughly 5%, 25%, and 70% of the current energy content of the Universe, respectively.</p><p>Parameters such as &#937; b and &#937; m represent the fraction of the Universe's energy content in terms of baryons and baryons plus dark matter, respectively. Determining them is important to learn about the nature and properties of dark matter and also to learn about the growth rate of the Universe <ref type="bibr">(Huterer 2023)</ref>. There are many different methods to infer these parameters, from studying the properties of the cosmic microwave background anisotropies to the spatial distribution of galaxies. Recently, <ref type="bibr">Villaescusa-Navarro et al. (2022)</ref> claimed that a tight relation between &#937; m and the properties of individual galaxies is present in galaxies from state-of-the-art hydrodynamic simulations. The relationship is present even when varying the value of astrophysical parameters controlling the efficiency of supernovae and active galactic nuclei (AGNs) feedback. <ref type="bibr">Echeverri et al. (2023)</ref> reached the same conclusion when using galaxies generated with a different hydrodynamic and subgrid physics model. <ref type="bibr">Villaescusa-Navarro et al. (2022)</ref> discussed that such a relation might be due to the existence of a low-dimensional manifold where galaxy properties reside. In this view, changing &#937; m modifies the location of the galaxies in that manifold differently than changing the efficiency of astrophysical processes. For instance, increasing the value of &#937; m while keeping &#937; b fixed will increase the overall dark matter content of the Universe. That excess will enhance the dark matter content of galaxies, affecting their density, star formation rate, metallicity, etc. On the other hand, feedback can also affect some of these properties, but it is unlikely that it will significantly affect the dark matter content of most galaxies. <ref type="bibr">Villaescusa-Navarro et al. (2022)</ref> argued that knowing the location of one point in the manifold is enough to characterize it, and therefore, with one single galaxy, it is possible to infer the value of &#937; m . We note that <ref type="bibr">Villaescusa-Navarro et al. (2022)</ref> and <ref type="bibr">Echeverri et al. (2023)</ref> showed that &#937; m can be inferred with a &#8764;10% precision based on the properties of a single galaxy, perhaps indicating that the manifold should have some intrinsic width associated with it. However, by using multiple galaxies, it should also be possible to infer the value of cosmological and astrophysical parameters by characterizing the impact on galaxy statistics like the stellar mass function. Recently, <ref type="bibr">Busillo et al. (2023)</ref> have shown that galaxy scaling relations are sensitive to both cosmology and astrophysics and derived constraints on those from real data (see also <ref type="bibr">Jo et al. 2023</ref> for the impact on the star formation rate history and the stellar mass function). In this work, we thus ask ourselves how well we can infer cosmological parameters if we only have a few galaxies. Due to computational constraints, we limit our analysis to fewer than 10 galaxies. The training time in the case of two galaxies is approximately 40 hr when using a single NVIDIA A100 GPU. This increases as we increase the number of galaxies. We note that using properties from galaxies directly (instead of summary statistics) enables our models to search through all potential summary statistics and crosscorrelations. Put simply, using details from individual galaxies rather than just summary numbers allows our models to explore all possible ways these details relate to each other.</p><p>In this paper, we show that using more than one galaxy increases the precision of the models trained to infer &#937; m , but at the same time, allows the models to infer other parameters that were unconstrained when using a single galaxy. To carry out our analysis, we made use of thousands of state-of-the-art hydrodynamic simulations from the Cosmology and Astrophysics with Machine Learning Simulations (CAMELS) project<ref type="foot">foot_0</ref>  <ref type="bibr">(Villaescusa-Navarro et al. 2021a</ref><ref type="bibr">, 2022;</ref><ref type="bibr">Ni et al. 2023)</ref>. To check that our results do not hold just for galaxies generated by a particular code, we perform our analysis using simulations run with three codes that employ different subgrid physics models: (1) AREPO+IllustrisTNG, (2) GIZMO +SIMBA, and (3) MP-Gadget+Astrid.</p><p>This paper is organized as follows. We present the data we use and the machine-learning algorithms we employ in Section 2. In Section 3, we present the main results of our analysis. Finally, we summarize the takeaways and conclude in Section 4.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Methods</head><p>In this section, we first describe the data we use for this work. We then explain the machine-learning algorithms we employ to analyze the data and outline the metrics we utilize to quantify the accuracy and precision of our models.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Data</head><p>In this paper, we train neural networks to infer the value of cosmological and astrophysical parameters using the internal properties of simulated galaxies. These galaxies come from state-of-the-art hydrodynamic simulations of the CAMELS project <ref type="bibr">(Villaescusa-Navarro et al. 2021a)</ref>.</p><p>All simulations follow the nonlinear evolution of 256 3 dark matter plus 256 3 initial fluid elements from z = 127 down to z = 0 in a cubic periodic volume of ( )</p><p>. All simulations share the value of these cosmological parameters: &#937; b = 0.049, h = 0.6711, n s = 0.9624, w = -1, &#937; K = 0, and &#8721;m &#957; = 0 eV.</p><p>The simulations have been run with three different codes and, therefore, can be classified into three different suites:</p><p>1. IllustrisTNG. The simulations in this suite have been run</p><p>with the AREPO code <ref type="bibr">(Springel 2010;</ref><ref type="bibr">Weinberger et al. 2019)</ref> and they employ the IllustrisTNG subgrid physics model <ref type="bibr">(Pillepich et al. 2018;</ref><ref type="bibr">Nelson et al. 2019</ref>). 2. SIMBA. The simulations in this suite have been run with the GIZMO code <ref type="bibr">(Hopkins 2015)</ref>, and they employ the SIMBA subgrid physics model <ref type="bibr">(Dav&#233; et al. 2019)</ref>.</p><p>3. Astrid. The simulations in this suite have been run with the MP-Gadget code <ref type="bibr">(Feng et al. 2018)</ref>, and they employ a slightly modified version of the Astrid subgrid physics model <ref type="bibr">(Bird et al. 2022;</ref><ref type="bibr">Ni et al. 2022)</ref>.</p><p>Each suite contains 1000 simulations (from the Latin hypercube set of CAMELS). Each of those simulations has a different value of &#937; m , &#963; 8 , and four astrophysical parameters that control the efficiency of supernova and AGN feedback: A SN1 , A SN2 , A AGN1 , and A AGN2 . We refer the reader to <ref type="bibr">Villaescusa-Navarro et al. (2021a)</ref> and <ref type="bibr">Ni et al. (2023)</ref> for further details on the specifics of the astrophysical parameters. We emphasize that the astrophysical parameters have different meanings in each suite due to the different subgrid implementations, and they represent variations relative to the corresponding fiducial models of IllustrisTNG, SIMBA, and Astrid. Table <ref type="table">1</ref> briefly describes the astrophysical and cosmological parameters involved in this study.</p><p>The value of these six parameters are arranged in a Latin hypercube with boundaries defined by</p><p>We note that in the case of Astrid, the A AGN2 parameter ranges from 0.25 to 4. We also emphasize that all simulations have different values of the initial random seed. In this work, we focus our attention on the z = 0 snapshots of these simulations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Galaxy Properties</head><p>Halos and subhalos are identified in the simulations using the SUBFIND algorithm <ref type="bibr">(Springel et al. 2001;</ref><ref type="bibr">Dolag et al. 2009)</ref>. In this work, we define a galaxy as a subhalo with a stellar mass larger than zero. We follow <ref type="bibr">Echeverri et al. (2023)</ref> and only consider galaxies with stellar masses above 5 &#215; 10 8 h -1 M e to avoid working with small, likely spurious objects. SUBFIND computes many properties for each galaxy, but in this work, we focus our attention on the following 14:</p><p>1. M g : the subhalo gas mass content, including the circumgalactic medium's contribution. 2. M BH : the black hole (BH) mass of the galaxy. 3. M * : the stellar mass of the galaxy. 4. M t : the total mass of the subhalo hosting the galaxy. 5. V max : the maximum circular velocity of the subhalo hosting the galaxy:</p><p>6. &#963; v : the velocity dispersion of all particles in the galaxy's subhalo. 7. Z g : the mass-weighted gas metallicity of the galaxy. 8. Z * : the mass-weighted stellar metallicity of the galaxy. 9. SFR: the galaxy star formation rate (SFR). 10. J: the galaxy's subhalo spin vector modulus. 11. V: the modulus of the galaxy's subhalo peculiar velocity. 12. R * : the radius containing half the galaxy's stellar mass. 13. R t : the radius containing half of the total mass of the galaxy's subhalo. 14. R max : the radius at which</p><p>For IllustrisTNG simulations, we also consider the following three properties: &#228;[0.1, 0.5] &#228;[0.6, 1.0] &#228;[0.25, 4.0] &#228;[0.5, 2.0] &#228;[0.25, 4.0] IllustrisTNG and SIMBA &#228;[0.5, 2.0], Astrid &#228; [0.25, 4.0]</p><p>2. K: the galaxy absolute magnitude in the K band.</p><p>3. g: the galaxy absolute magnitude in the g band.</p><p>We note that the above three magnitudes are not present in simulations of the SIMBA and Astrid suites because SUBFIND needs some particular properties not stored in those simulations to estimate the magnitudes. We refer the reader to <ref type="bibr">Villaescusa-Navarro et al. (2022)</ref> for further details about these properties.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Input Data</head><p>The input to our models is a 1D vector containing the properties of n galaxies, where n &#228; [1, 10]. For instance, if we use galaxies from the Astrid suite and set n = 5, the input vector will contain 5 &#215; 14 = 70 values. We remind the reader that for IllustrisTNG galaxies, we take 17 properties for each galaxy, while for SIMBA and Astrid, only 14 are available. Once the simulation suite and the value of n are chosen, we construct 1500 1D arrays with the properties of n unique galaxies (i.e., we enforce that the same galaxy cannot appear twice in the same set. It can, however, appear again in a different set). The 1500 1D arrays are constructed from the same simulation. The reason why we take 1500 arrays is that we have performed several convergence tests, and we find that increasing the number of 1D arrays during training does not yield noticeable improvements in our results.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.">Machine-learning Techniques</head><p>In this work, we train neural networks to perform likelihoodfree inference on the value of two cosmological (&#937; m and &#963; 8 ) and four astrophysical (A SN1 , A SN2 , A AGN1 , A AGN2 ) parameters.</p><p>Our models take as input a 1D vector containing the properties of n galaxies and return 2N params numbers, where N params is the number of parameters considered (e.g., N params = 1 if only inferring one parameter). For each parameter i, our models output its marginal posterior mean (&#956; i ) and standard deviation (&#963; i ). This is achieved by minimizing the following loss function:</p><p>This loss function guarantees that &#956;, i and &#963; i represent the parameter's posterior mean and standard deviation i (Jeffrey &amp; Wandelt 2020; Villaescusa-Navarro et al. 2021b).</p><p>Our models use several blocks, each containing a fully connected layer, a LeakyReLU nonlinear activation function, and a dropout layer. After the last block, a fully connected layer predicts the network's output. We write our model in PyTorch. <ref type="foot">9</ref>The number of blocks, the number of neurons in the fully connected layers, the learning rate, the weight decay, and the dropout rate are considered hyperparameters.</p><p>The value of the hyperparameters is tuned using Optuna<ref type="foot">foot_4</ref> (Akiba et al. 2019), which searches the hyperparameter space for the optimal values of the hyperparameters, which minimizes the value of the validation loss. We use at least 100 trials and the optimization is done by searching the hyperparameter values that minimize the validation loss. We emphasize that we run Optuna for each different configuration; for instance, when changing the simulation suite or the number of galaxies, we retrain using Optuna to find the best hyperparameters for that case.</p><p>To train the models, we first split the simulations into training (850), validation (100), and testing (50) sets. We then construct the input 1D arrays by combining the properties of galaxies from the same simulation. We note that it is important to (1) avoid mixing galaxies from different simulations when combining galaxy properties into the input arrays since different simulations sample different parameter values and (2) avoid having galaxies from the same simulation in different sets (e.g., training and testing) since there could be leakage of information if galaxies from the same simulation are somehow correlated.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.5.">Performance Metrics</head><p>In this work, we use four metrics to quantify the accuracy and precision of our models. To use these metrics, we need to consider that for a given input 1D vector i, &#952; i represents the value of the considered parameter, while &#956; i and &#963; i represent the posterior mean and standard deviation predicted by the network for that parameter. The four statistics we consider are:</p><p>1. Root mean squared error:</p><p>where the sum runs over all 1D arrays in the considered test set. Smaller values of the RMSE indicate the model is more accurate. 2. Mean relative error, &#242;:</p><p>the mean relative error tells us about the model's precision, with lower values representing, in general, more precise models. The mean relative error does not know anything about the true values. Thus, one can have a very precise but not accurate model. 3. Coefficient of determination, R 2 :</p><p>where, q q = &#229; = i N i</p><p>1 . The R 2 quantifies the model's accuracy, with values close to 1 being accurate, and values close to 0 being poor. 4. Reduced chi-squared, &#967; 2 :</p><p>We made use of these statistics to quantify the precision of the model error bars (posterior standard deviation).</p><p>Values close to 1 indicate the size of the errors is appropriate, while values below/above 1 indicate the errors are over/underpredicted.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Results</head><p>We now present the results of our analysis. We first show the results when training the models using the properties of two galaxies, and then we show the results when considering multiple galaxies. We note that Villaescusa-Navarro et al.</p><p>(2022) and <ref type="bibr">Echeverri et al. (2023)</ref> showed that &#937; m can be inferred with a &#8764;10% precision based on the properties of a single galaxy, and the models could not constrain the values of other cosmological and astrophysical parameters within a meaningful margin of error.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Two Galaxies</head><p>We train models using 1D arrays that contain the properties of two galaxies. We then test those models on 1D arrays that contain the properties of two galaxies from the test set. We show the results in Figures 1 (IllustrisTNG and SIMBA), and 2 (Astrid). We find that all models can constrain the value of &#937; m accurately with {RMSE, &#242;, R 2 } equal to {0.022, 0.077, 0.966} (IllustrisTNG), {0.023, 0.090, 0.956} (SIMBA), and {0.028, 0.094, 0.919} (Astrid). We note that these numbers are better than the ones obtained for a single galaxy; for instance, these metrics are {0.0365, 0.11, 0.842} when considering one single galaxy from Astrid <ref type="bibr">(Echeverri et al. 2023)</ref>.</p><p>On the other hand, &#963; 8 remains mostly unconstrained with two galaxies, irrespective of the simulation suite employed, in the same way as our findings for one galaxy <ref type="bibr">(Villaescusa-Navarro et al. 2022;</ref><ref type="bibr">Echeverri et al. 2023)</ref>. We reach similar conclusions for the AGN parameters of IllustrisTNG and SIMBA simulations. For Astrid, A AGN1 remains unconstrained, while A AGN2 can be inferred with a &#8764;16% precision; a significant improvement from the &#8764;24% obtained using one single galaxy <ref type="bibr">(Echeverri et al. 2023)</ref>. Finally, all models can infer the supernova feedback parameters with different precisions. We note that in the case of the supernova parameters, we have discarded a very small fraction of galaxies (0.41% for IllustrisTNG and 0.071% for SIMBA) since they have unreasonably small widths of the posterior, and therefore their &#967; 2 was really large and affected significantly the reported mean values.</p><p>These results show that better constraints on the value of the parameters can be achieved by using two galaxies instead of one.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Multiple Galaxies</head><p>Similar to the case of two galaxies, we also carried out the analysis for up to 10 galaxies considered simultaneously. We trained neural networks to perform likelihood-free inferences to estimate the values of the cosmological (&#937; m and &#963; 8 ) and astrophysical (A SN1 , A SN2 , A AGN1 , and A AGN2 ) parameters using data from N galaxies (N goes from 1 to 10) from the IllustrisTNG, SIMBA, and the Astrid suites. Once trained, the model is tested using the galaxies from the test set for each case. Figure <ref type="figure">3</ref> shows how the prediction RMSE and R 2 of the cosmological and astrophysical parameters change as we increase the number of galaxies considered simultaneously. In this case, the quantity reported is the mean value of all galaxies in the test set. In Figure <ref type="figure">7</ref> in the Appendix, we show the results of training and testing using 10 galaxies (this figure is the equivalent to Figures <ref type="figure">1</ref> and <ref type="figure">2</ref>).</p><p>We find that, as we consider more galaxies simultaneously, the predicted values of the cosmological parameters &#937; m and &#963; 8 become increasingly more accurate. In the case of the astrophysical parameters (A SN1 , A SN2 , A AGN1 , and A AGN2 ), their predicted values can either improve or remain the same. The trend in the astrophysical parameters is not the same for all the suites because of the difference in the physical meaning of these parameters in each suite. For instance, the prediction of A AGN2 significantly improves when increasing from one galaxy to more than seven for the Astrid model, but remains poorly constrained regardless of the number of galaxies in the IllustrisTNG and SIMBA models. As we discuss below this may be related to the fact that with several galaxies, one can create a proxy for astrophysical quantities that are sensitive to feedback.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Only &#937; m</head><p>It is evident from the results discussed up to now that the model does an excellent job of predicting the value of &#937; m . So, we proceed to train the neural network to predict the posterior mean and standard deviation for only &#937; m instead of all six cosmological and astrophysical parameters. We do this so that the models can focus entirely on minimizing the loss for this parameter, avoiding situations where degeneracies with other parameters can yield suboptimal results for the parameter of interest.</p><p>In this case, these models do a slightly better job at inferring the value of &#937; m compared to results obtained when trained to predict all six cosmological and astrophysical parameters.</p><p>From Figure <ref type="figure">4</ref> we see that the neural network becomes increasingly more precise at inferring the value of &#937; m as we increase the number of galaxies considered simultaneously. For SIMBA and IllustrisTNG suites, the RMSE improves by about 55%, and in Astrid's case, it improves by 37% as we go from 1 galaxy to 10 galaxies. The right panel of Figure <ref type="figure">4</ref> shows the results when considering the R 2 statistics instead. <ref type="bibr">-Navarro et al. (2022)</ref> carried out a feature importance study that showed that standard feature ranking methods (like computing saliency maps, using SHAP values, or using the inbuilt "feature importance" from scikit-learn) did not yield the important features that the model used to make inferences. This is due to strong internal correlations between galaxy properties, which makes it very difficult for the model to pinpoint the top properties. For that reason, Villaescusa-Navarro et al. (2022) trained a series of gradient-boosted trees models where one feature was discarded at a time. That way, the features could be ranked according to importance, and the results were sensitive. <ref type="bibr">Echeverri et al. (2023)</ref> used the same procedure to rank the properties of the Astrid galaxies. <ref type="bibr">Villaescusa-Navarro et al. (2022)</ref> and <ref type="bibr">Echeverri et al. (2023)</ref> found that the five most important properties, according to their order of importance, for each of the suites are:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Most Important Features</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Villaescusa</head><p>With this information on hand, we now ask ourselves whether the constraints we obtain for &#937; m are mostly due to those variables or whether when considering multiple galaxies there may be information coming from other features.</p><p>Figure <ref type="figure">1</ref>. We train neural networks to infer the value of the cosmological (&#937; m and &#963; 8 ) and astrophysical (A SN1 , A SN2 , A AGN1 , and A AGN2 ) parameters from the internal properties of two random galaxies (without using galaxy positions). Next, from each simulation of the test set, we randomly select two galaxies and test the model on them. We show the results as points with error bars representing the posterior mean and the standard deviation (without making assumptions about the shape of the posterior). As can be seen, the models can precisely infer the value of &#937; m for both IllustrisTNG and SIMBA galaxies and, in some cases, the supernova feedback parameters. The value of &#963; 8 and the AGN parameters is poorly predicted in all cases.</p><p>To answer this question, we train models that only use the above galaxy properties but consider multiple galaxies. We show the results in Figure <ref type="figure">5</ref> for the case of 10 galaxies. As we can see, when predicting only &#937; m , the model performs only &#8764;6% worse than in the case of SIMBA, and &#8764;21% worse than in cases of Astrid and IllustrisTNG, when compared to training with all properties. This is even after removing 12 galaxy properties in the case of IllustrisTNG (9 in the case of SIMBA and Astrid). We thus conclude that most of the information is contained in the most important variables for individual galaxies. We emphasize that this does not mean that the model uses information from individual galaxies and somehow stacks the results. Even using this subset of variables, one can construct noisy estimates of properties, like the stellar mass function, expected to be affected by cosmology <ref type="bibr">(Jo et al. 2023)</ref>. Therefore, the source of information may arise from both individual galaxies and collective properties.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5.">Robustness</head><p>One of the most important aspects to consider when working with numerical simulations is the robustness of the results. In other words, how well the model behaves when training on galaxies from one galaxy formation model and testing it on galaxies from another galaxy formation model. This aspect has been investigated before in <ref type="bibr">Villaescusa-Navarro et al. (2022)</ref>, where it was found that even with a single galaxy, inference of &#937; m from galaxy properties was not robust. This claim was later revisited by <ref type="bibr">Echeverri et al. (2023)</ref>, who found that the lack of robustness was at least partially due to the presence of a small fraction of outliers. <ref type="bibr">Echeverri et al. (2023)</ref> also found that removing these outliers during testing will make the model robust across simulations. In our case, as we train the models using more than one galaxy, the chances of including the outliers increase, thus making the model more precise but less robust.</p><p>In order to verify the robustness of our model, we have considered the case where we train models that use 10 random galaxies produced with a given code and test them using 10 galaxies from another galaxy formation model. We show the results in Figure <ref type="figure">6</ref>. As can be seen, the results are not robust and training models on galaxies from one simulation suite does not yield accurate results when testing on galaxies from another suite. To some extent, this is expected since constraints with multiple galaxies are tighter than with a single one and we have not removed, a priori, outliers from the cross-distributions. It is interesting to note that the worst case happens when training on Astrid and testing on IllustrisTNG. While we do not have a clear explanation for this, it may be related to the model focusing on aspects that are different among these two simulations. We leave it to future work to explore strategies designed to increase the robustness of the results following the findings of <ref type="bibr">Echeverri et al. (2023)</ref>. For instance, trying to identify outliers and remove them from the test set to improve the reliability of the predictions or perhaps using a generative model to be able to compute the likelihood of the galaxies directly.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Summary and Discussion</head><p>Previous works by <ref type="bibr">Villaescusa-Navarro et al. (2022)</ref> and <ref type="bibr">Echeverri et al. (2023)</ref> have pointed out the existence of a tight relation between the properties of individual simulated galaxies and &#937; m . The authors interpreted these results as a consequence of the existence of a manifold containing galaxy properties. Under that interpretation, properties of the manifold may change in distinct manners when varying different parameters, allowing the inference of &#937; m from the properties of a single galaxy.</p><p>In this work, we have studied whether using the properties of several galaxies can help better constrain the value of the cosmological and astrophysical parameters. To investigate this, Figure <ref type="figure">3</ref>. We train neural networks to infer the posterior mean and posterior standard deviation of all six parameters as a function of the number of galaxies. The top panels show the results for the RMSE, while the bottom panel displays the results for the R 2 statistics. In all cases, we show the average results, i.e., for a given simulation; we take 1500 different combinations and report the mean values. In general, the more galaxies we consider, the tighter the constraints on the parameters. However, there are some cases where constraints saturate, and adding more galaxies does not yield tighter constraints.</p><p>we have trained neural networks to perform likelihood-free inferences on the values of the cosmological (&#937; m and &#963; 8 ) and astrophysical (A SN1 , A SN2 , A AGN1 , and A AGN2 ) parameters by using the internal properties of multiple galaxies. We have made use of the properties of galaxies at redshift z = 0 from the IllustrisTNG, SIMBA, and the Astrid simulation suites from the CAMELS project <ref type="bibr">(Villaescusa-Navarro et al. 2021a;</ref><ref type="bibr">Ni et al. 2023)</ref>. We emphasize that our model only uses information from the galaxy properties, not their positions; in other words, the constraints do not incorporate any information from clustering.</p><p>We find that the precision of the predictions improves as we increase the number of galaxies. In the case of IllustrisTNG, SIMBA, and Astrid, the RMSE of &#937; m improves by factors of 2.5, 2.5, and 1.6, respectively. In the case of Astrid, we observe a plateau on the constraints when going beyond &#8764;5 galaxies. For IllustrisTNG and SIMBA, the trend indicates that better constraints can be achieved using more than 10 galaxies. When considering the R 2 statistics, we find that the results tend to saturate when using more than &#8764;5 galaxies. We have trained models to infer the value of &#937; m alone, i.e., without predicting the value of the other parameters. In this case, we find slightly more precise results than when training the models to predict all parameters. The results, shown in Figure <ref type="figure">4</ref>, do not change our conclusions.</p><p>For &#963; 8 , we find a steady improvement in the precision of the predictions (both RMSE and R 2 ) as we increase the number of galaxies. We emphasize that our models cannot determine the value of &#963; 8 with a single galaxy. The origin of these constraints may arise not from the properties of individual galaxies but from statistics that can be constructed when using multiple galaxies. For instance, the stellar mass function may be sensitive to the value of &#963; 8 , and a noisy version of it can be constructed when considering multiple galaxies. We thus speculate that the origin of this information may not be related to the manifold hosting the galaxy properties. We note that <ref type="bibr">Busillo et al. (2023)</ref> have obtained cosmological and astrophysical constraints from the properties of a relatively small number of local, star-forming, galaxies.</p><p>For the supernova feedback parameters, we also find a consistent improvement in the constraints as we increase the number of galaxies for all suites with the exception of A SN2 for Astrid where constraints seem to saturate for more than &#8764;5 galaxies. We believe the explanation may be related to the . We train the models to infer the value of &#937; m alone. The left and right panels show the mean values of the RMSE and R 2 as a function of the number of considered galaxies. We find our models perform slightly better when trained to infer &#937; m alone instead of inferring all six properties simultaneously. As can be seen, results improve when considering more galaxies, but in the case of Astrid, constraints tend to saturate when using more than &#8764;5 galaxies.</p><p>Figure <ref type="figure">5</ref>. We train models to infer the value of &#937; m using the properties of 10 galaxies. However, instead of using all galaxy properties, we made use of the five most important properties found when using a single galaxy. The panels show the average results for SIMBA (left), IllustrisTNG (middle), and Astrid (right). We found that constraints using just five galaxy properties are similar to the ones obtained using all galaxy properties. This may indicate that the models still extract information from the manifold containing galaxy properties and that information from noisy global quantities (e.g., stellar mass function) is subdominant.</p><p>previous argument, i.e., with multiple galaxies, one can construct noisy estimates of global statistical properties that may be sensitive to these parameters, like the stellar mass function or the stellar metallicity relation. However, differently to &#963; 8 , even with a single galaxy we find some constraining power on the value of these parameters, so it can just be that results are just exploiting that to determine the shape of the manifold better. Both factors likely came into play in this setup.</p><p>Finally, we find no constraining power for the AGN parameters for A AGN1 for the galaxies in IllustrisTNG and Astrid and a modest improvement of &#8764;20% for SIMBA. On the other hand, for A AGN2 , the constraints for IllustrisTNG and SIMBA do not improve up to 10 galaxies, while for Astrid there is a significant &#8764;2.5 &#215; improvement in the RMSE value. We note that AGN feedback is expected to have a larger effect on massive galaxies, so the fact that we choose galaxies randomly (making it more likely to choose small galaxies) can be the reason behind this behavior.</p><p>Our results indicate that the models may still be exploiting the information contained on the most important variables used when inferring &#937; m with a single galaxy. In this case, the improvement may be due to a better determination of the galaxy manifold (stacking results for individual galaxies) but also to the impact of cosmology and astrophysics on quantities such as the stellar mass function, where noisy versions of it can be constructed from a set of galaxies. It is however interesting to see that galaxy properties not important for constraints on individual galaxies do not seem to have an impact also when using catalogs.</p><p>As expected, our models become more precise but less accurate as we increase the number of galaxies. The reason is that the models are not robust even when considering a single 6. Robustness test. We have trained models to infer the value of &#937; m using the properties of 10 galaxies from SIMBA (top row), IllustrisTNG (middle row), and Astrid (bottom row). We then test the models on properties from 10 galaxies from SIMBA (left column), IllustrisTNG (middle column), and Astrid (right column). As can be seen, the models are not robust, and they fail when tested on galaxies from models different from the ones used for training.</p><p>galaxy <ref type="bibr">(Villaescusa-Navarro et al. 2022)</ref>. However, we note that <ref type="bibr">Echeverri et al. (2023)</ref> found that the models fail, on average, due to the presence of outliers. We thus leave it to future work to tackle the robustness of the models for one and multiple galaxies.</p><p>Finally, it is important to compare the results of this work versus those in <ref type="bibr">Hahn et al. (2023)</ref>, which are based on the core idea of utilizing the impact of cosmology of galaxy properties. That work provides the first constraints on &#937; m and &#963; 8 obtained from the photometry alone of thousands of NASA-Sloan Atlas<ref type="foot">foot_5</ref> (NSA) galaxies. The NSA is a catalog of images and parameters of local galaxies, from surveys in the ultraviolet, optical, and near-infrared bands. The NSA provides photometry of z &lt; 0.05 galaxies observed by the Sloan Digital Sky Survey. In that work, it is found that adding more galaxies improves the constraints, while here, we find that constraints tend to saturate when considering multiple galaxies. However, in <ref type="bibr">Hahn et al. (2023)</ref>, the information is not extracted from noiseless galaxy properties but from noisy and dust-attenuated photometry. In that case, even at the level of a single galaxy constraints are poorer than the ones reported here. This is because some information is lost when using photometry instead of galaxy properties. Thus, it is not surprising that stacking thousands of galaxies yields better constraints when using photometry than the ones obtained when using a few galaxies but knowing their properties without errors.</p><p>We conclude that better constraints on the value of the cosmological and astrophysical parameters can be obtained by using the properties of multiple galaxies instead of one. In this case, a combination of better knowing the underlying manifold hosting the data and the possibility of constructing noisy estimates of global quantities is behind the performance of our results. It would be interesting to investigate whether some particular combinations of galaxies yield tighter constraints and, therefore, maximize the information content. That selection should also account for the robustness of the model. We leave all this for future work. </p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_0"><p>https://www.camel-simulations.org/</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1"><p>The Astrophysical Journal, 969:105 (13pp), 2024 July 10 Chawak et al.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_2"><p>The Astrophysical Journal, 969:105 (13pp), 2024 July 10 Chawak et al.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_3"><p>https://pytorch.org/</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_4"><p>http://optuna.org/</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_5"><p>http://www.nsatlas.org/</p></note>
		</body>
		</text>
</TEI>
