<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Rapid construction of joint pulsar timing array data sets: the Lite method</title></titleStmt>
			<publicationStmt>
				<publisher>Monthly Notices of the Royal Astronomical Society</publisher>
				<date>09/09/2025</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10656395</idno>
					<idno type="doi">10.1093/mnras/staf1420</idno>
					<title level='j'>Monthly Notices of the Royal Astronomical Society</title>
<idno>0035-8711</idno>
<biblScope unit="volume">542</biblScope>
<biblScope unit="issue">4</biblScope>					

					<author>B Larsen</author><author>C_M F Mingarelli</author><author>P T Baker</author><author>J S Hazboun</author><author>S Chen</author><author>L Schult</author><author>S R Taylor</author><author>J Simon</author><author>J Antoniadis</author><author>J Baier</author><author>R N Caballero</author><author>A Chalumeau</author><author>Z Chen</author><author>I Cognard</author><author>D Deb</author><author>V DiMarco</author><author>T Dolch</author><author>I O Eya</author><author>E C Ferrara</author><author>K A Gersbach</author><author>D C Good</author><author>H Hu</author><author>A Kapur</author><author>S Kala</author><author>M Kramer</author><author>M T Lam</author><author>W G Lamb</author><author>T_J W Lazio</author><author>K Liu</author><author>Y Liu</author><author>M McLaughlin</author><author>D J Nice</author><author>B_B P Perera</author><author>A Petiteau</author><author>S M Ransom</author><author>D J Reardon</author><author>C J Russell</author><author>G M Shaifullah</author><author>L Speri</author><author>A Srivastava</author><author>G Theureau</author><author>J Wang</author><author>J Wang</author><author>L Zhang</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[<title>ABSTRACT</title> <p>The International Pulsar Timing Array (IPTA)’s second data release (IPTA DR2) combines decades of observations of 65 millisecond pulsars from 7 radio telescopes. IPTA data sets should be the most sensitive data sets to nanohertz gravitational waves (GWs), but take years to assemble, often excluding valuable recent data. To address this, we introduce the IPTA ‘Lite’ analysis, where a Figureof Merit is used to select an optimal PTA data set to analyse for each pulsar, enabling immediate access to new data and preliminary results prior to full combination. We test the capabilities of the Lite analysis using IPTA DR2, finding that ‘DR2 Lite’ can be used to detect the common red noise process with an amplitude of $A = 4.8^{+1.8}_{-1.8} \times 10^{-15}$ at $\gamma = 13/3$. This amplitude is slightly large in comparison to the combined analysis, and likely biased high as DR2 Lite is more sensitive to systematic errors from individual pulsars than the full data set. Furthermore, although there is no strong evidence for Hellings-Downs correlations in IPTA DR2, we still find the full data set is better at resolving Hellings-Downs correlations than DR2 Lite. Alongside the Lite analysis, we also find that analysing a subset of pulsars from IPTA DR2, available at a hypothetical ‘early’ stage of combination (EDR2), yields equally competitive results as the full data set. Looking ahead, the Lite method will enable rapid synthesis of the latest PTA data, offering preliminary GW constraints before the superior full data set combinations are available.</p>]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Table <ref type="table">1</ref>. Names and short descriptions of the three data sets we analyse in this work. Section 2 provides further details on each of the data sets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Name</head><p>Short description</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Full DR2</head><p>The fully-combined IPTA DR2 from <ref type="bibr">Perera et al. ( 2019 )</ref>, with T obs &gt; 3 yr filter from <ref type="bibr">Antoniadis et al. ( 2022 )</ref> for 53 pulsars in total.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>DR2 Lite</head><p>The Lite data set presented in this work, with a total of 53 pulsars selected using the FoM from single-PTA data subsets of Full DR2. EDR2 An 'early' subset of Full DR2 which includes fully-combined data for only 22 pulsars corresponding to the highest FoM in DR2 Lite.</p><p>than the individual constituent data sets comprising IPTA DR2. This provided a robust confirmation of the CRN first found by the regional PTA collaborations <ref type="bibr">(Arzoumanian et al. 2020b ;</ref><ref type="bibr">Chen et al. 2021b ;</ref><ref type="bibr">Goncharov et al. 2021b</ref> ). While it did not produce the first published measurement of a CRN process, we point out that IPTA DR2 could have presented the earliest opportunity to detect the CRN, as regional PTAs needed to collect approximately &#8764;2 more years of data than was used for IPTA DR2 in order to sufficiently resolve the signal. The delay in the IPTA DR2 analysis largely resulted from the resource-intensive process of data combination, which requires meticulous handling of different kinds of data, alignment of timing models, fitting for instrumental offsets, and iterative noise modelling across data sets <ref type="bibr">(Verbiest et al. 2016 ;</ref><ref type="bibr">Perera et al. 2019 )</ref>. This effort typically takes several years to complete, meaning by the time a combined data set is released, portions of the underlying data are already outdated. To illustrate the time-scales, IPTA DR2 contains the NANOGrav 9-yr data set <ref type="bibr">(Arzoumanian et al. 2016</ref> ), but IPTA DR2 was not published until 3 yr later <ref type="bibr">(Perera et al. 2019 )</ref>. By the time the IPTA DR2 GWB search was carried out <ref type="bibr">(Antoniadis et al. 2022 )</ref>, the NANOGrav 12.5yr data set was already used to detect the CRN <ref type="bibr">(Arzoumanian et al. 2020a</ref> ). Thus, the CRN could theoretically have been measured 3 yr in advance of <ref type="bibr">Arzoumanian et al. ( 2020a )</ref> if IPTA DR2 was constructed immediately. Given the long time-scales required for new GW signals to emerge in PTA data sets, this motivates the need to either improve the speed of data combination or explore alternative methods for analysing joint-PTA data sets, supplementing the eventual results of a fully-combined data set.</p><p>To address this, we introduce a novel, resource-efficient approach we call the 'Lite' method. Instead of immediately performing full data combination, we evaluate pulsar data sets which have already been produced by individual PTAs, and select the most informative data set for each pulsar using a Figure of <ref type="bibr">Merit (FoM)</ref>, which quantifies sensitivity to a GW signal based on the data set properties using the theoretical scaling laws for the signal-to-noise ratio (S/N). For example, the GWB S/N from <ref type="bibr">Siemens et al. ( 2013 )</ref> suggests a FoM which increases as total observing time increases, as observation cadence increases, and as RMS white noise residual decreases. By selecting each pulsar's data based on the FoM, the Lite data set achieves the maximum theoretical sensitivity to GW signals possible among all available data prior to performing data combination. As such, a Lite data set analysis may provide an early look into what may result from a fully-combined analysis. A Lite data set will also be less computationally intensive to analyse than its fully-combined counterpart due to the reduced data volume.</p><p>Data combination is a slow, intensive process, with combined data sets built up one pulsar at a time. It is tempting then to also consider the result of analysing an early or intermediate combined data set, which includes just the first set of pulsars which have had their data combined. The FoM suggests which pulsars to combine first: combining pulsars in order, starting from highest FoM to lowest, will maximize the GWB sensitivity of any intermediate combined data set. This practice already has precedent within PTA analyses <ref type="bibr">(Babak et al. 2016 ;</ref><ref type="bibr">Speri et al. 2023 )</ref>. For example, the creation of EPTA DR2 with 25 pulsars (EPTA Collaboration 2023a ) was preceded by a version of EPTA DR2 using 6 pulsars <ref type="bibr">(Chen et al. 2021c )</ref>, which were originally selected based on their expected S/N for continuous GWs <ref type="bibr">(Babak et al. 2016 )</ref>. The 25 pulsars used for the EPTA DR2 GWB search (EPTA Collaboration 2023a , c ) were then selected to optimize the theoretical S/N of the Hellings-Downs curve, following the method from <ref type="bibr">Speri et al. ( 2023 )</ref>.</p><p>Here, we use IPTA DR2 as a test case to assess the benefits of performing GW searches using a Lite data set, an early-combined data set, and a fully-combined data set, reflecting the stages in which future combined data may be analysed. Table <ref type="table">1</ref> provides short names and descriptions of each of these data sets for ease of reference. Specifically, we test how the detection statistics and upper limits for a GWB evolve as more data are combined. This analysis framework thus quantifies the benefits and drawbacks of a rapid, on-the-fly Lite analysis, as well as the superior sensitivity offered by the full data combination.</p><p>The information from this analysis will also be valuable for the interpretation of present-day data sets. <ref type="bibr">Agazie et al. ( 2024 )</ref> performed comparisons and joint-analyses of the NANOGrav 15 yr data set <ref type="bibr">(Agazie et al. 2023a , b )</ref>, EPTA + InPTA DR2 (EPTA Collaboration 2023a , c ), and PPTA DR3 <ref type="bibr">(Reardon et al. 2023a ;</ref><ref type="bibr">Zic et al. 2023a</ref> ). The factorized likelihood cross-PTA analyses in <ref type="bibr">Agazie et al. ( 2024 )</ref> are similar in spirit to the Lite analysis method we present here, and the results suggest that IPTA DR3 will place the most decisive detection to date of the Hellings-Downs curve, which is the definitive signature of an isotropic GWB imprinted in the cross-correlations between pulsar timing residuals <ref type="bibr">(Hellings &amp; Downs 1983 )</ref>. Our Lite analysis of IPTA DR2 will therefore be useful to calibrate expectations for IPTA DR3.</p><p>Our paper is laid out as follows: In Section 2 , we detail IPTA DR2, the FoM, and our method of creating DR2 Lite from IPTA DR2 as a starting point. In Section 3 , we describe the PTA likelihood, models, and parameters used in our Bayesian analysis of each data set. In Section 4 , we assess how the statistics for both a CRN and an Hellings-Downs cross-correlated GWB evolve throughout each stage of data combination, as well as the impact of data combination on single pulsar noise characterization and ensemble noise properties. In Section 5 , we discuss our results and future directions for the Lite method.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">I P TA D R A N D T H E L I T E DATA S E T</head><p>IPTA DR2 is the most recent IPTA-combined data set, fully detailed in <ref type="bibr">Perera et al. ( 2019 )</ref>. Steps required to create the combined data set include standardization of TOA flags (metadata), fitting for instrumental offsets, implementation of comprehensive timing and noise models, and simultaneous/iterative fits to the pulsar timing and noise model parameters <ref type="bibr">(Verbiest et al. 2016 ;</ref><ref type="bibr">Perera et al. 2019 )</ref>. IPTA DR2 includes TOAs from the NANOGrav 9-yr data set <ref type="bibr">(Arzoumanian et al. 2016 )</ref>, EPTA DR1 <ref type="bibr">(Desvignes et al. 2016 )</ref>, and PPTA DR1 <ref type="bibr">(Manchester et al. 2013 ;</ref><ref type="bibr">Reardon et al. 2016 )</ref>, as well as legacy NANOGrav timing data for PSRs J1713 + 0747, J1857 + 0943, and J1939 + 2134 <ref type="bibr">(Kaspi, Taylor &amp; Ryba 1994 ;</ref><ref type="bibr">Zhu MNRAS 542, 3028-3048 (2025</ref><ref type="bibr">) et al. 2015 )</ref> and extended PPTA data for PSRs J0437-4715, J1744-1134, J1713 + 0747, and J1909-3744 <ref type="bibr">(Shannon et al. 2015 )</ref>. In total, IPTA DR2 includes a total of 65 millisecond pulsars, with data sets spanning 0.5-30 yr, measured across 7 different telescopes. IPTA DR2 also features two versions, designated VersionA and VersionB, which were each created using different noise models. Throughout this work we use only VersionB. A few pulsars in IPTA DR2 are known by their Besselian names in NANOGrav data sets, though in this work we use their the Julian names: J1857 + 0943 (B1855 + 09), J1939 + 2134 (B1937 + 21), and J1955 + 2908 (B1953 + 29). <ref type="bibr">Antoniadis et al. ( 2022 )</ref> carried out a GWB search on a subset of 53 pulsars from IPTA DR2 with &gt; 3 yr of data. Pulsars with shorter timespans do not resolve the GWB at low frequencies, and their timing models may not yet be converged <ref type="bibr">(Andrews, Lam &amp; Dolch 2020 )</ref>. This data set, designated here as Full DR2, is our benchmark against which to compare the results of the Lite analysis. The search from <ref type="bibr">Antoniadis et al. ( 2022 )</ref> yielded a strong detection of a CRN process (the autocorrelated component of a GWB). Detection statistics for Hellings-Downs correlations (the cross-correlated component of a GWB) were also computed, but the values were considered insufficient for a detection. IPTA DR2's very long observation timespan of 30.2 yr results from the inclusion of legacy data no longer used in some more recent PTA data releases (EPTA Collaboration 2023a ; <ref type="bibr">Agazie et al. 2023b ;</ref><ref type="bibr">Zic et al. 2023b</ref> ). Among these legacy data include TOAs which have been observed only at single radio frequencies, which are suboptimal for accurately characterizing DM variations <ref type="bibr">(Shannon &amp; Cordes 2017 ;</ref><ref type="bibr">Lam et al. 2018a ;</ref><ref type="bibr">Sosa Fiscella et al. 2024</ref> ). It has since been shown empirically with EPTA DR2 (EPTA Collaboration 2023b , c ) and simulations <ref type="bibr">(Ferranti et al. 2025</ref> ) that including these types of single-frequency data can reduce the sensitivity of the PTA to cross-correlations between pulsar pairs, which must be used to resolve the Hellings-Downs curve. For simplicity and consistency with <ref type="bibr">Antoniadis et al. ( 2022 )</ref>, we include these TOAs in all versions of our analysis, but highlight that their presence should be considered during the interpretation of our results. We reserve an analysis assessing the impacts of legacy data in IPTA data sets for future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Selecting pre-combined pulsar data using a Figure of Merit</head><p>We next detail the methods of creating an IPTA Lite data set composed of pre -combined data from individual PTA data sets. In the presence of multiple data sets for a given pulsar, we select whichever data maximizes a FoM. The FoM encodes the theoretical sensitivity of a pulsar to a particular GW signal, based solely on the properties of the pulsar's TOAs. For this work, we define our FoM according to the scaling laws for a GWB with spectral index &#947; = 13 / 3 in the intermediate regime where the lowest frequencies of the GWB have risen above the white noise level <ref type="bibr">(Siemens et al. 2013 )</ref>,</p><p>where T obs is the pulsar's total observation timespan, &#963; TOA is the average (harmonic mean) TOA error, and t is the average (geometric mean) time between observations. Intuitively, equation ( 1 ) rewards data sets with high data quantity, i.e. long timespans T obs and high data cadence c = 1 / t, as well as data sets with high data quality, i.e. smaller errors &#963; TOA . The factor of 3 / 13 results from the predicted spectral index of the emerging GWB. The intermediate regime GWB S/N is also proportional to the number of pulsars, but this is not included in the FoM to select which data to use for a single pulsar. Different Lite data sets can be also curated for different nHz GW searches as the theoretical S/N for each type of GW signal will follow a different scaling law. We additionally present the FoM for continuous GWs and for GW bursts with memory in Appendix A , but we do not explore these in further in this work.</p><p>To create a Lite data set, first decide which GW signal to search for and select the FoM. Next, iterate through all pulsars of interest. Pulsars timed by a single PTA require no extra work to include them in the Lite data set, aside from ensuring terrestrial clock references and Solar system ephemeris versions are consistent and up to date among all pulsars. If a pulsar is timed by multiple PTAs, compute the FoM from each PTA's data for that pulsar, then take whichever FoM is largest and add the corresponding PTA's data to the Lite data set. The Lite data set is therefore the bespoke composition of uncombined pulsar data sets across different PTAs, which may be used from there to perform a joint GW search. The FoM-based selection approach has the advantage that it is purely based on the statistical properties of the data set itself and is agnostic to which PTA timed it. Lite data sets may be created immediately from the latest PTA data releases.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Creating intermediate data sets from IPTA DR2</head><p>We next detail the specifics behind the curation of our IPTA DR2based data subsets, which together with Full DR2 are summarized Table <ref type="table">1</ref> . IPTA DR2 is the most recent fully-combined IPTA data set and is therefore an excellent data set to test the performance of the Lite analysis. Here, we choose to create DR2 Lite starting from the IPTA DR2 release. This choice ensures the timing models are identical and thus any difference in GW sensitivity between DR2 Lite and Full DR2 results purely from the difference in data volume. In principle, one should also refit the timing models after reducing the data volume, or at least check to ensure the remaining timing residuals are within the regime of the linear timing model used during GW analyses. We empirically found the latter assumption to hold for the Lite version of IPTA DR2. Additionally, a maximally sensitive 'early' version of a fully-combined data set will start with a subset of pulsars that maximize the FoM. We use IPTA DR2 to create this hypothetical 'early' data set, which we call EDR2, drawing inspiration from the Gaia data releases (Gaia Collaboration 2021 ).</p><p>To create DR2 Lite starting in IPTA DR2, we first isolate each PTA's TOAs in each pulsar and compute the FoM from equation ( 1 ). Each pulsar in DR2 Lite then keeps only the PTA data with the largest FoM. Fig. <ref type="figure">1</ref> visualizes the results of this process, by comparing the FoM computed for each PTA and each pulsar. DR2 Lite in total uses EPTA data for 33 pulsars, NANOGrav data for 8 pulsars, and PPTA data for 12 pulsars. Fig. <ref type="figure">1</ref> also shows each pulsar's FoM computed from the fully-combined data set-in nearly all cases the combined data result in a higher FoM, as expected from equation ( 1 ). The FoM is slightly lower only for PSRs J1012 + 5307 and J1909-3744 using the combined data. Following equation ( 1 ), this results if the newly combined TOAs have much larger errors on average than the TOAs included in the Lite data set and the timespan and data cadence do not appreciably increase in contrast.</p><p>We also use the FoM distribution in Fig. <ref type="figure">1</ref> to select which pulsars to include in EDR2. Specifically, we rank the pulsars in order of highest to lowest FoM, as computed from DR2 Lite. This ranking represents an optimal order for combining the data to maximize the sensitivity of the early-combined data set to a GWB. For EDR2, we choose the 22 highest ranked pulsars, represented by all pulsars to the right of the vertical line in Fig. <ref type="figure">1</ref> . We choose to use a 22 pulsar cut-off for this analysis for a number of reasons, though the exact MNRAS 542, 3028-3048 (2025) number is ultimately arbitrary. Namely, this number should be large enough to avoid bias in GWB statistics due to a finite number of pulsars <ref type="bibr">(Johnson et al. 2022</ref> ), but still represents less than half of the total pulsars intended for the full analysis. This also cleanly cuts off the data set at PSR J2317 + 1439, which sees a large boost in the FoM post-combination based on Fig. <ref type="figure">1</ref> . Finally, this was also selected as a rough match for the number of pulsars with data combined for the upcoming IPTA DR3 at the time of performing this analysis <ref type="bibr">(Good &amp; International Pulsar Timing Array Team 2023 )</ref>.</p><p>Fig. <ref type="figure">2</ref> further visualizes each data set by displaying the observation times and radio frequencies of each TOA. Coloured markers represent the single-PTA data used in DR2 Lite, while coloured + black markers represent all data used in Full DR2. EDR2 pulsars are highlighted in gold. Several pulsars (e.g. PSR J0437 -4715) are timed only by a single PTA; therefore, their Lite and combined data sets are identical. Other pulsar's combined data (e.g. PSR J1713 + 0747) present a clear advantage in total radio band coverage and observation cadence, especially in the latter half of the data set. For reproducibility, code for creating Lite data sets from IPTA DR2 can be found in the public IPTA github repository. 1</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">A NA LY S I S M E T H O D S</head><p>We closely follow the methods and conventions in <ref type="bibr">Antoniadis et al. ( 2022 )</ref> for the Bayesian analysis of IPTA DR2 and its corresponding Lite data set. This will be review for readers familiar with the prior work, although in Section 3.2 we also update the pulsar noise models over those used in <ref type="bibr">Antoniadis et al. ( 2022 )</ref>. We use a multivariate Gaussian likelihood to represent our timing residual vector &#948;t under the full signal model M , expressed compactly as </p><p>where &#960; (&#951;| M ) is our prior probability over &#951; and Z(&#948;t | M ) is the model evidence (or marginal likelihood). Equation ( <ref type="formula">3</ref>) is evaluated numerically using Markov chain Monte Carlo (MCMC) or nested sampling. Given two different models M 1 and M 0 , the ratio of model evidences is the Bayes factor,</p><p>interpreted as the probability ratio for the data &#948;t under M 1 versus M 0 , or equivalently an odds ratio for M 1 versus M 0 given the data &#948;t (assuming equal prior odds for both models). B M 1 M 0 may be used as a detection statistic for a signal represented by M 1 if M 0 is the signal's null hypothesis. For nested models, equation ( <ref type="formula">4</ref>) is easily approximated using the Savage-Dickey density ratio <ref type="bibr">(Dickey 1971 )</ref>.</p><p>We construct the likelihood and priors using enterprise <ref type="bibr">(Ellis et al. 2020</ref> ) and enterprise extensions <ref type="bibr">(Taylor et al. 2021 )</ref>. We perform parameter estimation using PTMCMCSampler (MCMC with parallel tempering; <ref type="bibr">Ellis &amp; van Haasteren 2017 )</ref> as well as nautilus (nested sampling; Lange 2023 ). We next describe the models used to construct the likelihood and their parameters. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Timing model</head><p>We start with the best-fitting timing model for each pulsar from <ref type="bibr">Perera et al. ( 2019 )</ref>. The timing model accounts for deterministic delays to a given pulsars TOAs accounting for effects such as pulsar spindown, astrometry, binary orbits, dispersion, frequencydependent pulse profile evolution, and instrumental offsets. However, the presence of time-correlated noise will introduce perturbations to the best-fitting values of the timing model parameters. As such, each pulsar's timing model is varied using an approximate linearized timing model design matrix M &#8712; T , with elements defined</p><p>where t i is the ith TOA, &#946; j is the j th timing model parameter, and &#946; 0 ,j is the best-fit value of the j th timing model parameter <ref type="bibr">(van Haasteren &amp; Levin 2013 ;</ref><ref type="bibr">Taylor 2021</ref> ). The timing model coefficients = &#946;&#946; 0 &#8712; W are then assigned improper uniform priors, which are implemented numerically as Gaussian priors with (near-)infinite variance, and then marginalized over when computing the likelihood following <ref type="bibr">Johnson et al. ( 2024 )</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Noise models</head><p>Pulsar noise models account at minimum for white noise, lowfrequency red noise, and low-frequency chromatic noise. Here, we update the pulsar noise models for IPTA DR2 from those used in <ref type="bibr">Antoniadis et al. ( 2022 )</ref> to reflect recent advances in pulsar noise modelling. In particular, <ref type="bibr">Falxa et al. ( 2023 )</ref> recently performed a search for continuous GWs from individual SMBHBs in IPTA DR2 and found that the detailed noise models from <ref type="bibr">Chalumeau et al. ( 2022 )</ref>, which account for higher frequency sources of noise, were required to mitigate a spurious detection of a continuous GW. An optimal treatment of pulsar noise would necessitate the creation of fully customized pulsar noise models tailored to IPTA DR2 (e.g. <ref type="bibr">Lentati et al. 2016</ref> ), but this is beyond the scope of this work. Instead, we use effective pulsar noise models informed by published analyses from individual PTA data sets <ref type="bibr">(Goncharov et al. 2021a ;</ref><ref type="bibr">Chalumeau et al. 2022 ;</ref><ref type="bibr">EPTA Collaboration 2023b ;</ref><ref type="bibr">Reardon et al. 2023b ;</ref><ref type="bibr">Agazie et al. 2023c ;</ref><ref type="bibr">Larsen et al. 2024 )</ref>. We always use log 10uniform priors on the amplitude parameters of each noise process. These have been shown to be equivalent to spike and slab priors which enable noise model averaging (van Haasteren 2025 ). For each MNRAS 542, <ref type="bibr">3028-3048 (2025)</ref> version of the data set, we perform one round of noise analysis in each pulsar for parameter estimation and model validation prior to full-PTA analyses, using the same noise models for each data set. Furthermore, it is also possible to improve these priors by using hierarchical modelling to create a 'population prior', which represents the ensemble noise properties of millisecond pulsars <ref type="bibr">(van Haasteren 2024 ;</ref><ref type="bibr">Goncharov &amp; Sardana 2025 )</ref>. In particular, hierarchical priors have been shown <ref type="bibr">(Goncharov et al. 2024 ;</ref><ref type="bibr">van Haasteren 2024 )</ref> to reduce bias in GWB parameter estimation under a scenario where pulsars with similar intrinsic red noise properties become misattributed to the autocorrelations of the GWB (see <ref type="bibr">Goncharov et al. 2022 ;</ref><ref type="bibr">Zic et al. 2022 for discussions)</ref>. While this scenario could be relevant for the analysis of IPTA DR2, we do not consider a GWB analysis using hierarchical priors, as the ensemble noise properties obtained from two data sets (i.e. DR2 Lite and Full DR2) will not be equivalent, and comparing results obtained under two different data sets with different priors is not straightforward (best left for future work). None the less, it is useful to compare the ensemble noise properties obtained under different data sets with hierarchical modelling. These results are isolated to Section 4.4 , while the remainder of this work uses the standard uninformative priors.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.1">Achromatic red noise</head><p>Each pulsar's red noise is modelled as a rank-reduced Gaussian process using a N TOA &#215; 2 N f sine-cosine Fourier design matrix F &#8712; T with elements</p><p>where we use a linearly spaced frequency basis f , and N f is the number of frequencies used <ref type="bibr">(Lentati et al. 2013</ref> ). We place a powerlaw prior on the variance of the Fourier coefficients a &#8712; W at each frequency, parametrized in terms of the power spectral density</p><p>with uniform priors on the log 10 spectral amplitude at f = 1 / yr log 10 A RN &#8764; U( -20 , -11) and spectral index &#947; RN &#8764; U(0 , 7), while the Fourier coefficients a are marginalized over <ref type="bibr">(van Haasteren &amp; Levin 2013 )</ref>. Red noise processes may be reconstructed in the time domain as &#948;t RN = Fa by repeated draws from the posterior distribution over a <ref type="bibr">(Meyers et al. 2023 )</ref>. During a full-PTA analysis, the frequency basis for intrinsic pulsar red noise is defined to be equivalent to the CRN basis, f = ( 1 /T DR2 , . . . , 30 /T DR2 ) , where T DR2 &#8764; = 30 yr is the timespan of IPTA DR2, and the number of frequencies are spaced in integer steps of 1 /T DR2 . During single pulsar noise analyses, the frequency basis is tailored to the pulsar's timespan, T obs , such that f = ( 1 /T obs , . . . , 30 /T DR2 ) . This is chosen because any noise below 1 /T obs in a given pulsar will be degenerate with pulsar spindown parameters. Meanwhile, the truncation frequency 30 /T DR2 is chosen to make sure each pulsar's white noise properties (which could depend on the cutoff if the spectrum is shallow) are consistent across both phases of the analysis. This red noise model is left consistent across all pulsars. However, PSR J1012 + 5307 also exhibits red noise up to very high Fourier modes <ref type="bibr">(Chalumeau et al. 2022 ;</ref><ref type="bibr">Falxa et al. 2023 ;</ref><ref type="bibr">EPTA Collaboration 2023b )</ref>. As such, we add an additional high-frequency power-law red noise process for PSR J1012 + 5307 with f = ( 1 /T obs , . . . , 150 /T DR2 ) during all stages of analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.2">Chromatic noise</head><p>Any time-correlated noise processes depending on the radiofrequency of the pulsar, &#957;, are chromatic. The primary type of chromatic noise is DM noise, varying as &#948;t DM &#8733; &#957; -2 . Similarly to achromatic red noise, we use a Fourier-basis Gaussian process with a power-law prior to model DM noise using hyperparameters log 10 A DM &#8764; U( -20 , -11) and &#947; DM &#8764; U(0 , 7), with an additional scaling ( &#957;/1400 MHz ) -2 applied to the Fourier design matrix (equation ( <ref type="formula">6</ref>)). Following <ref type="bibr">Falxa et al. ( 2023 )</ref>, we allow the power law frequencies to extend to higher frequencies than achromatic red noise, here using f = ( 1 /T obs , . . . , 150 /T DR2 ) . An additional fit for linear and quadratic variations in DM( t) are included in all timing models by default.</p><p>The solar wind also induces annual quasi-periodic DM variations which we model separately from the Fourier-basis DM Gaussian process. Assuming a spherically-symmetric, r -2 density profile surrounding the Sun, the DM induced by the solar wind is</p><p>where &#952; ( t) is the angle between the Earth-Sun and Earth-pulsar lines of sight, and n Earth ( t) is the time-dependent solar wind electron density measured at 1 AU from the Sun <ref type="bibr">(You et al. 2007 ;</ref><ref type="bibr">Hazboun et al. 2022 ;</ref><ref type="bibr">Nit &#184;u et al. 2024 )</ref>. The mean, time-independent component of the electron density is included as a timing model parameter for every pulsar and marginalized over. We additionally fit for timedependent density perturbations n Earth &#8712; W along each pulsar's line-of-sight as a Gaussian process using the model from Nit &#184;u et al. ( <ref type="formula">2024</ref>), with N b equal to the number of pulsar-Sun conjunctions in the pulsar's data set, and a separate variance parameter sampled for each pulsar using the prior log 10 &#963; n Earth &#8764; U( -4 , 2) electrons cm -3 . Since &#952; ( t) is bounded by the ecliptic latitude ( ELAT ) of each pulsar, many pulsars with large ELAT will be less sensitive to the solar wind, though there may be exceptions depending on radio-frequency coverage and TOA precision <ref type="bibr">(Susarla et al. 2024</ref> ). We only include the time-dependent model in pulsars for which ELAT &lt; 35 &#8226; and the model is favoured with Savage-Dickey Bayes factor B SW ( t) 0 &gt; 1 using IPTA DR2.</p><p>Pulsars may also experience non-dispersive chromatic noise due to effects such as interstellar scattering or pulse profile variability. Scattering results from pulse propagation through an inhomogeneous refractive medium <ref type="bibr">(Cordes &amp; Rickett 1998 ;</ref><ref type="bibr">Hemberger &amp; Stinebring 2008 )</ref>. A simple model for time-delays introduced by scattering is that &#948;t &#8733; &#957; -&#967; with &#967; = 4 . 4. However, this makes several assumptions, including that the refractive medium is described by Kolmogorov turbulence, the medium is isolated to a thin screen, the pulse is Gaussian, and the pulse broadening function is exponential <ref type="bibr">(Geiger et al. 2025 )</ref>. Violations of these assumptions can and do result in alternative values for &#967; in millisecond pulsars <ref type="bibr">(Turner et al. 2021 )</ref>, especially once transforming from estimates of the scattering delay to the timing residual <ref type="bibr">(Geiger et al. 2025</ref> ).</p><p>Here we account for some of this excess chromatic noise using the same Fourier-basis Gaussian process model as DM noise, except the radio-frequency scaling of the Fourier basis follows ( &#957;/1400 MHz ) -&#967; , with &#967; as a fit parameter. We incorporate the uncertainty on &#967; in our priors by using a truncated normal distribution, &#967; &#8764; N (4 , 0 . 5) &#215; U(2 . 5 , 10), where the lower-bound at &#967; = 2 . 5 prevents degeneracy with DM noise. We include this model for PSRs J0437-4715, J0613-0200, J1600-3053, J1643-1224, J1713 + 0747, J1903 + 0327, J1939 + 2134 based on the likely influence of scattering variations in these pulsars' timing residuals from prior <ref type="table">MNRAS 542</ref>, <ref type="table">3028-3048 (2025)</ref> work <ref type="bibr">(Alam et al. 2021 ;</ref><ref type="bibr">Srivastava et al. 2023 ;</ref><ref type="bibr">Reardon et al. 2023b ;</ref><ref type="bibr">Agazie et al. 2023d</ref> ). We also model a chromatic event in PSR J1713 + 0747 using the following deterministic signal <ref type="bibr">(Lam et al. 2018b )</ref>,</p><p>with uniform priors log 10 <ref type="bibr">charov et al. 2021a ;</ref><ref type="bibr">Antoniadis et al. 2022</ref> ). To improve computational efficiency, all chromatic parameters &#967;, &#967; d are varied during single pulsar noise analyses, but held fixed to their maximum a posteriori (MAP) values during subsequent full-PTA analyses.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.3">White noise</head><p>Many white noise parameters are included in IPTA-combined data sets to account for different systematic errors which may be unique to particular observing systems. We apply the same prescription as <ref type="bibr">Antoniadis et al. ( 2022 )</ref> for fitting white noise. Two parameter types are diagonal in the N matrix: EFAC, which applies a net scaling to the estimated TOA uncertainties, and EQUAD, which adds an additional net uncertainty in quadrature. We also apply ECORR parameters to NANOGrav TOAs, which are intended to model pulse jitter in sub-banded TOAs measured during the same observation epoch using uniform blocks along the diagonal band of the N matrix. Separate white noise parameters are applied to TOAs from different receiver and backend combinations in each pulsar, where these combinations are specified by each TOA's -group flag. All white noise parameters are varied during single pulsar noise analyses, and then held fixed to their MAP values during subsequent full-PTA analyses.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Common signals</head><p>In full-PTA analyses, we search for an additional common red noise (CRN) process on top of all components in each pulsar's noise model. The CRN is modelled with the same power-law spectral density as the individual red noise models, equation ( <ref type="formula">7</ref>), using new parameters A CRN and &#947; CRN which are fit for simultaneously in all pulsars at once. In a single pulsar analysis, the achromatic red noise includes contributions from both common and intrinsic pulsar noise. Switching from the single pulsar to full-PTA analysis decouples the total achromatic red noise into the separate intrinsic and common channels. The CRN also uses the same frequency basis f for all pulsars. To be consistent with <ref type="bibr">Antoniadis et al. ( 2022 )</ref>, we use N freqs = 13 components for a frequency grid f = ( 1 /T DR2 , . . . , 13 /T DR2 ) for each analysis.</p><p>To model cross-correlations in a full-PTA analysis we define the cross-power spectral density,</p><p>where ab is the overlap reduction function (ORF) encoding the geometric cross-correlation between pulsars a and b as a function of their sky-separation angle, and S( f ) is the power spectral density in each pulsar, given by the form of equation ( <ref type="formula">7</ref>) if assuming a power law spectrum. One can specify different signals by the form of the ORF: ab = &#948; ab represents a purely autocorrelated CRN process (uncorrelated between pulsars), whereas ab given by the Hellings-Downs curve is the signature of an isotropic GWB under general relativity. Alternative ORFs given by monopolar and dipolar forms in pulsar sky separation angle would result from errors in terrestrial time standards <ref type="bibr">(Hobbs et al. 2012</ref> ) and conversion to the Solar system barycenter <ref type="bibr">(Champion et al. 2010 )</ref>, respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">I P TA D R 2 L I T E V E R S U S C O M B I N E D A NA LY S I S R E S U LT S</head><p>Here we present the results of our common signal search and analysis of DR2 Lite (Table <ref type="table">1</ref> , row 2), in comparison with the same analysis of Full DR2, which was originally carried out in <ref type="bibr">Antoniadis et al. 2022</ref> (Table <ref type="table">1</ref> , row 1). We also perform an analysis on the fully-combined data set with only 22 pulsars, designated here as EDR2, that could reflect an intermediate stage of the data combination process (Table <ref type="table">1</ref> , row 3).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Common red noise</head><p>We first compare our inferences on the CRN parameters using each data set to see how much information we can learn about the common signal using the Lite method, and how much our inferences improve using the combined data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.1">Full-PTA parameter estimation</head><p>First we perform a simultaneous analysis of all pulsars in each data set to estimate the CRN parameters assuming a fixed spectral index &#947; CRN = 13 / 3. Fig. <ref type="figure">3</ref> compares the posterior PDFs for log 10 A CRN , 13 / 3 from each data set. We also perform the same comparison using a varied &#947; CRN model, for which we show the CRN posteriors in Fig. <ref type="figure">4</ref> . The median and 95 per cent credible intervals on CRN parameters from both models are reported in Table <ref type="table">2</ref> . We further report in Table <ref type="table">2</ref> the upper limit A 95 per cent CRN , 13 / 3 for each data set, estimated as the 95 per cent one-sided Bayesian credible interval after replacing the log 10 -uniform priors on A CRN , 13 / 3 with uniform priors.</p><p>We find that all three data sets, including DR2 Lite, are able to detect a CRN, however, there are differences in spectral characterization. In particular, the amplitude measured using DR2 Lite is very large-systematically higher than amplitude measured from the combined data sets, with an median amplitude A CRN , 13 / 3 measured 23 per cent larger using DR2 Lite than it is using Full DR2. This implies that DR2 Lite is allowing excess noise intrinsic to the pulsars to leak into the common channel <ref type="bibr">(Zic et al. 2022 )</ref>. The upside then is that performing data combination apparently helps to mitigate this effect. We explore this discrepancy further in Section 4.1.4 .</p><p>Additionally, Fig. <ref type="figure">4</ref> shows that the CRN parameter constraints become more precise as more data are added. To quantify the improvement in precision, we estimate the area A of the 2D region in log 10 A CRN , &#947; CRN enclosed within 95 per cent of the posterior from each data set. We find the ratio of areas with respect to Full DR2 to be A Lite / A Full = 2 . 25 and A EDR2 / A Full = 1 . 23, i.e. Full DR2 is 2.25 times more precise at spectral characterization than DR2 Lite, and 1.23 times more precise than EDR2.</p><p>Fig. <ref type="figure">5</ref> further compares each data set using a more generic 'freespectral' model for the CRN, where we drop the assumption of a power-law spectrum and sample the timing residual power at each discrete frequency f i as independent parameters with prior log 10 . i &#8764; U( -10 , -4) in units of seconds <ref type="bibr">(Lentati et al. 2013 )</ref>. The top panel shows the posteriors on each log 10 . i from each data set, which plotted versus frequency represent the amplitude spectrum of the CRN. Here, it is valid to compare amplitude spectra from the different data sets as they share the same baseline T DR2 , otherwise the amplitude spectral density would be the relevant quantity to compare. Overplotted lines depict median power law spectra from each data set, for comparison. The bottom panel of Fig. <ref type="figure">5</ref> shows the log 10 Bayes factors, log 10 B CRN CRN -. i , for the free spectral CRN model versus the same model without the inclusion of common power in the frequency bin at frequency f i , as measured using the Savage-Dickey density ratio. These quantify the detection significance of the CRN at each frequency bin, where log 10 B CRN CRN -. i &gt; 0 indicates the data do favour the inclusion of additional common power at frequency f i .</p><p>EDR2's amplitude spectrum is comparable to Full DR2, with small deviations in the posteriors at higher frequencies. Both spectra are consistent with a power law form and display strongest detections of power in the 2 -3 nHz range. Meanwhile, the posterior power in DR2 Lite's amplitude spectrum exceeds that from the combined data sets at several frequencies, which is consistent with the higher powerlaw amplitude measured using DR2 Lite. We focus on the 1 and 4 nHz frequencies (corresponding to periods of &#8764; 30 and &#8764; 7 . 5 yr, respectively) where power is detected more significantly using DR2 Lite than the combined data sets. Relatively few pulsars in IPTA DR2 have sufficiently long timespans to contribute meaningfully to constraining the posterior at 1 nHz. The Fig. <ref type="figure">5</ref> cumulative histogram of pulsars over their observation timespan 1 /T obs illustrates this effect: Only 3 pulsars (PSRs J1713 + 0747, J1857 + 0943, J1939 + 2134) have T obs &gt; 21 yr, 2 of which (PSRs J1857 + 0943 and J1939 + 2134) have very large data gaps spanning &#8764; 10 yr in DR2 Lite, as shown in Fig. <ref type="figure">2</ref> . Unconstrained noise in even 1 of these pulsars using DR2 Lite could feasibly cause changes in timing residual power at 1 nHz. As for the 4 nHz posterior, it visibly deviates from the median power law fit obtained from Full DR2 and EDR2, likely producing the large amplitude measured in the power law analysis. While 41 pulsars have sufficient timespan to resolve this frequency, details in the noise characterization of a minority of especially sensitive pulsars may still produce such features in the common spectrum <ref type="bibr">(Hazboun et al. 2020 ;</ref><ref type="bibr">Larsen et al. 2024 )</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.2">Factorized likelihood analysis</head><p>We gain additional information about the significance of an autocorrelated common process using the Factorized Likelihood <ref type="bibr">(Taylor et al. 2022 )</ref>. Using the factorized likelihood, the likelihood and common noise posteriors are estimated as a product of these statistics as measured from analyses of individual pulsars. The factorized likelihood not only enables rapid PTA analyses but also allows the analysis of different Lite data sets at no additional computational cost, as demonstrated in <ref type="bibr">Agazie et al. ( 2024 )</ref>. Using the factorized likelihood, we assign the CRN model a fixed spectral index &#947; CRN = 13 / 3 Table 2. CRN parameters and statistics as a function of data set. Using the fixed &#947; CRN = 13 / 3 model, we report median and 95 per cent credible intervals on A CRN , 13 / 3 (using log 10 -uniform priors), 95 per cent upper limits A 95 per cent CRN , 13 / 3 (using uniform priors), and Bayes factors on the CRN B CRN , 13 / 3 0 estimated using the Savage-Dickey density ratio from a factorized likelihood analysis. Using the varied &#947; CRN model, we report median and 95 per cent credible intervals on A CRN and &#947; CRN . As data are progressively added from Lite to full, CRN detection statistics become more significant and parameter estimates become more precise, while the upper limit decreases with combined data.</p><p>Fixed &#947; CRN = 13 / 3 Varied &#947; CRN IPTA DR2 subset A CRN , 13 / 3 A 95 per cent CRN , 13 / 3 log 10 B CRN , 13 / 3 0 A CRN &#947; CRN DR2 Lite 4 . 8 + 1 . 8 -1 . 8 &#215; 10 -15 6 . 4 &#215; 10 -15</p><p>3 . 0 &#177; 0 . 2 10 . 0 + 15 . 6 -6 . 5 &#215; 10 -15 3 . 6 + 0 . 9</p><p>4 . 5 &#215; 10 -15 6 . 4 &#177; 0 . 2 6 . 9 + 7 . 3 -3 . 9 &#215; 10 -15 3 .</p><p>MNRAS 542, 3028-3048 (2025)  in each pulsar, while the intrinsic red noise model's spectral index is allowed to vary (otherwise, the intrinsic red noise and CRN signals are completely degenerate). Since the CRN may always be assigned to the intrinsic RN model during each single-pulsar analysis, the CRN amplitude in each pulsar is always finite at A CRN , 13 / 3 &lt; 10 -17 , therefore the Savage-Dickey density ratio can always be used to estimate the CRN Bayes factor from the product.  <ref type="formula">10</ref>3 ) or greater, signifying strong preference for a CRN, in agreement with the full-PTA analysis. However, the results show progressively more support for a CRN as the data combination process continues, with the odds of a CRN increasing by a factor of &#8764; 10 3 . 4 going from DR2 Lite to EDR2, and again by a factor of &#8764; 10 2 . 7 going from EDR2 to Full DR2. This result is unsurprising as we add more data into the analysis, but none the less underscores the effectiveness of data combination for improving our detections of common signals.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.3">Dropout analysis</head><p>The factorized likelihood can be also used to  several pulsars' DFs &gt; 1. Using DR2 Lite, fewer DFs are found to be larger than 1, which is unsurprising as the Bayes factor for the CRN is lower. PSR J1909-3744's value of DF = 6 + 24 -6 &#215; 10 -4 using DR2 Lite is extremely small (even within bootstrapping errors), as such this pulsar seems to be in major tension with the CRN process when using DR2 Lite. This is highly unusual, but may be explained if PSR J1909 -3744 has a much lower upper limit on the CRN than the remaining pulsars, as discussed in Section 4.1.4 . Using Full DR2, PSR J1909-3744's DF is DF = 0 . 49 + 0 . 91 -0 . 08 , indicating PSR J1909-3744 is now more in line with the CRN and with the other combined pulsars. We follow up further with PSR J1909-3744 in Section 4.3 . PSR J1713 + 0747 is also an interesting case, switching from disfavouring to supporting the CRN, going from DF = 0 . 18 + 0 . 73 -0 . 14 using DR2 Lite to DF = 4 . 93 + 13 . 34 -0 . 72 using Full DR2. Overall, five pulsars experience significant (i.e. &gt; 2 &#963; ) boosts to their DF after data combination, while only PSR J2317 + 1439's DF is significantly higher using DR2 Lite.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.4">Assessing the large common amplitude in DR2 Lite</head><p>While the dropout analysis in Section 4.1.3 shows that the detection significance of the CRN may be skewed by individual outlier pulsars using DR2 Lite, it still leaves open the question why the CRN amplitude is systematically larger when using DR2 Lite as opposed to Full DR2. To investigate further, we apply a modified version of the dropout analysis which compares how the discrepancy in the measured amplitude depends on which pulsars are dropped. To quantify this discrepancy, we compute a distribution over the shift or difference between the values for A CRN , 13 / 3 from each data set, A = A Lite -A Full , the mean of which is defined</p><p>For brevity we have omitted the previously used subscripts so that A + A CRN , 13 / 3 , and line 12 makes explicit that we compute this difference simply by drawing Monte Carlo pairs of samples from the original factorized likelihood posteriors from each data set. This quantity is related to the tension metric from <ref type="bibr">Agazie et al. ( 2024 )</ref>.</p><p>When including all pulsars, the median of this difference comes out to A = 0 . 8 + 1 . 2 -1 . 0 &#215; 10 -15 , where the quantiles enclose 68 per cent of the distribution about the median. The value A = 0 is only just within the 68 per cent credible region, indicating a marginal tension on the level of 0 . 81 &#963; .</p><p>To test the dependence of this discrepancy on individual pulsar data sets, we recompute the CRN amplitude at &#947; CRN = 13 / 3 via the factorized likelihood method (equation 5 from <ref type="bibr">Taylor et al. 2022</ref> ) but with pulsar j removed from the data set, i.e.</p><p>where i indexes each pulsar's data set &#948;t i , and the -j subscripts are a shorthand to denote pulsar j has been dropped. We compute p -j (log 10 A |{&#948;t } -j ) for every pulsar j using both DR2 Lite and Full DR2, at which points combining with equation ( <ref type="formula">12</ref>) yields a new measure of the discrepancy, A -j , with the j th pulsar removed from both data sets. The bottom panel of Fig. <ref type="figure">8</ref> shows our estimates of A -j for each pulsar j individually removed by themselves. Shaded regions show the corresponding estimates A with no pulsars removed. To better understand what is causing any measured differences in A -j , we also show the estimates on A -j using each data set in the top panel of Fig. <ref type="figure">8</ref> (and corresponding shaded regions for A with no pulsars removed). If A -j &gt; A , then pulsar j is driving down the amplitude estimate, whereas if A -j &lt; A , then pulsar j is adding power to the common amplitude. There are several pulsars, when excluded from the analysis, that reduce the discrepancy in A . The most impactful pulsars on this end are PSRs J2145 -0750, J1744 -1134, and J0437 -4715; their individual removal reduces the discrepancy to a 0 . 26 &#963; , 0 . 32 &#963; , and 0 . 30 &#963; level, respectively. Interestingly, each pulsar skews the distributions A -j in a different way. PSR J2145 -0750 displays very loud red noise with a steep power law index &#947; &gt; 4 in IPTA DR2 <ref type="bibr">(Caballero et al. 2016 ;</ref><ref type="bibr">Perera et al. 2019 )</ref>; in Full DR2, this is decoupled from the CRN, whereas in DR2 Lite, the red noise is not sufficiently resolved from DM variations (see Table <ref type="table">4</ref> ; Section 4.3 ) nor is it resolved from the CRN, causing the red noise to pollute the CRN process. PSR J1744 -1134, among the best timers in the data set, affects the discrepancy simply by reducing A CRN , 13 / 3 using Full DR2, whereas there is less constraint on its noise using DR2 Lite (Table <ref type="table">4</ref> ). Finally, PSR J0437 -4715, which only has TOAs from the PPTA in IPTA DR2, strongly drives up the CRN amplitude for both data sets. Notably, PSR J0437 -4715 has the highest FoM behind PSR J1939 + 2134 and therefore has a disproportionately strong effect on the analysis, on top of very challenging noise properties to model due to its brightness, which may well contribute additional systematic error in this analysis (see e.g. <ref type="bibr">Lentati et al. 2016 ;</ref><ref type="bibr">Goncharov et al. 2021a ;</ref><ref type="bibr">Reardon et al. 2023b</ref><ref type="bibr">Reardon et al. , 2024 ) )</ref>. In summary, we consider the discrepancy in the CRN amplitude between DR2 Lite and Full DR2 to be statistically insignificant as it is not robust to outliers among the individual pulsars.</p><p>Finally, there are a few pulsars on the opposite end where the discrepancy widens with their removal. This is unsurprising if an equal number of pulsars also narrow the discrepancy. However, this analysis sheds insight into the nature of PSR J1909 -3744's extremely low DF from Section 4.1.3 . We see on the far right-hand side of Fig. <ref type="figure">8</ref> that removing PSR J1909 -3744 from the analysis results in a much larger CRN amplitude of A -j = 6 . 1 + 1 . 4 -1 . 5 &#215; 10 -15 (with errors enclosing 95 per cent quantiles). This shows PSR J1909 -3744's DF is so low because it is uniquely suppressing a higher-amplitude mode of the CRN posterior, which is more likely to be dominated by intrinsic pulsar noise. Meanwhile, the Full DR2 amplitude A -j when dropping PSR J1909 -3744 is shifted by much less using Full DR2, reflective of the fact that several more pulsars contribute constraints on the CRN measurement after performing data combination.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Cross-correlations</head><p>In order for the data to signify evidence of a GWB, they ought to favour a cross-correlated common signal following the Hellings-Downs curve, as opposed to a purely autocorrelated common signal. We assume that data combination will ultimately improve our ability to resolve a cross-correlated GWB over the use of uncombined data. IPTA DR2 has limited value to test this assumption, since the statistics for cross-correlations in IPTA DR2 obtained first by for Gaussian perturbations to the solar wind electron density along the Earth-pulsar line of sight. These Bayes factors were computed as the Savage-Dickey density ratio using each signal's relevant amplitude parameter; where there were a lack of samples in the tail of each parameter's posterior, we place a lower bound of B &gt; 10 3 . Bold values mark cases where B &gt; 10, indicating strong evidence for the noise process. The total number of bolded entries are tallied at the bottom of each column. Entries with asterisks ( * ) indicate cases of severe covariance within the noise model parameter space, such that either, but not both, marked signals are favoured by the data (i.e. the data cannot distinguish the chromaticity of the process). The daggered ( &#8224; ) entries for PSR J1012 + 5307 indicate that the high-frequency achromatic red noise process was used to compute the Bayes factor, rather than the 30 frequency red noise process. </p><p>.8 0.7 --J1600-3053 0.8 1 . 2 * 1 . 6 * 1.7 15 . 7 3.0 &gt; 10 3 292 . 7 J1640 + 2224 0.7 0.7 --1.1 &gt; 10 3 --J1643-1224 0.7 2 . 9 * 1 . 0 * -&gt; 10 3 &gt; 10 3 &gt; 10 3 -J1713 + 0747 67 . 6 1.0 0.9 -&gt; 10 3 &gt; 10 3 &gt; 10 3 -J1730-2304 0.8 1.1 --0.8 0.7 --J1738 + 0333 0.9 0.8 --1 . 2 * 1 . 8 * --J1744-1134 1 . 5 * 1 . 1 * --&gt; 10 3 50 . 0 --J1853 + 1303 0.8 0.8 --1.4 1.1 --J1857 + 0943 9.6 &gt; 10 3 --1087 . 1 &gt; 10 3 --J1909-3744 0.5 &gt; 10 3 --&gt; 10 3 &gt; 10 3 --J1910 + 1256 0.8 30 . 9 --0.8 13 . 7 --J1918-0642 1 . 2 * 1 . 9 * -4.1 2.3 &gt; 10 3 -4.0 J1939 + 2134 &gt; 10 3 &gt; 10 3 &gt; 10 3 -&gt; 10 3 &gt; 10 3 &gt; 10 3 -J1955 + 2908 0.8 24 . 6 --0.9 169 . 4 --J2010-1323 1.5 1.0 --1.4 &gt; 10 3 --J2124-3358 0.9 0.8 -3.7 0.9 0.9 -1.0 J2145-0750 2 . 2 * 3 . 5 * -0.8 &gt; 10 3 158 . 0 -7.0 J2317 + 1439 0.7 0.7 --0.6 &gt; 10 3 --Total B &gt; 10: 2 5 1 0 12 16 5 2</p><p>Antoniadis et al. ( <ref type="formula">2022</ref>) are well below the thresholds required for GW detection <ref type="bibr">(Allen et al. 2023</ref> ). Thus, the following comparisons of cross-correlation statistics obtained from the three data sets are merely exploratory, but still presented for completeness.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.1">Optimal statistic</head><p>We first assess the significance of Hellings-Downs cross-correlations with the optimal statistic (OS; <ref type="bibr">Anholm et al. 2009 ;</ref><ref type="bibr">Chamberlin et al. 2015 )</ref> using the defiant tool. 2 We specifically use the multiple component OS to measure all three ORFs (Monopole, Dipole, Hellings-Downs) simultaneously, which reduces any bias incurred while measuring one ORF due to the presence of another <ref type="bibr">(Sardesai et al. 2023 )</ref>. To account for CRN and intrinsic pulsar noise parameter uncertainties, we apply noise marginalization to obtain a distribution of the OS (i.e. the NMOS; <ref type="bibr">Vigeland et al. 2018</ref> ) over 3000 draws from our Bayesian posteriors. For each data set, we then follow <ref type="bibr">Vallisneri et al. ( 2023 )</ref> to obtain a Bayesian S/N (S/N Bayes ), which can be interpreted as a probability-weighted mean over noise marginalized S/N distribution. For our purposes we assume the GWB S/N null distribution is Gaussian (noting an accurate measure of GW significance mandates use of a GX2 null distribution; see Hazboun 2 <ref type="url">https://github.com/GersbachKa/defiant/tree/main</ref> et al. 2023 ). Our analysis also neglects to account for covariance between pulsar pairs, which naturally arises in pulsar pairs with pulsars in common <ref type="bibr">(Allen &amp; Romano 2023 ;</ref><ref type="bibr">Johnson et al. 2024</ref> ). However, this assumption is justified here as we are in the weak S/N regime of the GWB cross-correlations <ref type="bibr">(Sardesai et al. 2023 )</ref>.</p><p>We report the S/N Bayes values we obtain for Hellings-Downs ORF on the right-most column of Table <ref type="table">3</ref> . Using DR2 Lite, We are completely unable to resolve HD correlations from noise, as given by S/N Bayes = 0. The combined data sets yield higher, but still relatively low values of S/N Bayes = 0 . 79 from Full DR2 and S/N Bayes = 0 . 92 from EDR2, with EDR2 yielding slightly higher significance despite including fewer pulsars than Full DR2. These results appear to be consistent with expectations from <ref type="bibr">Agazie et al. ( 2025 )</ref>, which found that the 30 noisiest pulsars can be dropped from the NANOGrav 15-yr data set before the Hellings-Downs S/N experiences any appreciable drop (here EDR2 is effectively Full DR2 with the 31 noisiest pulsars dropped). In this case, a slight increase in S/N with EDR2 may imply either excess noise from a set of the 31 remaining pulsars, or it is simply a statistical fluctuation given the low values of S/N Bayes .</p><p>Fig. <ref type="figure">9</ref> further shows the full distributions of the squared amplitude &#710; A 2 of each correlation signature, computed using the multiple component NMOS from each data set with uncertainty sampling <ref type="bibr">(Gersbach et al. 2025 )</ref>. The Hellings-Downs GWB amplitudes are consistent with each other and the reported values of S/N Bayes , with the combined data sets reducing the uncertainty from DR2 Lite. The MNRAS 542, <ref type="bibr">3028-3048 (2025)</ref> presence of monopolar or dipolar correlations indicates additional systematic correlated noise (see Section 3.3 ). Most notable is that DR2 Lite produces a larger monopole than Full DR2, suggesting the presence of additional unmodelled noise in the uncombined data, which is then mitigated via data combination. The monopole amplitude is further reduced in EDR2 from Full DR2, implying a set of the 31 remaining pulsars may be partially responsible for the noise contributing to this monopole. The DR2 Lite and Full DR2 monopoles both correspond to S / N &#8764; 2 <ref type="bibr">(Antoniadis et al. 2022</ref> ). This S/N level is not particularly significant, as it was shown in <ref type="bibr">Agazie et al. ( 2023a )</ref> that monopolar cross-correlations with higher corresponding S/N may arise frequently in simulations containing solely HD-correlated GWs and intrinsic pulsar noise. As such, it is possible these monopole measurements are statistical fluctuations, as opposed to the result of a real systematic such as a clock error. We leave a deeper analysis to understand the emergence and nature of monopolar cross-correlations in PTA data sets for future work. The dipole amplitude is highly consistent with zero in each case, which makes sense as the Solar system ephemeris version (DE436) used in this analysis is fixed, independently of how many TOAs are combined. However, the constraints on the amplitude of the dipolar correlations improve as data is combined. We report the Bayes factors we measure using each data set in the middle column of Table <ref type="table">3</ref> . To check consistency, we report the Bayes factors and sampling uncertainties measured using likelihood reweighting <ref type="bibr">(Hourihane et al. 2023</ref> ) and using productspace sampling, also known as the HyperModel in enterprise extensions <ref type="bibr">(Hee et al. 2016 ;</ref><ref type="bibr">Johnson et al. 2024 )</ref>. We estimate the uncertainties on the Bayes factors using the effective sample size for reweighting <ref type="bibr">(Hourihane et al. 2023 )</ref>, and using bootstrapping methods for the HyperModel . The two methods agree on the Bayes factor estimates within sampling uncertainties in all cases except for DR2 Lite. However, it is likely that the uncertainty using the HyperModel is underestimated, as the model switch parameter usually has a large autocorrelation length. In comparison to Full DR2 ( B &#8764; 1 . 4), the cross-correlated model is less favoured using DR2 Lite ( B &#8764; 0 . 6) and slightly more favoured using EDR2 ( B &#8764; 2 . 8; Table <ref type="table">3</ref> ). Although it is unexpected and interesting that EDR2 should return the highest Bayes factor, this appears to be consistent with the OS analysis, especially given we are not considering the presence of a monopole. Overall, even though the Bayes factors are all O(1), these results are a promising sign regarding the potential for combined data sets to improve measurements of cross-correlations over Lite data sets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.2">Bayes factors for Hellings-Downs correlations over common noise</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Single pulsar noise</head><p>We next compare how the characterization of pulsar noise changes whether we use Full DR2 or DR2 Lite. Out of the 53 pulsars in IPTA DR2, 24 have timing data from 2 or more PTAs, while the remaining pulsars' data are the same in Full DR2 and DR2 Lite. Therefore, we focus on how the noise properties, primarily red noise and chromatic noise, compare across these 24 pulsars. We do not assess the changes to the white noise or timing model parameters, but we acknowledge these parameters also play a large role in the total characterization of the pulsar.</p><p>First, we examine how many new noise processes are detected using Full DR2 versus DR2 Lite. Table <ref type="table">4</ref> shows the Bayes factors measured for each noise process and pulsar using DR2 Lite and Full DR2. These are estimated using the Savage-Dickey density ratio applied to the amplitude parameter of each noise process, e.g. B PLRN 0 in PSR J0030 + 0451 is the Bayes factor for a model with DM noise and red noise versus a model with DM noise only. We compare the Bayes factors for red noise, DM noise, higher-order chromatic noise, and time-dependent perturbations to the solar wind density. Bold parameters indicate the noise process is measured with B &gt; 10, and where there are no posterior samples in the tail, we place a lower limit of B &gt; 1000. Dashes indicate the process was not included in that pulsar's noise model. While these Bayes factors all depend on our choice for how to construct the prior, we are applying the same model and priors to each data set; therefore, the prior-bias becomes less important for the purpose of performing a comparison.</p><p>Using the threshold B &gt; 10 to indicate a probable detection of the noise process, we find using Table <ref type="table">4</ref> that using DR2 Lite, 2/24 pulsars detect achromatic red noise, 5/24 detect DM noise, 1/24 pulsars detect higher-order chromatic noise, and 0/24 detect solar wind density variations. Meanwhile, using Full DR2 we find 12/24 pulsars detect achromatic red noise, 16/24 detect DM noise, and 5/24 pulsars detect higher-order chromatic noise, indicating 10 new detections of red noise, 11 new detections of DM noise, and 4 new detections of chromatic noise in the combined data. 2/17 pulsars newly detect solar wind density variations. Overall, 16/24 pulsars newly detect a noise process that was not detected using DR2 Lite.</p><p>Additionally, we find some of these noise processes are not detected in the Lite data set specifically because of source confusion. Bayes factors marked with asterisks in Table <ref type="table">4</ref> signify that there is a strong case of parameter covariance between two or more processes in the model, such that the removal of one process from the model increases the detection significance of the other, and vice versa. In other words, one or more processes are favoured by the data, but the data cannot distinguish which one it is <ref type="bibr">(Lentati et al. 2016 ;</ref><ref type="bibr">EPTA Collaboration 2023b ;</ref><ref type="bibr">Ferranti et al. 2025</ref> each discuss this effect in more depth). We observed this behaviour in 8 pulsars using DR2 Lite. For PSRs J1600-3053 and J1643-1224 the source confusion is between DM and chromatic noise. For PSRs J1022 + 1001, J1024-0719, J1744-1134, J1918-0642, and J2145-0750, the source confusion is between DM and achromatic red noise. For PSR J0613-0200, the data prefer either to include only DM noise, or to include both achromatic and higher-order chromatic noise. This behaviour is detailed more clearly for PSR J0613-0200 in Fig. <ref type="figure">10</ref> . In all of these cases, using Full DR2 results in strong measurement of 1 or more of these noise processes, resolving the source confusion. Additionally, for PSR J1738 + 0333 there is no evidence of achromatic red noise or DM noise in DR2 Lite, but using Full DR2 it has entered into the source confusion regime of the two signals.</p><p>We further find from comparing the chromatic, DM, and achromatic red noise parameters from the single-pulsar noise analyses of all 24 pulsars that the changes in the noise parameters going from DR2 Lite to Full DR2 fall under three general categories:  (i) Consistency and improved constraints : In this case, the noise parameter posteriors measured using Full DR2 are more constrained than the posteriors measured using DR2 Lite, but both sets of posteriors are consistent with one another. This demonstrates the expected effect that adding more data results in more constrained posteriors. The majority of pulsars (15 out of 24) best-fitting into this category: PSRs J0030 + 0451, J0613-0200, J1012 + 5307, J1022 + 1001, J1024-0719, J1455-3330, J1600-3053, J1640 + 2224, J1738 + 0333, J1744-1134, J1857 + 0943, J1918-0642, J2010-1323, J2145-0750, J2317 + 1439.</p><p>(ii) Inconsistency and improved constraints In this case, using the combined data changes the noise parameter posteriors such that there is noticeable tension between DR2 Lite and Full DR2 in the posteriors for one or more parameters. This could possibly arise if we have model misspecification, or if additional unmitigated chromatic noise enters into the data combination. Otherwise, there may be more complicated interactions between the different model components than expected. Three out of twenty-four pulsars most cleanly fit into this category: PSRs J1643-1224, J1713 + 0747, and J1909-3744.</p><p>(iii) Consistency and similar constraints : In this case, the posterior distributions are very similar and consistent with one another. This indicates the full data combination is not significantly improving noise characterization at the single pulsar level. Six out of twenty-four pulsars best-fitting into this category: PSRs J1730-2304, J1853 + 1303, J1910 + 1256, J1939 + 2134, J1955 + 2908, and J2124-3358.</p><p>These categories summarize the effects we observe here of data combination of noise parameter characterization, though some pulsars also toe the line between these categories.</p><p>We next follow up with closer examinations of three pulsars, PSRs J0613-0200, J1909-3744, and J1910 + 1256, which each serve as an illustrative case from each category. Figs 10 -12 compare their noise parameter distributions obtained using DR2 Lite (orange) and using Full DR2 (blue). We also show the CRN parameters from Full DR2 overplotted over the total achromatic red noise parameters in each figure.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.1">PSR J0613-0200 | Consistency and improved constraints</head><p>Fig. <ref type="figure">10</ref> shows that the Lite data set for PSR J0613-0200 is not sufficient to distinguish the different sources of noise from each other. The 1D posteriors over the DM and red noise amplitudes each have long tails, with posterior support near log 10 A RN = -20 and log 10 A DM = -20, indicating the signals are not detected. However, the 2D posterior over both parameters shows has a deficiency of samples where both log 10 A RN = -20 and log 10 A DM = -20; i.e. the data does favour inclusion of at least one signal with high significance, but they cannot distinguish which. PSR J0613-0200 also includes higher-order chromatic noise in its model, and this same effect also occurs between log 10 A DM and log 10 A Chr . This means in this case, a side effect of including the higher-order chromatic noise is to slightly increase the detection significance for achromatic red noise, rather than to decrease it. This source confusion effect does not occur directly between log 10 A Chr and log 10 A RN , as there are still samples in the 2D region where log 10 A RN = -20 and log 10 A DM = -20 (instead the L-shape of the 2D 95 per cent credible interval in Fig. <ref type="figure">10</ref> is just a product of their 1D posteriors). In contrast, using Full DR2 allows much more precise measurements of each of these processes, and they are better distinguished from one another. These improvements likely result from the improved data cadence   and radio frequency bandwidth achieved by data combination for this pulsar.</p><p>PSR J0613-0200 also shows the strongest preference for the CRN out of the pulsars in EDR2 based on the DFs from Section 4.1.3 , as well as the original IPTA DR2 analysis <ref type="bibr">(Antoniadis et al. 2022 )</ref>. This is demonstrated in Fig. <ref type="figure">10</ref> by the excellent overlap between the achromatic red noise parameters from the single-pulsar analysis and the CRN parameters from the full-PTA analysis using Full DR2. Meanwhile, the red noise parameters measured using DR2 Lite are much less constrained, indicating weaker evidence for a detection of red noise as well as a much higher upper limit on red noise. In total, the improved characterization of pulsar noise resulting from the data combination directly translate here to improved measurement of the CRN. We can infer a similar story is at play for several of the other pulsars with improved constraints, particularly those pulsars in Table <ref type="table">4</ref> with covariances recorded between noise parameters, as well as those that show improvements to their DFs in Fig. <ref type="figure">7</ref> . For other pulsars, such as PSR J2317 + 1439, the data combination improves their characterization of DM noise but not achromatic red noise.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.2">PSR J1909-3744 | Inconsistency &amp; improved constraints</head><p>Unlike the case for PSR J0613-0200, Fig. <ref type="figure">11</ref> shows that the achromatic red noise distributions obtained using DR2 Lite and Full DR2 are in tension with each other. Using DR2 Lite, no red noise is detected in PSR J1909-3744; however, the red noise detected using Full DR2 lies above the expected region using DR2 Lite (i.e. the 68 per cent 2D credible regions do not overlap). This makes it appear that a red noise process has emerged in the combined data where there was none previously in DR2 Lite. There is no difference in observation timespan between DR2 Lite and Full DR2 for this pulsar, so non-stationary noise is unlikely to cause this. Furthermore, Fig. <ref type="figure">11</ref> shows both data sets detect DM variations with similar characteristics in PSR J1909-3744, which makes the possibility of new chromatic noise entering into the combined data set seem unlikely. However, the DM noise parameters are measured more precisely using the fully-combined data. Improving the precision on DM variations as a result of improved cadence and radio frequency coverage (shown in Fig. <ref type="figure">2</ref> ) may have the effect of helping to uncover the red noise present in the fully-combined data. This effect only makes sense in tandem with the log 10 -uniform priors used on A RN , which heavily downweight the presence of noise in the data set. Another possibility is that some level of model misspecification is at play and resulting in inconsistent noise properties across the two data sets.</p><p>PSR J1909-3744 is the pulsar in the most tension with the CRN using DR2 Lite, with the lowest DF out of all pulsars (Fig. <ref type="figure">7</ref> ). The CRN parameters from Full DR2 overplotted in Fig. <ref type="figure">11</ref> help explain this -they are well above the 95 per cent Bayesian credible interval of the total achromatic red noise measured using the Lite data set under the log 10 -uniform prior. Meanwhile, using EDR2, PSR J1909-3744 is more agnostic to the CRN measurement, indicated in Fig. <ref type="figure">11</ref> by the modest overlap between the achromatic red noise using Full DR2 and the CRN. These effects are also evident in the corner plots for PSRs J1713 + 747 and J1744-1134 (not shown): the achromatic red noise measured using DR2 Lite lies below the level of the red noise measured using Full DR2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.3">PSR J1910 + 1256 | Consistency &amp; similar constraints</head><p>PSR J1910 + 1256 is the last unique case we cover here. Unlike the previous cases, Fig. <ref type="figure">12</ref> shows the posteriors measured for PSR J1910 + 1256 using DR2 Lite are similar to the posteriors measured using Full DR2, with nearly identical red noise and very similar DM noise. Thus, there apparently exist cases where data combination does not improve noise characterization at the single-pulsar level. The largest factor playing into this may simply be that not enough new data are added into the combination. Indeed, PSR J1910 + 1256 has one of the lowest FoMs out of the mutli-PTA pulsars (Fig. <ref type="figure">1</ref> ), also has a DF &#8764; 1 (Fig. <ref type="figure">7</ref> ), indicating its combined data set is not yet advanced enough to measure the CRN. While a future combination MNRAS 542, <ref type="bibr">3028-3048 (2025)</ref> may likely yield the emergence of a red noise process in this and similar pulsars, it is also important to consider that pulsar sensitivity may eventually saturate, and therefore reduce the effect of combined data. For example, even with increasing telescope sensitivity, some pulsar's will eventually become jitter-noise limited. However, it does not seem we have reached this regime for the majority of pulsars <ref type="bibr">(Lam et al. 2019 )</ref>. Other pulsars, such as PSR J1939 + 2134, may be dominated by very strong intrinsic red noise processes, explaining why PSR J1939 + 2134's noise is well-measured in both DR2 Lite and Full DR2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4">Ensemble noise properties</head><p>For our final analysis, we use the framework of hierarchical Bayesian modelling <ref type="bibr">(Loredo &amp; Hendry 2019 ;</ref><ref type="bibr">Thrane &amp; Talbot 2019 ;</ref><ref type="bibr">van Haasteren 2024 )</ref> to estimate the ensemble noise properties of intrinsic red noise and DM variations from each data set (DR2 Lite, EDR2, and Full DR2), following closely the procedures described in <ref type="bibr">Goncharov &amp; Sardana ( 2025 )</ref>. The ensemble properties are encoded in the distribution p(&#952;| M / ), where we define &#952; as our noise parameters &#952; = {log 10 A RN , &#947; RN , log 10 A DM , &#947; DM } . M / designates the hierarchical model with hyperparameters , which are learned from the data &#948;t , while M designates the original model using uninformative priors. For the current analysis, the ensemble noise distribution p(&#952;| M / ) is neither a prior nor a posterior on pulsar noise parameters. Rather, it is a posterior predictive distribution for noise parameters, as it is informed by the population of millisecond pulsars in the current data set. For an independent analysis, p(&#952;| M / ) could be used as a population-informed prior distribution on the noise parameters.</p><p>We estimate the ensemble properties following <ref type="bibr">Goncharov &amp; Sardana ( 2025 )</ref> by first inferring the hyperparameters from single pulsar analyses using the marginalized likelihood obtained from importance sampling <ref type="bibr">(Thrane &amp; Talbot 2019 )</ref>:</p><p>where N psr is the number of pulsars, n i is the number of posterior samples obtained from the single pulsar i's noise analysis, and Z(&#948;t i | M ) is the evidence from the single pulsar i's noise analysis (the evidences may be treated as an unknown normalization for the purpose of parameter estimation). In equation ( <ref type="formula">15</ref>), each pulsar's integral is estimated by iterating over the n i posterior samples (i.e. &#952; k i is the kth posterior sample for the ith pulsar). As suggested by <ref type="bibr">Goncharov &amp; Sardana ( 2025 )</ref>, we attempt to separate intrinsic red noise contributions from the GWB by drawing red and DM noise samples from the single pulsar posteriors obtained from our factorized likelihood analyses, which each contained an additional red noise term in each pulsar at fixed &#947; = 13 / 3 which provides a channel to separate the GWB from intrinsic noise. After defining a suitable hyperprior &#960; ( | M / ), we can obtain samples over the posterior P( |&#948;t , M / ) from stochastic sampling, and numerically marginalize over to obtain the ensemble noise distribution (Goncharov &amp; Sardana 2025 )</p><p>where the second line suggests we concatenate samples from the prior &#960; (&#952;| k , M / ) over n p samples drawn from the posterior P( |&#948;t , M / ). Equation ( <ref type="formula">16</ref>) shows how p(&#952;| M / ) implicitly depends on the data &#948;t , and as such it should be interpreted as a posterior predictive distribution for pulsar noise parameters rather than a true prior on &#952; .</p><p>We are left with some freedom to define the hyperparameters , the functional form of &#960; (&#952;| , M / ), and the hyperprior &#960; ( | M / ). We choose to keep the uniform priors on &#952; and infer the min and max ranges of the prior as our hyperparameters , such that</p><p>for each of our four noise parameters denoted by j . This choice is based on <ref type="bibr">Goncharov &amp; Sardana ( 2025 )</ref>, who found using EPTA DR2 that this uniform distribution model was preferred with a higher Bayes factor than alternative models using a normal distribution or a mixture of normal and uniform distributions. We enforce the constraint / min ,j &lt; / max ,j by defining our hyperpriors to draw from the distributions</p><p>where u j and l j are the original upper and lower bounds on the red/DM noise parameters (Section 3.2 ). We use NUMPYRO <ref type="bibr">(Bingham et al. 2019 ;</ref><ref type="bibr">Phan, Pradhan &amp; Jankowiak 2019 )</ref> to define our hierarchical Bayesian model and JAXNS <ref type="bibr">(Albert 2020</ref> ) to infer the posterior distribution P( |&#948;t , M / ) using nested sampling. Fig. <ref type="figure">13</ref> shows the resulting ensemble distributions of the parameters of our two noise processes, p(log 10 A RN , &#947; RN | M / ) and p(log 10 A DM , &#947; DM | M / ) after numerical marginalization over the hyperparameters . The left hand side shows that DR2 Lite does not produce strong constraints on the intrinsic red noise properties of the ensemble, whereas EDR2 produces equal constraints on the ensemble noise properties as Full DR2. This consistent in particular with the single pulsar results table 4 which shows only 2 pulsars out of a subset of 24 have strongly detected red noise using DR2 Lite (the list increases to 5 out 53 once including PSRs J0437 -4715, J0621 + 1002, and J1824 -2452A), while Full DR2 is capable of constraining red noise in the majority of pulsars. Meanwhile, the right-hand side of Fig. <ref type="figure">13</ref> shows that DR2 Lite can be used to place similar levels of constraints on the ensemble properties of DM variations as Full DR2, even despite detecting DM noise in fewer pulsars than Full DR2 (but still detecting more DM noise processes than red noise processes; Table <ref type="table">4</ref> ). The distributions are centred near &#947; DM &#8764; 2, but display errors consistent with the value &#947; DM = 8 / 3 expected for DM variations from Kolmogorov turbulence <ref type="bibr">(Keith et al. 2013 )</ref>. EDR2 in the meantime displays a tighter distribution of DM noise properties, centred near lower amplitude and spectral index. This is most likely because EDR2 contains only 22 pulsars, and neither PSRs J1721 -2457 and J1903 + 0327, two high-DM pulsars with the highest DM noise amplitudes in IPTA DR2, are included among the 22.</p><p>MNRAS 542, <ref type="bibr">3028-3048 (2025)</ref> correlation statistics and spectral characterization results as Full DR2, demonstrating that indeed using the FoM or similar statistic to inform the order in which to combine pulsars may yield valuable combined data set at the intermediate stage of data combination, before all pulsars have been combined.</p><p>Alongside the above results, we found the amplitude distribution of the CRN is shifted to higher values using DR2 Lite than using EDR2 or Full DR2, suggesting unmitigated intrinsic noise is present in some of the DR2 Lite pulsars and leaking into the common channel. This conclusion is reaffirmed using various 'dropout' analyses that show the CRN amplitude using DR2 Lite is strongly dependent on individual pulsars. The multiple component OS analysis also finds a larger amplitude of monopolar correlations in DR2 Lite. This monopole could be partially responsible or connected to the larger CRN amplitude, or it may just be a statistical fluctuation. By using combined data, the amplitude distributions of both the CRN and the monopole shift to lower values, which shows that data combination is capable of mitigating these systematic errors with no required knowledge of their source.</p><p>While these results are encouraging, we have not fully stresstested the Lite method or forecasted its potential for future data sets -this would require analyses of numerous detailed simulations of the data combination process, which is beyond the scope of this work. Another caveat is we do not omit or otherwise account for legacy or single-frequency data in IPTA DR2 from any version of the analysis. As shown by EPTA Collaboration ( 2023c ); <ref type="bibr">Ferranti et al. ( 2025 )</ref>, including legacy data in the analysis helps to constrain the spectrum but is less useful for measuring the Hellings-Downs curve.</p><p>Finally, the effects of data combination are also noticeable on the single pulsar level. After examining 24 pulsars in DR2 Lite which have multi-PTA data in Full DR2, we detect at least one new intrinsic noise process using Full DR2 that we could not detect using DR2 Lite in 16 out of 24 pulsars. These improvements in noise characterization appear to correlate with improvements in the effective radio frequency bandwidth and data cadence that result from data combination. Meanwhile, DR2 Lite resulted in similar constraints on pulsar noise for only 6 out of 24 pulsars. In most of these cases, it appears the data combination did not improve band coverage at low frequencies enough to improve constraints on DM variations, and/or the Lite data set already contained the majority of the TOAs in Full DR2, which likely dominated the statistics. These conditions are unlikely to hold true for many pulsars in the next IPTA data set, as numerous data at low radio frequencies from LOFAR <ref type="bibr">(Stappers et al. 2011 )</ref>, NenuFAR <ref type="bibr">(Zarka et al. 2012</ref> ), the GMRT <ref type="bibr">(Joshi et al. 2018 )</ref>, and CHIME (CHIME/Pulsar Collaboration 2021 ) are now becoming available for combination. We should therefore expect to see that the next data combinations will further improve pulsar noise characterization, and by extension, sensitivity to GWs. The reduced number of pulsars with significant detections of noise in DR2 Lite also translate to less informative ensemble distributions of pulsar red noise properties.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3">Future directions</head><p>Looking to the future, the Lite method has a clear role as an intermediate step in the analysis of PTA data sets. This was demonstrated recently in <ref type="bibr">Agazie et al. ( 2024 )</ref> where several pseudo-IPTA data sets were composed from the most recent EPTA, NANOGrav, and PPTA data sets using a factorized likelihood approach. While these data sets were not created using a FoM, they are similar in spirit to the Lite method we present here. <ref type="bibr">Agazie et al. ( 2024 )</ref> found that adding pulsars to each PTA's data set using this method consistently results in a higher GWB S/N and more precise constraints on A CRN than what one obtains using each PTA's individual data release. This further affirms the capability for the Lite method to improve measurements of GW signals. Extrapolating the results of our study comparing IPTA DR2 with its Lite version, we expect the upcoming IPTA data set, IPTA DR3, will further improve GWB spectral characterization and GW detection prospects <ref type="bibr">(Good &amp; International Pulsar Timing Array Team 2023 )</ref>. Furthermore, our analysis of EDR2 reinforces that we can capture a large amount of the information in a fullycombined data set using an intermediate version with fewer pulsars, as expected based on <ref type="bibr">Speri et al. ( 2023 )</ref>. This suggests the creation and analysis of an IPTA 'EDR3' as a way to start reaping the rewards of a fully-combined IPTA DR3 at an earlier time. Though we do not explore this here, the creation of even more optimal joint-PTA data sets should also be possible with a hybrid approach by using combined data for the most sensitive pulsars and using uncombined data for the remaining pulsars.</p><p>The Lite analysis may have further unquantified benefits if curating data sets for other nHz GW searches, such as continuous GWs from individual SMBHBs, as it may be used to rapidly improve the sky coverage of the PTA. Of the regional PTAs, three are northern hemisphere (CPTA, EPTA, and NANOGrav), one is quasi-equatorial (InPTA), and two are Southern hemisphere (MPTA and PPTA). The maximum sky coverage achieved by any PTA is approximately 75 per cent due to telescope elevation limits, while for transit telescopes such as CHIME, the sky coverage is considerably less. This sky coverage is perhaps only a secondary concern for the initial studies of the isotropic stochastic GWB, but it is a major concern for efforts to detect and study individual GW sources, as well GWB anisotropy. These gains in sky coverage may be achieved immediately with the creation of Lite data sets.</p><p>In conclusion, the Lite method does not replace the need for full data combination, but serves as a powerful exploratory tool for evaluating new data sets. By providing early indicators of GWB sensitivity, the Lite method may help motivate the creation of fullycombined data sets, while the FoM may guide the order in which to combine the pulsar's data together. As future IPTA data releases incorporate larger and more complex data sets, the Lite method will become increasingly more useful for nHz GW searches as one balances the trade-off between computational efficiency and sensitivity. The Lite method may also complement other future avenues of analysing joint-PTA data sets, such as the Fourier-space combination of posteriors from single-PTA data sets <ref type="bibr">(Valtolina &amp; van Haasteren 2024 ;</ref><ref type="bibr">Laal et al. 2025 )</ref>, or the fully extended PTA analysis from <ref type="bibr">Agazie et al. ( 2024 )</ref> using the OS.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>AC K N OW L E D G E M E N T S</head><p>This paper is the result of the work of many people and uses data from over three decades of pulsar timing observations. We thank Boris Goncharov for reviewing the manuscript and providing useful suggestions, such as the hierarchical Bayesian analysis, which improved the quality of this work. BL additionally thanks members of the IPTA Gravitational Wave Analysis working group, co-chaired by Nihan Pol, Aur&#233;lien Chalumeau, and Paul Baker, for constructive comments and discussions, and to Rutger van Haasteren for providing useful insights on hierarchical modelling. We acknowledge support received from National Science Foundation </p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>MNRAS 542,3028-3048 (2025)   </p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1"><p>Downloaded from https://academic.oup.com/mnras/article/542/4/3028/8244155 by Yale Library user on 29 December 2025</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="14" xml:id="foot_2"><p>Observatoire Radioastronomique de Nanc &#184;ay, Observatoire de Paris, Universit&#233; PSL, Univ Orl&#233;ans, CNRS, F-18330 Nanc &#184;ay, France Downloaded from https://academic.oup.com/mnras/article/542/4/3028/8244155 by Yale Library user on 29 December 2025</p></note>
		</body>
		</text>
</TEI>
