<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>New method for reducing parton distribution function uncertainties in the high-mass Drell-Yan spectrum</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>03/01/2019</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10093590</idno>
					<idno type="doi">10.1103/PhysRevD.99.054004</idno>
					<title level='j'>Physical Review D</title>
<idno>2470-0010</idno>
<biblScope unit="volume">99</biblScope>
<biblScope unit="issue">5</biblScope>					

					<author>C. G. Willis</author><author>R. Brock</author><author>D. Hayden</author><author>T.-J. Hou</author><author>J. Isaacson</author><author>C. Schmidt</author><author>C.-P. Yuan</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Uncertainties in the parametrization of Parton Distribution Functions (PDFs) are becoming a serious limiting systematic uncertainty in Large Hadron Collider (LHC) searches for Beyond the Standard Model physics. This is especially true for measurements at high scales induced by quark and anti-quark collisions, where Drell-Yan continuum backgrounds are dominant. Tools are recently available which enable exploration of PDF fitting strategies and emulate the effects of new data in a future global fit. ePump is such a tool and it is shown that judicious selection of measurable kinematical quantities can reduce the assigned systematic PDF uncertainties by significant factors.This will be made possible by the huge statistical precision of future LHC Standard Model datasets.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>Beyond the Standard Model (BSM) physics at the Large Hadron Collider (LHC) would be found as deviations from Standard Model (SM) expectations, possibly in rates, but more typically in the kinematic distributions of final state objects or their combinationsof jets, leptons, and missing energy. Therefore the importance of accurately and precisely modeling SM physics cannot be overstated. While the electroweak properties of the SM are very precisely known, precision knowledge of Parton Distributions Functions (PDFs) is becoming a limiting factor for many BSM searches. This limitation comes from the theoretical uncertainties becoming so large at high-mass that a clear deviation from the SM becomes hard to distinguish, and even upon discovery of new physics the characterisation of this signal among various different theoretical models would be blurred.</p><p>As PDFs are not analytically calculable in the framework of perturbative Quantum Chromodynamics (QCD), their shapes must be modeled by globally fitting measured distributions from many combinations of varied experimental data. Most of these data come from legacy experiments, such as Deep Inelastic Scattering (DIS) experiments, various fixed target hadron experiments, and the Fermilab Tevatron. LHC experimental results are beginning to be used in global PDF fits, and in the coming decades new knowledge of PDFs will come from measurements at ATLAS <ref type="bibr">[1]</ref>, CMS <ref type="bibr">[2]</ref>, and LHCb <ref type="bibr">[3]</ref>. We suggest that new strategies are worth exploring and we present one here.</p><p>Constraining PDFs and their uncertainties is now an intense research program. The systematic uncertainty in the PDF models arises from the 1) experimental uncertainties of the input data used in a global fit, 2) any theoretical assumptions made by the fitting groups, and/or 3) the chosen parameterizations characterizing the functional forms of the PDFs themselves. All of the global PDF fitting groups (CTEQ-TEA <ref type="bibr">[4]</ref>, MMHT <ref type="bibr">[5]</ref>, and NNPDF <ref type="bibr">[6]</ref>) characterize their fits with Hessian error matrices or Monte Carlo replicas so that experiments can legitimately include PDF uncertainties as a component to any theoretical error for any measurement or limit.</p><p>In this paper we explore the PDF uncertainties as they apply to the BSM search for a resonant Z gauge boson in the dilepton invariant mass spectrum. The dominant and irreducible background process to this search is the Drell-Yan (DY) process. Both ATLAS <ref type="bibr">[7]</ref> and CMS <ref type="bibr">[8]</ref> have recently completed their searches for new high-mass phenomena from the first &#8730; s = 13 TeV data-taking runs at the LHC. Both set comparable lower bounds on the mass of a hypothetical new vector boson and both publish extensive lists of their systematic uncertainties, including uncertainties attributed to our limited knowledge of PDF fitting.</p><p>To date, only 5% of the planned LHC data are in hand and yet these PDF uncertainties might already have limited future mass reaches for such searches. Not only are resonant Z boson searches "at risk" but also W boson searches and especially non-resonant (such as contact interactions) searches, which are very sensitive to sloped shape changes in the background. Furthermore, as we enter the new high integrated luminosity era of the LHC, experimental uncertainties will naturally be continually reduced, meaning that searches with even more complicated final states will eventually start to become limited predominantly by theoretical uncertainties. Therefore, it is critical that we improve our understanding of PDFs and their associated uncertainties.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Our Strategy</head><p>Experiments utilize PDF fits which are global and agnostic respecting a basic principle of the parton model: PDF sets and uncertainties originate from all data and are applicable to all scattering. But knowledge of the PDFs is not uniform nor are all reactions similarly dependent on them. For example, DY production is less sensitive to knowledge of the gluon PDF than many BSM searches. Instead, precision predictions of DY processes depend significantly on knowledge of both the valence and sea quark densities which largely come from deep inelastic scattering and DY experiments. And to that end, hadron collider DY experimental inputs have been a part of PDF global fitting for years. For example, the CT14NNLO <ref type="bibr">[4]</ref> fits utilized inputs from the W and Z boson charge asymmetry measurements from the Tevatron: <ref type="bibr">[9,</ref><ref type="bibr">10]</ref> and <ref type="bibr">[11]</ref> from CDF and <ref type="bibr">[12,</ref><ref type="bibr">13]</ref> results from D&#216;.</p><p>And for the first time, in CT14NNLO the CTEQ-TEA group included LHC data from W/Z cross sections and the charged lepton asymmetry measurement from ATLAS <ref type="bibr">[14]</ref>, the charged lepton asymmetry in the electron <ref type="bibr">[15]</ref> and muon decay channels <ref type="bibr">[16]</ref> from CMS, and the W/Z lepton rapidity distributions and charged lepton asymmetry from LHCb <ref type="bibr">[17]</ref>. But we will show that modern PDF global fits are not as potent for quark densities as are necessary for future precision measurements.</p><p>The only remedy to this problem is the addition of qualitatively new experimental inputs to global fitting, but the LHC is currently the only PDF "game in town." We propose a way to judiciously use LHC DY data itself as inputs to global fitting. The strategy would be to add Z boson peak and DY continuum data to global fitting from a well-measured, low-tomoderate invariant mass control region (M &lt; 1 TeV). The resulting, "boutique" PDF sets could be used in an unbiased way to constrain the theoretical uncertainties in a kinematic search region relevant to modern BSM particle hunt, which is now in the M &gt; 5 TeV region.</p><p>We further show that the DY kinematics can be exploited to enhance the impact on LHC DY data, namely emphasizing well-understood up-quark densities and de-emphasize always limited sea-quark densities. This would require inputs which are differential in nature and not just asymmetry results near the Z boson peak.</p><p>The machinery of PDF global fitting groups is very complex and for physicists outside of the PDF groups, testing new PDF analysis strategies can be cumbersome. This will change with the recent development of tools like ePump <ref type="bibr">[18]</ref> (the Error PDF Updating Method Package, see Appendix A and <ref type="bibr">[18]</ref> for details), which makes it possible to explore the effects of new kinematic inputs to a global fit without requiring a full global analysis. ePump is not a substitute for full global fitting, but can be used as a tool to probe the effects of new data.</p><p>In essence one can consider ePump to be a simulation of global fitting in an approximation described in Appendix A. Pseudo-data can be added to an existing global fit in order to explore how that data might affect the central value and importantly, the uncertainties in the resulting candidate PDFs. All of the sum rules, QCD evolution, and uncertainties inherent in the "parent" global fit to which test data are added are preserved. While other PDF profiling tools exist such as xFitter <ref type="bibr">[19]</ref>, in this paper we choose to use ePump which has been thoroughly tested <ref type="bibr">[18]</ref> against the CT14NNLO <ref type="bibr">[4]</ref> global fits.</p><p>The work in this paper is the first published use of ePump. We demonstrate that new insight into kinematics of the DY process has emerged, and that considerable reduction in the quark and anti-quark PDF uncertainties is possible with new data inputs to PDF global fitting.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Our Goals</head><p>Our goals in this paper are limited. We simply ask the optimistic questions: can qualitatively new data when combined with the current inputs of CT14HERA2 reduce future PDF uncertainties and if so, by how much? And would any reduction improve the overall precision of high mass DY backgrounds relevant to future Z searches? We exploit the unprecedented statistical power of future LHC running and use DY kinematically motivated differential distributions to suggest that sensitivities to partons of special interest in DY production can be enhanced.</p><p>Our ansatz is to treat BSM DY searches as consisting of a control region-from which we envision mining DY data for global fitting-and a signal region to where those new global fits are extrapolated. Of course as in any control-signal region analysis, the assumption is that the control region contains only SM physics. We specifically explore the possibility that LHC DY data in a safe control region might be useful to further constrain PDFs appropriate to high-mass BSM searches for which the continuum DY is the dominant background. Having determined that this is worth consideration, our ultimate proposal is that the LHC experiments and the PDF fitting teams work together to explore inclusion of LHC DY data into global fitting when prepared in a particularly useful way.</p><p>We chose to do our work using the most recent CTEQ PDF global fit, namely CT14HERA2.</p><p>This includes the most recent HERA1 and HERA2 data and utilizes an updated parametrization from the previous CT14NNLO sets. Since the recent experimental ATLAS publication <ref type="bibr">[7]</ref> limits were set using the the CT14NNLO sets, we do make a brief comparison to show that the basic PDFs are very similar.</p><p>Our goals are limited to asking and answering our two questions above. To that basic end, what we do not do here are the following:</p><p>&#8226; An important part of the theoretical uncertainties include exploration of the parameterization assumed and potentially additional parameterization choices. While exploring functional choices would be an interesting exercise when attempting to extrapolate into a new kinematical regime, we do not do that here.</p><p>&#8226; We do not attempt to optimize theoretical uncertainties associated with any other theoretical considerations like the strong coupling constant, electroweak couplings, or higher order electroweak and QCD effects.</p><p>&#8226; We also make no effort to optimize or explore the full set of possible experimental uncertainties.</p><p>9 0 1 0 0 2 0 0 1 0 0 0 2 0 0 0 E v e n t s T h ep a p e ri ss t r u c t u r e da sf o l l o w s . F i r s t ,i nS e c .I It h ec u r r e n te x p e r im e n t a lr e s u l t s a r eb r i e fl yr e v i ew e d w i t ha nem p h a s i so nt h es y s t em a t i cu n c e r t a i n t i e s . N e x t , w er e v i ew   forum recommendations <ref type="bibr">[29]</ref>. "PDF variation" is the result from the full error matrix for the nominal PDF set.</p><p>predictions as excursions from the nominal choice and its full error matrix.</p><p>The two experiments report different assignments for PDF uncertainties. For example ATLAS assigns large uncertainties derived from a detailed treatment. For the di-electron channel the reported overall uncertainty is 26.3% which comes from: the combined PDF (variation plus choice) uncertainties of 20.8%, other non-PDF theory uncertainties of 10%, and total experimental uncertainties of 12.8%. Di-muon uncertainties are not as large, but for both measurements the PDF uncertainties compete unfavorably with the experimental uncertainties. CMS reports smaller PDF uncertainties and comparable experimental uncertainties.</p><p>Experimental systematic uncertainties will likely be reduced with more data, but the PDF uncertainties at this point are largely irreducible in the absence of new data of a qualitatively different sort (new DIS experiments?) or new ideas. We propose new ideas to address this using LHC data itself. <ref type="bibr">[30]</ref> of the DY process initiated by a quark-antiquark pair as observed at the LHC.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. THE DRELL-YAN PROCESS</head><p>The general Drell-Yan process <ref type="bibr">[31]</ref> of pp &#8594; + -+ X at leading order originates from an s-channel exchange of an electroweak boson</p><p>Here, X denotes any additional final-state particles (radiated partons, the underlying event, multi-parton interactions, etc.). At next-to-leading order, the real corrections introduce three t-channel processes, listed in order of decreasing cross section at LHC energies,</p><p>The leading order process is depicted in Fig. <ref type="figure">2</ref>.</p><p>In each case, the vector boson decays into a pair of same-flavor, oppositely-charged leptons. For simplicity, our discussion will center on the leading order process, but all of our results are based on Next to Leading Order (NLO) plus Next to Leading Log (NLL) calculations using the NLO-NLL ResBos <ref type="bibr">[32]</ref><ref type="bibr">[33]</ref><ref type="bibr">[34]</ref> package.</p><p>The DY triple-differential cross section can be represented as a function of the dilepton invariant mass m , the dilepton rapidity y , and the cosine of the lepton polar angle in the Collins-Soper rest frame cos &#952; * . This was measured by ATLAS <ref type="bibr">[35]</ref> using data from the &#8730; s = 8 TeV LHC running for 46 &lt; m &lt; 150 GeV.</p><p>For the LO s-channel process, the DY triple-differential cross section can be written as</p><p>Here &#8730; s is the centre of mass energy of the LHC, and P 1 and P 2 are the 4-momenta of protons 1 and 2. In the standard fashion, x 1 and x 2 are the incoming parton momentum fractions such that p 1 = x 1 P 1 and p 2 = x 2 P 2 . We take our notation from <ref type="bibr">[35]</ref>.</p><p>The functions f q/P 1 (x 1 , Q 2 ) and f q/P 2 (x 2 , Q 2 ) are the PDFs for quark flavors q and q, respectively. The term (q &#8596; q) accounts for the fact that either proton can carry a sea quark, as the LHC is a proton-proton collider.</p><p>Finally, the quantity P q accounts for the parton-level dynamics in terms of important electroweak parameters, and exhibits dependencies on both dilepton mass and cos &#952; . Each factor in this formula matters in a high-mass extrapolation, and are discussed in detail in Appendix B 1.</p><p>The energy scale of the collision is set by the transferred four-momentum squared Q 2 , which can be identified with the square of the dilepton invariant mass m 2 . Well-known kinematic definitions include</p><p>and,</p><p>which parametrizes the dilepton rapidity in terms of the x fractions of the initial-state partons at LO. From these, the variables are related, also at LO, by</p><p>Eq. ( <ref type="formula">8</ref>) provides the first hint to the source of the large PDF uncertainty in high-mass DY production. The &#8730; s = 13 TeV LHC is now probing extremely large values of m , beyond a few TeV. As such, a central dilepton event with an invariant mass of m = 3 TeV and rapidity of y = 0 requires x fractions beyond x 0.2. This is beginning to probe regions of sea and even valence quark momentum fractions which are not well constrained by mostly DIS inputs. Figure <ref type="figure">3</ref> shows quark, anti-quark and gluon momentum fractions from the CT14HERA2 PDF set evaluated at two scales Q 2 . F IG .3 . T h eC T 1 4 H E R A 2PD F so ft h eCT EQc o l l a b o r a t i o n . D e p i c t e da r eg l u o n ,q u a r k ,a n da n t iq u a r kPD F sa sa f u n c t i o no fx ,e v a l u a t e da tas c a l eo f Q=2 G eV( a )a n d Q=1 0 0 G eV( b ) <ref type="bibr">[ 4 ]</ref> .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A . B e h a v i o ro f PDF sa th i g hp a r t o n x</head><p>T h e r e a s o n f o r t h i s i n h e r e n th i g h -xu n c e r t a i n t y i n t h eq u a r ka n da n t i -q u a r kPD F s i sd u e t o t h en e e d t oe x t r a p o l a t ee x p e r im e n t a ld a t a-e s p e c i a l l y f o rq u a r ka n da n t i -q u a r kfi t t i n ga s s e e n i nF i g .4 . T h eo n l yd a t aw h i c hd i r e c t l yp r o b eq u a r ka n da n t i -q u a r kPD F s f o rx 0 . 2 c om ef r om l e g a c yd e e p -i n e l a s t i cs c a t t e r i n ge x p e r im e n t sa n d H ERA m e a s u r em e n t s . PD F s r e l e v a n t f o r c u r r e n ta n d f u t u r eLHCDYp r o d u c t i o n s c a l e so f i n t e r e s t r e q u i r ea n e x t r a p o l a t i o n o fa lm o s tt h r e eo r d e r so f m a g n i t u d e i n m a s sa n dt h i sp r o v e sd iffi c u l tt od op r e c i s e l yw i t h t h ec u r r e n tw o r l dd a t a . r o l e i nt h e i n i t i a l -s t a t eq u a r k -a n t i q u a r ka n n i h i l a t i o nt h a tr e s u l t s i nt h e DYp r o c e s s . T h i s s i g n i fi c a n t l a c ko fp r e c i s i o n i st h es o u r c eo ft h e l a r g es y s t em a t i cu n c e r t a i n t i e sr e q u i r e d i n CT14HERA2 PDF. The ratio band is the quoted CT14HERA2 <ref type="bibr">[37]</ref> PDF uncertainties of about 18% at m = 4 TeV consistent with that quoted in the ATLAS result of 19% at m ee = 4 TeV.</p><p>As DY data inputs are the only way to constrain high-x PDFs, a strategy is explored here that turns this lack of sensitivity into an opportunity. The DY continuum is wellmeasured and reliably SM physics. If PDF global fits were to include LHC DY data well below any search region, but high enough in invariant mass to better constrain the fits, this uncertainty could be reduced. Moreover, the amount of LHC data that will become available in the coming years will be staggering, so we've decided to explore DY kinematics further in hopes of finding/discovering sensitivities that would help to enhance the potential of high-x PDF fits.</p><p>We will show that there are DY observables, such as cos &#952; * , that could in principle be incorporated in PDF global fitting, and the use of ePump tells us approximately how much reduction in PDF uncertainty is possible, as well as how much smaller the PDF systematic high-mass DY production, the PDF uncertainties begin to diverge. As noted in the introduction, the differences between these two recent fits is minimal, justifying our choice of CT14HERA2 in this analysis.</p><p>uncertainty might become in the DY differential mass spectrum. Due to the importance of cos &#952; * , and the role it plays in our fitting strategy, a brief review is given in the next section.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. The Collins-Soper Polar Angle</head><p>We have found particular power in cos &#952; * in Eq. ( <ref type="formula">5</ref>). This angle is defined in the Collins-Soper (CS) <ref type="bibr">[20]</ref> rest frame of the lepton-pair with the polar and azimuthal angles defined relative to the two proton directions. The z axis is defined in the Z boson rest frame so F IG .6 . T h ed i l e p t o n i n v a r i a n t m a s s s p e c t r um ,g e n e r a t e dw i t h t h eR e sB o sMCg e n e r a t o ra n d t h e C T 1 4 H E R A 2PD Fs e t .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>t h a t i tb i s e c t st h ea n g l e f o rm e db yt h e m om e n t umo fo n eo ft h e i n c om i n gp r o t o n sa n dt h e n e g a t i v eo ft h e m om e n t umo ft h eo t h e r i n c om i n gp r o t o n . T h eya x i s i sc o n s t r u c t e dt ob e n o rm a lt ot h ep l a n eo ft h et w op r o t o n m om e n t aa n dt h exa x i sw h i c h i sc h o s e n i no r d e rt o c r e a t ear i g h t -h a n d e dC a r t e s i a nc o o r d i n a t es y s t em .</head><p>T h ec o s i n eo ft h ep o l a ra n g l e&#952; * d e fi n e st h ed i r e c t i o no ft h eo u t g o i n g l e p t o n -r e l a t i v e t o&#7825;i nt h eC S f r am ea n dc a nb ec a l c u l a t e dd i r e c t l y f r om l a b f r am e l e p t o nq u a n t i t i e sw i t h</p><p>T h es i g no ft h eza x i s i sd e fi n e do na ne v e n t -b y -e v e n tb a s i sa st h es i g no ft h e l e p t o np a i r m om e n t um w i t hr e s p e c tt ot h e za x i s i nt h e l a b o r a t o r yf r am e . H e r e ,P T a n dP z a r et h e t r a n s v e r s ea n d l o n g i t u d i n a l m om e n t umo ft h ed i l e p t o ns y s t em ,r e s p e c t i v e l y ,a n d ,</p><p>w h e r et h e l e p t o n( a n t i -l e p t o n )e n e r g ya n d l o n g i t u d i n a l m om e n t uma r eE 1 a n dp z , 1 ( E 2 a n d p z , 2 ) ,r e s p e c t i v e l y . T h i sd e fi n i t i o nr e q u i r e st h ee l e c t r i cc h a r g e i d e n t i fi c a t i o no fe a c h l e p t o n . Our strategy was to explore the DY cross section with the goal of finding global PDF fitting inputs tailored specifically to DY physics. To that end we used the ResBos MC (and the MadGraph generator <ref type="bibr">[38]</ref> as a check), configured with the CT14HERA2 PDF set to study several kinematic distributions.</p><p>All simulation samples are produced in bins of true dilepton invariant mass in the range m = 40 GeV to m = 1 TeV at &#8730; s = 13 TeV. In order to roughly correspond to ATLAS <ref type="bibr">[7]</ref> and CMS <ref type="bibr">[8]</ref> acceptances, the lepton pseudo-rapidities were restricted. Central-central (CC) allows access to a wider range in x, c.f. Eq. ( <ref type="formula">11</ref>).</p><p>We found particular practical significance in focusing on the polar angle. Figure <ref type="figure">7</ref> shows several cos &#952; * distributions of Eq. ( <ref type="formula">9</ref>) in discrete slices of dilepton invariant mass. Each mass-slice is further decomposed into sub-processes that consist distinctly of up-type or down-type initial-state quarks. The up-type sub-processes include initial-states of uu, ug, and ug, where u is the up quark or charm quark and g is the gluon. A similar definition applies to the d-type (down, strange, bottom) sub-processes, with u replaced by d. This is in accordance with the four DY reactions in Eqs. ( <ref type="formula">1</ref>) and ( <ref type="formula">4</ref>).</p><p>The distributions in Figs. 7(a), 7(b), and 7(c) are essentially the regions covered by an ATLAS measurement of the triple differential cross section during the 8 TeV running. <ref type="bibr">[35]</ref> These are familiar as they show part of the source of the oft-measured Forward-Backward Asymmetry in both p -p and pp on-resonance Z boson analyses <ref type="bibr">[39]</ref>.</p><p>Intriguingly, the relative up-type and down-type sub-processes are highly dependent on both mass and polar angle &#952; * . This is especially true above the Z boson mass peak, in which the forward region (cos &#952; * &gt; 0) shows an increasing degree of separation between the rates associated with the up-type and down-type DY sub-processes. Indeed, in this region the contribution to the total cross section is due almost entirely to the up-type sub-process by itself: almost by a factor of four. At high mass and high polar angle, the LHC DY process proceeds almost entirely through the u&#363; sub-process, effectively making the LHC a u&#363; collider.</p><p>Why is this the case? Appendix B explains this conclusion as a fortuitous conspiracy of electroweak couplings and parton luminosities which collectively favor up quarks and antiquarks over their down-like counterparts. Notice that we've not really learned anything new since DY kinematics is an old subject. But high-mass behavior in regions only statistically available at the LHC is revealing and the question is whether cos &#952; * behavior as a function of mass should be an important discrimination as an input to global PDF fitting. This is where ePump comes in.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. A PROPOSED STRATEGY TO PDF ERROR REDUCTION FOR DY</head><p>We attempt to shed light on two questions:</p><p>1. If cos &#952; * data were incorporated in global fitting, how significant might the reduction in PDF uncertainties be?</p><p>2. Would those decreased errors be a significant reduction in the overall theoretical uncertainties in future BSM, high-mass DY searches?</p><p>In order to answer Question 1, ePump was used, which can update an existing PDF set with new experimental data (or pseudo-data) in order to produce an improved best-fit and Hessian Error PDFs. The ePump workflow can be seen in Fig. <ref type="figure">8</ref>.</p><p>For this analysis, "pseudo-data" are used to mimic a possible future LHC dataset for PDF fitting. As any dataset has finite statistics, the resulting uncertainties in the new PDFs will reflect whatever statistical precision is modeled in the pseudo-data. The effects of new PDFs and uncertainties can then be used to re-evaluate the PDF systematic uncertainty on the high-mass dilepton event yield.</p><p>Furthermore, we imagine a Signal Region (SR) as m &gt; 1 TeV and a Control Region (CR) to be for 0.04 &lt; m &lt; 1 TeV. Since new physics should lie above the current limits of approximately m &#8764; 3 TeV (as in Sec. II), it would be "fair" to use low-mass DY data to constrain the high-mass DY spectrum.</p><p>A. PDF Update Strategy ePump requires standard inputs to emulate the global fit-the templates in Fig. <ref type="figure">8</ref>. We describe our strategy here. The analysis was performed at "truth level," such that the acceptance and efficiency effects associated with the reconstruction and identification of prompt, high-p T leptons in an LHC detector are neglected. However, leptons are well measured at the LHC, so this is an acceptable first look at this technique. Additional dilepton backgrounds were neglected, but are well understood by the LHC experiments as can be seen in Fig. <ref type="figure">1</ref>.</p><p>These backgrounds include t t production, W t Single Top production, W W , W Z, and ZZ Diboson production, and W +jets &amp; Multi-jet production in the electron channel.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. ePump Template Construction</head><p>Naively, one might imagine only using m in the CR to predict the improvement in the SR, but our awareness of the significant differential quark sensitivities to cos &#952; * (and moderate sensitivity to y ) plus the knowledge that future LHC running will provide enormous continuum DY datasets led us to explore dividing pseudo-data into many bins of dilepton mass m , as well as y and cos &#952; * .</p><p>The fiducial region considered for our analysis is designed explicitly to probe the PDFs at high x, and is defined by</p><p>DY samples were generated using the ResBos MC generator with the CT14HERA2 PDF set for the &#8730; s = 13 TeV LHC. Events were further required to pass a loose event selection in order to construct the finalized Data templates. Dilepton events with an invariant mass of m &gt; 80 GeV were required to satisfy p T &gt; 30 GeV, while low-mass events in the interval of 40 &lt; m &lt; 80 must satisfy p T &gt; 15 GeV. In addition, events must consist of leptons which are distributed as central-central or central-forward.</p><p>Events passing these selections were binned in ePump template histograms, which parametrize the triple-differential cross section of Eq. ( <ref type="formula">5</ref>), according to</p><p>where i, j, and k correspond to the bin indices of each distribution of interest. Note that in a realistic measurement, the numerator of Eq. ( <ref type="formula">12</ref>) would be replaced by N ijk data -N ijk bkg , where the background component arises from the standard dilepton background processes.</p><p>The total number of pseudo-data events are given by N ijk pseudo-data , the integrated luminosity of the pseudo-dataset is L int , and (&#8710;m ) i , (2&#8710;|y |) j , and (&#8710; cos &#952; * ) k are the corresponding bin widths. The factor of two in the denominator accounts for the modulus in the rapidity bin width. The bins used to parametrize Eq. ( <ref type="formula">12</ref>) are Events were generated as if they came from a future integrated luminosity and so uncertainties in the ePump results are scattered according to the statistics of such a hypothetical LHC input dataset. For each bin the DY cross section estimate &#963; ijk Drell-Yan was scaled by a characteristic integrated luminosity L int to arrive at a definite DY event yield N ijk Drell-Yan . The resulting yield was assumed to be the mean of a Poisson distribution, which was then used to throw a random number according to Poisson statistics, thereby populating the bin with N ijk pseudo-data pseudo-data events. Note that the pseudo-data were treated as those of one "experiment," but in practice ATLAS, CMS, and LHCb would all be sources of fitting input data. For illustration we chose two future LHC scenarios for integrated luminosities:</p><p>L int = 300 fb -1 approximating the data set for one experiment following Run-3 of the LHC, and L int = 3000 fb -1 , approximating that of the final dataset for one experiment of the High Luminosity (HL) LHC. i.e., integrated over the y and cos &#952; * dimensions. The dashed curve labeled "rapidity" adds the cumulative effect of binned (&#8710;|y |) and (&#8710;m ) to ePump. Finally, the solid curve labeled "angle" adds the cumulative effect of binned (&#8710; cos &#952; * ), (&#8710;|y |), and (&#8710;m ) to ePump.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>V. PDF UPDATE RESULTS</head><p>We can answer Question 1 by re-evaluating the effect of the 3000 fb -1 DY pseudo-dataset on the CT14HERA2 PDFs, as well as Question 2 by assessing the reduction of the PDF systematic uncertainty in the high-mass dilepton spectrum.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Impact on CT14HERA2 PDFs</head><p>Question 1 asked whether explicit inclusion of cos &#952; data might have a useful effect in reducing the uncertainties in the parton fits. The answer can be seen in the following four plots in Figs. 9 and 10. In order to see the effect of each of the quantities in the ePumpsimulated refitting, there are four sets of results in each plot. Figure <ref type="figure">9</ref> shows the impact of the ePump update with the 3000 fb -1 scenario on the &#363;(x) and d(x) sea distributions and Fig. <ref type="figure">10</ref>, the impact on the u v (x) and d v (x) valence distributions.</p><p>The sea distributions show a considerable reduction in uncertainty at high x. For example, in both the &#363;(x) and d(x) distributions, the PDF uncertainty is reduced from its pre-update value of approximately 70% to 20% at x = 0.5. The improvement in the valence distributions i.e., integrated over the y and cos &#952; * dimensions. The dashed curve labeled "rapidity" adds the cumulative effect of binned (&#8710;|y |) and (&#8710;m ) to ePump. Finally, the solid curve labeled "angle" adds the cumulative effect of binned (&#8710; cos &#952; * ), (&#8710;|y |), and (&#8710;m ) to ePump.  <ref type="table">III</ref>. Impact of 3000 fb -1 update on the CT14HERA2 u v (x) and d v (x) valence and &#363;(x) and d(x) sea distributions for several values of x using the standard triple-differential templates at Q = 3 TeV. To be compared with the "Angle" curves of Figs. 9 and 10. at x 0.5 is less dramatic, but substantial improvement is observed in the ranges of x 0.5.</p><p>The post-update u v (x) distribution remains better constrained than d v (x) at high x, where the uncertainty measures 2.6% as compared to 11% at x = 0.5, respectively. Table <ref type="table">III</ref>  Table <ref type="table">IV</ref> is the corresponding comparison for the 300 fb -1 scenario. The answer to Question 1 is that a global PDF fit which includes DY LHC data below 1</p><p>TeV in mass, and binned in rapidity and cos &#952; * , would dramatically improve the precision in our knowledge of the up and down PDFs. During the LHC era DY measurements of this kind are likely the only way to reduce uncertainties on the PDFs at high x; no other input data are capable of achieving this improvement.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Impact on the High-Mass Drell-Yan Spectrum</head><p>With an updated set of PDFs, we can answer Question 2: the effect of new PDFs on the systematic uncertainty on high-mass DY cross section. Rather than the enormous extrapolation required of current-day PDFs, the extrapolation from our Control Region to our Signal Region is modest and impactful. In order to make contact with primarily the ATLAS dilepton analysis <ref type="bibr">[7]</ref>, the invariant mass distribution assessed here utilizes leptons that originate in the central-central final state only.</p><p>The results are presented in Fig. <ref type="figure">13</ref>, which shows the impact of the 3000 fb -1 pseudodataset on the high-mass PDF systematic uncertainty. The PDF uncertainty is evaluated at several characteristic values of dilepton mass, which are listed in Table <ref type="table">V</ref>. At m = 5 TeV, the PDF systematic uncertainty is reduced from 31% to 8.9%, a reduction of roughly a factor of 3.5. Similarly, at m = 3 TeV, the uncertainty is reduced from 15% to 3.7%, roughly a factor of 4. In each case, a substantial improvement is obtained compared to the   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VI. OUTLOOK</head><p>The impact of a future DY cross section measurement on the CT14HERA2 PDF uncertainty was assessed using the ePump package at the &#8730; s = 13 TeV LHC with 300 fb -1 and 3000 fb -1 of DY pseudo-data. The fiducial region considered for the PDF update was based on three variables: the dilepton mass (m ), the dilepton rapidity (y ), and the cosine of the polar angle in the CS-frame (cos &#952; * ). These regions were divided into 1296 histogram bins and used to construct ePump pseudo-data and signal templates, which were designed to probe the PDFs in the extreme kinematic regions of (x,Q 2 ) only accessible at the LHC.</p><p>The CT14HERA2 PDF set was used for the update, but similar effects would be observed in other PDF sets. The results showed a significant reduction in the uncertainties associated with all parton flavors, especially &#363;(x) and d(x) sea distribution at high x. Likewise, these reduced PDF uncertainties, when propagated to the dilepton invariant mass spectrum, lead to a significantly improved description at high mass. inant theoretical uncertainties in the electron channel of the dilepton analysis. As the PDF uncertainty will be reduced well below the current experimental uncertainty, attention will be shifted to the reduction of others, such as the "PDFChoice" uncertainty, improving the discovery potential of future iterations of the dilepton analysis.</p><p>future PDF global fits is absolutely crucial, as it supplements the more standard doubledifferential measurements in invariant mass and rapidity; when used in conjunction, as was done here, the reduction in uncertainty can be dramatic.</p><p>For these reasons, DY cross section measurements could be vital to the success of future searches and measurements at the LHC. Not only will the PDF uncertainty that affects the high-mass dilepton analysis be reduced, improving the discovery potential of many nonresonant new physics models, but also the inclusion of new and robust data into the modern PDF global fits might even bring the uncertainty estimates of the various global fitting groups into better agreement.</p><p>Such an opportunity might result in a reduction of the "PDF choice" uncertainty when all PDF groups include triply differential DY data as discussed here. Obviously the goal would be to reach a stage in which the largest uncertainty would cease to be due to the PDFs. Table VI compares these uncertainties explicitly, where the uncertainty on the QCD background estimate is not included in calculating the post-update PDF uncertainty which will be reduced well below the current experimental uncertainty.</p><p>Therefore, for the reasons outlined in this paper, experiments at the LHC and global fitting groups should seriously consider the inclusion of precision measurements of the DY triple-differential cross section over a large invariant mass region in order to further constrain the PDF uncertainties in future PDF global fits.</p><p>If the contribution from a new experiment, &#967; 2 Nexp+1 , is added to the global analysis, the exact solution of the problem would require finding the new minimum of Eq. (A1), as well as diagonalizing the new Hessian matrix. Since this requires the full data sets from all experiments in the global analysis, as well as the theory calculations for every data point evaluated at many parameter values, it is an onerous and time-consuming task even for the global analysis teams that specialize in this endeavor. This is where a tool such as ePump is advantageous. ePump works by using the fact that the original &#967; 2 global is well-approximated by the known quadratic function and the fact that the theory predictions for the new observables, T Nexp+1,i (z), can be approximated using the original Hessian error PDFs. Under these approximations, the minimization and Hessian diagonalization can be performed algebraically <ref type="bibr">[18,</ref><ref type="bibr">40]</ref>, with the numerical computations taking seconds, rather than hours or days.</p><p>Figure <ref type="figure">8</ref> illustrates the use of ePump. In order to perform the PDF update, ePump requires two sets of inputs: data templates and theory templates. The data templates consist of the new experimental data values and their statistical and systematic uncertainties, including correlations, exactly as would be included in a standard global analysis. In the case of our present study these are the event counts of the new pseudo-data, along with their associated statistical uncertainties. The theory templates consist of the corresponding theory predictions for the same observables, evaluated using the central PDF and each of the Hessian eigenvector PDFs. Note that any number of new data sets can be included in the update by ePump, with any number of data points per new data set.</p><p>The output of ePump is an updated central and Hessian eigenvector PDFs, which approximate the result that would be obtained from a full global re-analysis that includes the new data. As an additional benefit, ePump can also directly output the updated predictions and uncertainties for any other observables of interest (such as the cross section in the signal region), without the necessity to recalculate using the updated PDFs. For more details about the use of ePump, see Ref. <ref type="bibr">[18]</ref>. The code for ePump and more specific details of its usage can be obtained at the website <ref type="url">http://hep.pa.msu.edu/epump/</ref>.</p><p>In the present study, we have used ePump to assess the reduction of PDF uncertainties from various kinematic selection choices on the Drell-Yan data. It should be noted that if the included new data deviate more from the prediction (based on CT14HERA2), the result of the ePump analysis will be less reliable. This is due to the nature of the ePump method which assumes a quadratic dependence of &#967; 2 and a linear dependence of observables when the PDFs vary. Since the pseudodata (generated by MMHT14) and the theory predictions (from CT14HERA2) do not differ much, we expect the results of the ePump analysis in our study will hold to a very good approximation. However, should the future data deviate significantly from the theory predictions (from CT14HERA2), a full global analysis, probably with an extended non-perturbative parametrization form, must be carried out.</p><p>The asymmetry coefficients C 0 q and C 1 q of Eq. (B1) include the electroweak couplings of the initial-state quarks and final-state leptons, and describe the m spectrum as</p><p>C 1 q (m ) = 4Q Q q a a q &#967; 1 (m ) + 8a v a q v q &#967; 2 (m ).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>(B2)</head><p>Where   </p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_0"><p>-1 0 3 -1 0 2 -</p></note>
		</body>
		</text>
</TEI>
