<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Keck Integral-field Spectroscopy of M87 Reveals an Intrinsically Triaxial Galaxy and a Revised Black Hole Mass</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>03/01/2023</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10412070</idno>
					<idno type="doi">10.3847/2041-8213/acbbcf</idno>
					<title level='j'>The Astrophysical Journal Letters</title>
<idno>2041-8205</idno>
<biblScope unit="volume">945</biblScope>
<biblScope unit="issue">2</biblScope>					

					<author>Emily R. Liepold</author><author>Chung-Pei Ma</author><author>Jonelle L. Walsh</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Abstract                          The three-dimensional intrinsic shape of a galaxy and the mass of the central supermassive black hole provide key insight into the galaxy’s growth history over cosmic time. Standard assumptions of a spherical or axisymmetric shape can be simplistic and can bias the black hole mass inferred from the motions of stars within a galaxy. Here, we present spatially resolved stellar kinematics of M87 over a two-dimensional 250″ × 300″ contiguous field covering a radial range of 50 pc–12 kpc from integral-field spectroscopic observations at the Keck II Telescope. From about 5 kpc and outward, we detect a prominent 25 km s              −1              rotational pattern, in which the kinematic axis (connecting the maximal receding and approaching velocities) is 40° misaligned with the photometric major axis of M87. The rotational amplitude and misalignment angle both decrease in the inner 5 kpc. Such misaligned and twisted velocity fields are a hallmark of triaxiality, indicating that M87 is not an axisymmetrically shaped galaxy. Triaxial Schwarzschild orbit modeling with more than 4000 observational constraints enabled us to determine simultaneously the shape and mass parameters. The models incorporate a radially declining profile for the stellar mass-to-light ratio suggested by stellar population studies. We find that M87 is strongly triaxial, with ratios of              p              = 0.845 for the middle-to-long principal axes and              q              = 0.722 for the short-to-long principal axes, and determine the black hole mass to be                                                                                                  (                                                            5.37                                                              −                      0.25                                                              +                      0.37                                                        ±                  0.22                  )                  ×                                                            10                                                              9                                                                                                  M                                                              ⊙                                                                                                  , where the second error indicates the systematic uncertainty associated with the distance to M87.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>1. Introduction Some of the earliest dynamical evidence for the presence of a supermassive black hole (SMBH) came from M87 <ref type="bibr">(Sargent et al. 1978)</ref>. A bright asymmetric ring of radio emission around the M87 SMBH was imaged in 2019 (Event Horizon Telescope <ref type="bibr">Collaboration et al. 2019a</ref>). The black hole mass (M BH ) inferred from the ring features is consistent with the value determined from stellar dynamics based on axisymmetric orbit modeling <ref type="bibr">(Gebhardt et al. 2011</ref>), but it is nearly twice the mass inferred from dynamics of a gas disk around the hole <ref type="bibr">(Walsh et al. 2013)</ref>.</p><p>M87 is classified as an elliptical galaxy based on the twodimensional shape of the stellar light projected on the sky. However, its three-dimensional intrinsic shape has never been determined. A galaxy's intrinsic shape is a fundamental property that encodes the galaxy's past merger history and provides information about the mass ratios of the progenitor galaxies, the merger orbital parameters, gas fractions, and the fraction of stars formed ex situ. Whether a galaxy is intrinsically spherical, axisymmetric, or triaxial also impacts dynamical determinations of its SMBH mass and stellar mass, as well as any mass reconstructions based on the method of gravitational lensing.</p><p>Thus far, almost all information about galaxy intrinsic shapes has been inferred statistically by inverting distributions of observed galaxy properties <ref type="bibr">(Franx et al. 1991;</ref><ref type="bibr">Weijmans et al. 2014;</ref><ref type="bibr">Foster et al. 2017;</ref><ref type="bibr">Ene et al. 2018;</ref><ref type="bibr">Li et al. 2018</ref>). Here, we use the Keck Cosmic Web Imager (KCWI; <ref type="bibr">Morrissey et al. 2018</ref>) on the 10 m Keck II telescope to obtain a spatially resolved two-dimensional map of the stellar kinematics of M87 over a 250&#8243; &#215; 300&#8243; field of view. The resulting kinematics span a radial range of &#8764;0 6-150&#8243;, corresponding to a physical range of 50 pc-12 kpc at a distance of 16.8 &#177; 0.7 Mpc to M87, the value adopted in Event Horizon Telescope <ref type="bibr">Collaboration et al. (2019b)</ref> and in this work (an angular size of 1&#8243; corresponds to a physical length of 81.1 &#177; 3.3 pc). We perform triaxial Schwarzschild orbit modeling using the detailed stellar kinematic measurements as constraints to determine M87&#700;s shape and mass parameters. Our models include a radially declining profile for the stellar mass-to-light ratio (M * /L) inferred from stellar population measurements <ref type="bibr">(Sarzi et al. 2018)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Keck Observations of M87</head><p>We observed M87 with Keck KCWI in 2020 <ref type="bibr">May, 2021</ref><ref type="bibr">May, 2022</ref><ref type="bibr">March, and 2022</ref> April. With the large slicer and BL grating of the integral-field unit (IFU), we obtained spectra between 3500 and 5600 &#197; at 39 pointings, which provide contiguous two-dimensional spatial coverage of the nucleus and the outer parts of M87 (Figure <ref type="figure">1</ref>). The data span about 20 kpc (250&#8243;) across the photometric major axis (-25&#176;east of north) and about 24 kpc (300&#8243;) across the photometric minor axis (-115&#176;east of north).</p><p>We coadd spectra from individual KCWI spaxels to reach high signal-to-noise ratios (S/Ns), forming 461 spatial bins. Within each spatial aperture, we measure the line-of-sight stellar velocity distributions (LOSVDs) from the shapes of the absorption lines. Further details about the observations, data reduction procedures, spectral fitting processes, and stellar kinematic determination are provided in Appendices A and B.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Stellar Kinematic Maps</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Misalignment between Kinematic and Photometric Axes</head><p>The KCWI map for the line-of-sight velocity V (left panel of Figure <ref type="figure">1</ref>) shows a prominent rotational pattern at large radii, in which the northeast side of the galaxy is blueshifted and the southwest side is redshifted. The kinematic axis that connects the maximal receding and approaching velocities, however, is not aligned with the photometric major axis, as it would be for an axisymmetric rotating galaxy.</p><p>To quantify the amplitude and axis of rotation, we model the velocity field as a cosine function, with</p><p>, where R is the projected radius from the galaxy's center and &#920; is the azimuthal angle on the sky. The model parameters V 1 (R) and &#920; 0 (R) are the the amplitude of rotation and the position angle (PA) of the kinematic axis at radius R, respectively. With increasing radius, the velocity curve shows a systematic shift in phase and an increase in rotational amplitude (Figure <ref type="figure">2</ref>). Within a radius of 3 kpc, the PA of the kinematic axis changes rapidly clockwise with radius (lower right panel of Figure <ref type="figure">2</ref>), representing the kinematically distinct core mapped out by the Multi Unit Spectroscopic Explorer (MUSE) on the Very Large Telescope <ref type="bibr">(Emsellem et al. 2014)</ref>. Beyond 3 kpc, where the MUSE data end (at about 35&#8243;), we find that the PA of the kinematic axis continues to change clockwise and crosses the PA of the photometric minor axis, plateauing at -165&#176;between 6 and 12 kpc. Hence, there is a 40&#176;misalignment between the stellar kinematic axis and photometric major axis in M87.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Stellar Velocity Dispersion</head><p>The KCWI map (right panel of Figure <ref type="figure">1</ref>) and radial profile (Figure <ref type="figure">3</ref>) of the stellar velocity dispersion &#963; exhibit several features. Toward the center of M87, &#963; increases rapidly from 250 km s -1 at a radius of 2 kpc to 370 km s -1 at 100 pc from the nucleus. This is a clear signature of the gravitational influence of the central black hole on the motions of the stars in its vicinity. The velocity dispersion stays at about 250 km s -1 between 2 and 5 kpc and then shows a gentle 10% decline between 5 kpc and the outermost reach of our data at 12 kpc. The stellar &#963; at the edge of our field connects smoothly to the latest determinations of the velocity dispersions of discrete dynamical tracers (lower panel of Figure <ref type="figure">3</ref>) such as red globular clusters and planetary nebulae in the outer parts of M87 <ref type="bibr">(Zhang et al. 2015;</ref><ref type="bibr">Longobardi et al. 2018</ref>). Beyond about 10 kpc, subpopulations of planetary nebulae have been reported to have distinct kinematics <ref type="bibr">(Longobardi et al. 2018</ref>): &#963; of "intracluster" planetary nebulae rises to 800 km s -1 at 100 kpc, whereas those in the galaxy halo component have a relatively flat &#963; profile out to 100 kpc, similar to that of the red population of globular clusters <ref type="bibr">(C&#244;t&#233; et al. 2001;</ref><ref type="bibr">Strader et al. 2011;</ref><ref type="bibr">Zhang et al. 2015)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Determination of Mass and Shape Parameters from</head><p>Triaxial Schwarzschild Modeling We use the full LOSVDs from Keck KCWI, along with photometric observations from the Hubble Space Telescope (HST) and ground-based telescopes <ref type="bibr">(Kormendy et al. 2009)</ref>, to measure M87&#700;s mass distribution and intrinsic shape. We perform triaxial Schwarzschild orbit modeling with the TriOS code <ref type="bibr">(Quenneville et al. 2021</ref><ref type="bibr">(Quenneville et al. , 2022) )</ref> based on an earlier code (van den <ref type="bibr">Bosch et al. 2008)</ref>, and use more than 4000 observational constraints to simultaneously determine six parameters: M BH , M * /L, dark matter content, and the threedimensional intrinsic shape. As described below, we implement a new capability in the code to model spatial variations in M * /L and use a radially declining M * /L profile that closely approximates the variation inferred from stellar population and dynamics studies of M87 <ref type="bibr">(Oldham &amp; Auger 2018;</ref><ref type="bibr">Sarzi et al. 2018)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Galaxy Model and Orbit Sampling</head><p>Each galaxy model has three mass components: a central SMBH, stars, and a dark matter halo. The three-dimensional stellar density in the TriOS code is represented as a sum of multiple Gaussian functions of differing widths and axial ratios. To determine these functions, we first fit a two-dimensional Multi-Gaussian Expansion (MGE; Cappellari 2002) to the surface brightness distribution of M87 (see Appendix C). Each MGE component is allowed an independent flattening parameter ( q in Table <ref type="table">2</ref>) to model any radially changing ellipticity observed on the sky.</p><p>For a given set of three angles, &#952;, f, and &#968;, that relate the intrinsic and projected coordinate systems of a galaxy <ref type="bibr">(Binney 1985)</ref>, we deproject each MGE component, multiply by a radially varying M * /L (see below), and add the deprojected Gaussians to obtain the three-dimensional stellar density. Each deprojected MGE component can have its own axis ratios p, q, and u, where p = b/a is the intrinsic middle-tolong axis ratio, q = c/a is the intrinsic short-to-long axis ratio, and u is the apparent-to-intrinsic long axis ratio. When the bestfit p, q, and u are quoted below, each value is luminosity averaged over the MGE components. Further details of the relations between the apparent and intrinsic shape parameters and the deprojections can be found in Section 2 of <ref type="bibr">Quenneville et al. (2022)</ref>.</p><p>The M * /L we use to obtain the stellar density varies radially, following a logistic curve given by </p><p>. (Upper right) The amplitude of rotation, V 1 (R), increases with radius and reaches 25 km s -1 around 6 kpc. (Lower right) The phase of the velocity function, &#920; 0 , measures the orientation of the kinematic axis and varies significantly with radius. It plateaus to -165&#176;beyond 6 kpc, indicating a 40&#176;misalignment between the kinematic axis and the photometric major axis (red dashed curves; <ref type="bibr">Kormendy et al. 2009)</ref> </p><p>where &#948; is the ratio of the inner and outer M * /L, and R 0 and k parameterize the location and sharpness of the transition. We choose &#948; = 2.5, R 0 = 10&#8243;, and k = 2, which together well approximate (Figure <ref type="figure">D1</ref>) the spatial M * /L profile of M87 determined from <ref type="bibr">Sarzi et al. (2018)</ref>. We leave the overall normalization-the outer M * /L-as a free parameter. A similar form as Equation (1) was used in an axisymmetric Jeans dynamical study of M87 globular cluster and stellar kinematics data <ref type="bibr">(Oldham &amp; Auger 2018)</ref>. We implement this spatial variation in our models by choosing distinct M * /L ratios for each component of the MGE such that the profile is reproduced.</p><p>The dark matter halo is described by a generalized Navarro-Frenk-White density profile <ref type="bibr">(Navarro et al. 1996</ref>)</p><p>where &#961; 0 is the density scale factor and r s is the scale radius. This form of the dark matter halo is used by <ref type="bibr">Li et al. (2020)</ref> when fitting axisymmetric Jeans models to M87 globular cluster and stellar kinematics data. They determine that r 15.7 s 2.0 2.3 kpc for the cored &#947; = 0 model but find no significant preference for &#947; = 0 over &#947; = 1. <ref type="bibr">Oldham &amp; Auger (2016)</ref>, on the other hand, find a strong preference for flat cores with &#947; &#61576; 0.13. We have tested models with &#947; = 0, 0.5, and 1, and we find that the models with a &#947; = 0 halo are a better description of the data, with the goodness of fit (&#967; 2 ) lowered by at least 100. We therefore adopt the flat core, &#947; = 0 dark matter halo. Since the KCWI stellar kinematics extend to a projected radius of 12 kpc, we expect r s and &#961; 0 to be quite degenerate; we choose to fix r s = 15 kpc and keep &#961; 0 as a free parameter in the models.</p><p>For each galaxy model, we compute the trajectories of a library of around 500,000 stellar orbits that sample 120 values of energy, 54 and 27 values of the second integral of motion for the loop and box orbit libraries, and 27 values of the third integral of motion over logarithmically spaced radii from 0 01 to 316&#8243;. The loop and box orbits are integrated for 2000 and 200 dynamical times, respectively. We project the stellar orbits onto the sky and compute the LOSVDs, accounting for the KCWI point-spread function (PSF) and spatial binning. Using a non-negative least-squares optimization, we determine the orbital weights such that the linear superposition of orbits reproduces the luminous mass (to an accuracy of 1%) and the observed kinematics in each spatial bin. As described below, the procedure is repeated for a large suite of galaxy models in order to determine the best combination of the galaxy model parameters.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Parameter Search</head><p>The best-fit model parameters and uncertainties are determined as follows. We use an iterative grid-free Latin hypercube scheme to select sampling points in the six-dimensional model parameter space <ref type="bibr">(Liepold et al. 2020;</ref><ref type="bibr">Pilawa et al. 2022;</ref><ref type="bibr">Quenneville et al. 2022)</ref>. In each iteration, the TriOS code is run to assess the &#967; 2 of each of the sampled galaxy models. The &#967; 2 of a model is determined by comparing the data and uncertainties for the lowest eight kinematic moments in each of the 461 spatial bins to the model predictions. An additional set of constraints is imposed on kinematic moments h 9 to h 12 , in which the value of each moment is required to be zero with error bars comparable to the errors in h 3 to h 8 . As shown in <ref type="bibr">Liepold et al. (2020)</ref>, these additional constraints help eliminate spurious behavior in the LOSVDs predicted by the models.</p><p>The goodness-of-fit landscape is then approximated using Gaussian process regression (GPR; <ref type="bibr">Rasmussen &amp; Williams 2006;</ref><ref type="bibr">Pedregosa et al. 2011</ref>) with a Mat&#233;rn covariance kernel. To map the high-likelihood region in finer detail, we run the TriOS code again for a subsequent set of models selected by uniformly sampling a zoom-in volume that lies within the 3&#963; confidence level for six parameters in the previous regression surface. A more accurate GPR surface is then obtained from all the available models. After multiple iterations, we again use GPR to construct a smooth likelihood surface from all available models (nearly 20,000 in total). Finally, we use the dynamic nested sampler dynesty (Speagle 2020) to sample from this surface to produce Bayesian posteriors, assuming a uniform prior for all parameters.</p><p>Following <ref type="bibr">Quenneville et al. (2022)</ref>, we search over a different set of shape parameters, T, T maj , and T min , instead of angles &#952;, f, and &#968;. Such a parameterization maps the deprojectable volume in the viewing-angle space into a unit cube in the shape-parameter space, allowing for simpler and more efficient searches. The definitions of (T T T , , maj min ) and the relationships with (&#952;, f, &#968;) are given in Section 3 of <ref type="bibr">Quenneville et al. (2022)</ref>.</p><p>The final posterior distributions yield clear constraints on all six model parameters: M BH , outer M * /L, dark matter density &#961; 0 , T, T maj , and T min (Figure <ref type="figure">4</ref>). Instead of the halo density parameter &#961; 0 , we describe the dark matter halo in terms of the ratio of dark matter to total matter enclosed within 10 kpc, f 10 . The posterior distributions for the more intuitive (luminosityaveraged) axis ratios p, q, and u are also shown. The best-fit parameters are summarized in Table <ref type="table">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Black Hole Mass and Stellar Mass-to-light Ratio</head><p>The mass of the M87 black hole has been determined with two other independent methods <ref type="bibr">(Walsh et al. 2013</ref>; Event Horizon Telescope Collaboration et al. 2019a) in addition to the stellar-dynamical method used here. Compared to our value ( ) M M 5.37 0.22 10 BH 0.25 0.37 9</p><p>, the value M BH = (6.5 &#177; 0.2 &#177; 0.7) &#215; 10 9 M e inferred from the crescent diameter by the Event Horizon Telescope (EHT) team (Event Horizon Telescope Collaboration et al. 2019b) is 21% higher, but the difference is within 1.5&#963; of their uncertainties. A recent reanalysis of EHT observations <ref type="bibr">(Broderick et al. 2022)</ref> revised the black hole mass to M BH = (7.13 &#177; 0.39) &#215; 10 9 M e , which is 33% above our value, but <ref type="bibr">Tiede et al. (2022)</ref> warned of the false-positive tendency of the method used in the reanalysis and found that significant systematic uncertainties were not taken into account.</p><p>The ionized gas-dynamical determination of ( ) M M 3.45 10 BH 0.26 0.85 9</p><p>(after scaling to our adopted distance of 16.8 Mpc) is 36% below our value <ref type="bibr">(Walsh et al. 2013;</ref><ref type="bibr">Event Horizon Telescope Collaboration et al. 2019b)</ref>.</p><p>Before this work, the most recent mass measurement of the M87 black hole that also used orbit-based stellar dynamics obtained <ref type="bibr">(Gebhardt et al. 2011</ref>; Event Horizon Telescope Collaboration et al. 2019b) ( ) M M 6.14 10 BH 0.62 1.07 9</p><p>(after scaling to our adopted distance of 16.8 Mpc), which is 14% above our value. Despite the apparent consistency, there are many differences between the two measurements. In this work, the stellar spectra are obtained in a homogeneous manner from the latest IFU at the Keck Telescope over a contiguous 250&#8243; &#215; 300&#8243; field and have S/N of around 100 per &#197; for the outermost bins and above 200 per &#197; for central bins. The observed stellar velocity dispersions used to constrain the orbit models in this work are about 20% lower than those of <ref type="bibr">Gebhardt et al. (2011)</ref> and <ref type="bibr">Murphy et al. (2011)</ref> beyond 1 kpc (Figure <ref type="figure">3</ref>; top panel), but this work is in broad agreement with other recent measurements <ref type="bibr">(Emsellem et al. 2014;</ref><ref type="bibr">Sarzi et al. 2018;</ref><ref type="bibr">Forbes et al. 2020)</ref>. The orbit modeling in this work allows for triaxiality, and the M BH is obtained from a full sixdimensional model parameter search with posteriors measured using a Bayesian framework. Furthermore, Gebhardt et al.</p><p>(2011) adopts a spatially constant V-band M * /L of 9.7 M e /L e (scaled to our distance of 16.8 Mpc). However, a recent detailed stellar population analysis of M87 reported a negative radial gradient due to a changing stellar initial mass function <ref type="bibr">(Sarzi et al. 2018)</ref>. When incorporating the shape of this M * /L gradient into our stellar-dynamical models, we find the V-band M * /L declines from 8.65 M e /L e at the center to an outer value of 3.46 M e /L e .</p><p>Using either M * (&lt; r SOI ) = M BH or M * (&lt; r SOI ) = 2M BH as the definition of a black hole's gravitational sphere of influence (SOI), we find the SOI radius of the M87 SMBH to be r SOI = 4 4 (0.36 kpc) or 6 1 (0.50 kpc).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.">Dark Matter Mass</head><p>At the outer reach of our data, at a radius of 10 kpc, we find the enclosed dark matter mass to be M DM (&lt; 10 kpc) = (3.88 &#177; 0.12) &#215; 10 11 M e , constituting about 67% of the total mass of the galaxy ( f 10 in Table <ref type="table">1</ref>). A similar dark matter fraction (73% at 14.2 kpc) is obtained from Jeans modeling of the kinematics of globular clusters <ref type="bibr">(Li et al. 2020)</ref>. A lower dark matter fraction (about 30% at 11 kpc) is estimated from axisymmetric orbit-based modeling of the kinematics from stars and globular clusters <ref type="bibr">(Murphy et al. 2011)</ref>. This lower fraction arises mainly from their high estimate of M * /L, as discussed in the previous paragraph.</p><p>Our inferred total mass of M87 within 10 kpc is M tot (&lt; 10 kpc) = (5.77 &#177; 0.12) &#215; 10 11 M e . Dynamical modeling of globular clusters under the assumption of spherical symmetry yields very similar values at the same radius <ref type="bibr">(Romanowsky &amp; Kochanek 2001;</ref><ref type="bibr">Wu &amp; Tremaine 2006)</ref> but with large modeling uncertainties <ref type="bibr">(Wu &amp; Tremaine 2006)</ref>. Estimates from axisymmetric orbit models find a 15% lower value <ref type="bibr">(Murphy et al. 2011)</ref>. Jeans modeling studies <ref type="bibr">(Oldham &amp; Auger 2016;</ref><ref type="bibr">Li et al. 2020</ref>) incorporating a radially declining M * /L find a total mass enclosed within 10 kpc to be in the range of (3-7.5) &#215; 10 11 M e .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.5.">M87's Intrinsic Shape</head><p>Our orbit modeling results show that M87 is strongly triaxial, where the lengths of the short and middle principal axes are 72% and 85% of the length of the long axis, corresponding to q and p, respectively. A triaxiality parameter often used to quantity the ratios of a galaxy's principal axes is</p><p>. This parameter ranges between T = 0 for an oblate axisymmetric shape (p = 1 or a = b) and T = 1 for a prolate axisymmetric shape (p = q or b = c), with values between 0 and 1 indicating a triaxial shape. Our inferred value for M87 is T = 0.65 &#177; 0.02, strongly excluding the possibility that M87 is an axisymmetric galaxy.</p><p>The shape parameters p, q, and u in Table <ref type="table">1</ref> are related to a set of angles &#952;, f, and &#968; that uniquely specify the orientation of M87&#700;s intrinsic axes with respect to its projected axes on the sky (van den <ref type="bibr">Bosch et al. 2008;</ref><ref type="bibr">Quenneville et al. 2022)</ref>. The angles &#952; and f specify the direction of the line of sight from M87 to the observer; they are the usual polar angles in M87&#700;s intrinsic coordinate system. The inclination angle &#952; = 0&#176;c orresponds to a face-on view of M87 along its intrinsic short axis, and &#952; = 90&#176;corresponds to an edge-on view with the short axis in the sky plane. The azimuthal angle f = 0&#176;places the intrinsic middle axis in the sky plane, and f = 90&#176;places the intrinsic long axis in the sky plane. Once the line of sight is described by &#952; and f, the third angle &#968; specifies the remaining degree of freedom for the rotation about the line of sight. Our best-fit angles for M87 are (&#952;, f, &#968;) = (48&#176;.9, 37&#176;. 5, -61&#176;.3). Thus, we are viewing M87 from a direction that is roughly equidistant from all three principal axes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.6.">Angular Momentum Vector and Origin of Kinematic</head><p>Misalignment To gain physical insight into the origin of the observed misalignment between the kinematic axis and photometric major axis of M87 on the sky (Figure <ref type="figure">2</ref>; lower right), we examine the direction of the total angular momentum vector, L, of the stars predicted by our best-fit orbit model and how it would be projected on the sky. To do this, we sum the individual contributions to the angular momentum from the superposition of stellar orbits and compute the total L. Among the three major orbital types computed in the TriOS code, the box orbits supported by a triaxial gravitational potential, by construction, have zero angular momentum, whereas the shortaxis and long-axis tube orbits have net L along the intrinsic short axis and long axis, respectively <ref type="bibr">(Schwarzschild 1979;</ref><ref type="bibr">van den Bosch et al. 2008;</ref><ref type="bibr">Quenneville et al. 2022)</ref>. The direction of the total L is therefore determined by the relative contributions from the two types of tube orbits <ref type="bibr">(Franx et al. 1991)</ref>. Posterior distributions of six parameters from triaxial Schwarzschild orbit modeling of M87: black hole mass M BH , outer stellar mass-to-light ratio M * /L, dark matter fraction enclosed within 10 kpc f 10 , and shape parameters T, T maj , and T min . The posterior distributions of the luminosity-averaged axis ratios u, p, and q are shown in the upper right. The three levels of purple shading bound the 1&#963;, 2&#963;, and 3&#963; regions (68%, 95%, and 99.7% confidence levels, respectively) of the parameters. The vertical lines in each one-dimensional distribution indicate the median and the corresponding three confidence levels.</p><p>The rotational velocity of M87 reaches sufficiently high amplitudes beyond about 5 kpc (Figures <ref type="figure">1</ref> and <ref type="figure">2</ref>) for us to robustly determine the direction of L. We find it to point approximately 60&#176;off of the intrinsic short axis. Using the bestfit viewing angles to project L on the sky, we find it to lie at a PA of approximately -60&#176;. Because the projected L is orthogonal to the kinematic axis of the projected velocity field, this simple calculation indicates that the PA of the kinematic axis predicted by the model is around -150&#176;, very similar to the observed kinematic axis. The observed kinematic misalignment of M87 on the sky is therefore a result of both projection effects of a triaxial galaxy and a physical offset between the total angular momentum vector and the intrinsic short axis of the galaxy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusions</head><p>With 4000 constraints from Keck KCWI and our latest triaxial orbit modeling code and procedure for sampling highdimensional parameter spaces even with computationally intensive models, we are able to relax the common assumption of axisymmetry and present the most comprehensive stellardynamical study of the M87 galaxy and its central black hole. This work is one of only a small number of studies that have produced constraints on all three intrinsic shape parameters for individual galaxies <ref type="bibr">(Jin et al. 2020;</ref><ref type="bibr">Santucci et al. 2022)</ref>. Even fewer galaxies have been observed with sufficient angular resolution, field of view, spectral coverage, and S/N for a simultaneous determination of the intrinsic shape, supermassive black hole mass, and galaxy mass (van den Bosch &amp; Tim de Zeeuw 2010; <ref type="bibr">Walsh et al. 2012;</ref><ref type="bibr">den Brok et al. 2021;</ref><ref type="bibr">Pilawa et al. 2022;</ref><ref type="bibr">Quenneville et al. 2022)</ref>. As demonstrated in this work, further advancements have only been made possible by the installations of wide-field and highly sensitive IFUs on large ground-based telescopes.</p><p>Moving forward, it is crucial to apply triaxial stellardynamical orbit models to larger samples of galaxies, thereby advancing this method from a rarity to a standard technique. This is especially pertinent for massive elliptical galaxies such as M87 because the majority of them-when a rotational pattern can be detected in the stellar velocity field-show some degree of misalignment between the kinematic and photometric major axes, extending to the half-light radius and beyond <ref type="bibr">(Ene et al. 2018</ref><ref type="bibr">(Ene et al. , 2020;;</ref><ref type="bibr">Krajnovi&#263; et al. 2018)</ref>. Such an offset indicates triaxiality <ref type="bibr">(Binney 1985;</ref><ref type="bibr">Franx et al. 1991)</ref>; an axisymmetric galaxy would, by symmetry, produce only aligned kinematic and photometric major axes.</p><p>When direct comparisons between axisymmetric and triaxial modeling were made on the same galaxy, the black hole mass from axisymmetric models has ranged from about 50% (van den Bosch &amp; Tim de Zeeuw 2010) to 170% <ref type="bibr">(Pilawa et al. 2022)</ref> of the mass when triaxiality was allowed; and in two galaxies, the black hole mass did not change appreciably (van den Bosch &amp; Tim de Zeeuw 2010; <ref type="bibr">Liepold et al. 2020;</ref><ref type="bibr">Quenneville et al. 2022)</ref>. Overall, triaxial models were able to match the observed stellar kinematics significantly better than axisymmetric models <ref type="bibr">(Pilawa et al. 2022;</ref><ref type="bibr">Quenneville et al. 2022)</ref>.</p><p>More secure black hole masses could result in significant changes to the local black hole census and the shapes of the scaling relations between black holes and host galaxies, thereby impacting our understanding of black hole fueling and feedback physics, as well as binary black hole merger physics used to forecast and eventually interpret gravitational wave signals for Pulsar Timing Arrays (Taylor 2021) and space-based detectors (Amaro-Seoane et al. 2022). In terms of black hole imaging studies, because the photon ring diameter ranges from about 9.6 to 10.4 gravitational radii depending on the black hole spin (Event Horizon Telescope Collaboration et al. 2019b), future analyses combining direct imaging with stellar kinematic measurements such as that presented this paper have the potential to significantly improve the prospects for measuring black hole spins.</p><p>Table 1 Mass and Shape Properties of M87 M87 Property (units) Inferred Value Black hole mass M BH (10 9 M e ) 5.37 0.22 0.25 0.37 Outer M * /L (V-band; M e /L e ) 3.46 0.15 0.06 0.04 Inner M * /L (V-band; M e /L e ) 8.65 0.38 0.15 0.10 Dark matter fraction at 10 kpc f 10 0.67 &#177; 0.02 Total mass within 10 kpc (10 11 M e ) 5.77 &#177; 0.12 Shape parameter T 0.65 &#177; 0.02 Shape parameter T maj 0.46 0.02 0.03 Shape parameter T min 0.61 &#177; 0.02 Average middle-to-long axis ratio p 0.845 &#177; 0.004 Average short-to-long axis ratio q 0.722 &#177; 0.007 Average apparent-to-intrinsic long axis ratio u 0.935 &#177; 0.004 Line-of-sight direction &#952;, f (&#176;) 48.9 1.0 1.1 , 37.5 1.3 1.4 Rotation about line of sight &#968; (&#176;) 61.3 1.7 1.4 Note. The search over galaxy parameters in the triaxial orbit modeling in this paper is performed over M BH , outer M * /L, halo scale density &#961; 0 , and the shape parameters T, T maj , and T min . All other parameters in the table are computed from the posteriors of those six parameters. For the two primary mass parameters M BH and M * /L, the second set of errors denotes systematic uncertainties (68% confidence levels) due to the uncertainty in the distance to M87: 16.8 &#177; 0.7 Mpc (Event Horizon Telescope Collaboration et al. 2019b).</p><p>(Cappellari 2017), vorbin <ref type="bibr">(Cappellari &amp; Copin 2003)</ref>, MGE <ref type="bibr">(Cappellari 2002)</ref>, KCWI Data Reduction Pipeline <ref type="bibr">(Morrissey et al. 2018)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Appendix A Keck KCWI Data Reduction and Analysis</head><p>We observed M87 using the integral-field spectrograph KCWI on Keck. We used the BL grating centered on 4600 &#197; and the Kblue filter to obtain the widest wavelength coverage, reducing possible template mismatch during the subsequent extraction of the stellar kinematics. The integration time per exposure varied from 300 s for the central pointings to 1500 s for the outermost pointings with low surface brightness. We periodically acquired offset sky exposures in between the onsource galaxy exposures, each roughly half the integration time of the adjacent galaxy exposures. Only data taken in good observing conditions are used in this analysis; the on-source and sky exposure times total 13 hr and 2.8 hr, respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A.1. Data Reduction</head><p>The KCWI Data Extraction and Reduction Pipeline <ref type="bibr">(Morrissey et al. 2018</ref>) is actively maintained on a publicly accessible GitHub repository. We use the IDL version of the pipeline with its default settings to reduce each frame. The main steps include overscan and bias removal, cosmic ray rejection, dark and scattered light subtraction, solving for the geometric distortion and wavelength solution, flat-fielding, correction for vignetting and the illumination pattern, sky subtraction, and the generation of data cubes using the spatial and spectral mappings determined previously. The pipeline then corrects for differential atmospheric refraction and applies a flux calibration using a standard star.</p><p>In addition to the default pipeline, we perform custom steps to improve the quality of the processed data. Some cosmic rays are improperly removed by the KCWI pipeline, leaving sharp features at certain wavelengths in a small number of spaxels in our data cubes. We therefore scan through each wavelength slice of the cubes, mask the impacted pixels, and perform an interpolation to replace their values with those of neighboring pixels. Furthermore, beyond about 100&#8243;, the KCWI spectra are sky-dominated and subtle mis-subtraction of the sky can result in significant reduction of the S/N of the galaxy spectra. The sky subtraction stage of the KCWI pipeline uses b-spline interpolation to build a "noise-free" model of the sky in each pixel that is subtracted from the corresponding object exposure. We find that this routine does not capture highly space-or timevariant sky features, so we further remove residual sky features using the combination of a principal component analysis (PCA) and the penalized pixel-fitting (pPXF; Cappellari 2017) method, as described in Appendices A.4 and B.</p><p>In the final step, we merge the on-source M87 data cubes. Roughly half of the pointings were taken with the long axis of KCWI aligned with a PA of -25&#176;and half were oriented perpendicular to this with a PA of -115&#176;. We construct a pair of data cubes, one for each of the two orientations, using the nifcube and gemcube IRAF tasks that are part of Gemini's data reduction software. We input the fully calibrated KCWI data cubes (the "_icubes.fits" files) and map the cubes onto a shared grid with a spacing of 0 3 &#215; 1 4 &#215; 1 &#197;. This choice of spaxel size matches the native scale of the individual KCWI data cubes for our observational setup.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A.2. Line-spread Function</head><p>We find that our selected spectrograph configuration produces a line-spread function (LSF) that is distinctly non-Gaussian (Figure <ref type="figure">A1</ref>). The LSF is instead described well by the convolution of a Gaussian function and a top-hat function of the form</p><p>where &#928;(x) = 1 if |x| 1/2 and 0 otherwise, &#916; is the full width of the top-hat component, and &#963; is the standard deviation of the Gaussian. To measure the widths of the Gaussian and top-hat components of the LSF, we simultaneously fit 31 lines of an FeAr arc lamp spectrum between 4500 and 5000 &#197; and determine &#916; = 5.105 &#197; and &#963; = 0.627 &#197;. Repeating this procedure on different spectral or spatial regions yields comparable best-fit parameters.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A.3. Point-spread Function</head><p>During the first night of observations, we took KCWI data of the inner region of M87 and the atmospheric seeing was estimated to be 0 63 by the differential image motion monitor at the nearby Canada-France-Hawaii Telescope weather station. This estimate is consistent with the broadening of point sources measured from exposures taken with the guider camera. During the other four nights, we observed the outer regions of M87 and measured similar seeings. While running stellar-dynamical models, described in Section 4, we use a PSF that is a Gaussian with a full width at half maximum (FWHM) of 0 63 (&#963; = 0 28).</p><p>Figure <ref type="figure">A1</ref>. Line-spread function of Keck KCWI with BL grating. We find the LSF of KCWI BL grating to be approximated well by a Gaussian function convolved with a top-hat function, as shown in Equation (A1). To measure the shape of the LSF, we simultaneously fit 31 lines of an FeAr lamp spectrum as described in Appendix A.2. Here, we plot a superposition of the nine most prominent of those lines. Black points mark the flux in the lamp spectrum around each line after normalizing for each line's amplitude. Our best-fit LSF model (green) has a top-hat function of width &#916; = 5.105 &#197; convolved with a Gaussian function of &#963; = 0.627 &#197;. A single Gaussian function, as is typically assumed, would provide a very poor fit to the KCWI LSF (red).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A.4. PCA Decomposition of Sky Features</head><p>As part of the process to remove residual sky features seen in the reduced M87 data cubes, we perform a PCA decomposition of the sky spectra. For each sky cube, we apply a conservative spatial masking of possible sources in the field and coadd the unmasked pixels to obtain a high-S/N sky spectrum. A weighted expectation-maximization PCA <ref type="bibr">(Bailey 2012</ref>) is then applied to each of the sky spectra between 3800 and 5650 &#197;. Since the amplitudes of the 4861 &#197; H&#946;, 5200 &#197; [N I], and the 5577 &#197; [O I] lines are highly variable and are not captured well with a PCA decomposition (van Dokkum et al. 2019), we mask these features. The first PCA component is effectively the mean sky spectrum. The second and fourth components capture slight variations in the shape of the continuum and the Ca H and K features. The third and fifth components capture variations in the numerous OH lines. While we obtain measurements of the first ten components, the fifth component and beyond are consistent with noise. A similar routine was previously applied to KCWI observations <ref type="bibr">(van Dokkum et al. 2019)</ref>, and this method is similar in spirit to the Zurich Atmospheric Purge (ZAP; <ref type="bibr">Soto et al. 2016</ref>) used for MUSE observations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A.5. Spectral and Spatial Masking</head><p>We mask nine spectral features, which together span a total of 274 &#197; (Figure <ref type="figure">A2</ref>). The masked features include emission lines that are prominent at the nucleus, as well as the 4861 &#197; H&#946;, 5200 &#197; [N I], and 5577 &#197; [O I] lines that are masked in the PCA decomposition. The Mg I b region (5184-5234 &#197;) is also masked because it is contaminated by the 5200 &#197; [N I] skyline and is coincident with Fe emission features at M87&#700;s redshift.</p><p>We also apply a spatial mask to exclude potentially contaminant spaxels. This is done by collapsing the data cubes spectrally, flagging regions of spaxels with substantially higher surface brightness than their surroundings, and then masking the brightest spaxels in those regions. This process removes the spaxels that are contaminated by the prominent jet, the central &#8764;0 85 that is affected by the active galactic nucleus (AGN), and numerous bright globular clusters.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A.6. Spatial Binning</head><p>We use the vorbin package <ref type="bibr">(Cappellari &amp; Copin 2003)</ref> to construct spatial bins and obtain coadded KCWI stellar spectra with uniformly high S/N. By default, vorbin calculates the S/N of each coadded spectrum based on values of the signal and the noise of the individual spaxel spectrum given by the user, adding the signals linearly and the noise in quadrature. Instead of this default setting, we modify the sn_func() routine in vorbin&#700;s voronoi_2d_binning to nonanalytically recompute the S/N from the M87 datacube while binning. This approach improves the uniformity of the resultant S/N across the bins, as it naturally incorporates spatial correlations in the signal and noise between spaxels. We estimate the S/N by first smoothing the spectrum with a Gaussian kernel with FWHM = 4 &#197;, comparable to the LSF. The noise is taken to be the root-mean-square (rms) difference between the raw and smoothed spectra, while the signal is taken to be the median flux of the raw spectrum. We apply the spectral masks described above before smoothing to avoid contamination from sharp features in the spectra.</p><p>This procedure results in 461 spatial bins and a coadded spectrum for each of the bins. The S/N per &#197; ranges from about 200 in the central regions to about 100 in the outer regions. Figure <ref type="figure">A2</ref> shows a series of representative KCWI spectra (black curves) for ten of the 461 spatial bins located at projected radii of 1&#8243;-130&#8243;.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Appendix B Stellar Kinematic Determination</head><p>We measure the stellar LOSVD for each of the 461 binned spectra using pPXF <ref type="bibr">(Cappellari 2017)</ref>. With pPXF, we convolve a linear combination of template stars with an LOSVD, parameterized by V, &#963;, and high-order Gauss-Hermite moments h 3 -h 8 that account for asymmetric and symmetric deviations from a Gaussian velocity distribution (van der Marel &amp; Franx 1993). The S/N ratios of our data enable the measurement of high-order Gauss-Hermite moments. We find that truncating the series at h 4 (or h 6 ) results in elevated values for h 4 (or h 6 ), but when fitting to h 8 or h 12 , the values of h 4 and h 6 converge and the highest extracted moments become consistent with 0, as seen in past work <ref type="bibr">(Liepold et al. 2020;</ref><ref type="bibr">Pilawa et al. 2022)</ref>. In addition, we find it important to constrain the kinematic moments beyond h 4 in dynamical modeling. When those moments are not constrained in orbit models, the models are prone to producing LOSVDs with unphysical features due to large values in the high-order moments, potentially biasing the preferred model parameters <ref type="bibr">(Liepold et al. 2020;</ref><ref type="bibr">Quenneville et al. 2021</ref>). For stellar templates, we use the MILES library <ref type="bibr">(S&#225;nchez-Bl&#225;zquez et al. 2006;</ref><ref type="bibr">Falc&#243;n-Barroso et al. 2011</ref>) but select 485 spectra out of the full 985 templates that have well-identified spectral types and luminosity classifications. These stellar templates have a higher spectral resolution than our observations and are degraded to match the KCWI (non-Gaussian) LSF before fitting with pPXF.</p><p>During the kinematic fit, we use an additive polynomial of degree one and a multiplicative polynomial of degree 15 to model the stellar continuum. We also supply the PCA components that describe the sky background to pPXF. This procedure results in a weighted combination of the PCA components, which is included as an additional additive term to match the residual sky features in the M87 spectra that remained after the KCWI pipeline's default sky subtraction. Ultimately, we use the first ten PCA components, but we find that the extracted Gauss-Hermite moments are unchanged as long as at least the first five PCA components are included in the fit.</p><p>Because we excluded three highly variable sky lines during the PCA decomposition process, we also mask those spectral regions when running pPXF, as well as emission lines associated with M87 and the Mg I b region, as described previously. In contrast to the other masked regions, we find that the extracted Gauss-Hermite moments depend strongly on the endpoints of Mg I b mask and only stabilize once the entire 5184-5234 &#197; region is excluded from the fit.</p><p>Altogether, we fit the Gauss-Hermite moments, polynomial coefficients, template weights, and sky weights simultaneously. The stellar templates broadened by the best-fit LOSVD provide excellent fits to each of the observed spectra, as illustrated by the red curves for the ten representative spectra shown in Figure <ref type="figure">A2</ref>.</p><p>The measurement uncertainties on the LOSVDs are determined as follows. After an initial fit to each binned spectrum, we perturb the spectrum at a given wavelength by drawing a random number from a Gaussian distribution centered on the spectrum and with a dispersion equal to the rms of the pPXF residuals from the preliminary fit at that wavelength. We perform 1000 such perturbed fits with the pPXF bias parameter set to 0 and determine the mean and standard deviation of each moment over those 1000 realizations, which we adopt as the kinematic value and its 1&#963; uncertainty. For bins in the central 100&#8243; &#215; 100&#8243; region, the mean error on V is 2.6 km s -1 and that on &#963; is 3.0 km s -1 . The mean errors on h 3 through h 8 are similar, spanning from 0.009 to 0.016. The typical errors in the outer bins are slightly larger, with mean errors on V, &#963;, and h 3 through h 8 of 2.6 km s -1 , 3.4 km s -1 , and 0.012-0.022, respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Appendix C Surface Brightness of M87</head><p>Besides the stellar kinematics, another constraint used in the dynamical models is the galaxy's luminosity density. We use a previously published V-band light profile, along with measurements of the ellipticity and PA of the isophotes <ref type="bibr">(Kormendy et al. 2009</ref>). The profile extends from 0 017 to 2400&#8243; and comes from a combination of ground-based data and highresolution HST images, which have been deconvolved to remove the effects of the PSF as well as the AGN.</p><p>We fit the sum of multiple two-dimensional Gaussians to the composite surface photometry. These MGE <ref type="bibr">(Cappellari 2002)</ref> approximations are commonly used because they are able to match the surface brightnesses of galaxies while also enabling analytical deprojections to obtain intrinsic luminosity densities. Our best-fit MGE reproduces the surface brightness between 0 1 and 500&#8243; within 10%. This MGE has 11 Gaussian components that share the same center and PA of -25&#176;. While the value of the PA in <ref type="bibr">Kormendy et al. (2009)</ref> varies within 50&#8243;, the isophotes between 1&#8243; and 50&#8243; are very round with ellipticity &#242; &#61576; 0.08; using a constant PA in our MGE therefore does not affect the quality of the fit. The MGE parameters are given in Table <ref type="table">2</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Appendix D Orbit Modeling</head><p>Radial profiles of the M * /L ratio and stellar kinematics used in the orbit modeling in this work are shown in Figures <ref type="figure">D1</ref> and <ref type="figure">D2</ref>, respectively. Notes. For each of the 11 two-dimensional Gaussian components, the first column lists the central surface brightness density, the middle column lists the dispersion of the Gaussian, and the last column lists the axis ratio, where primed variables denote projected quantities. We obtain the MGE by fitting to the V-band light profile in <ref type="bibr">Kormendy et al. (2009)</ref>. To impose an M * /L gradient in the dynamical models, the I k values are adjusted to reproduce the profile in Figure <ref type="figure">D1</ref>. and inner M * /L of 8.65 0.15 0.10 M e /L e .</p><p>ORCID iDs Emily R. Liepold https:/ /orcid.org/0000-0002-7703-7077 Chung-Pei Ma https:/ /orcid.org/0000-0002-4430-102X Jonelle L. Walsh https:/ /orcid.org/0000-0002-1881-5908 The observed Keck KCWI moments (gray) are matched well by the moments predicted by the best-fit model (red) given by Table <ref type="table">1</ref>. The triaxial orbit models produce point-symmetric LOSVDs, so we have point-symmetrized the kinematic moments before fitting.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>The Astrophysical Journal Letters, 945:L35 (11pp), 2023 March 10 Liepold, Ma, &amp; Walsh</p></note>
		</body>
		</text>
</TEI>
