<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Evidence for &lt;math display='inline'&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mi&gt;B&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mo stretchy='false'&gt;→&lt;/mo&gt;&lt;msup&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;/msup&gt;&lt;mi&gt;ν&lt;/mi&gt;&lt;mover accent='true'&gt;&lt;mi&gt;ν&lt;/mi&gt;&lt;mo stretchy='false'&gt;¯&lt;/mo&gt;&lt;/mover&gt;&lt;/mrow&gt;&lt;/math&gt; decays</title></titleStmt>
			<publicationStmt>
				<publisher>https://doi.org/10.1103/PhysRevD.109.112006</publisher>
				<date>06/01/2024</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10527115</idno>
					<idno type="doi">10.1103/PhysRevD.109.112006</idno>
					<title level='j'>Physical Review D</title>
<idno>2470-0010</idno>
<biblScope unit="volume">109</biblScope>
<biblScope unit="issue">11</biblScope>					

					<author>I Adachi</author><author>K Adamczyk</author><author>L Aggarwal</author><author>H Ahmed</author><author>H Aihara</author><author>N Akopov</author><author>A Aloisio</author><author>N Anh_Ky</author><author>D M Asner</author><author>H Atmacan</author><author>T Aushev</author><author>V Aushev</author><author>M Aversano</author><author>V Babu</author><author>H Bae</author><author>S Bahinipati</author><author>P Bambade</author><author>Sw Banerjee</author><author>S Bansal</author><author>M Barrett</author><author>J Baudot</author><author>M Bauer</author><author>A Baur</author><author>A Beaubien</author><author>F Becherer</author><author>J Becker</author><author>P K Behera</author><author>J V Bennett</author><author>F U Bernlochner</author><author>V Bertacchi</author><author>M Bertemes</author><author>E Bertholet</author><author>M Bessner</author><author>S Bettarini</author><author>B Bhuyan</author><author>F Bianchi</author><author>T Bilka</author><author>D Biswas</author><author>A Bobrov</author><author>D Bodrov</author><author>A Bolz</author><author>J Borah</author><author>A Bozek</author><author>M Bračko</author><author>P Branchini</author><author>R A Briere</author><author>T E Browder</author><author>A Budano</author><author>S Bussino</author><author>M Campajola</author><author>L Cao</author><author>G Casarosa</author><author>C Cecchi</author><author>J Cerasoli</author><author>M-C Chang</author><author>P Chang</author><author>R Cheaib</author><author>P Cheema</author><author>V Chekelian</author><author>C Chen</author><author>B G Cheon</author><author>K Chilikin</author><author>K Chirapatpimol</author><author>H-E Cho</author><author>K Cho</author><author>S-J Cho</author><author>S-K Choi</author><author>S Choudhury</author><author>J Cochran</author><author>L Corona</author><author>L M Cremaldi</author><author>S Cunliffe</author><author>S Das</author><author>F Dattola</author><author>E De_La_Cruz-Burelo</author><author>S A De_La_Motte</author><author>G De_Nardo</author><author>M De_Nuccio</author><author>G De_Pietro</author><author>R de_Sangro</author><author>M Destefanis</author><author>S Dey</author><author>A De_Yta-Hernandez</author><author>R Dhamija</author><author>A Di_Canto</author><author>F Di_Capua</author><author>J Dingfelder</author><author>Z Doležal</author><author>I Domínguez_Jiménez</author><author>T V Dong</author><author>M Dorigo</author><author>K Dort</author><author>D Dossett</author><author>S Dreyer</author><author>S Dubey</author><author>G Dujany</author><author>P Ecker</author><author>M Eliachevitch</author><author>D Epifanov</author><author>Y Fan</author><author>P Feichtinger</author><author>T Ferber</author><author>D Ferlewicz</author><author>T Fillinger</author><author>C Finck</author><author>G Finocchiaro</author><author>A Fodor</author><author>F Forti</author><author>B G Fulsom</author><author>A Gabrielli</author><author>E Ganiev</author><author>M Garcia-Hernandez</author><author>R Garg</author><author>A Garmash</author><author>G Gaudino</author><author>V Gaur</author><author>A Gaz</author><author>A Gellrich</author><author>G Ghevondyan</author><author>D Ghosh</author><author>H Ghumaryan</author><author>G Giakoustidis</author><author>R Giordano</author><author>A Giri</author><author>A Glazov</author><author>B Gobbo</author><author>R Godang</author><author>O Gogota</author><author>P Goldenzweig</author><author>P Grace</author><author>W Gradl</author><author>T Grammatico</author><author>S Granderath</author><author>E Graziani</author><author>D Greenwald</author><author>Z Gruberová</author><author>T Gu</author><author>Y Guan</author><author>K Gudkova</author><author>S Halder</author><author>Y Han</author><author>T Hara</author><author>K Hayasaka</author><author>H Hayashii</author><author>S Hazra</author><author>C Hearty</author><author>M T Hedges</author><author>A Heidelbach</author><author>I Heredia_de_la_Cruz</author><author>M Hernández_Villanueva</author><author>A Hershenhorn</author><author>T Higuchi</author><author>E C Hill</author><author>M Hoek</author><author>M Hohmann</author><author>P Horak</author><author>C-L Hsu</author><author>T Humair</author><author>T Iijima</author><author>K Inami</author><author>G Inguglia</author><author>N Ipsita</author><author>A Ishikawa</author><author>S Ito</author><author>R Itoh</author><author>M Iwasaki</author><author>P Jackson</author><author>W W Jacobs</author><author>D E Jaffe</author><author>E-J Jang</author><author>Q P Ji</author><author>S Jia</author><author>Y Jin</author><author>A Johnson</author><author>K K Joo</author><author>H Junkerkalefeld</author><author>H Kakuno</author><author>M Kaleta</author><author>D Kalita</author><author>A B Kaliyar</author><author>J Kandra</author><author>K H Kang</author><author>S Kang</author><author>G Karyan</author><author>T Kawasaki</author><author>F Keil</author><author>C Ketter</author><author>C Kiesling</author><author>C-H Kim</author><author>D Y Kim</author><author>K-H Kim</author><author>Y-K Kim</author><author>H Kindo</author><author>K Kinoshita</author><author>P Kodyš</author><author>T Koga</author><author>S Kohani</author><author>K Kojima</author><author>T Konno</author><author>A Korobov</author><author>S Korpar</author><author>E Kovalenko</author><author>R Kowalewski</author><author>T_M G Kraetzschmar</author><author>P Križan</author><author>P Krokovny</author><author>Y Kulii</author><author>T Kuhr</author><author>J Kumar</author><author>M Kumar</author><author>R Kumar</author><author>K Kumara</author><author>T Kunigo</author><author>A Kuzmin</author><author>Y-J Kwon</author><author>S Lacaprara</author><author>Y-T Lai</author><author>T Lam</author><author>J S Lange</author><author>M Laurenza</author><author>K Lautenbach</author><author>R Leboucher</author><author>F R Le_Diberder</author><author>P Leitl</author><author>D Levit</author><author>P M Lewis</author><author>C Li</author><author>L K Li</author><author>J Libby</author><author>Q Y Liu</author><author>Z Q Liu</author><author>D Liventsev</author><author>S Longo</author><author>A Lozar</author><author>T Lueck</author><author>T Luo</author><author>C Lyu</author><author>Y Ma</author><author>M Maggiora</author><author>S P Maharana</author><author>R Maiti</author><author>G Mancinelli</author><author>R Manfredi</author><author>E Manoni</author><author>A C Manthei</author><author>M Mantovano</author><author>D Marcantonio</author><author>S Marcello</author><author>C Marinas</author><author>L Martel</author><author>C Martellini</author><author>A Martini</author><author>T Martinov</author><author>L Massaccesi</author><author>M Masuda</author><author>T Matsuda</author><author>K Matsuoka</author><author>D Matvienko</author><author>S K Maurya</author><author>J A McKenna</author><author>R Mehta</author><author>F Meier</author><author>M Merola</author><author>F Metzner</author><author>M Milesi</author><author>C Miller</author><author>M Mirra</author><author>K Miyabayashi</author><author>H Miyake</author><author>R Mizuk</author><author>G B Mohanty</author><author>N Molina-Gonzalez</author><author>S Mondal</author><author>S Moneta</author><author>H-G Moser</author><author>M Mrvar</author><author>R Mussa</author><author>I Nakamura</author><author>K R Nakamura</author><author>M Nakao</author><author>H Nakazawa</author><author>Y Nakazawa</author><author>A Narimani_Charan</author><author>M Naruki</author><author>Z Natkaniec</author><author>A Natochii</author><author>L Nayak</author><author>M Nayak</author><author>G Nazaryan</author><author>C Niebuhr</author><author>N K Nisar</author><author>S Nishida</author><author>S Ogawa</author><author>Y Onishchuk</author><author>H Ono</author><author>Y Onuki</author><author>P Oskin</author><author>F Otani</author><author>P Pakhlov</author><author>G Pakhlova</author><author>A Paladino</author><author>A Panta</author><author>E Paoloni</author><author>S Pardi</author><author>K Parham</author><author>H Park</author><author>S-H Park</author><author>B Paschen</author><author>A Passeri</author><author>S Patra</author><author>S Paul</author><author>T K Pedlar</author><author>R Peschke</author><author>R Pestotnik</author><author>F Pham</author><author>M Piccolo</author><author>L E Piilonen</author><author>P_L M Podesta-Lerma</author><author>T Podobnik</author><author>S Pokharel</author><author>L Polat</author><author>C Praz</author><author>S Prell</author><author>E Prencipe</author><author>M T Prim</author><author>M V Purohit</author><author>H Purwar</author><author>N Rad</author><author>P Rados</author><author>G Raeuber</author><author>S Raiz</author><author>N Rauls</author><author>M Reif</author><author>S Reiter</author><author>M Remnev</author><author>I Ripp-Baudot</author><author>G Rizzo</author><author>L B Rizzuto</author><author>S H Robertson</author><author>M Roehrken</author><author>J M Roney</author><author>A Rostomyan</author><author>N Rout</author><author>G Russo</author><author>Y Sakai</author><author>D A Sanders</author><author>S Sandilya</author><author>A Sangal</author><author>L Santelj</author><author>Y Sato</author><author>V Savinov</author><author>B Scavino</author><author>C Schmitt</author><author>M Schnepf</author><author>C Schwanda</author><author>A J Schwartz</author><author>Y Seino</author><author>A Selce</author><author>K Senyo</author><author>J Serrano</author><author>M E Sevior</author><author>C Sfienti</author><author>W Shan</author><author>C Sharma</author><author>X D Shi</author><author>T Shillington</author><author>T Shimasaki</author><author>J-G Shiu</author><author>D Shtol</author><author>A Sibidanov</author><author>F Simon</author><author>J B Singh</author><author>J Skorupa</author><author>R J Sobie</author><author>M Sobotzik</author><author>A Soffer</author><author>A Sokolov</author><author>E Solovieva</author><author>S Spataro</author><author>B Spruck</author><author>M Starič</author><author>P Stavroulakis</author><author>S Stefkova</author><author>Z S Stottler</author><author>R Stroili</author><author>J Strube</author><author>M Sumihama</author><author>K Sumisawa</author><author>W Sutcliffe</author><author>H Svidras</author><author>M Takahashi</author><author>M Takizawa</author><author>U Tamponi</author><author>S Tanaka</author><author>K Tanida</author><author>F Tenchini</author><author>A Thaller</author><author>O Tittel</author><author>R Tiwary</author><author>D Tonelli</author><author>E Torassa</author><author>N Toutounji</author><author>K Trabelsi</author><author>I Tsaklidis</author><author>M Uchida</author><author>I Ueda</author><author>Y Uematsu</author><author>T Uglov</author><author>K Unger</author><author>Y Unno</author><author>K Uno</author><author>S Uno</author><author>P Urquijo</author><author>Y Ushiroda</author><author>S E Vahsen</author><author>R van_Tonder</author><author>G S Varner</author><author>K E Varvell</author><author>M Veronesi</author><author>A Vinokurova</author><author>V S Vismaya</author><author>L Vitale</author><author>R Volpe</author><author>B Wach</author><author>M Wakai</author><author>H M Wakeling</author><author>S Wallner</author><author>E Wang</author><author>M-Z Wang</author><author>X L Wang</author><author>Z Wang</author><author>A Warburton</author><author>M Watanabe</author><author>S Watanuki</author><author>M Welsch</author><author>C Wessel</author><author>E Won</author><author>X P Xu</author><author>B D Yabsley</author><author>S Yamada</author><author>W Yan</author><author>S B Yang</author><author>J Yelton</author><author>J H Yin</author><author>Y M Yook</author><author>K Yoshihara</author><author>C Z Yuan</author><author>Y Yusa</author><author>L Zani</author><author>V Zhilich</author><author>J S Zhou</author><author>Q D Zhou</author><author>X Y Zhou</author><author>V I Zhukova</author><author>Belle_II_Collaboration</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[<p>We search for the rare decay<math display='inline'><msup><mi>B</mi><mo>+</mo></msup><mo stretchy='false'>→</mo><msup><mi>K</mi><mo>+</mo></msup><mi>ν</mi><mover accent='true'><mi>ν</mi><mo stretchy='false'>¯</mo></mover></math>in a<math display='inline'><mn>362</mn><mtext></mtext><mtext></mtext><msup><mi>fb</mi><mrow><mo>−</mo><mn>1</mn></mrow></msup></math>sample of electron-positron collisions at the<math display='inline'><mrow><mi mathvariant='normal'>ϒ</mi><mo stretchy='false'>(</mo><mn>4</mn><mi>S</mi><mo stretchy='false'>)</mo></mrow></math>resonance collected with the Belle II detector at the SuperKEKB collider. We use the inclusive properties of the accompanying<math display='inline'><mi>B</mi></math>meson in<math display='inline'><mrow><mi mathvariant='normal'>ϒ</mi><mo stretchy='false'>(</mo><mn>4</mn><mi>S</mi><mo stretchy='false'>)</mo><mo stretchy='false'>→</mo><mrow><mi>B</mi><mover accent='true'><mrow><mi>B</mi></mrow><mrow><mo stretchy='false'>¯</mo></mrow></mover></mrow></mrow></math>events to suppress background from other decays of the signal<math display='inline'><mi>B</mi></math>candidate and light-quark pair production. We validate the measurement with an auxiliary analysis based on a conventional hadronic reconstruction of the accompanying<math display='inline'><mi>B</mi></math>meson. For background suppression, we exploit distinct signal features using machine learning methods tuned with simulated data. The signal-reconstruction efficiency and background suppression are validated through various control channels. The branching fraction is extracted in a maximum likelihood fit. Our inclusive and hadronic analyses yield consistent results for the<math display='inline'><msup><mi>B</mi><mo>+</mo></msup><mo stretchy='false'>→</mo><msup><mi>K</mi><mo>+</mo></msup><mi>ν</mi><mover accent='true'><mi>ν</mi><mo stretchy='false'>¯</mo></mover></math>branching fraction of<math display='inline'><mrow><mo stretchy='false'>[</mo><mn>2.7</mn><mo>±</mo><mn>0.5</mn><mo stretchy='false'>(</mo><mrow><mi>stat</mi></mrow><mo stretchy='false'>)</mo><mo>±</mo><mn>0.5</mn><mo stretchy='false'>(</mo><mrow><mi>syst</mi></mrow><mo stretchy='false'>)</mo><mo stretchy='false'>]</mo></mrow><mo>×</mo><msup><mn>10</mn><mrow><mo>−</mo><mn>5</mn></mrow></msup></math>and<math display='inline'><mrow><mo stretchy='false'>[</mo><msubsup><mn>1.1</mn><mrow><mo>−</mo><mn>0.8</mn></mrow><mrow><mo>+</mo><mn>0.9</mn></mrow></msubsup><mo stretchy='false'>(</mo><mrow><mi>stat</mi></mrow><msubsup><mo stretchy='false'>)</mo><mrow><mo>−</mo><mn>0.5</mn></mrow><mrow><mo>+</mo><mn>0.8</mn></mrow></msubsup><mo stretchy='false'>(</mo><mrow><mi>syst</mi></mrow><mo stretchy='false'>)</mo><mo stretchy='false'>]</mo></mrow><mo>×</mo><msup><mn>10</mn><mrow><mo>−</mo><mn>5</mn></mrow></msup></math>, respectively. Combining the results, we determine the branching fraction of the decay<math display='inline'><msup><mi>B</mi><mo>+</mo></msup><mo stretchy='false'>→</mo><msup><mi>K</mi><mo>+</mo></msup><mi>ν</mi><mover accent='true'><mi>ν</mi><mo stretchy='false'>¯</mo></mover></math>to be<math display='inline'><mrow><mo stretchy='false'>[</mo><mn>2.3</mn><mo>±</mo><mn>0.5</mn><mo stretchy='false'>(</mo><mrow><mi>stat</mi></mrow><msubsup><mo stretchy='false'>)</mo><mrow><mo>−</mo><mn>0.4</mn></mrow><mrow><mo>+</mo><mn>0.5</mn></mrow></msubsup><mo stretchy='false'>(</mo><mrow><mi>syst</mi></mrow><mo stretchy='false'>)</mo><mo stretchy='false'>]</mo></mrow><mo>×</mo><msup><mn>10</mn><mrow><mo>−</mo><mn>5</mn></mrow></msup></math>, providing the first evidence for this decay at 3.5 standard deviations. The combined result is 2.7 standard deviations above the standard model expectation.</p> <sec><supplementary-material><permissions><copyright-statement>Published by the American Physical Society</copyright-statement><copyright-year>2024</copyright-year></permissions></supplementary-material></sec>]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>Flavor-changing neutral-current transitions, such as b &#8594; s&#957;&#957; and b &#8594; sll, where l represents a charged lepton, are suppressed in the standard model (SM) of particle physics, because of the Glashow-Iliopoulos-Maiani mechanism <ref type="bibr">[1]</ref>. These transitions can only occur at higher orders in SM perturbation theory through weak-interaction amplitudes that involve the exchange of at least two gauge bosons. Rate predictions for b &#8594; sll have significant theoretical uncertainties from the breakdown of factorization due to photon exchange <ref type="bibr">[2]</ref>. This process does not contribute to b &#8594; s&#957;&#957;, so the corresponding rate predictions are relatively precise.</p><p>The b &#8594; s&#957;&#957; transition provides the leading amplitudes for the B &#254; &#8594; K &#254; &#957;&#957; decay in the SM, as shown in Fig. <ref type="figure">1</ref>. The SM branching fraction of the B &#254; &#8594; K &#254; &#957;&#957; decay <ref type="bibr">[3]</ref> is predicted in Ref. <ref type="bibr">[4]</ref> to be</p><p>including a contribution of &#240;0.61 AE 0.06&#222; &#215; 10 -6 from the long-distance double-charged-current</p><p>decay rate can be significantly modified in models that predict non-SM particles, such as leptoquarks <ref type="bibr">[5]</ref>. In addition, the B &#254; meson could decay into a kaon and an undetectable particle, such as an axion <ref type="bibr">[6]</ref> or a dark-sector mediator <ref type="bibr">[7]</ref>. In all analyses reported to date <ref type="bibr">[8]</ref><ref type="bibr">[9]</ref><ref type="bibr">[10]</ref><ref type="bibr">[11]</ref><ref type="bibr">[12]</ref><ref type="bibr">[13]</ref>, no evidence for a signal has been found, and the current experimental upper limit on the branching fraction is 1.6 &#215; 10 -5 at the FIG. <ref type="figure">1</ref>. Lowest-order quark-level diagrams for the B &#254; &#8594; K &#254; &#957;&#957; decay in the SM are either of the penguin (a), or box type (b): examples are shown. The long-distance double-charged-current diagram (c) arising at tree level in the SM also contributes to the</p><p>Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article's title, journal citation, and DOI. Funded by SCOAP 3 .</p><p>90% confidence level <ref type="bibr">[14]</ref>. The study of the B &#254; &#8594; K &#254; &#957;&#957; decay is experimentally challenging as the final state contains two neutrinos that are not reconstructed. This prevents the full reconstruction of the kinematic properties of the decay, hindering the differentiation of signal distributions from background.</p><p>In this study the signal B meson is produced in the e &#254; e -&#8594; &#978;&#240;4S&#222; &#8594; B &#254; B -process. The at-threshold production of B B pairs helps to mitigate the limitations due to the unconstrained kinematics, as the partner B meson can be used to infer the presence and properties of the signal B. An inclusive tagging analysis method (ITA) exploiting inclusive properties from the B meson pair-produced along with the signal B, is applied to the entire Belle II data set currently available, superseding the results of Ref. <ref type="bibr">[13]</ref>, where this method was first used. In addition, an auxiliary analysis using the well-established hadronic tagging analysis method (HTA) <ref type="bibr">[9,</ref><ref type="bibr">10]</ref> is presented; this involves explicit reconstruction of the partner B meson through a hadronic decay. The HTA method offers an important consistency check of the newer inclusive tagging method and helps validate the ITA results. In addition, the small size of the overlap between the HTA and ITA samples allows for a straightforward combination of the results, achieving a 10% increase in precision over the ITA result alone.</p><p>The ITA commences with the reconstruction of charged and neutral particles, followed by the selection of a single signal kaon candidate in events with one or more kaons. Subsequently, relevant quantities are computed using the kaon candidate, along with the remaining particles in the event, to discriminate between signal and background processes. These quantities are used in boosted decision trees (BDTs) <ref type="bibr">[15,</ref><ref type="bibr">16]</ref> that are optimized and trained using simulated data. A signal region is then defined, and a binned profile-likelihood sample-composition fit is carried out on data. This fit uses simulated samples to provide predictions to determine the branching fraction of the B &#254; &#8594; K &#254; &#957;&#957; decay along with the rates of background processes. The fit incorporates systematic uncertainties arising from detector and physics-modeling imperfections as nuisance parameters. To validate the modeling of signal and background processes in simulation, several control channels are employed. The method is further validated through a closure-test measurement of the branching fraction of the B &#254; &#8594; &#960; &#254; K 0 decay.</p><p>The HTA follows a similar method, but begins with the reconstruction of the partner B meson, and then proceeds to the definition of the signal candidate.</p><p>Except for the tagging method, the two analyses are similar in terms of particle reconstruction, event selection, usage of control samples, fit strategy, and treatment of common systematic uncertainties. In what follows, common approaches and details of the ITA are given first, followed by the HTA-specific details.</p><p>The paper is organized as follows. The data and simulated samples are presented in Sec. II followed by the Belle II detector description in Sec. III. The initial event selection and reconstruction of the decays are described in Sec. IV. Corrections introduced to the simulated samples are discussed in Sec. V. Section VI details background suppression and final event-selection using machine learning methods. Section VII defines the signal region used to extract the B &#254; &#8594; K &#254; &#957;&#957; decay branching fraction. The following two sections, Sec. VIII and Sec. IX, are dedicated to the validation of the modeling of the signal-selection efficiency and background contributions, respectively. Section X documents the statistical approach used to extract the signal, and Sec. XI describes the systematic uncertainties. The results are discussed in Sec. XII, and consistency checks used for validation are presented in Sec. XIII. The combination of the ITA and HTA results is discussed in Sec. XIV. A discussion of the results is presented in Sec. XV. Section XVI concludes the paper.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. DATA AND SIMULATED SAMPLES</head><p>This search uses data from e &#254; e -collisions produced between the years 2019 and 2022 by the SuperKEKB collider <ref type="bibr">[17]</ref>. The on-resonance data, with an integrated luminosity of 362 fb -1 <ref type="bibr">[18]</ref>, are recorded at a center-ofmass (c.m.) energy of ffiffi ffi s p &#188; 10.58 GeV, which corresponds to the mass of &#978;&#240;4S&#222; resonance, and contain N B B &#188; &#240;387 AE 6&#222; &#215; 10 6 B B pairs <ref type="bibr">[19]</ref>. An additional 42 fb -1 offresonance sample, collected at an energy 60 MeV below the mass of &#978;&#240;4S&#222; resonance, is used to study background from continuum: e &#254; e -&#8594; &#964; &#254; &#964; -events and e &#254; e -&#8594; q q events, where q indicates an u, d, s, or c quark.</p><p>Simulated samples are exploited for training multivariate classifiers, estimating signal-selection efficiencies, identifying backgrounds, and defining components of the fits to data. Various event generators are used. The production and decays of charged and neutral B mesons use PYTHIA8 <ref type="bibr">[20]</ref> and EVTGEN <ref type="bibr">[21]</ref>. The KKMC generator <ref type="bibr">[22]</ref> is used to generate the q q pairs followed by PYTHIA8 to simulate their hadronization and EVTGEN to model the decays of the resulting hadrons. Similarly, KKMC and TAUOLA <ref type="bibr">[23]</ref> are employed to simulate production of e &#254; e -&#8594; &#964; &#254; &#964; -events and decays of &#964; leptons, respectively. Final-state QED radiation is simulated using PHOTOS <ref type="bibr">[24]</ref>. For all samples, the Belle II analysis software <ref type="bibr">[25,</ref><ref type="bibr">26]</ref>, interfaced with GEANT4 <ref type="bibr">[27]</ref>, is used to simulate the detector response and perform event reconstruction.</p><p>The simulated B &#254; &#8594; K &#254; &#957;&#957; signal decays are weighted according to the SM form-factor calculations from Ref. <ref type="bibr">[4]</ref>. Similar weighting is applied to B &#8594; K &#195; &#240;892&#222;&#957;&#957; background decays [in the following, K &#195; &#240;892&#222; mesons are indicated with</p><p>decays are simulated separately, normalized using the branching fraction from Ref. <ref type="bibr">[4]</ref>, and added to the B &#254; B -background.</p><p>The simulation of several other background processes receives additional corrections. Nonresonant three-body B &#254; &#8594; K &#254; n n decays are simulated assuming the thresholdenhancement effect present in the isospin-partner decay <ref type="bibr">[29]</ref> and assuming equal probabilities for the</p><p>contribution and nonresonant p-wave contribution with parameters taken from the isospin-related decay B 0 &#8594; K 0 S K &#254; K -, as measured in Ref. <ref type="bibr">[29]</ref>. The PHOKHARA event generator <ref type="bibr">[30]</ref> is used to simulate e &#254; e -&#8594; &#981;&#240;&#8594; K 0 S K 0 L &#222;&#947; events, which are used for additional studies.</p><p>The simulated continuum samples are normalized based on the known cross sections and integrated luminosity. Both the simulated B B background and signal samples are scaled using N B B, where the number of B &#254; B -pairs is calculated as f &#254;-N B B, and the number of B 0 B0 pairs is calculated as <ref type="bibr">[31]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. DETECTOR</head><p>A comprehensive description of the Belle II detector is given in Ref. <ref type="bibr">[32]</ref>. The detector consists of several subdetectors arranged in a cylindrical structure around the beam pipe. The innermost subsystem consists of a silicon pixel detector surrounded by a double-sided silicon strip detector, referred to as the silicon vertex detector, and a central drift chamber (CDC). The second layer of the pixel detector covers only one-sixth of the azimuthal angle for the data used in this work. The silicon detectors allow for precise determination of particle-decay vertices while the CDC determines charged-particle momenta and electric charge. A time-of-propagation counter and an aerogel ring-imaging Cherenkov counter cover the barrel and forward endcap regions of the detector, respectively: these subdetectors are important for charged-particle identification (PID). An electromagnetic calorimeter (ECL), used to reconstruct photons and distinguish electrons from other charged particles, occupies the remaining volume inside a superconducting solenoid. This provides a uniform 1.5 T magnetic field, parallel to the detector's principal axis. A dedicated system to identify K 0 L mesons and muons is installed in the flux return of the solenoid. The z axis of the laboratory frame is collinear with the symmetry axis of the solenoid and almost aligned with the electron-beam direction. The polar angle, as well as the longitudinal and transverse directions, are defined with respect to the z axis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. EVENT SELECTION</head><p>The online-event-selection systems (triggers) for this analysis are based either on the number of charged-particle trajectories (tracks) in the CDC or on the energy deposits in the ECL, and have an efficiency close to 100% for signal decays. In the offline analysis, the reconstruction of charged particles follows the algorithm outlined in Ref. <ref type="bibr">[33]</ref>. For the ITA, to ensure that efficiency is high and well-measured, and to suppress beam-related background, charged particles are required to have a transverse momentum p T &gt; 0.1 GeV=c and to be within the CDC acceptance (17&#176;&lt; &#952; &lt; 150&#176;). All charged particles except those used to form K 0 S candidates are required to have minimum longitudinal and transverse distances (impact parameters) from the average interaction point of jd z j &lt; 3.0 cm and d r &lt; 0.5 cm, respectively. The K 0 S candidates are formed by combining pairs of oppositely charged particles in a vertex fit. These candidates are required to have a dipion reconstructed mass between 0.495 and 0.500 GeV=c 2 , vertex p-value greater than 0.001, flight time greater than 0.007 ns (corresponding to about 2 mm displacement from the primary vertex), and cosine of the angle between momentum and flight direction greater than 0.98. Photons are identified as energy deposits exceeding 0.1 GeV detected in the ECL regions within the CDC acceptance, and not matched to tracks. The minimum energy requirement suppresses the beam-related background and energy deposits from charged hadrons that fail the matching to tracks. Each of the charged particles and photons is required to have an energy of less than 5.5 GeV to reject misreconstructed particles and cosmic muons. The kaon candidates are selected using particleidentification likelihoods based on information coming primarily from the PID detectors, complemented with information from the silicon strip detector, CDC, and the K 0 L and muon identification system. To ensure reliable PID, at least 20 deposited-charge measurements are required in the CDC. The chosen PID requirement has 68% efficiency for signal kaons, while the probability to identify a pion as a kaon is 1.2%. Candidates are also required to have at least one deposit in the pixel detector: this improves the impact parameter resolution, and helps to reject background events.</p><p>Events are required to contain no more than ten tracks to suppress background (e.g., high-multiplicity continuum production) with only a 0.5% loss of signal-selection efficiency. Low-track-multiplicity background events, such as those originating from two-photon-collision processes, are suppressed by demanding at least four tracks in the event. This reduces signal-reconstruction efficiency by 7.6%. The total energy from all reconstructed particles in the event must exceed 4 GeV. The polar angle of the missing momentum, computed in the c.m. frame as the complement to the total momentum of all reconstructed particles, must be between 17&#176;and 160&#176;. This range is chosen to remove low-multiplicity events and to ensure that the missing momentum points toward the active detector volume.</p><p>To select the signal kaon in an event, the mass squared of the neutrino pair is computed as</p><p>assuming the signal B meson to be at rest in the e &#254; e -c.m. frame. Here M K is the known mass of K &#254; mesons and E &#195; K is the reconstructed energy of the kaon in the c.m. system. Uncertainties in the kinematic properties of the colliding beams have negligible impact on the q 2 rec reconstruction. The candidate having the lowest q 2 rec is retained for further analysis. Studies on simulated signal events show that prior to applying the q 2 rec requirement the fraction of events with multiple candidates is 39%. The average number of candidates in such events is 2.2. The lowest-q 2 rec candidate is the signal kaon in 96% of cases. Checks using a random selection of the signal candidate, if several candidates are found, indicate no bias in the procedure. The remaining charged particles are fit to a common vertex and are attributed, together with the photons and K 0 S candidates, to the rest of the event (ROE). For the signal events, these charged particles and K 0 S candidates correspond to the decay products of the second B meson.</p><p>The HTA commences with the full reconstruction of a B meson (B tag ), decaying into one of 36 hadronic B decays, through the full event interpretation (FEI) <ref type="bibr">[34]</ref>. The FEI is an algorithm based on a hierarchical multivariate approach in which final-state particles are constructed using the tracks and energy deposits in the ECL, and combined into intermediate particles until the final B tag candidates are formed. The algorithm calculates, for each decay chain, the probability of it correctly describing the true process using gradient-boosted decision trees. Only B tag mesons with a probability exceeding 0.001 are retained. In addition, the beam-constrained mass</p><p>MeV are required, where E &#195; B and p &#195; B are the energy and the magnitude of the three-momentum of the B tag in the c.m. frame, respectively. Signal candidates peak at the known B &#254; mass and zero in M bc and &#916;E, respectively, while continuum events are distributed more uniformly. The FEI algorithm imposes conditions on the charged particles and energy deposits in the ECL similar to those used by the ITA. The algorithm requires at least three tracks and three energy deposits in the ECL, including those that are associated with the tracks. Furthermore, events with more than 12 tracks having d r &lt; 2 cm and jd z j &lt; 2 cm are rejected. Such events would have greater multiplicity than the maximum of the reconstructed B tag final states plus the signalkaon track.</p><p>The signal-kaon candidate track is required to have at least 20 measurements in the CDC, and impact parameters d r &lt; 0.5 cm and jd z j &lt; 4 cm, and is required to satisfy PID criteria for a kaon. The B tag and signal kaon are required to have opposite charges. The same restrictions on missing momentum are applied as in the ITA. Moreover, the number of tracks with d r &lt; 2 cm, jd z j &lt; 4 cm and with at least 20 measurements in the CDC, which are neither associated with the B tag nor with the signal kaon, is required to be zero.</p><p>The remaining reconstructed objects in the HTA include tracks, which neither meet the CDC nor impact parameter requirements (extra tracks), and energy deposits in the ECL, which are neither associated with the B tag nor the signal kaon (extra photon candidates). Only the energy deposits in the ECL that exceed a &#952;-dependent energy threshold ranging from 60 MeV to 150 MeV, have a distance from the nearest track extrapolation larger than 50 cm, and are reconstructed within the CDC acceptance are considered. The sum of the energies of these deposits, denoted as E extra , and the multiplicity of the extra tracks, denoted as n tracks extra , are utilized in the subsequent steps of the analysis. Events are rejected if a K 0 S -meson, &#960; 0 -meson, or &#923;-baryon candidate is reconstructed from the extra tracks and photons.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>V. CORRECTIONS TO SIMULATED DATA</head><p>The simulation of the detector response is tested using control samples from data, and correction factors are introduced with corresponding systematic uncertainties. Correction factors are applied as weights to the selected events when appropriate, particularly when the corrections impact the efficiency of the signal-kaon selection. In other cases, when the corrections affect the kinematic properties of the particles, these corrections are applied prior to the event selection and computation of related variables. The correction procedure is carried out for the nominal analysis as well as when computing systematic variations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Reconstruction of charged particles</head><p>The efficiency for reconstructing charged particles is studied using e &#254; e -&#8594; &#964; &#254; &#964; -events, where one &#964; lepton decays into a single charged particle while the other decays into three charged particles <ref type="bibr">[35]</ref>. Simulation agrees well with the data. A systematic uncertainty of 0.3% is introduced for each charged particle to account for uncertainties in the detection efficiency and in the knowledge of the detector geometrical acceptance.</p><p>The reconstruction of kinematic properties of charged particles is validated by comparing the measured pole masses of known resonances with simulation. The simulation reproduces the data with an accuracy better than 0.1%, and any residual differences have a negligible impact on the analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Identification of charged particles</head><p>About 10% of background arises from incorrect particle identification of the signal-kaon candidate. The main contribution is from misidentified pions, while misidentified muons, electrons, and protons have a smaller impact. The efficiency of kaon identification and the misidentification ("fake") rate for pions misidentified as kaons are determined using</p><p>decays reconstructed in continuum data and simulation. The small mass difference between D &#195;&#254; and D 0 mesons enables the isolation of a pure signal. The charge of the low-momentum pion from the flavor-conserving D &#195;&#254; decay allows the precise identification of the products of the Cabibbofavored D 0 decay, providing abundant and low-background K -and &#960; &#254; samples.</p><p>Correction factors and their uncertainties are applied to the simulation as functions of the particle's charge, momentum, and polar angle. The correction factors for the pion-tokaon fake rates are close to a factor of 2, indicating that the simulation underestimates the rate at which pions are misidentified as kaons. The uncertainties associated with these corrections, which are around 1% for efficiencies and 10% for fake rates, are treated as systematic uncertainties. Correction factors for the lepton-to-kaon fake rates are also applied, although their impact is negligible.</p><p>The correction factors for kaon identification efficiency and the pion-to-kaon fake rate are further validated for the signal region of the ITA, using the</p><p>following the procedure outlined below. The decays are reconstructed using the pion mass hypothesis and the nominal kaon identification for the h &#254; candidate. For B &#254; &#8594; &#960; &#254; D0 decays, the distribution of the &#916;E variable peaks at zero. Since the B &#254; is produced almost at rest in the c.m. frame, the kaon momentum in the two-body B &#254; &#8594; K &#254; D0 decay is expected to be equal to 2.3 GeV=c with a small spread. Combined with the pion mass, this leads to an energy deficit compared to the correct kaon hypothesis, resulting in a shift in &#916;E of -0.049 GeV. This characteristic of the &#916;E distribution is used to distinguish between</p><p>decays without relying on PID information. The kaon and pion candidates from the D0 decay are differentiated by kaon identification: the particle with the higher value is assumed to be a kaon. Only D0 candidates with an invariant mass within 3 standard deviations of the known D0 mass <ref type="bibr">[14]</ref> are kept. In addition, the selections M bc &gt; 5.27 GeV=c 2 and j&#916;Ej &lt; 0.1 GeV are applied. If several candidates pass the selection, a random one is chosen.</p><p>For the selected B &#254; &#8594; h &#254; D0 decays, the information on the &#916;E variable is kept while the tracks from the D0 decay are removed, and each event is reconstructed again as a B &#254; &#8594; K &#254; &#957;&#957; event. The same procedure is repeated for both data and simulation. The selected events show a q 2 rec distribution peaking between 3 GeV 2 =c 4 and 5 GeV 2 =c 4 corresponding to the D 0 mass squared. The events have a B &#254; &#8594; K &#254; &#957;&#957; signal-like signature, and for this q 2 range are reconstructed with high efficiency. Distribution for q 2 is included in the Supplemental Material <ref type="bibr">[36]</ref>. The distribution of the &#916;E variable for the signal region of the ITA is shown in Fig. <ref type="figure">2</ref>. Two prominent peaks corresponding to</p><p>observed. The yields of the two components are extracted in a fit using Gaussian shapes. The double ratio of the</p><p>decay rates in data to simulation is 1.03 AE 0.09, showing consistency with unity within the statistical uncertainty.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Reconstruction of neutral particles</head><p>The photon detection efficiency at Belle II, and the calibration of the photon energy reconstruction, are based on studies of e &#254; e -&#8594; &#956; &#254; &#956; -&#947; events. The efficiency is the fraction of events where a photon is reconstructed with momentum consistent with the expectation from recoil against the &#956; &#254; &#956; -system <ref type="bibr">[37]</ref>. The resulting uncertainty is negligible for this analysis.</p><p>The uncertainty on the photon energy is 0.5%. The effect on signal yield in the fit is estimated by applying this uncertainty to energy deposits from photon candidates matched to simulated photons.</p><p>The simulated sample shows that photon candidates have 30% contamination from beam-related background, energy deposits from charged hadrons that are reconstructed away from the particle trajectory, and from neutral hadrons. These deposits are not matched to simulated photons ("unmatched"). The bias in the reconstructed energy for these sources ("hadronic energy correction") is studied using the summed energy of the photon candidates in the ROE ("summed neutral energy",</p><p>decay reconstruction can be found in FIG. 2. Distribution of &#916;E in data (dots with error bars) obtained for</p><p>computed assuming a pion mass hypothesis for h &#254; . The blue solid line represents the fit result to the data, modeled as a sum of two Gaussian shapes corresponding to</p><p>events, with the daughters from the D0 decays removed, and chosen to be in the signal region of the ITA.</p><p>Sec. VIII). In the simulation, the energy of reconstructed photon candidates is treated differently based on their matching to the generated photons. The energy for matched candidates is not corrected. For unmatched candidates, a multiplicative hadronic energy correction is inferred empirically using data. In the simulation the correction is varied within a AE20% range around unity. For the ITA, an improvement is found when the hadronic energy is varied down by 10%. The corresponding correction with 100% uncertainty (relative) is introduced. Illustration of the hadronic energy correction for the ITA is included in the Supplemental Material <ref type="bibr">[36]</ref>.</p><p>Figure <ref type="figure">3</ref> shows the comparison of distributions of summed neutral energy for events in which a B &#254; &#8594; K &#254; J=&#968; decay is reconstructed, for collision data and for the corresponding uncorrected and corrected simulation. The correction corresponds to a variation of the hadronic energy by -10%. Better data-simulation agreement is achieved by the corrected simulation.</p><p>The correction is validated using various control samples dominated by background, such as off-resonance data and data at early selection steps. An improvement is observed in the description of several variables related to neutralparticle energy deposits, such as the number of photon candidates. The latter is sensitive to the hadronic energy since the hadronic-energy deposits peak at low energy and are affected strongly by the minimal energy requirement of 0.1 GeV.</p><p>For the HTA, a different extra-photon selection is adopted. In the HTA sample, the energy spectrum of extra photon candidates exhibits good data-simulation agreement, but observed discrepancies in the multiplicity (n &#947; extra ) propagate to the E extra distribution. To correct this, a control sample is used where the signal kaon and the B tag have the same charge. A weight is computed for the n &#947; extra distribution as follows:</p><p>where N data &#240;n &#947; extra &#222; and N simulation &#240;n &#947; extra &#222; correspond to the event yields with n &#947; extra candidates in data and simulation, respectively. Subsequently, simulated events where the signal kaon and B tag have opposite charges, are weighted based on their associated n &#947; extra value. This method is validated using an independent pionenriched control sample where the signal track is identified as a pion instead of a kaon. The pion-enriched sample is further divided into two samples based on whether the signal candidate and B tag have the same or opposite charge. Corrections are derived from the sample where signal and B tag have same charge and are then applied to the oppositecharge sample. The effect of the correction in the pionenriched sample at the event-selection stage is shown in Fig. <ref type="figure">4</ref>.</p><p>Although an improvement is observed after applying the correction, residual data-simulation discrepancies remain. To account for these, a systematic uncertainty is assigned corresponding to 100% of the residual difference in the data-to-simulation ratio observed in the opposite-charge pion-enriched control sample after the correction.</p><p>Given the prominence of background contributions containing K 0 L mesons, a dedicated study is performed to check their modeling. This study focuses on the ECL response only, as the analysis does not use K 0 L candidates from the dedicated identification system to avoid additional systematic uncertainties due to their modeling. Radiativereturn production e &#254; e -&#8594; &#947;&#981;&#240;&#8594; K 0 S K 0 L &#222; is used for this purpose for K 0 L with energy above 1.6 GeV. The events are selected by demanding a photon candidate with energy E &#195; &#947; &gt; 4.7 GeV in the c.m. frame, a well-reconstructed K 0 S candidate, and no extra tracks. The K 0 L four-momentum is inferred based on the photon and K 0 S four-momenta, where the photon energy is computed based on the two-body e &#254; e -&#8594; &#947;&#981; process. The typical momentum resolution of an inferred K 0 L is better than 1%. An energy deposit in the ECL reconstructed at a radius R is matched to the trajectory of K 0 L extrapolated to the same R if the distance between them is less than 15 cm. The efficiency for finding a matched energy-deposit is studied both in data and simulation, and is tested separately in the ITA and HTA. The ITA selection for the ECL deposits is looser than the HTA </p><p>J=&#968; decay is reconstructed. The correction corresponds to a variation of the hadronic energy by -10%. The simulation is normalized to the number of events in data. The ratio shown in the lower panel refers to data over corrected simulation. selection; therefore a higher efficiency is found. Figure <ref type="figure">5</ref> shows the ITA K 0 L efficiency as a function of momentum; the simulation overestimates the efficiency by 17%. This is taken into account by performing a -17% (relative) efficiency correction in the ITA sample, for all K 0 L , including those below 1.6 GeV. A AE8.5% systematic uncertainty (i.e. half of the correction) is assigned. Distribution of the energy deposits in ECL is shown in the Supplemental Material <ref type="bibr">[36]</ref>.</p><p>While the radiative-return production of &#981; mesons does not encompass K 0 L with energies below 1.6 GeV, approximately half of the K 0 L mesons in the main background processes populate this lower-energy range. As a consistency check, a 100% inefficiency is incorporated in the ITA for this kinematic region in the simulation. Specifically, all energy deposits in the ECL that fall within a 15 cm radius of the extrapolated K 0 L trajectory are removed for simulated K 0</p><p>L with energies smaller than 1.6 GeV. The impact of this additional requirement on the analysis is found to be covered by the hadronic-energy systematic uncertainty, discussed above.</p><p>The K 0 L reconstruction efficiency is smaller for the HTA. Since the effect on E extra is already addressed by the correction and systematic uncertainty derived from the extra-photon-multiplicity spectrum, no direct correction to the K 0 L efficiency is applied. Instead, a systematic uncertainty is assigned, wherein the yields of B final states with a K 0 L are varied by 17%.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VI. BACKGROUND SUPPRESSION</head><p>Simulated signal and background events are used to train BDTs that suppress the background. Several inputs are considered, including general event-shape variables described in Ref. <ref type="bibr">[38]</ref>, as well as variables characterizing the kaon candidate and the kinematic properties of the ROE. Moreover, vertices of two and three charged particles, with one of the tracks being the kaon-candidate track, are reconstructed to identify kaons from D 0 and D &#254; meson decays; variables describing the fit quality and kinematic properties of the resulting candidates are considered as possible BDT inputs. Variables are excluded if either their contribution to the classification's separation power is negligible, or they are poorly described by the simulation.</p><p>The ITA uses two consecutive BDTs. A first binary classifier, BDT 1 , is designed as a first-level filter after event selection. It is trained on 10 6 simulated events of each of the seven considered background categories (decays of charged B mesons, decays of neutral B mesons, and the five continuum categories: e &#254; e -&#8594; q q with q &#188; u, d, s, c quarks and e &#254; e -&#8594; &#964; &#254; &#964; -), weighted to a common FIG. 5. Efficiency of reconstructing an energy deposit in the ECL matched to the K 0 L direction, as a function of the K 0 L energy, for e &#254; e -&#8594; &#947;&#981; data and simulation. The energy deposits are selected following the ITA criteria.</p><p>equivalent luminosity such that the sum of weights is balanced to the 10 6 simulated signal events. The classifier uses 12 input variables. The most discriminating variable is the difference between the ROE energy in the c.m. frame and ffiffi ffi s p =2 (&#916;E ROE ), which tends to be negative for signal events due to neutrinos, whereas it is positive for the background with additional reconstructed particles. Significant discrimination comes from variables sensitive to the momentum imbalance of the signal events due to neutrinos, as well as those that correlate the missing momentum with the signalkaon momentum. Examples of such variables are the reduced first-order Fox-Wolfram moment <ref type="bibr">[39]</ref> and the modified Fox-Wolfram moments <ref type="bibr">[40]</ref>.</p><p>The second classifier, BDT 2 , is used for the final event selection. It is trained on events with BDT 1 &gt; 0.9, which corresponds to a signal (background) selection efficiency of 34% (1.5%), using 35 input variables. A simulated background sample of 200 fb -1 equivalent luminosity, corresponding to 4.2 &#215; 10 6 events, and a sample of 1.7 &#215; 10 6 signal events are used. Tests with larger samples used for BDT 2 training show no additional improvements in BDT 2 performance. For BDT 2 , the most discriminating variables are the cosine of the angle between the momentum of the signal-kaon candidate and the thrust axis of the ROE computed in the c.m. frame, which has a uniform distribution for the signal and a peaking shape for the jetlike continuum background. The thrust axis is defined as the unit vector t that maximizes the thrust value P j t &#8226; p &#195; i j= P j p &#195; i j, where p &#195; i is the momentum of ith final-state particle in the e &#254; e -c.m. frame <ref type="bibr">[41,</ref><ref type="bibr">42]</ref>. Also important are variables identifying kaons from D 0 and D &#254; meson decays, and the modified Fox-Wolfram moments. The BDT 1 and BDT 2 parameters are optimized based on a grid search in the parameter space and are described in Appendix A 1. Training of BDT 1 and BDT 2 classifiers is based on simulated samples that are statistically independent of those used in the sample-composition fit.</p><p>For the HTA, the remaining background is suppressed using a multivariate classifier BDTh, which uses 12 input variables combining information about the event shape, the signal-kaon candidate, the B tag meson, and any extra tracks and extra photons. Simulated background samples of about 2 &#215; 10 5 B B events and 3 &#215; 10 5 continuum events, which correspond to an equivalent luminosity of, respectively, 3 ab -1 and 1 ab -1 , are used together with a signal sample of 5 &#215; 10 5 events. The BDTh parameters are optimized through a grid search in the parameter space. Given the limited size of the simulated sample, it is beneficial to use information from the whole sample both to train the BDTh and estimate the remaining background in the signal region. The simulated sample is thus split into two subsamples that are used to train two separate BDTh's. Good agreement between the two outputs is observed. The data sample is then randomly divided into two halves and each BDTh is applied to one half. In the background sample, for each event, the BDTh other than the one the event is used to train is applied. Details regarding the input variables and BDTh parameters are reported in Appendix A 2.</p><p>The BDTh input variable providing the highest discriminating power is E extra . For correctly reconstructed signal events, no extra ECL deposits are expected, which results in a E extra distribution peaking at zero; backgrounds leave deposits with energies up to 1 GeV. The second most discriminating variable is the sum of missing energy and magnitude of the missing momentum (E &#195; miss &#254; cp &#195; miss ), where the missing four-vector is defined as the difference between the beam four-vector and the sum of the signal kaon and B tag four-vectors in the c.m. frame. For correctly reconstructed signal events, E &#195; miss &#254; cp &#195; miss is defined by the neutrino kinematic properties, and its distribution peaks around 5 GeV, while for background events the random loss of particles mimicking the neutrinos results in a broader distribution.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VII. SIGNAL REGION DEFINITION</head><p>Using the simulated signal sample, the BDT 2 variable is mapped to the complement of the integrated signalselection efficiency,</p><p>where &#958;&#240;b&#222; is the total signal-selection efficiency density for the BDT 2 value b. In this way the distribution of &#951;&#240;BDT 2 &#222; for simulated signal events is uniform; a similar mapping is used to define &#951;&#240;BDTh&#222;, based on the efficiency of the selection on BDTh.</p><p>For the ITA, the signal region (SR) is defined to be BDT 1 &gt; 0.9 and &#951;&#240;BDT 2 &#222; &gt; 0.92, as this criterion maximizes the expected signal significance, based on studies in simulation. The SR is further divided into 4 &#215; 3 intervals (bins) in the &#951;&#240;BDT 2 &#222; &#215; q 2 rec space. The bin boundaries are &#189;0.92; 0.94; 0.96; 0.98; 1.00 in &#951;&#240;BDT 2 &#222; and &#189;-1.0; 4.0; 8.0; 25.0 GeV 2 =c 4 in q 2 rec . The bin &#951;&#240;BDT 2 &#222; &gt; 0.98 provides the main information on the signal while the bin &#951;&#240;BDT 2 &#222; &lt; 0.94 helps to constrain background contributions. The bin boundaries in q 2 rec are chosen to follow those of theoretical predictions <ref type="bibr">[2]</ref> while ensuring a sufficient number of expected signal events in each bin. The expected yields of the SM signal and the backgrounds in the SR are 160 and 16793 events, respectively. More detailed information about the expected background composition for charged and neutral B decays is shown in the Supplemental Material <ref type="bibr">[36]</ref>. For the highest-purity &#951;&#240;BDT 2 &#222; &gt; 0.98 region, the expected SM signal yield is reduced to 40 events with a background yield of 977 events. These signal and background yields include corrections to the simulation discussed in the following sections; they correspond to the sample entering the statistical analysis to extract the signal described in Sec. X.</p><p>For the HTA, the SR is defined to be &#951;&#240;BDTh&#222; &gt; 0.4 and is divided into six bins with bin boundaries at [0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]. In events containing multiple B tag -K &#254; candidates, the candidate formed by the B tag with highest FEI probability is selected. The expected yields of the SM signal and the background in the SR are 8 and 211 events, respectively. For the highest purity &#951;&#240;BDTh&#222; &gt; 0.7 region, the expected SM signal yield is reduced to 4 events with background yield of 33 events. The expected background and signal distributions in the signal search region are shown in the Supplemental Material <ref type="bibr">[36]</ref>.</p><p>The signal-selection efficiency in the SR is shown in Fig. <ref type="figure">6</ref>. Much higher efficiency is observed for the ITA; however, the ITA efficiency has a significantly stronger q 2 dependence compared to the efficiency for the HTA. The analysis relies on modeling of this variation by simulation, which is checked using a control channel, as discussed in the next section.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VIII. SIGNAL SELECTION EFFICIENCY VALIDATION</head><p>The decay</p><p>is used to validate the BDT performance on signal-like events between data and simulation, exploiting its large branching fraction and distinctive experimental signature. These events are selected in data and B &#254; &#8594; K &#254; J=&#968; simulation by requiring the presence of two oppositely-charged muons with an invariant mass within 50 MeV=c 2 of the known J=&#968; mass <ref type="bibr">[14]</ref>. To suppress background events, the variable j&#916;Ej is required to be less than 100 MeV, and the beam-energy constrained mass M bc is required to exceed 5.27 GeV=c 2 . These criteria result in 7214 events being selected in the data sample with an expected background contamination of 2%. Each event is then reconsidered as a B &#254; &#8594; K &#254; &#957;&#957; event by ignoring the muons from the J=&#968; decay and replacing the kaon candidate with the signal kaon candidate from a simulated</p><p>event, to reflect the three-body topology of the signal signature. The kinematic properties of the signal kaon are then adjusted such that the B &#254; four-momentum and decay vertex in the simulated B &#254; &#8594; K &#254; &#957;&#957; decay match the four-momentum and decay vertex of the corresponding</p><p>decay. This substitution is performed for the reconstructed track, ECL energy deposits, and PID likelihood values associated with the simulated kaon such that the test samples have a format identical to the data and can be analyzed by the same reconstruction software. This signal-embedding method is performed for both data and</p><p>The results obtained by analyzing selected events are summarized for the ITA in Fig. <ref type="figure">7</ref>, where the distributions of the output values of both BDTs are shown. FIG. 7. Distribution of the classifier output BDT 1 (main figure) and BDT 2 for BDT 1 &gt; 0.9 (inset). The distributions are shown before</p><p>removal and replacement of the kaon momentum of selected B &#254; &#8594; K &#254; J=&#968; events in simulation and data. As a reference, the classifier outputs directly obtained from simulated B &#254; &#8594; K &#254; &#957;&#957; signal events are overlaid. The simulation histograms are scaled to the total number of B &#254; &#8594; K &#254; J=&#968; events selected in the data.</p><p>Good agreement between simulation and data is observed for the selected events before (</p><p>J=&#968;) the signal embedding. Distributions with logarithmic y-axis are presented in the Supplemental Material <ref type="bibr">[36]</ref>. The ratio of the selection efficiencies for the SR in data and simulation is 1.00 AE 0.03; i.e., agreement is observed.</p><p>For the HTA, the signal embedding is used to check both the FEI and the combined FEI plus BDTh signal reconstruction efficiency. The ratios of data and simulation efficiencies at the two levels of the selection are found to be 0.68 AE 0.06 and 0.60 AE 0.10, respectively. The first ratio agrees with an independent FEI calibration derived from B &#8594; Xl&#957; FEI-tagged events <ref type="bibr">[43]</ref> and is therefore used as a correction for signal efficiencies and B B normalization. From the relative uncertainty on the efficiency ratio computed after the &#951;&#240;BDTh&#222; selection, a 16% systematic uncertainty on the signal-selection efficiency is derived. For HTA, the resulting distributions of this study are shown in the Supplemental Material <ref type="bibr">[36]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IX. BACKGROUND STUDIES</head><p>The main background sources for the analysis arise from decays that involve an energetic kaon (or a misidentified pion), missing energy, or particles that leave no or small signatures in the ECL, such as K 0 L mesons. These processes occur in both continuum and B-meson decays. Dedicated studies, using a variety of control samples, are performed in order to validate the background description in simulated events. Where needed, correction factors are derived with corresponding systematic uncertainties. In the following subsections the modeling of backgrounds from continuum (Sec. IX A) and B B events (Sec. IX B) are discussed. In Sec. IX C the overall background normalization after all corrections are applied is checked.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Continuum background</head><p>Continuum represents 40% and 30% of the background in the entire signal region of the ITA and HTA, respectively. This contribution drops to 17% in the highest-sensitivity region &#951;&#240;BDT 2 &#222; &gt; 0.98 of the ITA, and to 15% in the highest-sensitivity region &#951;&#240;BDTh&#222; &gt; 0.7 of the HTA. The background modeling is validated using the off-resonance data and shows moderate disagreements in the shape of some of the input features of the various classifiers (locally up to 20%). The modeling of continuum-background simulation is thus improved following Ref. <ref type="bibr">[44]</ref>. A binary classifier, BDT c , is trained to separate the off-resonance data and off-resonance simulation. For the ITA, the BDT c input variables consist of all BDT 2 input variables, q 2 rec , and the output of BDT 2 . The BDT c classifier is trained with events that satisfy BDT 1 &gt; 0.9 and &#951;&#240;BDT 2 &#222; &gt; 0.75 in the offresonance data and a 50 fb -1 sample of off-resonance simulation. As a check, BDT c is trained using a 200 fb -1 simulated sample of continuum events produced at a c.m. energy corresponding to the &#978;&#240;4S&#222; resonance, yielding a similar performance. For the HTA, the BDT c exploits all BDTh input variables and is trained with the off-resonance data and a 1 ab -1 simulated sample of continuum events produced at a c.m. energy corresponding to the &#978;&#240;4S&#222; resonance. If p, taking values between 0.0 and 1.0, denotes the BDT c classifier output for a given continuum event, the ratio p=&#240;1 -p&#222; approximates the likelihood ratio L&#240;data&#222;=L&#240;simulation&#222;, where L&#240;data&#222; &#240;L&#240;simulation&#222;&#222; is the likelihood of the continuum event being from data (simulation), which is used as a weight <ref type="bibr">[44]</ref>. The weights range between 0.5 and 2.0 with a standard deviation of 0.3 for the ITA. This weight, for the ITA, is applied to the simulated continuum events after the final selection; for the HTA, it is applied before the BDTh training. Comparison of simulated continuum events with off-resonance data shows that the application of this weight improves the modeling of the input variables.</p><p>Figure <ref type="figure">8</ref> shows a comparison of the q 2 rec distribution in data and corrected simulation for the ITA off-resonance sample. While the shapes of the distributions are similar, there is a normalization excess of the data over the simulation of &#240;40 AE 5&#222;%, which is included as a systematic uncertainty (see Sec. XI). A possible source of the discrepancy is a mismodeling of kaon fragmentation in the PYTHIA8 version used in Belle II. Illustration of improvement of the ITA distributions with BDT c -based reweighting is shown in the Supplemental Material <ref type="bibr">[36]</ref>.</p><p>For the HTA, the relative normalization between offresonance data and continuum simulation is 0.82 AE 0.01 FIG. <ref type="figure">8</ref>. Distribution of q 2 rec for the off-resonance data (points with error bars) and continuum background simulation (filled histograms) in the SR for the ITA. The simulation is normalized to the number of events in the data. The distribution of the difference between data and simulation divided by the combined uncertainty (pull) is shown in the bottom panel. before the BDTh selection. This factor accounts for mismodeling effects on the FEI performance for continuum events and is used to scale the expected continuum contamination. The relative normalization in the BDTh signal region is consistent with unity with 50% uncertainty, which is included as a systematic uncertainty (see Sec. XI).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. B background</head><p>The backgrounds originating from B 0 and B &#254; decays are dominant in the highest-sensitivity regions of the analysis. The composition of the B backgrounds is similar for both the ITA and HTA samples. It is also similar for B &#254; and B 0 decays; however, the contribution from B &#254; decays has a larger impact for both analyses.</p><p>In the ITA sample, the main background process consists of semileptonic B decays to charm, where the signalcandidate kaons originate from charmed-meson decays. This process is approximately 47% of the total B background in the SR. The other major background processes are hadronic B decays involving charmed mesons and other hadronic B decays, contributing about 38% and 14% to the total B background in the SR, respectively. The remaining sources of background are B &#254; &#8594; &#964; &#254; &#957; &#964; decays and B &#8594; K &#195; &#957;&#957; decays.</p><p>In the HTA sample, semileptonic B decays represent the majority of the B background events, accounting for approximately 62% of the total background. The second most abundant contribution comes from hadronic B decays with final states including a charmed meson accompanied by multiple pions, representing about 20% of the total background. The remaining contributions are from other hadronic modes.</p><p>The lower-particle-multiplicity events involving the direct decay of a B meson into a D meson contribute more than those containing D &#195; resonances. The decays involving higher excitations of D mesons (D &#195;&#195; modes), which are less well known, correspond to approximately 4% of the total B background for ITA and 6% for HTA, and are modeled according to their PYTHIA8 <ref type="bibr">[20]</ref> simulation. In the following, the modeling of the main background categories and of specific background decays requiring special treatment is presented.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Modeling of D-meson decays involving a K + meson</head><p>The dominant background contributions in which the signal-candidate K &#254; originates from D 0 and D &#254; decays are suppressed using several variables that exploit characteristic features of these decays, such as displaced decay vertex and invariant-mass information, as discussed in Sec. VI. The modeling of this background is checked by comparing the distributions of these variables in data and simulation at various selection stages, and good agreement is observed. An example is presented in Fig. <ref type="figure">9</ref>, which shows the invariant mass distribution of the signal-kaon candidate paired with a charged particle from the ROE after the BDT 1 selection. The distinctive shape in data, including the peak from the two-body D 0 &#8594; K -&#960; &#254; decay, is well reproduced by the simulation.</p><p>Uncertainties related to the knowledge of the semileptonic B-decay branching fractions are included explicitly, as discussed in Sec. XI. Uncertainties due to the decay form factors are studied using the eFFORT computer program <ref type="bibr">[45]</ref> and found to be negligible.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Modeling of D-meson decays involving a K 0</head><p>L meson Backgrounds from prompt production of K &#254; mesons in B decays are important in the highest-sensitivity region. The branching fractions of</p><p>decays are relevant due to the sizable and poorly known fraction of D-meson decays involving K 0 L mesons. The branching fraction of decays which involve B &#8594; D &#8594; K 0 L transitions is studied using independent control samples based on alternative particle-identification requirements. A pion-enriched control sample is used to determine corrections, while samples with the signal track identified as an electron or a muon are used to validate them.</p><p>The pion-enriched sample presents an overall excess of the data over expectations for both ITA and HTA. For the ITA, the excess is studied as a function of q 2 rec and found to appear at the D 0 threshold and above (see Fig. <ref type="figure">10</ref> </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>left). If attributed to D-meson decays involving K 0</head><p>L , the excess is consistent with a &#240;30 AE 2&#222;% increase in rate compared to the expectation from simulation. This is determined in a threeparameter fit to the binned q 2 rec distribution for &#951;&#240;BDT 2 &#222; &gt; 0.92 where the fit parameters are the fractions of summed continuum, summed charged and neutral B-meson decays with D-meson decays involving K 0 L , and summed charged and neutral B-meson decays without D-meson decays involving K 0</p><p>L mesons (see Appendix B for more details). In background simulation the branching fraction for D 0 &#8594; &#240;K 0 = K0 &#222;X is 40% and for D &#254; &#8594; &#240;K 0 = K0 &#222;X is 58%. When these branching fractions are scaled by 1.30, the resulting branching fraction of 52% for D 0 is compatible with the known value of &#240;47 AE 4&#222;% <ref type="bibr">[14]</ref>; the value for D &#254; , 75%, is above the known value of &#240;61 AE 5&#222;% <ref type="bibr">[14]</ref>. The distribution of q 2 rec in simulation for B-meson decays with subsequent D &#254; &#8594; K 0 L X and D 0 &#8594; K 0 L X decays in the pion-enriched ITA control sample is shown in the Supplemental Material <ref type="bibr">[36]</ref>.</p><p>An excess at q 2 rec above the charm-production threshold is also evident in the samples in which the signal track is identified to be a muon or an electron. It is covered by &#240;35 AE 1&#222;% and &#240;38 AE 1&#222;% increases in the rate of charm decays involving K 0 L in the respective samples. Consequently, a correction of &#254;30% is applied to the branching fraction of events containing D &#8594; K 0 L X in the simulated background sample, in both ITA and HTA. The correction is based on the excess size determined for the pion-enriched sample, as the rate of pionto-kaon misidentification is significantly larger than that of lepton-to-kaon misidentification. Due to the discrepancy in the correction factors between the different samples, a systematic uncertainty of 10% is assigned; i.e., the correction is &#240;&#254;30 AE 10&#222;%.</p><p>Figure <ref type="figure">11</ref> shows the &#951;&#240;BDT 2 &#222; distribution for the pionenriched sample, after all corrections are applied, including the scaling of the branching fraction of D-meson decays involving K 0 L mesons. The resulting expectations are consistent with the data. The q 2 rec distribution for the sample is also discussed in Sec. XIII. The q 2 rec distribution for the sample in which the signal track is identified as a lepton is shown in Fig. <ref type="figure">12</ref>. FIG. 11. Distribution of &#951;&#240;BDT 2 &#222; in data (points with error bars) and simulation (filled histograms) divided into three groups (B-meson decays with and without subsequent D &#8594; K 0 L X decays, and the sum of the five continuum categories), for the pionenriched ITA control sample. All the corrections are applied, including the one for the contribution involving D mesons decaying to K 0 L . The pull distribution is shown in the bottom panel.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Modeling of B</head><p>final state, since these neutral particles can mimic the signal signature. The contributions from</p><p>n, and B &#8594; K &#195; &#957;&#957; decays are estimated separately, as described in Sec. II. The modeling of the</p><p>Details of the reconstruction are given in Appendix C. The sPlot method <ref type="bibr">[46]</ref> is used to determine the invariant-mass distributions for the K 0 S K 0 S and</p><p>The result for the <ref type="figure">13</ref>. Data and simulation show good shape and normalization agreement, validating the</p><p>butions, as described in Sec. II. This model is validated by reconstructing the isospin-related decay</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>this decay proceeds via scalar resonances and a nonresonant s-wave amplitude. The</head><p>S decays in data are used to model the latter two contributions only, as this decay lacks a p-wave component due to Bose-Einstein statistics of the K 0 S K 0 S pair. Figure <ref type="figure">14</ref> shows a comparison between the observed</p><p>obtained in data and corrected for efficiency and the ratio of the B &#254; and B 0 lifetimes, (2) simulated B 0 &#8594; K 0 S &#981; contributions, and (3) simulated</p><p>Satisfactory agreement is observed both in shape and normalization.</p><p>The B &#254; &#8594; K &#254; n n background constitutes 0.4% of the total B background in the signal region and 1.0% in the most sensitive region for the ITA. This contribution is significant because of the threshold enhancement used in the model: these contributions would be only 0.2% and 0.3% if the decay proceeded according to phase space.</p><p>Contaminations from B &#254; &#8594; K &#195;&#254; &#957;&#957; and B 0 &#8594; K &#195;0 &#957;&#957; decays are also included in the background model </p><p>The simulated distribution is normalized to the number of B B events. The pull distribution is shown in the bottom panel. according to the SM prediction <ref type="bibr">[4]</ref>. Their expected yield is approximately 5 times smaller than the expected signal yield in the entire SR and 10 times smaller in the most sensitive region.</p><p>The long-distance contribution of B &#254; &#8594; &#964; &#254; &#240;&#8594; K &#254; &#957;&#222;&#957; decay is included as part of the background model (see Sec. II). Compared to the signal, which by construction has a selection efficiency of 8.0% in the SR for the ITA, this background has a higher selection efficiency of 9.7%. This higher efficiency is due to a q 2 distribution that peaks at a lower value than the signal. However, due to the small branching fraction, the expected yield is approximately 6 times smaller than the expected signal yield in the most sensitive region.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Validation of background estimation</head><p>The modeling of the ITA BDT distributions of background events is tested using events outside the SR with &#951;&#240;BDT 2 &#222; in the interval 0.75 to 0.90.</p><p>For the HTA, the background normalization and BDTh input and output distribution are checked in two control samples: one in which the B tag and the signal kaon have the same charge and another one in which the requirement on the PID criteria on the signal-side kaon is reversed.</p><p>In both analyses, the distributions obtained in data and simulation agree. The normalization of the background contributions also agrees with the expectation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>X. SIGNAL YIELD DETERMINATION</head><p>The signal yields are estimated via binned maximum likelihood fits to data event counts in the bins of the SRs defined in Sec. VII. The ITA fit is a simultaneous fit to on and off-resonance data samples; the HTA fit is to onresonance data only. Templates are used to approximate the distributions, in the relevant observables, of each class of events. The likelihood function is constructed as a product of Poisson probability-density functions that combine the information from the SR bins. The systematic uncertainties are included in the likelihood as nuisance parameters, which are approximated as additive or multiplicative modifiers of the relevant yields and constrained to the available auxiliary information using Gaussian likelihoods. The parameter of interest is &#956;, the signal branching fraction relative to its SM expectation (signal strength). The SM expectation for the signal branching fraction used as a reference is 4.97 &#215; 10 -6 , based on Ref. <ref type="bibr">[4]</ref> and excluding the contribution from &#964; decays. The statistical analysis is performed with the PYHF computer program <ref type="bibr">[47,</ref><ref type="bibr">48]</ref>, and the results are checked using a dedicated SGHF computer program <ref type="bibr">[13]</ref>, which is also used for fits to control samples.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>XI. SYSTEMATIC UNCERTAINTIES</head><p>A number of possible sources of systematic uncertainty are considered and summarized in Table <ref type="table">I</ref> for the ITA and Table <ref type="table">II</ref> for the HTA.</p><p>For the ITA, the yields of the seven individual background categories are allowed to vary independently in the fit. In each case, a Gaussian constraint is added to the fit, centered at the expectation based on (corrected) simulation and with standard deviation corresponding to 50% of the central value. The 50% value is motivated by a global normalization difference between the off-resonance data and continuum simulation, as mentioned in Sec. IX A. For the charged-B-background yield, which has the largest correlation with the signal strength &#956;, the postfit uncertainty is reduced to about half the assigned prefit uncertainty. The data also significantly constrain the cc-background yield, reducing the postfit uncertainty to approximately half of the prefit uncertainty.</p><p>The remaining systematic uncertainties may also influence the shape of the templates. Each source is described by several nuisance parameters. Several sources are used to cover background-modeling uncertainties. The branching fractions of decay modes contributing about 80% of B &#254; decays and 70% of B 0 decays in the SR are allowed to vary according to their known uncertainties <ref type="bibr">[14]</ref>. These variations are then propagated to the SR bins, and their effects, along with correlations, are incorporated into a covariance matrix. This matrix is subsequently factorized into a FIG. <ref type="figure">14</ref>. Distribution of the invariant mass of the</p><p>in background-subtracted data (points with error bars) and the sum of the simulated</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>in data (blue-filled histogram) and the simulated p-wave nonresonant component (red-filled histogram). The distribution obtained using</head><p>in data is corrected for efficiency and the ratio of the B &#254; and B 0 lifetimes. The simulated distributions are normalized to the number of B B events. The pull distribution is shown in the bottom panel. canonical form using eigendecomposition and represented using six nuisance parameters. The uncertainty on the branching fraction of the</p><p>decays. The uncertainty on the branching fraction of the</p><p>L decay is estimated to be 30%. This accounts for possible isospin-breaking effects (20%) and uncertainties in the p-wave nonresonant contribution (20%). The uncertainties on the branching fractions of B &#8594; D &#195;&#195; decays, which are poorly known, are assigned to be 50%. Uncertainties in the modeling of baryonic decays involving neutrons are covered by the 100% uncertainty on the B &#254; &#8594; K &#254; n n branching fraction. The fraction of D-meson decays TABLE I. Sources of systematic uncertainty in the ITA, corresponding correction factors (if any), their treatment in the fit, their size, and their impact on the uncertainty of the signal strength &#956;. The uncertainty type can be "Global", corresponding to a global normalization factor common to all SR bins, or "Shape", corresponding to a bin-dependent uncertainty. Each source is described by one or more nuisance parameters (see the text for more details). The impact on the signal strength uncertainty &#963; &#956; is estimated by excluding the source from the minimization and subtracting in quadrature the resulting uncertainty from the uncertainty of the nominal fit.</p><p>Source Correction Uncertainty type, parameters Uncertainty size Impact on &#963; &#956; Normalization of B B background Global, 2 50% 0.90 Normalization of continuum background Global, 5 50% 0.10 Leading B-decay branching fractions Shape, 6 O&#240;1%&#222; 0.22 Branching fraction for</p><p>O&#240;100%&#222; Shape, 1 100% 0.20 Branching fraction for D &#8594; K 0 L X &#254;30% Shape, 1 10% 0.14 Continuum-background modeling, BDT c Multivariate O&#240;10%&#222; Shape, 1 100% of correction 0.01 Integrated luminosity Global, 1 1% &lt;0.01 Number of B B Global, 1 1.5% 0.02 Off-resonance sample normalization Global, 1 5% 0.05 Track-finding efficiency Shape, 1 0.3% 0.20 Signal-kaon PID p, &#952; dependent O&#240;10-100%&#222; Shape, 7 O&#240;1%&#222; 0.07 Photon energy Shape, 1 0.5% 0.08 Hadronic energy -10% Shape, 1 10% 0.37 K 0 L efficiency in ECL -17% Shape, 1 8.5% 0.22 Signal SM form-factors q 2 dependent O&#240;1%&#222; Shape, 3 O&#240;1%&#222; 0.02 Global signal efficiency Global, 1 3% 0.03 Simulated-sample size Shape, 156 O&#240;1%&#222; 0.52 TABLE II. Sources of systematic uncertainty in the HTA (see caption of Table I for details). Source Correction Uncertainty type, parameters Uncertainty size Impact on &#963; &#956; Normalization of B B background Global, 1 30% 0.91 Normalization of continuum background Global, 2 50% 0.58 Leading B-decay branching fractions Shape, 3</p><p>O&#240;100%&#222; Shape, 1 100% 0.05 Branching fraction for D &#8594; K 0 L X &#254;30% Shape, 1 10% 0.03 Continuum-background modeling, BDT c Multivariate O&#240;10%&#222; Shape, 1 100% of correction 0.29 Number of B B Global, 1 1.5% 0.07 Track finding efficiency Global, 1 0.3% 0.01 Signal-kaon PID p, &#952; dependent O&#240;10-100%&#222; Shape, 3 O&#240;1%&#222; &lt;0.01 Extra-photon multiplicity n &#947;extra dependent O&#240;20%&#222; Shape, 1 O&#240;20%&#222; 0.61 K 0 L efficiency Shape, 1 17% 0.31 Signal SM form-factors q 2 dependent O&#240;1%&#222; Shape, 3 O&#240;1%&#222; 0.06 Signal efficiency Shape, 6 16% 0.42 Simulated-sample size Shape, 18 O&#240;1%&#222; 0.60</p><p>involving K 0 L mesons is corrected by 30% with a 10% absolute uncertainty, motivated by the differences in the scaling factors determined using different samples, as discussed in Sec. IX B 2. All of these uncertainties are propagated as correlated shape uncertainties.</p><p>Global normalization uncertainties on the luminosity measurement (1% assumed) and the number of B B pairs (1.5%) are treated with one nuisance parameter each. In addition, a 5% uncertainty is introduced on the difference in normalization between on-and off-resonance data samples.</p><p>The following five sources represent uncertainties in detector modeling; they are discussed in detail in Sec. V. The sources are track-finding efficiency, kaon-identification efficiency, modeling of energy for photons and hadrons, and K 0 L reconstruction efficiency. The final three sources account for signal-modeling uncertainties. These are signal form factors, which are based on Ref. <ref type="bibr">[4]</ref>, and global signalselection efficiency uncertainties as determined in Sec. VIII.</p><p>The systematic uncertainty due to the limited size of simulated samples is taken into account by one nuisance parameter per bin per category (156 parameters).</p><p>To account for all the systematic sources described above, a total of 193 nuisance parameters, along with the signal strength &#956;, are varied in the fit.</p><p>The largest impact on the uncertainty of the signal strength &#956; arises from the knowledge of the normalization of the background from charged B decays. Other important sources are the simulated-sample size, branching fraction for</p><p>, branching fraction for B &#8594; D &#195;&#195; decays, reconstructed energy of hadrons, branching fractions of the leading B decays, and K 0 L reconstruction efficiency. These sources of uncertainty allow for substantial changes in the B B shape. The shape variations are larger than the data-simulation residuals in &#951;&#240;BDT 2 &#222; in the pionenriched sample (Fig. <ref type="figure">11</ref>). This suggests that uncertainties in the B B shape are adequately covered by the existing systematic contributions.</p><p>The summary of systematic uncertainties for the HTA is provided in Table <ref type="table">II</ref>. Three background components are considered in the HTA: B B, accounting for both charged and neutral B decays; cc; and light-quark continuum (u &#363;, d d, ss). The contribution from &#964;-pair decays is negligible. The primary contribution to the systematic uncertainty arises from the determination of the normalization of the B B background. This determination is based on the comparison of data-to-simulation normalization in the pionenriched control sample, which shows agreement within the 30% statistical uncertainty. The other important sources are the uncertainty associated with the bin-by-bin correction of the extra-photon-candidate multiplicity, and the uncertainty due to the limited size of the simulated sample. The uncertainty on continuum normalization (50%), determined using off-resonance data, is the fourth most important contribution. The limited size of the HTA sample prevents the substantial reduction of postfit uncertainties seen in the ITA, compared to prefit values, for the background normalization. The other sources of systematic uncertainty are the same in both analyses, except for those related to photon and hadronic-energy corrections, not applied in the HTA, and the p-wave contribution from</p><p>For both analyses, nuisance-parameter results are investigated in detail. No significant shift is observed for the background yields from charged and neutral B-meson decays. For the ITA, the shifts in the continuum background yields are consistent with the difference observed in the normalization of the continuum simulation with respect to the off-resonance data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>XII. RESULTS</head><p>The data in the off-resonance sample and in the SR of the ITA are shown in Fig. <ref type="figure">15</ref>, with fit results overlaid. Good visual agreement between data and fit is observed in both samples. An excess over background is observed in the SR, consistent with the presence of B &#254; &#8594; K &#254; &#957;&#957; signal. The observed signal purity, in terms of the fraction of signal events, is 5% in the SR and 19% in the three bins with &#951;&#240;BDT 2 &#222; &gt; 0.98. The signal strength is determined to be &#956; &#188; 5.4 AE 1.0&#240;stat&#222; AE 1.1&#240;syst&#222; &#188; 5.4 AE 1.5, where the statistical uncertainty is estimated using simplified simulated experiments based on Poisson statistics. The total uncertainty is obtained using a profile likelihood ratio, fitting the model with fixed values of &#956; around the best-fit value while keeping the other fit parameters free; see Fig. <ref type="figure">16</ref>. The systematic uncertainty is calculated by subtracting the statistical uncertainty in quadrature from the total uncertainty. An additional 8% theoretical uncertainty, arising from the knowledge of the branching fraction is not included. Compatibility between the data and fit result is assessed using simplified experiments, and a p-value of 47% is found. (The test is based on the fraction of simplified experiments with the negative profile log-likelihood ratio of the nominal to the "saturated" model, in which the predictions are set to the observations, above the one observed in data.) Figures <ref type="figure">17</ref> and <ref type="figure">18</ref> present distributions of several variables for the events within the signal region. Distributions of q 2 rec with more differential background information are included in the Supplemental Material <ref type="bibr">[36]</ref>. Each simulated event is weighted using the ratio of post-to-prefit yields for the corresponding SR bin and event category. Good overall agreement is observed. However, certain discrepancies are evident in the q 2 rec distribution, showing a deficit in data-to-predictions for q 2 rec &lt; 3 GeV 2 =c 4 and an excess for 3 GeV 2 =c 4 &lt; q 2 rec &lt; 5 GeV 2 =c 4 . A comparison of data and fit results for the HTA is shown in Fig. <ref type="figure">19</ref>. The compatibility between the data and fit results is determined to be 61%. The HTA observes a signal strength of &#956; &#188; 2.2 &#254;1.8 -1.7 &#240;stat&#222; &#254;1.6 -1.1 &#240;syst&#222; &#188; 2.2 &#254;2.4 -2.0 , lower than the ITA result. In the whole SR, a signal purity of 7% is measured, which increases to 20% in the three bins with &#951;&#240;BDTh&#222; &gt; 0.7, with the main background contribution from B B decays. Figure <ref type="figure">20</ref> shows distributions of several variables for the events within the signal region. Good agreement is observed. Limit setting for HTA is included in the Supplemental Material <ref type="bibr">[36]</ref>.</p><p>If interpreted in terms of signal, the results correspond to a branching fraction of the B &#254; &#8594; K &#254; &#957;&#957; decay of &#189;2.7 AE 0.5&#240;stat&#222; AE 0.5&#240;syst&#222; &#215; 10 -5 for the ITA and &#189;1.1 &#254;0.9 -0.8 &#240;stat&#222; &#254;0.8 -0.5 &#240;syst&#222; &#215; 10 -5 for the HTA. As mentioned in Sec. X, the measured branching fraction does not include the contribution from the long-distance double-charged-</p><p>The significance of the observation is determined by evaluating the profile likelihood L for several &#956; values. The square root of the difference between the -2 log L values at &#956; &#188; 0 and the minimum is used to estimate the significance of the observed excess with respect to the background-only hypothesis, which yields 3.5 standard deviations for the ITA. For the HTA, the observed signal is consistent with the background-only hypothesis at 1.1 standard deviations. Similarly, the square root of the difference between the -2 log L values at &#956; &#188; 1 and at the minimum is used to estimate the significance of the observed signal with respect to the SM expectation. For the ITA, it is found to be 2.9 standard deviations, indicating a potential deviation from the SM. For the HTA, the result is in agreement with the SM at 0.6 standard deviations.</p><p>Events from the SR of the HTA represent only 2% of the corresponding events in the ITA; their removal does not alter the ITA result significantly. The ITA sample with removed overlapping events is used for the compatibility checks. The ITA and HTA measurements agree, with a difference in signal strength of 1.2 standard deviations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>XIII. CONSISTENCY CHECKS</head><p>Several checks are performed to test the validity of the analysis.</p><p>Simulation and data events are divided into approximately same-size statistically independent samples (split samples) according to various criteria: data-taking period; FIG. <ref type="bibr">16</ref>. Twice the negative profile log-likelihood ratio as a function of the signal strength &#956; for the ITA, HTA, and the combined result. The value for each scan point is determined by fitting the data, where all parameters but &#956; are varied. FIG. <ref type="figure">15</ref>. Observed yields and fit results in bins of the &#951;&#240;BDT 2 &#222; &#215; q 2 rec space obtained by the ITA simultaneous fit to the off-and onresonance data, corresponding to an integrated luminosity of 42 and 362 fb -1 , respectively. The yields are shown individually for the B &#254; &#8594; K &#254; &#957;&#957; signal, neutral and charged B-meson decays and the sum of the five continuum categories. The yields are obtained in bins of the &#951;&#240;BDT 2 &#222; &#215; q 2 rec space. The pull distributions are shown in the bottom panel. FIG. 18. Distributions of &#951;&#240;BDT 2 &#222;, q 2 rec , beam-constrained mass of the ROE M bc;ROE , &#916;E ROE , Fox-Wolfram R 2 , and modified Fox-Wolfram H so m;2 in data (points with error bars) and simulation (filled histograms) shown individually for the B &#254; &#8594; K &#254; &#957;&#957; signal, neutral and charged B-meson decays, and the sum of the five continuum categories in the ITA. Events in the most signal-rich region, with &#951;&#240;BDT 2 &#222; &gt; 0.98, are shown. Simulated samples are normalized according to the fit yields in the ITA. The pull distributions are shown in the bottom panels.</p><p>missing-momentum direction; momentum of the rest-ofevent particles; number of photons, charged particles, and lepton candidates in the event; kaon direction; kaon charge; and total charge of the reconstructed particles in the event. Fits are performed for each split sample, and the results are presented in Fig. <ref type="figure">21</ref>.</p><p>Good compatibility is observed between the split samples for the HTA. A tension at 2.4 standard deviations is observed for the total charge split sample in the ITA. Several studies are conducted to investigate this tension, but they did not reveal any significant systematic effects. The total &#967; 2 value per degrees of freedom for all tests in the ITA is 12.5=9.</p><p>An important test involves the subdivision based on the number of leptons in the ITA. Since there are no leptons on the signal side, this test compares events in which a (semi) leptonic B decay occurs in the ROE with those in which a hadronic B decay occurs. The separation is confirmed by inspecting simulated events. Excellent agreement is observed between the results in the two split samples. This demonstrates the robustness of the ITA procedure with respect to a particular signature in the ROE.</p><p>For each common split sample, a comparison is also performed between the ITA and the HTA, showing compatibility between 1 and 2 standard deviations.</p><p>An ITA fit fixing the normalization of the B background to the expectation and the normalization of the continuum to the yield observed in off-resonance data yields a reduction of the uncertainty on &#956; by 25% with a downward change in &#956; that is consistent with zero at 1.5 standard deviations. Performing a fit where the 50% constraints on the normalizations of all background sources are released leads to a minimal change of &#956; by 0.1, with the uncertainty on &#956; increasing by only 5%. Another fit in which the leading systematic uncertainties are fixed also gives a consistent result. A fit to the 12 bins of &#978;&#240;4S&#222; data only, i.e., excluding the off-resonance data, changes &#956; by less than 0.1, while the uncertainty increases by 2%. Similarly, a fit restricted to the 18 bins with &#951;&#240;BDT 2 &#222; &gt; 0.94 yields a change in &#956; of less than 0.1, while the uncertainty increases by 3%. Additional fits are conducted to study the stability of the result with respect to q 2 rec . In these fits, the B background normalization is fixed to its expected value due to increased uncertainties, and the normalization of the continuum is set based on the yield observed in off-resonance data. The fits are separately performed for the low q 2 rec &lt; 4 GeV 2 =c 4 and high q 2 rec &gt; 4 GeV 2 =c 4 SR bins. The results from these fits are consistent within 1.4 standard deviations.</p><p>The ITA method is further validated by performing a branching fraction measurement of the B &#254; &#8594; &#960; &#254; K 0 decay. This decay is reconstructed by measuring the recoil of the &#960; &#254; , while the K 0 is not directly detected. In this case, the</p><p>, with comparable selection efficiency and purity. The known branching fraction, measured using K 0 S in the final state, is &#240;2.34 AE 0.08&#222; &#215; 10 -5 <ref type="bibr">[14]</ref>. With respect to the nominal B &#254; &#8594; K &#254; &#957;&#957; analysis, the following modifications are implemented for this validation: (i) positive pion identification is used instead of kaon identification; (ii) a bin boundary of the SR in q 2 rec is changed from 4 GeV 2 =c 4 to 2 GeV 2 =c 4 to increase sensitivity; (iii) the fit model uses only three sources of background (continuum, neutral B decays, charged B decays excluding B &#254; &#8594; &#960; &#254; K 0 ), and the signal B &#254; &#8594; &#960; &#254; K 0 decays; (iv) systematic uncertainties are restricted to those originating from limited sizes of the simulated samples and global normalization uncertainties; (v) the fit is restricted to the data sample collected at the &#978;&#240;4S&#222; resonance.</p><p>Based on the simulation, 80% of K 0 within the SR are K 0 L while the remaining 20% are K 0 S . The B &#254; &#8594; &#960; &#254; K 0 SR corresponds to a signal-selection efficiency of 4.4% with 0.9% purity, which can be compared to the 8% and 0.9% values for the B &#254; &#8594; K &#254; &#957;&#957;, respectively. However, the yield is almost 3 times higher, providing a sensitive test of the SR modeling. The fit quality is good, with a p-value of 83%. The branching fraction of the B &#254; &#8594; &#960; &#254; K 0 decay is found to be &#240;2.5 AE 0.5&#222; &#215; 10 -5 , consistent with the known value. The distribution of q 2 rec with the background and signal components normalized using the fit result is shown in Fig. <ref type="figure">22</ref>. The distribution of q 2 rec for events with &#951;&#240;BDT 2 &#222; &gt; 0.98 is shown in Supplemental Material <ref type="bibr">[36]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>XIV. COMBINATION</head><p>The consistency of the two analyses and the small size of the overlap between the HTA and ITA samples allows the combination of the results, which achieves a 10% increase in precision over the ITA result alone. This is done through a profile likelihood fit, incorporating correlations between FIG. <ref type="figure">19</ref>. Observed yields and fit results in bins of &#951;&#240;BDTh&#222; as obtained by the HTA fit, corresponding to an integrated luminosity of 362 fb -1 . The yields are shown for the B &#254; &#8594; K &#254; &#957;&#957; signal and the three background categories (B B decays, cc continuum, and light-quark continuum). The pull distribution is shown in the bottom panel. common systematic uncertainties. In order to eliminate statistical correlation, common data events are excluded from the ITA dataset prior to the combination. Nuisance parameters corresponding to the number of B B events, signal form factors, and branching fractions for processes</p><p>, and other leading B-meson decays are treated as fully correlated. To capture full correlations for the systematic uncertainties related to the branching fractions of leading B-meson decays between the ITA and HTA, eigendecomposition of the shared covariance matrix between ITA and HTA is performed and represented using ten nuisance parameters.</p><p>Conversely, other sources are considered uncorrelated due to their analysis-specific nature, distinct evaluation methods, or minor impact, such as PID uncertainties.</p><p>In order to ensure robustness, various scenarios are studied, including variations in which sources, such as global background normalization, are assumed to be fully correlated between the two analyses. These tests yield no substantial deviation from the default combination.</p><p>The combined result for the signal strength yields &#956; &#188; 4.6 AE 1.0&#240;stat&#222; AE 0.9&#240;syst&#222; &#188; 4.6 AE 1.3, corresponding to a branching fraction of the B &#254; &#8594; K &#254; &#957;&#957; decay of &#189;2.3 AE 0.5&#240;stat&#222; &#254;0.5 -0.4 &#240;syst&#222; &#215; 10 -5 &#188; &#240;2.3 AE 0.7&#222; &#215; 10 -5 . The significance with respect to the background-only hypothesis is found to be 3.5 standard deviations. The combined result is 2.7 standard deviations above the SM expectation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>XV. DISCUSSION</head><p>The measured branching fraction is compared with previous measurements in Fig. <ref type="figure">23</ref>. The comparison is performed using branching fractions from prior measurements to assess both compatibility and relative accuracy. For BABAR, the branching fractions are taken as given in Refs. <ref type="bibr">[10,</ref><ref type="bibr">11]</ref>. Since Belle did not report branching fractions in Refs. <ref type="bibr">[9,</ref><ref type="bibr">12]</ref> they are computed for this comparison based on the quoted observed number of events and efficiency taking into account statistical and systematic uncertainties. Note that BABAR uses a different value of f &#254;-&#188; 0.5 compared to the one adopted here. However, due to the large statistical uncertainties, minor differences in the correction factors have a small impact on the comparison of the results.</p><p>The ITA result is in agreement with the previous measurements obtained using hadronic and inclusive tagging methods. There are tensions of 2.3 and 1.8 standard deviations with the results obtained using semileptonic tagging by the BABAR <ref type="bibr">[11]</ref> and Belle <ref type="bibr">[12]</ref> Collaborations, respectively. The HTA result is in agreement with all measurements. The precision of the ITA measurement is comparable with the previous best results, despite being obtained with a smaller data sample. The precision of the HTA result exceeds that achieved by previous analyses using hadronic tagging. The combined Belle II result has comparable accuracy to the best single measurement, reported by Belle using semileptonic tags.</p><p>A simplified weighted average of the five independent measurements, obtained using symmetrized uncertainties (see Fig. <ref type="figure">23</ref>), yields a branching fraction of &#240;1.3 AE 0.4&#222; &#215; 10 -5 , and the corresponding &#967; 2 per degree of freedom is found to be 5.6=5, corresponding to a p-value of 35%.</p><p>The analysis was initially performed in a manner designed to reduce experimenter's bias. The full analysis procedure was developed and finalized before determining the branching fraction from data. However, several checks and corrections were applied after the result was obtained. The original measurement was initially limited to the ITA FIG. 22. Distribution of q 2</p><p>rec for ITA events in the pionenriched sample and populating the &#951;&#240;BDT 2 &#222; &gt; 0.92 bins. The yields of simulated background and signal components are normalized based on the fit results to determine the branching fraction of the B &#254; &#8594; &#960; &#254; K 0 decay. The pull distribution is shown in the bottom panel. and optimized through simulation using a partial data set of 189 fb -1 . In spring 2022, a fit to the data revealed a significant deviation from the expectations of the SM. To validate the findings, the ITA was repeated using a larger data sample while maintaining the selection criteria employed in the original measurement. As an additional consistency check, the HTA was introduced. The new analyses underwent rigorous consistency checks before the signal strength was once again unveiled in spring 2023. The ITA and HTA results were found to be in agreement, confirming the results of the original 2022 analysis. Further comprehensive checks were conducted in PID sidebands, leading to changes in background modeling and an increase in systematic uncertainties.</p><p>The postunveiling changes in the ITA are corrections to the K 0 L reconstruction efficiency in the ECL and its uncertainty, motivated by the observed excess in the pion-enriched sample (Sec. V C); correction to the rate of D-meson decays involving K 0 L and its uncertainty (Sec. IX B 2); and corrections to the B &#254; &#8594; K &#254; K 0 K0 decay modeling and corresponding uncertainty (Sec. IX B 3). In addition, the treatment of the reconstructed hadronic energy in the ECL was adjusted. Instead of keeping the scale at the nominal value, it is now adjusted to the preferred value while keeping the 100% uncertainty (Sec. V C). These modifications lead to a shift of the signal strength &#956; in the ITA of about -0.5. A mistake was found in the treatment of the B &#254; &#8594; &#964; &#254; &#240;&#8594; K &#254; &#957;&#222;&#957; background which was accidentally removed from the simulation. The mistake was corrected yielding a -0.15 change in &#956;. Given updates of the input variables, a new training of the BDT 2 was performed that led to an additional -0.5 change in &#956; with a statistical uncertainty of 0.6 estimated using simulated experiments. The modifications lead to an increase of the total uncertainty by 10%, driven by the uncertainty on the</p><p>The HTA is based on a standard FEI data selection that is widely used within Belle II. During the review of another Belle II analysis <ref type="bibr">[49]</ref>, it was concluded that it is necessary to remove selection criteria on the total energy in the ECL that is poorly modeled in simulation. The selection was removed and the BDTh was then retrained on new selected samples. This change resulted in a change of signal strength of -2.6. Additional HTA changes include systematic uncertainty due to the K 0 L reconstruction efficiency in the ECL (Sec. V C); correction to the rate of D-meson decays involving K 0 L and its uncertainty (Sec. IX B 2); corrections to the B &#254; &#8594; K &#254; K 0 L K 0 L decay modeling and corresponding uncertainty (Sec. IX B 3). Dedicated studies were performed targeting the E extra variable that is correlated with the total energy in the ECL, as described in Sec. V C, resulting in a data-driven correction and additional systematic uncertainty. These changes resulted in a change in the signal strength of -1.1 with a statistical uncertainty of 1.2, estimated using simulated experiments, which accounts for both data and simulated samples. The previously underestimated contributions from B &#254; &#8594; K &#254; K 0 L K 0 L and D &#8594; K 0 L X background reduce the signal strength by -0.6. Taking this reduction and the estimate of the statistical uncertainty into account, the significance of the change in &#956; is 1.9 standard deviations. The total uncertainty for the HTA is reduced by about 20%. The increase in the systematic uncertainty, also observed in ITA, is compensated by an increase in the data-sample size due to changes in the FEI selection.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>XVI. SUMMARY</head><p>In summary, a search for the rare decay B &#254; &#8594; K &#254; &#957;&#957; is reported using an inclusive tagging approach with data collected by the Belle II detector at the &#978;&#240;4S&#222; resonance, corresponding to an integrated luminosity of 362 fb -1 . The search is validated by a well-established approach using hadronic B tagging. The background processes are suppressed by exploiting distinct kinematic properties of the B &#254; &#8594; K &#254; &#957;&#957; decays in a multivariate classifier that is optimized using simulated data. The quality of the simulation is validated using several control channels. A sample-composition fit is used to extract the branching fraction of the B &#254; &#8594; K &#254; &#957;&#957; decay. The branching fraction obtained using the inclusive tagging is &#240;2.7 AE 0.7&#222; &#215; 10 -5 . This measurement has a significance of 3.5 standard FIG. <ref type="figure">23</ref>. Branching-fraction values measured by Belle II, measured by previous experiments <ref type="bibr">[9]</ref><ref type="bibr">[10]</ref><ref type="bibr">[11]</ref><ref type="bibr">[12]</ref><ref type="bibr">[13]</ref>, and predicted by the SM <ref type="bibr">[4]</ref>. The Belle analyses reported upper limits; the values shown here are computed based on the quoted observed number of events, efficiency, and f &#254;-&#188; 0.516. The BABAR results are taken directly from the publications, and they use f &#254;-&#188; 0.5. The weighted average is computed assuming symmetrized and uncorrelated uncertainties, excluding the superseded measurement of Belle II (63 fb -1 , inclusive) <ref type="bibr">[13]</ref> and the uncombined results of Belle II shown as open data points.</p><p>deviation with respect to the background-only hypothesis and shows a 2.9 standard deviation departure from the standard model expectation. The branching fraction obtained using the hadronic tagging is &#240;1.1 &#254;1.2 -1.0 &#222; &#215; 10 -5 and is consistent with the inclusive result at 1.2 standard deviations. A combination of the inclusive and hadronic tagging results yields &#240;2.3 AE 0.7&#222; &#215; 10 -5 for the B &#254; &#8594; K &#254; &#957;&#957; decay branching fraction, providing the first evidence of the decay with a significance of 3.5 standard deviations. The combined result shows a departure of 2.7 standard deviations from the standard model expectation.</p><p>Table <ref type="table">III</ref> presents the parameters that are used to train the classifiers BDT 1 and BDT 2 of the ITA. Furthermore, all input variables are listed below. Unless otherwise specified, all variables are measured in the laboratory frame. Each variable is used in BDT 1 , BDT 2 , or in both BDTs as specified in parentheses. The variable selection is done by iteratively removing variables from the training and checking the impact of their removal on the binary classification performance, measured with the area under the receiver operating characteristic curve <ref type="bibr">[50]</ref>.</p><p>For a given track, the point of closest approach (POCA) is defined as the point on the track that minimizes the distance to a line d passing through the average interaction point (IP) and parallel to the z axis, defined as the symmetry axis of the solenoid. The transverse impact parameter d r is defined as this minimal distance and the longitudinal impact parameter d z is defined as the z coordinate of the POCA with respect to the average interaction point <ref type="bibr">[33]</ref>.</p><p>Variables related to the kaon candidate are as follows:</p><p>(i) radial distance between the POCA of the K &#254; candidate track and the IP (BDT 2 ), (ii) cosine of the angle between the momentum line of the signal-kaon candidate and the z axis (BDT 2 ). Variables related to the kaon candidate do not include q 2 rec , because the data are binned in this variable and in BDT 2 in the last stage of the analysis.</p><p>Variables related to the tracks and energy deposits of the rest of the event (ROE) are as follows:</p><p>(i) two variables corresponding to the x and z components of the vector from the average interaction point to the ROE vertex (BDT 2 ), (ii) p-value of the ROE vertex fit (BDT 2 ), (iii) variance of the transverse momentum of the ROE tracks (BDT 2 ), (iv) polar angle of the ROE momentum (BDT 1 , BDT 2 ), (v) magnitude of the ROE momentum (BDT 1 , BDT 2 ), (vi) a modified Fox-Wolfram moment of the "oo" type (see Ref. <ref type="bibr">[38]</ref>), i.e., ROE-ROE, calculated in the c.m. frame (BDT 1 , BDT 2 ), (vii) difference between the ROE energy in the c.m. frame and the energy of one beam in the c.m. frame ( ffiffi ffi s p =2) (BDT 1 , BDT 2 ). Variables related to the entire event are as follows: (i) number of e AE and &#956; AE candidates (BDT 2 ), (ii) number of photon candidates, number of charged particle candidates (BDT 2 ), (iii) square of charged particles in the event (BDT 2 ), (iv) cosine of the polar angle of the thrust axis in the c.m. frame (BDT 1 , BDT 2 ), (v) harmonic moments with respect to the thrust axis in the c.m. frame <ref type="bibr">[39]</ref> (BDT 1 , BDT 2 ), (vi) modified Fox-Wolfram moments calculated in the c.m. frame <ref type="bibr">[40]</ref> (BDT 1 , BDT 2 ), (vii) polar angle of the missing three-momentum in the c.m. frame (BDT 2 ), (viii) square of the missing invariant mass (BDT 2 ), (ix) event sphericity in the c.m. frame <ref type="bibr">[38]</ref> (BDT 2 ), (x) normalized Fox-Wolfram moments in the c.m. frame <ref type="bibr">[39]</ref> (BDT 1 , BDT 2 ), (xi) cosine of the angle between the momentum of the signal-kaon and the ROE thrust axis in the c.m. frame (BDT 1 , BDT 2 ), (xii) radial and longitudinal distance between the POCA of the K &#254; candidate track and the tag vertex (BDT 2 ). Variables related to the D 0 =D &#254; suppression D 0 candidates are obtained by fitting the kaon candidate track and each track of opposite charge in the ROE to a common vertex; D &#254; candidates are obtained by fitting the kaon candidate track and two ROE tracks of appropriate charges. In both cases, we choose the candidate having the best vertex fit quality. The related variables are as follows:</p><p>(i) radial distance between the chosen D &#254; candidate vertex and the IP (BDT 2 ), (ii) &#967; 2 of the chosen D 0 candidate vertex fit and the best D &#254; candidate vertex fit (BDT 2 ), (iii) mass of the chosen D 0 candidate (BDT 2 ), (iv) median p-value of the vertex fits of the D 0 candidates (BDT 2 ).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Hadronic tag analysis</head><p>BDTh parameters, reported in Table <ref type="table">IV</ref>, are optimized based on a grid search in the parameter space. The following 12 variables are used as input:</p><p>(i) sum of extra-photon energy deposits in ECL, (ii) number of extra tracks, (iii) sum of the missing energy and absolute missing three-momentum vector, (iv) azimuthal angle between the signal kaon and the missing-momentum vector,</p><p>TABLE III. Parameter values of the ITA classifier model [15]. Parameter Value Number of trees 2000 Tree depth 2 (BDT 1 ), 3 (BDT 2 ) Shrinkage 0.2 Sampling rate 0.5 Number of bins 256</p><p>(v) cosine of the angle between the momentum direction of the signal-kaon candidate, and the thrust axis of the particles comprising the B tag , the extra tracks, and the extra photons, (vi) modified Fox-Wolfram moments H so 22 , H so 02 , H oo 0 , (vii) invariant mass of of the four-momentum difference between the two colliding beams and the signal kaon, (viii) signal probability for the B tag returned by the FEI algorithm, (ix) p-value of the vertex fit of the signal kaon and one or two tracks in the event to reject fake kaons coming from D 0 or D &#254; decays.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>112006-13</p></note>
		</body>
		</text>
</TEI>
