skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Chen, S.-J."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. submitted - in Review for IEEE ICASSP-2024) (Ed.)
    The Fearless Steps Apollo (FS-APOLLO) resource is a collection of over 150,000 hours of audio, associated meta-data, and supplemental technological toolkit intended to benefit the (i) speech processing technology, (ii) communication science, team-based psychology, and history, and (iii) education/STEM, preservation/archival communities. The FSAPOLLO initiative which started in 2014 has since resulted in the preservation of over 75,000 hours of NASA Apollo Missions audio. Systems created for this audio collection have led to the emergence of several new Speech and Language Technologies (SLT). This paper seeks to provide an overview of the latest advancements in the FS-Apollo effort and explore upcoming strategies in big-data deployment, outreach, and novel avenues of K-12 and STEM education facilitated through this resource. 
    more » « less
  2. INTRODUCTION: CRSS-UTDallas initiated and oversaw the efforts to recover APOLLO mission communications by re-engineering the NASA SoundScriber playback system, and digitizing 30-channel analog audio tapes – with the entire Apollo-11, Apollo-13, and Gemini-8 missions during 2011-17 [1,6]. This vast data resource was made publicly available along with supplemental speech & language technologies meta-data based on CRSS pipeline diarization transcripts and conversational speaker time-stamps for Apollo team at NASA Mission Control Center, [2,4]. In 2021, renewed efforts over the past year have resulted in the digitization of an additional +50,000hrs of audio from Apollo 7,8,9,10,12 missions, and remaining A-13 tapes. Cumulative digitization efforts have enabled the development of the largest publicly available speech data resource with unprompted, real conversations recorded in naturalistic environments. Deployment of this massive corpus has inspired multiple collaborative initiatives such as Web resources ExploreApollo (https://app.exploreapollo.org) LanguageARC (https://languagearc.com/projects/21) [3]. ExploreApollo.org serves as the visualization and play-back tool, and LanguageARC the crowd source subject content tagging resource developed by UG/Grad. Students, intended as an educational resource for k-12 students, and STEM/Apollo enthusiasts. Significant algorithmic advancements have included advanced deep learning models that are now able to improve automatic transcript generation quality, and even extract high level knowledge such as ID labels of topics being spoken across different mission stages. Efficient transcript generation and topic extraction tools for this naturalistic audio have wide applications including content archival and retrieval, speaker indexing, education, group dynamics and team cohesion analysis. Some of these applications have been deployed in our online portals to provide a more immersive experience for students and researchers. Continued worldwide outreach in the form of the Fearless Steps Challenges has proven successful with the most recent Phase-4 of the Challenge series. This challenge has motivated research in low level tasks such as speaker diarization and high level tasks like topic identification. IMPACT: Distribution and visualization of the Apollo audio corpus through the above mentioned online portals and Fearless Steps Challenges have produced significant impact as a STEM education resource for K-12 students as well as a SLT development resource with real-world applications for research organizations globally. The speech technologies developed by CRSS-UTDallas using the Fearless Steps Apollo corpus have improved previous benchmarks on multiple tasks [1, 5]. The continued initiative will extend the current digitization efforts to include over 150,000 hours of audio recorded during all Apollo missions. ILLUSTRATION: We will demonstrate WebExploreApollo and LanguageARC online portals with newly digitized audio playback in addition to improved SLT baseline systems, the results from ASR and Topic Identification systems which will include research performed on the corpus conversational. Performance analysis visualizations will also be illustrated. We will also display results from the past challenges and their state-of-the-art system improvements. 
    more » « less
  3. This report presents a comprehensive collection of searches for new physics performed by the ATLAS Collaboration during the Run~2 period of data taking at the Large Hadron Collider, from 2015 to 2018, corresponding to about 140~$$^{-1}$$ of $$\sqrt{s}=13$$~TeV proton--proton collision data. These searches cover a variety of beyond-the-standard model topics such as dark matter candidates, new vector bosons, hidden-sector particles, leptoquarks, or vector-like quarks, among others. Searches for supersymmetric particles or extended Higgs sectors are explicitly excluded as these are the subject of separate reports by the Collaboration. For each topic, the most relevant searches are described, focusing on their importance and sensitivity and, when appropriate, highlighting the experimental techniques employed. In addition to the description of each analysis, complementary searches are compared, and the overall sensitivity of the ATLAS experiment to each type of new physics is discussed. Summary plots and statistical combinations of multiple searches are included whenever possible. 
    more » « less
    Free, publicly-accessible full text available April 22, 2026
  4. The ATLAS experiment has developed extensive software and distributed computing systems for Run 3 of the LHC. These systems are described in detail, including software infrastructure and workflows, distributed data and workload management, database infrastructure, and validation. The use of these systems to prepare the data for physics analysis and assess its quality are described, along with the software tools used for data analysis itself. An outlook for the development of these projects towards Run 4 is also provided. 
    more » « less
    Free, publicly-accessible full text available March 6, 2026
  5. A search is performed for dark matter particles produced in association with a resonantly produced pair of b-quarks with 30 < mbb < 150 GeV using 140 fb−1 of proton-proton collisions at a center-of-mass energy of 13 TeV recorded by the ATLAS detector at the LHC. This signature is expected in extensions of the standard model predicting the production of dark matter particles, in particular those containing a dark Higgs boson s that decays into bb¯. The highly boosted s → bb¯ topology is reconstructed using jet reclustering and a new identification algorithm. This search places stringent constraints across regions of the dark Higgs model parameter space that satisfy the observed relic density, excluding dark Higgs bosons with masses between 30 and 150 GeV in benchmark scenarios with Z0 mediator masses up to 4.8 TeV at 95% confidence level. 
    more » « less
    Free, publicly-accessible full text available March 1, 2026
  6. This paper presents a search for exotic decays of the Higgs boson into a pair of new pseudoscalar particles, H → aa, where one pseudoscalar decays into a b-quark pair and the other decays into a τ-lepton pair, in the mass range 12 ≤ ma ≤ 60 GeV. The analysis uses pp collision data at \sqrt{s} = 13 TeV collected with the ATLAS detector at the LHC, corresponding to an integrated luminosity of 140 fb−1. No significant excess above the Standard Model (SM) prediction is observed. Assuming the SM Higgs boson production cross section, the search sets upper limits at 95% confidence level on the branching ratio of Higgs bosons decaying into BR (H → aa → bb\tau\tau), between 2.2% and 3.9% depending on the pseudoscalar mass. 
    more » « less
    Free, publicly-accessible full text available September 1, 2025
  7. Free, publicly-accessible full text available August 20, 2025
  8. A search for the nonresonant production of Higgs boson pairs in the H H b b ¯ τ + τ channel is performed using 140 fb 1 of proton-proton collisions at a center-of-mass energy of 13 TeV recorded by the ATLAS detector at the CERN Large Hadron Collider. The analysis strategy is optimized to probe anomalous values of the Higgs boson self-coupling modifier κ λ and of the quartic H H V V ( V = W , Z ) coupling modifier κ 2 V . No significant excess above the expected background from Standard Model processes is observed. An observed (expected) upper limit μ H H < 5.9 ( 3.3 ) is set at 95% confidence-level on the Higgs boson pair production cross section normalized to its Standard Model prediction. The coupling modifiers are constrained to an observed (expected) 95% confidence interval of 3.1 < κ λ < 9.0 ( 2.5 < κ λ < 9.3 ) and 0.5 < κ 2 V < 2.7 ( 0.2 < κ 2 V < 2.4 ), assuming all other Higgs boson couplings are fixed to the Standard Model prediction. The results are also interpreted in the context of effective field theories via constraints on anomalous Higgs boson couplings and Higgs boson pair production cross sections assuming different kinematic benchmark scenarios. © 2024 CERN, for the ATLAS Collaboration2024CERN 
    more » « less
    Free, publicly-accessible full text available August 1, 2025
  9. Abstract A search for leptoquark pair production decaying into$$te^- \bar{t}e^+$$ t e - t ¯ e + or$$t\mu ^- \bar{t}\mu ^+$$ t μ - t ¯ μ + in final states with multiple leptons is presented. The search is based on a dataset ofppcollisions at$$\sqrt{s}=13~\text {TeV} $$ s = 13 TeV recorded with the ATLAS detector during Run 2 of the Large Hadron Collider, corresponding to an integrated luminosity of 139 fb$$^{-1}$$ - 1 . Four signal regions, with the requirement of at least three light leptons (electron or muon) and at least two jets out of which at least one jet is identified as coming from ab-hadron, are considered based on the number of leptons of a given flavour. The main background processes are estimated using dedicated control regions in a simultaneous fit with the signal regions to data. No excess above the Standard Model background prediction is observed and 95% confidence level limits on the production cross section times branching ratio are derived as a function of the leptoquark mass. Under the assumption of exclusive decays into$$te^{-}$$ t e - ($$t\mu ^{-}$$ t μ - ), the corresponding lower limit on the scalar mixed-generation leptoquark mass$$m_{\textrm{LQ}_{\textrm{mix}}^{\textrm{d}}}$$ m LQ mix d is at 1.58 (1.59) TeV and on the vector leptoquark mass$$m_{{\tilde{U}}_1}$$ m U ~ 1 at 1.67 (1.67) TeV in the minimal coupling scenario and at 1.95 (1.95) TeV in the Yang–Mills scenario. 
    more » « less
    Free, publicly-accessible full text available August 1, 2025
  10. Several processes studied by the ATLAS experiment at the Large Hadron Collider produce low momentum b-flavored hadrons in the final state. This paper describes the calibration of a dedicated tagging algorithm that identifies b-flavored hadrons outside of hadronic jets by reconstructing the soft secondary vertices originating from their decays. The calibration is based on a proton-proton collision dataset at a center-of-mass energy of 13 TeV corresponding to an integrated luminosity of 140 fb−1. Scale factors used to correct the algorithm’s performance in simulated events are extracted for the b-tagging efficiency and the mistag rate of the algorithm using a data sample enriched in t¯t events. Several orthogonal measurement regions are defined, binned as a function of the multiplicities of soft secondary vertices and jets containing a b-flavored hadron in the event. The mistag rate scale factors are estimated separately for events with low and high average numbers of interactions per bunch crossing. The results, which are derived from events with low missing transverse momentum, are successfully validated in a phase space characterized by high missing transverse momentum and therefore are applicable to new physics searches carried out in either phase space regime. 
    more » « less
    Free, publicly-accessible full text available August 1, 2025