skip to main content


Search for: All records

Creators/Authors contains: "Chen, S. J."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. submitted - in Review for IEEE ICASSP-2024) (Ed.)
    The Fearless Steps Apollo (FS-APOLLO) resource is a collection of over 150,000 hours of audio, associated meta-data, and supplemental technological toolkit intended to benefit the (i) speech processing technology, (ii) communication science, team-based psychology, and history, and (iii) education/STEM, preservation/archival communities. The FSAPOLLO initiative which started in 2014 has since resulted in the preservation of over 75,000 hours of NASA Apollo Missions audio. Systems created for this audio collection have led to the emergence of several new Speech and Language Technologies (SLT). This paper seeks to provide an overview of the latest advancements in the FS-Apollo effort and explore upcoming strategies in big-data deployment, outreach, and novel avenues of K-12 and STEM education facilitated through this resource. 
    more » « less
    Free, publicly-accessible full text available April 16, 2025
  2. INTRODUCTION: CRSS-UTDallas initiated and oversaw the efforts to recover APOLLO mission communications by re-engineering the NASA SoundScriber playback system, and digitizing 30-channel analog audio tapes – with the entire Apollo-11, Apollo-13, and Gemini-8 missions during 2011-17 [1,6]. This vast data resource was made publicly available along with supplemental speech & language technologies meta-data based on CRSS pipeline diarization transcripts and conversational speaker time-stamps for Apollo team at NASA Mission Control Center, [2,4]. In 2021, renewed efforts over the past year have resulted in the digitization of an additional +50,000hrs of audio from Apollo 7,8,9,10,12 missions, and remaining A-13 tapes. Cumulative digitization efforts have enabled the development of the largest publicly available speech data resource with unprompted, real conversations recorded in naturalistic environments. Deployment of this massive corpus has inspired multiple collaborative initiatives such as Web resources ExploreApollo (https://app.exploreapollo.org) LanguageARC (https://languagearc.com/projects/21) [3]. ExploreApollo.org serves as the visualization and play-back tool, and LanguageARC the crowd source subject content tagging resource developed by UG/Grad. Students, intended as an educational resource for k-12 students, and STEM/Apollo enthusiasts. Significant algorithmic advancements have included advanced deep learning models that are now able to improve automatic transcript generation quality, and even extract high level knowledge such as ID labels of topics being spoken across different mission stages. Efficient transcript generation and topic extraction tools for this naturalistic audio have wide applications including content archival and retrieval, speaker indexing, education, group dynamics and team cohesion analysis. Some of these applications have been deployed in our online portals to provide a more immersive experience for students and researchers. Continued worldwide outreach in the form of the Fearless Steps Challenges has proven successful with the most recent Phase-4 of the Challenge series. This challenge has motivated research in low level tasks such as speaker diarization and high level tasks like topic identification. IMPACT: Distribution and visualization of the Apollo audio corpus through the above mentioned online portals and Fearless Steps Challenges have produced significant impact as a STEM education resource for K-12 students as well as a SLT development resource with real-world applications for research organizations globally. The speech technologies developed by CRSS-UTDallas using the Fearless Steps Apollo corpus have improved previous benchmarks on multiple tasks [1, 5]. The continued initiative will extend the current digitization efforts to include over 150,000 hours of audio recorded during all Apollo missions. ILLUSTRATION: We will demonstrate WebExploreApollo and LanguageARC online portals with newly digitized audio playback in addition to improved SLT baseline systems, the results from ASR and Topic Identification systems which will include research performed on the corpus conversational. Performance analysis visualizations will also be illustrated. We will also display results from the past challenges and their state-of-the-art system improvements. 
    more » « less
  3. Abstract A study of the charge conjugation and parity ( $$\textit{CP}$$ CP ) properties of the interaction between the Higgs boson and $$\tau $$ τ -leptons is presented. The study is based on a measurement of $$\textit{CP}$$ CP -sensitive angular observables defined by the visible decay products of $$\tau $$ τ -leptons produced in Higgs boson decays. The analysis uses 139 fb $$^{-1}$$ - 1 of proton–proton collision data recorded at a centre-of-mass energy of $$\sqrt{s}= 13$$ s = 13  TeV with the ATLAS detector at the Large Hadron Collider. Contributions from $$\textit{CP}$$ CP -violating interactions between the Higgs boson and $$\tau $$ τ -leptons are described by a single mixing angle parameter $$\phi _{\tau }$$ ϕ τ in the generalised Yukawa interaction. Without constraining the $$H\rightarrow \tau \tau $$ H → τ τ signal strength to its expected value under the Standard Model hypothesis, the mixing angle $$\phi _{\tau }$$ ϕ τ is measured to be $$9^{\circ } \pm 16^{\circ }$$ 9 ∘ ± 16 ∘ , with an expected value of $$0^{\circ } \pm 28^{\circ }$$ 0 ∘ ± 28 ∘ at the 68% confidence level. The pure $$\textit{CP}$$ CP -odd hypothesis is disfavoured at a level of 3.4 standard deviations. The results are compatible with the predictions for the Higgs boson in the Standard Model. 
    more » « less
    Free, publicly-accessible full text available July 1, 2024
  4. A bstract A search for heavy Higgs bosons produced in association with a vector boson and decaying into a pair of vector bosons is performed in final states with two leptons (electrons or muons) of the same electric charge, missing transverse momentum and jets. A data sample of proton–proton collisions at a centre-of-mass energy of 13 TeV recorded with the ATLAS detector at the Large Hadron Collider between 2015 and 2018 is used. The data correspond to a total integrated luminosity of 139 fb − 1 . The observed data are in agreement with Standard Model background expectations. The results are interpreted using higher-dimensional operators in an effective field theory. Upper limits on the production cross-section are calculated at 95% confidence level as a function of the heavy Higgs boson’s mass and coupling strengths to vector bosons. Limits are set in the Higgs boson mass range from 300 to 1500 GeV, and depend on the assumed couplings. The highest excluded mass for a heavy Higgs boson with the coupling combinations explored is 900 GeV. Limits on coupling strengths are also provided. 
    more » « less
    Free, publicly-accessible full text available July 1, 2024
  5. A bstract A search for Higgs boson pair production in events with two b -jets and two τ -leptons is presented, using a proton–proton collision dataset with an integrated luminosity of 139 fb − 1 collected at $$ \sqrt{s} $$ s = 13 TeV by the ATLAS experiment at the LHC. Higgs boson pairs produced non-resonantly or in the decay of a narrow scalar resonance in the mass range from 251 to 1600 GeV are targeted. Events in which at least one τ -lepton decays hadronically are considered, and multivariate discriminants are used to reject the backgrounds. No significant excess of events above the expected background is observed in the non-resonant search. The largest excess in the resonant search is observed at a resonance mass of 1 TeV, with a local (global) significance of 3 . 1 σ (2 . 0 σ ). Observed (expected) 95% confidence-level upper limits are set on the non-resonant Higgs boson pair-production cross-section at 4.7 (3.9) times the Standard Model prediction, assuming Standard Model kinematics, and on the resonant Higgs boson pair-production cross-section at between 21 and 900 fb (12 and 840 fb), depending on the mass of the narrow scalar resonance. 
    more » « less
    Free, publicly-accessible full text available July 1, 2024
  6. A bstract A combination of measurements of the inclusive top-quark pair production cross-section performed by ATLAS and CMS in proton–proton collisions at centre-of-mass energies of 7 and 8 TeV at the LHC is presented. The cross-sections are obtained using top-quark pair decays with an opposite-charge electron–muon pair in the final state and with data corresponding to an integrated luminosity of about 5 fb − 1 at $$ \sqrt{s} $$ s = 7 TeV and about 20 fb − 1 at $$ \sqrt{s} $$ s = 8 TeV for each experiment. The combined cross-sections are determined to be 178 . 5 ± 4 . 7 pb at $$ \sqrt{s} $$ s = 7 TeV and $$ {243.3}_{-5.9}^{+6.0} $$ 243.3 − 5.9 + 6.0 pb at $$ \sqrt{s} $$ s = 8 TeV with a correlation of 0.41, using a reference top-quark mass value of 172.5 GeV. The ratio of the combined cross-sections is determined to be R 8 / 7 = 1 . 363 ± 0 . 032. The combined measured cross-sections and their ratio agree well with theory calculations using several parton distribution function (PDF) sets. The values of the top-quark pole mass (with the strong coupling fixed at 0.118) and the strong coupling (with the top-quark pole mass fixed at 172.5 GeV) are extracted from the combined results by fitting a next-to-next-to-leading-order plus next-to-next-to-leading-log QCD prediction to the measurements. Using a version of the NNPDF3.1 PDF set containing no top-quark measurements, the results obtained are $$ {m}_t^{\textrm{pole}}={173.4}_{-2.0}^{+1.8} $$ m t pole = 173.4 − 2.0 + 1.8 GeV and $$ {\alpha}_{\textrm{s}}\left({m}_Z\right)={0.1170}_{-0.0018}^{+0.0021} $$ α s m Z = 0.1170 − 0.0018 + 0.0021 . 
    more » « less
    Free, publicly-accessible full text available July 1, 2024
  7. Abstract The ATLAS experiment at the Large Hadron Collider has a broad physics programme ranging from precision measurements to direct searches for new particles and new interactions, requiring ever larger and ever more accurate datasets of simulated Monte Carlo events. Detector simulation with Geant4 is accurate but requires significant CPU resources. Over the past decade, ATLAS has developed and utilized tools that replace the most CPU-intensive component of the simulation—the calorimeter shower simulation—with faster simulation methods. Here, AtlFast3, the next generation of high-accuracy fast simulation in ATLAS, is introduced. AtlFast3 combines parameterized approaches with machine-learning techniques and is deployed to meet current and future computing challenges, and simulation needs of the ATLAS experiment. With highly accurate performance and significantly improved modelling of substructure within jets, AtlFast3 can simulate large numbers of events for a wide range of physics processes. 
    more » « less
  8. Abstract The accurate simulation of additional interactions at the ATLAS experiment for the analysis of proton–proton collisions delivered by the Large Hadron Collider presents a significant challenge to the computing resources. During the LHC Run 2 (2015–2018), there were up to 70 inelastic interactions per bunch crossing, which need to be accounted for in Monte Carlo (MC) production. In this document, a new method to account for these additional interactions in the simulation chain is described. Instead of sampling the inelastic interactions and adding their energy deposits to a hard-scatter interaction one-by-one, the inelastic interactions are presampled, independent of the hard scatter, and stored as combined events. Consequently, for each hard-scatter interaction, only one such presampled event needs to be added as part of the simulation chain. For the Run 2 simulation chain, with an average of 35 interactions per bunch crossing, this new method provides a substantial reduction in MC production CPU needs of around 20%, while reproducing the properties of the reconstructed quantities relevant for physics analyses with good accuracy. 
    more » « less
  9. A bstract A direct search for Higgs bosons produced via vector-boson fusion and subsequently decaying into invisible particles is reported. The analysis uses 139 fb − 1 of pp collision data at a centre-of-mass energy of $$ \sqrt{s} $$ s = 13 TeV recorded by the ATLAS detector at the LHC. The observed numbers of events are found to be in agreement with the background expectation from Standard Model processes. For a scalar Higgs boson with a mass of 125 GeV and a Standard Model production cross section, an observed upper limit of 0 . 145 is placed on the branching fraction of its decay into invisible particles at 95% confidence level, with an expected limit of 0 . 103. These results are interpreted in the context of models where the Higgs boson acts as a portal to dark matter, and limits are set on the scattering cross section of weakly interacting massive particles and nucleons. Invisible decays of additional scalar bosons with masses from 50 GeV to 2 TeV are also studied, and the derived upper limits on the cross section times branching fraction decrease with increasing mass from 1 . 0 pb for a scalar boson mass of 50 GeV to 0 . 1 pb at a mass of 2 TeV. 
    more » « less