skip to main content


Title: Dynamic off‐resonance correction for spiral real‐time MRI of speech
Purpose

To improve the depiction and tracking of vocal tract articulators in spiral real‐time MRI (RT‐MRI) of speech production by estimating and correcting for dynamic changes in off‐resonance.

Methods

The proposed method computes a dynamic field map from the phase of single‐TE dynamic images after a coil phase compensation where complex coil sensitivity maps are estimated from the single‐TE dynamic scan itself. This method is tested using simulations and in vivo data. The depiction of air–tissue boundaries is evaluated quantitatively using a sharpness metric and visual inspection.

Results

Simulations demonstrate that the proposed method provides robust off‐resonance correction for spiral readout durations up to 5 ms at 1.5T. In ‐vivo experiments during human speech production demonstrate that image sharpness is improved in a majority of data sets at air–tissue boundaries including the upper lip, hard palate, soft palate, and tongue boundaries, whereas the lower lip shows little improvement in the edge sharpness after correction.

Conclusion

Dynamic off‐resonance correction is feasible from single‐TE spiral RT‐MRI data, and provides a practical performance improvement in articulator sharpness when applied to speech production imaging.

 
more » « less
NSF-PAR ID:
10066093
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Magnetic Resonance in Medicine
Volume:
81
Issue:
1
ISSN:
0740-3194
Page Range / eLocation ID:
p. 234-246
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Purpose

    To develop and evaluate a fast and effective method for deblurring spiral real‐time MRI (RT‐MRI) using convolutional neural networks.

    Methods

    We demonstrate a 3‐layer residual convolutional neural networks to correct image domain off‐resonance artifacts in speech production spiral RT‐MRI without the knowledge of field maps. The architecture is motivated by the traditional deblurring approaches. Spatially varying off‐resonance blur is synthetically generated by using discrete object approximation and field maps with data augmentation from a large database of 2D human speech production RT‐MRI. The effect of off‐resonance range, shift‐invariance of blur, and readout durations on deblurring performance are investigated. The proposed method is validated using synthetic and real data with longer readouts, quantitatively using image quality metrics and qualitatively via visual inspection, and with a comparison to conventional deblurring methods.

    Results

    Deblurring performance was found superior to a current autocalibrated method for in vivo data and only slightly worse than an ideal reconstruction with perfect knowledge of the field map for synthetic test data. Convolutional neural networks deblurring made it possible to visualize articulator boundaries with readouts up to 8 ms at 1.5 T, which is 3‐fold longer than the current standard practice. The computation time was 12.3 ± 2.2 ms per frame, enabling low‐latency processing for RT‐MRI applications.

    Conclusion

    Convolutional neural networks deblurring is a practical, efficient, and field map‐free approach for the deblurring of spiral RT‐MRI. In the context of speech production imaging, this can enable 1.7‐fold improvement in scan efficiency and the use of spiral readouts at higher field strengths such as 3 T.

     
    more » « less
  2. Purpose

    To provide 3D real‐time MRI of speech production with improved spatio‐temporal sharpness using randomized, variable‐density, stack‐of‐spiral sampling combined with a 3D spatio‐temporally constrained reconstruction.

    Methods

    We evaluated five candidate (k,t) sampling strategies using a previously proposed gradient‐echo stack‐of‐spiral sequence and a 3D constrained reconstruction with spatial and temporal penalties. Regularization parameters were chosen by expert readers based on qualitative assessment. We experimentally determined the effect of spiral angle increment andkztemporal order. The strategy yielding highest image quality was chosen as the proposed method. We evaluated the proposed and original 3D real‐time MRI methods in 2 healthy subjects performing speech production tasks that invoke rapid movements of articulators seen in multiple planes, using interleaved 2D real‐time MRI as the reference. We quantitatively evaluated tongue boundary sharpness in three locations at two speech rates.

    Results

    The proposed data‐sampling scheme uses a golden‐angle spiral increment in thekxkyplane and variable‐density, randomized encoding alongkz. It provided a statistically significant improvement in tongue boundary sharpness score (P < .001) in the blade, body, and root of the tongue during normal and 1.5‐times speeded speech. Qualitative improvements were substantial during natural speech tasks of alternating high, low tongue postures during vowels. The proposed method was also able to capture complex tongue shapes during fast alveolar consonant segments. Furthermore, the proposed scheme allows flexible retrospective selection of temporal resolution.

    Conclusion

    We have demonstrated improved 3D real‐time MRI of speech production using randomized, variable‐density, stack‐of‐spiral sampling with a 3D spatio‐temporally constrained reconstruction.

     
    more » « less
  3. Abstract

    Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is however limited, and comprehensive datasets with broad access are needed to catalyze research across numerous domains. The imaging of the rapidly moving articulators and dynamic airway shaping during speech demands high spatio-temporal resolution and robust reconstruction methods. Further, while reconstructed images have been published, to-date there is no open dataset providing raw multi-coil RT-MRI data from an optimized speech production experimental setup. Such datasets could enable new and improved methods for dynamic image reconstruction, artifact correction, feature extraction, and direct extraction of linguistically-relevant biomarkers. The present dataset offers a unique corpus of 2D sagittal-view RT-MRI videos along with synchronized audio for 75 participants performing linguistically motivated speech tasks, alongside the corresponding public domain raw RT-MRI data. The dataset also includes 3D volumetric vocal tract MRI during sustained speech sounds and high-resolution static anatomical T2-weighted upper airway MRI for each participant.

     
    more » « less
  4. Purpose

    To demonstrate a tagging method compatible with RT‐MRI for the study of speech production.

    Methods

    Tagging is applied as a brief interruption to a continuous real‐time spiral acquisition. Tagging can be initiated manually by the operator, cued to the speech stimulus, or be automatically applied with a fixed frequency. We use a standard 2D 1‐3‐3‐1 binomial SPAtial Modulation of Magnetization (SPAMM) sequence with 1 cm spacing in both in‐plane directions. Tag persistence in tongue muscle is simulated and validated in vivo. The ability to capture internal tongue deformations is tested during speech production of American English diphthongs in native speakers.

    Results

    We achieved an imaging window of 650‐800 ms at 1.5T, with imaging signal to noise ratio ≥ 17 and tag contrast to noise ratio ≥ 5 in human tongue, providing 36 frames/s temporal resolution and 2 mm in‐plane spatial resolution with real‐time interactive acquisition and view‐sharing reconstruction. The proposed method was able to capture tongue motion patterns and their relative timing with adequate spatiotemporal resolution during the production of American English diphthongs and consonants.

    Conclusion

    Intermittent tagging during real‐time MRI of speech production is able to reveal the internal deformations of the tongue. This capability will allow new investigations of valuable spatiotemporal information on the biomechanics of the lingual subsystems during speech without reliance on binning speech utterance repetition.

     
    more » « less
  5. Purpose

    To mitigate a common artifact in spiral real‐time MRI, caused by aliasing of signal outside the desired FOV. This artifact frequently occurs in midsagittal speech real‐time MRI.

    Methods

    Simulations were performed to determine the likely origin of the artifact. Two methods to mitigate the artifact are proposed. The first approach, denoted as “large FOV” (LF), keeps an FOV that is large enough to include the artifact signal source during reconstruction. The second approach, denoted as “estimation‐subtraction” (ES), estimates the artifact signal source before subtracting a synthetic signal representing that source in multicoil k‐space raw data. Twenty‐five midsagittal speech‐production real‐time MRI data sets were used to evaluate both of the proposed methods. Reconstructions without and with corrections were evaluated by two expert readers using a 5‐level Likert scale assessing artifact severity. Reconstruction time was also compared.

    Results

    The origin of the artifact was found to be a combination of gradient nonlinearity and imperfect anti‐aliasing in spiral sampling. The LF and ES methods were both able to substantially reduce the artifact, with an averaged qualitative score improvement of 1.25 and 1.35 Likert levels for LF correction and ES correction, respectively. Average reconstruction time without correction, with LF correction, and with ES correction were 160.69 ± 1.56, 526.43 ± 5.17, and 171.47 ± 1.71 ms/frame.

    Conclusion

    Both proposed methods were able to reduce the spiral aliasing artifacts, with the ES‐reduction method being more effective and more time efficient.

     
    more » « less