skip to main content

Title: Real-Time Camera Localization during Robot-Assisted Telecystoscopy for Bladder Cancer Surveillance
Telecystoscopy can lower the barrier to access critical urologic diagnostics for patients around the world. A major challenge for robotic control of flexible cystoscopes and intuitive teleoperation is the pose estimation of the scope tip. We propose a novel real-time camera localization method using video recordings from a prior cystoscopy and 3D bladder reconstruction to estimate cystoscope pose within the bladder during follow-up telecystoscopy. We map prior video frames into a low-dimensional space as a dictionary so that a new image can be likewise mapped to efficiently retrieve its nearest neighbor among the dictionary images. The cystoscope pose is then estimated by the correspondence among the new image, its nearest dictionary image, and the prior model from 3D reconstruction. We demonstrate performance of our methods using bladder phantoms with varying fidelity and a servo-controlled cystoscope to simulate the use case of bladder surveillance through telecystoscopy. The servo-controlled cystoscope with 3 degrees of freedom (angulation, roll, and insertion axes) was developed for collecting cystoscope videos from bladder phantoms. Cystoscope videos were acquired in a 2.5D bladder phantom (bladder-shape cross-section plus height) with a panorama of a urothelium attached to the inner surface. Scans of the 2.5D phantom were performed in separate arc trajectories each of which is generated by actuation on the angulation with a fixed roll and insertion length. We further included variance in moving speed, imaging distance and existence of bladder tumors. Cystoscope videos were also acquired in a water-filled 3D silicone bladder phantom with hand-painted vasculature. Scans of the 3D phantom were performed in separate circle trajectories each of which is generated by actuation on the roll axis under a fixed angulation and insertion length. These videos were used to create 3D reconstructions, dictionary sets, and test data sets for evaluating the computational efficiency and accuracy of our proposed method in comparison with a method based on global Scale-Invariant Feature Transform (SIFT) features, named SIFT-only. Our method can retrieve the nearest dictionary image for 94–100% of test frames in under 55[Formula: see text]ms per image, whereas the SIFT-only method can only find the image match for 56–100% of test frames in 6000–40000[Formula: see text]ms per image depending on size of the dictionary set and richness of SIFT features in the images. Our method, with a speed of around 20 Hz for the retrieval stage, is a promising tool for real-time image-based scope localization in robotic cystoscopy when prior cystoscopy images are available.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
Journal of Medical Robotics Research
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Simultaneous visualization of the teeth and periodontium is of significant clinical interest for image-based monitoring of periodontal health. We recently reported the application of a dual-modality photoacoustic-ultrasound (PA-US) imaging system for resolving periodontal anatomy and periodontal pocket depths in humans. This work utilized a linear array transducer attached to a stepper motor to generate 3D images via maximum intensity projection. This prior work also used a medical head immobilizer to reduce artifacts during volume rendering caused by motion from the subject (e.g., breathing, minor head movements). However, this solution does not completely eliminate motion artifacts while also complicating the imaging procedure and causing patient discomfort. To address this issue, we report the implementation of an image registration technique to correctly align B-mode PA-US images and generate artifact-free 2D cross-sections. Application of the deshaking technique to PA phantoms revealed 80% similarity to the ground truth when shaking was intentionally applied during stepper motor scans. Images from handheld sweeps could also be deshaken using an LED PA-US scanner. Inex vivoporcine mandibles, pigmentation of the enamel was well-estimated within 0.1 mm error. The pocket depth measured in a healthy human subject was also in good agreement with our prior study. This report demonstrates that a modality-independent registration technique can be applied to clinically relevant PA-US scans of the periodontium to reduce operator burden of skill and subject discomfort while showing potential for handheld clinical periodontal imaging.

    more » « less
  2. This study introduces a technique for simultaneous multislice (SMS) cardiac magnetic resonance fingerprinting (cMRF), which improves the slice coverage when quantifying myocardialT1,T2, andM0. The single‐slice cMRF pulse sequence was modified to use multiband (MB) RF pulses for SMS imaging. Different RF phase schedules were used to excite each slice, similar to POMP or CAIPIRINHA, which imparts tissues with a distinguishable and slice‐specific magnetization evolution over time. Because of the high net acceleration factor (R = 48 in plane combined with the slice acceleration), images were first reconstructed with a low rank technique before matching data to a dictionary of signal timecourses generated by a Bloch equation simulation. The proposed method was tested in simulations with a numerical relaxation phantom. Phantom and in vivo cardiac scans of 10 healthy volunteers were also performed at 3 T. With single‐slice acquisitions, the mean relaxation times obtained using the low rank cMRF reconstruction agree with reference values. The low rank method improves the precision inT1andT2for both single‐slice and SMS cMRF, and it enables the acquisition of maps with fewer artifacts when using SMS cMRF at higher MB factors. With this technique, in vivo cardiac maps were acquired from three slices simultaneously during a breathhold lasting 16 heartbeats. SMS cMRF improves the efficiency and slice coverage of myocardialT1andT2mapping compared with both single‐slice cMRF and conventional cardiac mapping sequences. Thus, this technique is a first step toward whole‐heart simultaneousT1andT2quantification with cMRF.

    more » « less
  3. Purpose

    This work aims to develop an approach for simultaneous water–fat separation and myocardial T1and T2quantification based on the cardiac MR fingerprinting (cMRF) framework with rosette trajectories at 3T and 1.5T.


    Two 15‐heartbeat cMRF sequences with different rosette trajectories designed for water–fat separation at 3T and 1.5T were implemented. Water T1and T2maps, water image, and fat image were generated with B0inhomogeneity correction using a B0map derived from the cMRF data themselves. The proposed water–fat separation rosette cMRF approach was validated in the International Society for Magnetic Resonance in Medicine/National Institute of Standards and Technology MRI system phantom and water/oil phantoms. It was also applied for myocardial tissue mapping of healthy subjects at both 3T and 1.5T.


    Water T1and T2values measured using rosette cMRF in the International Society for Magnetic Resonance in Medicine/National Institute of Standards and Technology phantom agreed well with the reference values. In the water/oil phantom, oil was well suppressed in the water images and vice versa. Rosette cMRF yielded comparable T1but 2~3 ms higher T2values in the myocardium of healthy subjects than the original spiral cMRF method. Epicardial fat deposition was also clearly shown in the fat images.


    Rosette cMRF provides fat images along with myocardial T1and T2maps with significant fat suppression. This technique may improve visualization of the anatomical structure of the heart by separating water and fat and could provide value in diagnosing cardiac diseases associated with fibrofatty infiltration or epicardial fat accumulation. It also paves the way toward comprehensive myocardial tissue characterization in a single scan.

    more » « less
  4. Purpose

    To enable rapid imaging with a scan time–efficient 3D cones trajectory with a deep‐learning off‐resonance artifact correction technique.


    A residual convolutional neural network to correct off‐resonance artifacts (Off‐ResNet) was trained with a prospective study of pediatric MRA exams. Each exam acquired a short readout scan (1.18 ms ± 0.38) and a long readout scan (3.35 ms ± 0.74) at 3 T. Short readout scans, with longer scan times but negligible off‐resonance blurring, were used as reference images and augmented with additional off‐resonance for supervised training examples. Long readout scans, with greater off‐resonance artifacts but shorter scan time, were corrected by autofocus and Off‐ResNet and compared with short readout scans by normalized RMS error, structural similarity index, and peak SNR. Scans were also compared by scoring on 8 anatomical features by two radiologists, using analysis of variance with post hoc Tukey's test and two one‐sided t‐tests. Reader agreement was determined with intraclass correlation.


    The total scan time for long readout scans was on average 59.3% shorter than short readout scans. Images from Off‐ResNet had superior normalized RMS error, structural similarity index, and peak SNR compared with uncorrected images across ±1 kHz off‐resonance (P< .01). The proposed method had superior normalized RMS error over −677 Hz to +1 kHz and superior structural similarity index and peak SNR over ±1 kHz compared with autofocus (P< .01). Radiologic scoring demonstrated that long readout scans corrected with Off‐ResNet were noninferior to short readout scans (P< .05).


    The proposed method can correct off‐resonance artifacts from rapid long‐readout 3D cones scans to a noninferior image quality compared with diagnostically standard short readout scans.

    more » « less
  5. SUMMARY An efficient method for tracking a target using a single Pan-Tilt-Zoom (PTZ) camera is proposed. The proposed Scale-Invariant Optical Flow (SIOF) method estimates the motion of the target and rotates the camera accordingly to keep the target at the center of the image. Also, SIOF estimates the scale of the target and changes the focal length relatively to adjust the Field of View (FoV) and keep the target appear in the same size in all captured frames. SIOF is a feature-based tracking method. Feature points used are extracted and tracked using Optical Flow (OF) and Scale-Invariant Feature Transform (SIFT). They are combined in groups and used to achieve robust tracking. The feature points in these groups are used within a twist model to recover the 3D free motion of the target. The merits of this proposed method are (i) building an efficient scale-invariant tracking method that tracks the target and keep it in the FoV of the camera with the same size, and (ii) using tracking with prediction and correction to speed up the PTZ control and achieve smooth camera control. Experimental results were performed on online video streams and validated the efficiency of the proposed method SIOF, comparing with OF, SIFT, and other tracking methods. The proposed SIOF has around 36% less average tracking error and around 70% less tracking overshoot than OF. 
    more » « less