skip to main content


Title: A Reactive Autonomous Camera System for the RAVEN II Surgical Robot
The endoscopic camera of a surgical robot pro- vides surgeons with a magnified 3D view of the surgical field, but repositioning it increases mental workload and operation time. Poor camera placement contributes to safety-critical events when surgical tools move out of the view of the camera. This paper presents a proof of concept of an autonomous camera system for the Raven II surgical robot that aims to reduce surgeon workload and improve safety by providing an optimal view of the workspace showing all objects of interest. This system uses transfer learning to localize and classify objects of interest within the view of a stereoscopic camera. The positions and centroid of the objects are estimated and a set of control rules determines the movement of the camera towards a more desired view. Our perception module had an accuracy of 61.21% overall for identifying objects of interest and was able to localize both graspers and multiple blocks in the environment. Comparison of the commands proposed by our system with the desired commands from a survey of 13 participants indicates that the autonomous camera system proposes appropriate movements for the tilt and pan of the camera.  more » « less
Award ID(s):
1829004
NSF-PAR ID:
10257276
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2020 International Symposium on Medical Robotics (ISMR)
Page Range / eLocation ID:
195 to 201
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The recent development of Robot-Assisted Minimally Invasive Surgery (RAMIS) has brought much benefit to ease the performance of complex Minimally Invasive Surgery (MIS) tasks and lead to more clinical outcomes. Compared to direct master-slave manipulation, semi-autonomous control for the surgical robot can enhance the efficiency of the operation, particularly for repetitive tasks. However, operating in a highly dynamic in-vivo environment is complex. Supervisory control functions should be included to ensure flexibility and safety during the autonomous control phase. This paper presents a haptic rendering interface to enable supervised semi-autonomous control for a surgical robot. Bayesian optimization is used to tune user-specific parameters during the surgical training process. User studies were conducted on a customized simulator for validation. Detailed comparisons are made between with and without the supervised semi-autonomous control mode in terms of the number of clutching events, task completion time, master robot end-effector trajectory and average control speed of the slave robot. The effectiveness of the Bayesian optimization is also evaluated, demonstrating that the optimized parameters can significantly improve users' performance. Results indicate that the proposed control method can reduce the operator's workload and enhance operation efficiency. 
    more » « less
  2. Autonomous mobile robots (AMRs) have been widely utilized in industry to execute various on-board computer-vision applications including autonomous guidance, security patrol, object detection, and face recognition. Most of the applications executed by an AMR involve the analysis of camera images through trained machine learning models. Many research studies on machine learning focus either on performance without considering energy efficiency or on techniques such as pruning and compression to make the model more energy-efficient. However, most previous work do not study the root causes of energy inefficiency for the execution of those applications on AMRs. The computing stack on an AMR accounts for 33% of the total energy consumption and can thus highly impact the battery life of the robot. Because recharging an AMR may disrupt the application execution, it is important to efficiently utilize the available energy for maximized battery life. In this paper, we first analyze the breakdown of power dissipation for the execution of computer-vision applications on AMRs and discover three main root causes of energy inefficiency: uncoordinated access to sensor data, performance-oriented model inference execution, and uncoordinated execution of concurrent jobs. In order to fix these three inefficiencies, we propose E2M, an energy-efficient middleware software stack for autonomous mobile robots. First, E2M regulates the access of different processes to sensor data, e.g., camera frames, so that the amount of data actually captured by concurrently executing jobs can be minimized. Second, based on a predefined per-process performance metric (e.g., safety, accuracy) and desired target, E2M manipulates the process execution period to find the best energy-performance trade off. Third, E2M coordinates the execution of the concurrent processes to maximize the total contiguous sleep time of the computing hardware for maximized energy savings. We have implemented a prototype of E2M on a real-world AMR. Our experimental results show that, compared to several baselines, E2M leads to 24% energy savings for the computing platform, which translates into an extra 11.5% of battery time and 14 extra minutes of robot runtime, with a performance degradation lower than 7.9% for safety and 1.84% for accuracy. 
    more » « less
  3. null (Ed.)
    This work presents ideation and preliminary results of using contextual information and information of the objects present in the scene to query applicable social navigation rules for the sensed context. Prior work in socially-Aware Navigation (SAN) shows its importance in human-robot interaction as it improves the interaction quality, safety and comfort of the interacting partner. In this work, we are interested in automatic detection of social rules in SAN and we present three major components of our method, namely: a Convolutional Neural Network-based context classifier that can autonomously perceive contextual information using camera input; a YOLO-based object detection to localize objects with a scene; and a knowledge base of social rules relationships with the concepts to query them using both contextual and detected objects in the scene. Our preliminary results suggest that our approach can observe an on-going interaction, given an image input, and use that information to query the social navigation rules required in that particular context. 
    more » « less
  4. Vision serves as an essential sensory input for insects but consumes substantial energy resources. The cost to support sensitive photoreceptors has led many insects to develop high visual acuity in only small retinal regions and evolve to move their visual systems independent of their bodies through head motion. By understanding the trade-offs made by insect vision systems in nature, we can design better vision systems for insect-scale robotics in a way that balances energy, computation, and mass. Here, we report a fully wireless, power-autonomous, mechanically steerable vision system that imitates head motion in a form factor small enough to mount on the back of a live beetle or a similarly sized terrestrial robot. Our electronics and actuator weigh 248 milligrams and can steer the camera over 60° based on commands from a smartphone. The camera streams “first person” 160 pixels–by–120 pixels monochrome video at 1 to 5 frames per second (fps) to a Bluetooth radio from up to 120 meters away. We mounted this vision system on two species of freely walking live beetles, demonstrating that triggering image capture using an onboard accelerometer achieves operational times of up to 6 hours with a 10–milliamp hour battery. We also built a small, terrestrial robot (1.6 centimeters by 2 centimeters) that can move at up to 3.5 centimeters per second, support vision, and operate for 63 to 260 minutes. Our results demonstrate that steerable vision can enable object tracking and wide-angle views for 26 to 84 times lower energy than moving the whole robot.

     
    more » « less
  5. A fundamental challenge in retinal surgery is safely navigating a surgical tool to a desired goal position on the retinal surface while avoiding damage to surrounding tissues, a procedure that typically requires tens-of-microns accuracy. In practice, the surgeon relies on depth-estimation skills to localize the tool-tip with respect to the retina and perform the tool-navigation task, which can be prone to human error. To alleviate such uncertainty, prior work has introduced ways to assist the surgeon by estimating the tool-tip distance to the retina and providing haptic or auditory feedback. However, automating the tool-navigation task itself remains unsolved and largely un-explored. Such a capability, if reliably automated, could serve as a building block to streamline complex procedures and reduce the chance for tissue damage. Towards this end, we propose to automate the tool-navigation task by mimicking the perception-action feedback loop of an expert surgeon. Specifically, a deep network is trained to imitate expert trajectories toward various locations on the retina based on recorded visual servoing to a given goal specified by the user. The proposed autonomous navigation system is evaluated in simulation and in real-life experiments using a silicone eye phantom. We show that the network can reliably navigate a surgical tool to various desired locations within 137 µm accuracy in phantom experiments and 94 µm in simulation, and generalizes well to unseen situations such as in the presence of auxiliary surgical tools, variable eye backgrounds, and brightness conditions. 
    more » « less