Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
BackgroundClinical care in modern intensive care units (ICUs) combines multidisciplinary expertise and a complex array of technologies. These technologies have clearly advanced the ability of clinicians to do more for patients, yet so much equipment also presents the possibility for cognitive overload. PurposeThe aim of this study was to investigate clinicians’ experiences with and perceptions of technology in ICUs. Methodology/ApproachWe analyzed qualitative data from 30 interviews with ICU clinicians and frontline managers within four ICUs. ResultsOur interviews identified three main challenges associated with technology in the ICU: (a) too many technologies and too much data; (b) inconsistent and inaccurate technologies; and (c) not enough integration among technologies, alignment with clinical workflows, and support for clinician identities. To address these challenges, interviewees highlighted mitigation strategies to address both social and technical systems and to achieve joint optimization. ConclusionWhen new technologies are added to the ICU, they have potential both to improve and to disrupt patient care. To successfully implement technologies in the ICU, clinicians’ perspectives are crucial. Understanding clinicians’ perspectives can help limit the disruptive effects of new technologies, so clinicians can focus their time and attention on providing care to patients. Practice ImplicationsAs technology and data continue to play an increasingly important role in ICU care, everyone involved in the design, development, approval, implementation, and use of technology should work together to apply a sociotechnical systems approach to reduce possible negative effects on clinical care for critically ill patients.more » « less
-
Aim: Video review programs in hospitals play a crucial role in optimizing operating room workflows. In scenarios where split-seconds can change the outcome of a surgery, the potential of such programs to improve safety and efficiency is profound. However, leveraging this potential requires a systematic and automated analysis of human actions. Existing methods predominantly employ manual methods, which are labor-intensive, inconsistent, and difficult to scale. Here, we present an AI-based approach to systematically analyze the behavior and actions of individuals from operating rooms (OR) videos. Methods: We designed a novel framework for human mesh recovery from long-duration surgical videos by integrating existing human detection, tracking, and mesh recovery models. We then trained an action recognition model to predict surgical actions from the predicted temporal mesh sequences. To train and evaluate our approach, we annotated an in-house dataset of 864 five-second clips from simulated surgical videos with their corresponding actions. Results: Our best model achieves an F1 score and the area under the precision-recall curve (AUPRC) of 0.81 and 0.85, respectively, demonstrating that human mesh sequences can be successfully used to recover surgical actions from operating room videos. Model ablation studies suggest that action recognition performance is enhanced by composing human mesh representations with lower arm, pelvic, and cranial joints. Conclusion: Our work presents promising opportunities for OR video review programs to study human behavior in a systematic, scalable manner.more » « less
-
Video-language models (VLMs), large models pre-trained on numerous but noisy video-text pairs from the internet, have revolutionized activity recognition through their remarkable generalization and open-vocabulary capabilities. While complex human activities are often hierarchical and compositional, most existing tasks for evaluating VLMs focus only on high-level video understanding, making it difficult to accurately assess and interpret the ability of VLMs to understand complex and fine-grained human activities. Inspired by the recently proposed MOMA framework, we define activity graphs as a single universal representation of human activities that encompasses video understanding at the activity, sub10 activity, and atomic action level. We redefine activity parsing as the overarching task of activity graph generation, requiring understanding human activities across all three levels. To facilitate the evaluation of models on activity parsing, we introduce MOMA-LRG (Multi-Object Multi-Actor Language-Refined Graphs), a large dataset of complex human activities with activity graph annotations that can be readily transformed into natural language sentences. Lastly, we present a model-agnostic and lightweight approach to adapting and evaluating VLMs by incorporating structured knowledge from activity graphs into VLMs, addressing the individual limitations of language and graphical models. We demonstrate a strong performance on activity parsing and few-shot video classification, and our framework is intended to foster future research in the joint modeling of videos, graphs, and language.more » « less
An official website of the United States government

Full Text Available