Background Lung volume reduction surgery (LVRS) and medical therapy are 2 available treatment options in dealing with severe emphysema, which is a chronic lung disease. However, or there are currently limited guidelines on the timing of LVRS for patients with different characteristics. Objective The objective of this study is to assess the timing of receiving LVRS in terms of patient outcomes, taking into consideration a patient’s characteristics. Methods A finite-horizon Markov decision process model for patients with severe emphysema was developed to determine the short-term (5 y) and long-term timing of emphysema treatment. Maximizing the expected life expectancy, expected quality-adjusted life-years, and total expected cost of each treatment option were applied as the objective functions of the model. To estimate parameters in the model, the data provided by the National Emphysema Treatment Trial were used. Results The results indicate that the treatment timing strategy for patients with upper-lobe predominant emphysema is to receive LVRS regardless of their specific characteristics. However, for patients with non–upper-lobe–predominant emphysema, the optimal strategy depends on the age, maximum workload level, and forced expiratory volume in 1 second level. Conclusion This study demonstrates the utilization of clinical trial data to gain insights into the timing of surgical treatment for patients with emphysema, considering patient age, observable health condition, and location of emphysema. Highlights Both short-term and long-term Markov decision process models were developed to assess the timing of receiving lung volume reduction surgery in patients with severe emphysema. How clinical trial data can be used to estimate the parameters and obtain short-term results from the Markov decision process model is demonstrated. The results provide insights into the timing of receiving lung volume reduction surgery as a function of a patient’s characteristics, including age, emphysema location, maximum workload, and forced expiratory volume in 1 second level.
more »
« less
Estimation of optimal treatment regimes with electronic medical record data using the residual life value estimator
Summary Clinicians and patients must make treatment decisions at a series of key decision points throughout disease progression. A dynamic treatment regime is a set of sequential decision rules that return treatment decisions based on accumulating patient information, like that commonly found in electronic medical record (EMR) data. When applied to a patient population, an optimal treatment regime leads to the most favorable outcome on average. Identifying optimal treatment regimes that maximize residual life is especially desirable for patients with life-threatening diseases such as sepsis, a complex medical condition that involves severe infections with organ dysfunction. We introduce the residual life value estimator (ReLiVE), an estimator for the expected value of cumulative restricted residual life under a fixed treatment regime. Building on ReLiVE, we present a method for estimating an optimal treatment regime that maximizes expected cumulative restricted residual life. Our proposed method, ReLiVE-Q, conducts estimation via the backward induction algorithm Q-learning. We illustrate the utility of ReLiVE-Q in simulation studies, and we apply ReLiVE-Q to estimate an optimal treatment regime for septic patients in the intensive care unit using EMR data from the Multiparameter Intelligent Monitoring Intensive Care database. Ultimately, we demonstrate that ReLiVE-Q leverages accumulating patient information to estimate personalized treatment regimes that optimize a clinically meaningful function of residual life.
more »
« less
- Award ID(s):
- 2113637
- PAR ID:
- 10524396
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Biostatistics
- ISSN:
- 1465-4644
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
A treatment regime is a sequence of decision rules, one per decision point, that maps accumulated patient information to a recommended intervention. An optimal treatment regime maximises expected cumulative utility if applied to select interventions in a population of interest. As a treatment regime seeks to improve the quality of healthcare by individualising treatment, it can be viewed as an approach to formalising precision medicine. Increased interest and investment in precision medicine has led to a surge of methodological research focusing on estimation and evaluation of optimal treatment regimes from observational and/or randomised studies. These methods are becoming commonplace in biomedical research, although guidance about how to choose among existing methods in practice has been somewhat limited. The purpose of this review is to describe some of the most commonly used methods for estimation of an optimal treatment regime, and to compare these estimators in a series of simulation experiments and applications to real data. The results of these simulations along with the theoretical/methodological properties of these estimators are used to form recommendations for applied researchers.more » « less
-
Mechanical Ventilation (MV) is a critical life-support intervention in intensive care units (ICUs). However, optimal ventilator settings are challenging to determine because of the complexity of balancing patient-specific physiological needs with the risks of adverse outcomes that impact morbidity, mortality, and healthcare costs. This study introduces ConformalDQN, a novel distribution-free conformal deep Q-learning approach for optimizing mechanical ventilation in intensive care units. By integrating conformal prediction with deep reinforcement learning, our method provides reliable uncertainty quantification, addressing the challenges of Q-value overestimation and out-of-distribution actions in offline settings.We trained and evaluated our model using ICU patient records from the MIMIC-IV database. ConformalDQN extends the Double DQN architecture with a conformal predictor and employs a composite loss function that balances Q-learning with well-calibrated probability estimation. This enables uncertainty-aware action selection, allowing the model to avoid potentially harmful actions in unfamiliar states and handle distribution shifts by being more conservative in out-of-distribution scenarios.Evaluation against baseline models, including physician policies, policy constraint methods, and behavior cloning, demonstrates that ConformalDQN consistently makes recommendations within clinically safe and relevant ranges, outperforming other methods by increasing the 90-day survival rate. Notably, our approach provides an interpretable measure of confidence in its decisions, which is crucial for clinical adoption and potential human-in-the-loop implementations.more » « less
-
The Covid-19 pandemic has brought the rapid expansion of virtual services and automated patient care. While there is a growing body of research on how organizations can leverage algorithm-enabled systems to make patient decisions, attention to the synergistic combination of organizational resources surrounding the use of these systems in providing virtual patient care has been limited. More importantly, the enablement of new avenues for value-creation has been overlooked. This presentation report how health practitioners within virtual contexts successfully use algorithm-enabled patients care systems based on interviews with health professionals working in a Virtual Intensive Care Unit.more » « less
-
Laboratory testing is an integral tool in the management of patient care in hospitals, particularly in intensive care units (ICUs). There exists an inherent trade-off in the selection and timing of lab tests between considerations of the expected utility in clinical decision-making of a given test at a specific time, and the associated cost or risk it poses to the patient. In this work, we introduce a framework that learns policies for ordering lab tests which optimizes for this trade-off. Our approach uses batch off-policy reinforcement learning with a composite reward function based on clinical imperatives, applied to data that include examples of clinicians ordering labs for patients. To this end, we develop and extend principles of Pareto optimality to improve the selection of actions based on multiple reward function components while respecting typical procedural considerations and prioritization of clinical goals in the ICU. Our experiments show that we can estimate a policy that reduces the frequency of lab tests and optimizes timing to minimize information redundancy. We also find that the estimated policies typically suggest ordering lab tests well ahead of critical onsets—such as mechanical ventilation or dialysis—that depend on the lab results. We evaluate our approach by quantifying how these policies may initiate earlier onset of treatment.more » « less
An official website of the United States government

