Abstract The prolonged COVID-19 pandemic has tied up significant medical resources, and its management poses a challenge for the public health care decision making. Accurate predictions of the hospitalizations are crucial for the decision makers to make informed decision for the medical resource allocation. This paper proposes a method named County Augmented Transformer (CAT). To generate accurate predictions of four-week-ahead COVID-19 related hospitalizations for every states in the United States. Inspired by the modern deep learning techniques, our method is based on a self-attention model (known as the transformer model) that is actively used in Natural Language Processing. Our transformer based model can capture both short-term and long-term dependencies within the time series while enjoying computational efficiency. Our model is a data based approach that utilizes the publicly available information including the COVID-19 related number of confirmed cases, deaths, hospitalizations data, and the household median income data. Our numerical experiments demonstrate the strength and the usability of our model as a potential tool for assisting the medical resources allocation.
more »
« less
Reinforcement Learning Methods in Public Health
Reinforcement learning (RL) is the subfield of machine learning focused on optimal sequential decision making under uncertainty. An optimal RL strategy maximizes cumulative utility by experimenting only if and when the information generated by experimentation is likely to outweigh associated short-term costs. RL represents a holistic approach to decision making that evaluates the impact of every action (ie, data collection, allocation of resources, and treatment assignment) in terms of short-term and long-term utility to stakeholders. Thus, RL is an ideal model for a number of complex decision problems that arise in public health, including resource allocation in a pandemic, monitoring or testing, and adaptive sampling for hidden populations. Nevertheless, although RL has been applied successfully in a wide range of domains, including precision medicine, it has not been widely adopted in public health. The purposes of this review are to introduce key ideas in RL and to identify challenges and opportunities associated with the application of RL in public health.
more »
« less
- Award ID(s):
- 2103672
- PAR ID:
- 10321951
- Publisher / Repository:
- ScienceDirect
- Date Published:
- Journal Name:
- Clinical therapeutics
- Volume:
- 44
- Issue:
- 1
- ISSN:
- 0149-2918
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Abstract: Identifying critical decisions is one of the most challenging decision-making problems in real-world applications. In this work, we propose a novel Reinforcement Learning (RL) based Long-Short Term Rewards (LSTR) framework for critical decisions identification. RL is a machine learning area concerned with inducing effective decision-making policies, following which result in the maximum cumulative "reward." Many RL algorithms find the optimal policy via estimating the optimal Q-values, which specify the maximum cumulative reward the agent can receive. In our LSTR framework, the "long term" rewards are defined as "Q-values" and the "short term" rewards are determined by the "reward function." Experiments on a synthetic GridWorld game and real-world Intelligent Tutoring System datasets show that the proposed LSTR framework indeed identifies the critical decisions in the sequences. Furthermore, our results show that carrying out the critical decisions alone is as effective as a fully-executed policy.more » « less
-
Offline reinforcement learning (RL) is a promising approach for training intelligent medical agents to learn treatment policies and assist decision making in many healthcare applications, such as scheduling clinical visits and assigning dosages for patients with chronic conditions. In this paper, we investigate the potential usefulness of Decision Transformer (Chen et al., 2021)–a new offline RL paradign– in medical domains where decision making in continuous time is desired. As Decision Transformer only handles discrete-time (or turn-based) sequential decision making scenarios, we generalize it to Continuous-Time Decision Transformer that not only considers the past clinical measurements and treatments but also the timings of previous visits, and learns to suggest the timings of future visits as well as the treatment plan at each visit. Extensive experiments on synthetic datasets and simulators motivated by real-world medical applications demonstrate that Continuous-Time Decision Transformer is able to outperform competitors and has clinical utility in terms of improving patients’ health and prolonging their survival by learning high-performance policies from logged data generated using policies of different levels of quality.more » « less
-
null (Ed.)Deep reinforcement learning (RL) has recently been successfully applied to networking contexts including routing, flow scheduling, congestion control, packet classification, cloud resource management, and video streaming. Deep-RL-driven systems automate decision making, and have been shown to outperform state-of-the-art handcrafted systems in important domains. However, the (typical) non-explainability of decisions induced by the deep learning machinery employed by these systems renders reasoning about crucial system properties, including correctness and security, extremely difficult. We show that despite the obscurity of decision making in these contexts, verifying that deep-RL-driven systems adhere to desired, designer-specified behavior, is achievable. To this end, we initiate the study of formal verification of deep RL and present Verily, a system for verifying deep-RL-based systems that leverages recent advances in verification of deep neural networks. We employ Verily to verify recently-introduced deep-RL-driven systems for adaptive video streaming, cloud resource management, and Internet congestion control. Our results expose scenarios in which deep-RL-driven decision making yields undesirable behavior. We discuss guidelines for building deep-RL-driven systems that are both safer and easier to verify.more » « less
-
Restless multi-armed bandits (RMAB) have been widely used to model sequential decision making problems with constraints. The decision maker (DM) aims to maximize the expected total reward over an infinite horizon under an “instantaneous activation constraint” that at most B arms can be activated at any decision epoch, where the state of each arm evolves stochastically according to a Markov decision process (MDP). However, this basic model fails to provide any fairness guarantee among arms. In this paper, we introduce RMAB-F, a new RMAB model with “long-term fairness constraints”, where the objective now is to maximize the longterm reward while a minimum long-term activation fraction for each arm must be satisfied. For the online RMAB-F setting (i.e., the underlying MDPs associated with each arm are unknown to the DM), we develop a novel reinforcement learning (RL) algorithm named Fair-UCRL. We prove that Fair-UCRL ensures probabilistic sublinear bounds on both the reward regret and the fairness violation regret. Compared with off-the-shelf RL methods, our Fair-UCRL is much more computationally efficient since it contains a novel exploitation that leverages a low-complexity index policy for making decisions. Experimental results further demonstrate the effectiveness of our Fair-UCRL.more » « less
An official website of the United States government
