We consider the following scenario: A radiotherapy clinic has a limited number of proton therapy slots available each day to treat cancer patients of a given tumor site. The clinic's goal is to minimize the expected number of complications in the cohort of all patients of that tumor site treated at the clinic, and thereby maximize the benefit of its limited proton resources.

To address this problem, we extend the normal tissue complication probability (NTCP) model–based approach to proton therapy patient selection to the situation of limited resources at a given institution. We assume that, on each day, a newly diagnosed patient is scheduled for treatment at the clinic with some probability and with some benefit from protons over photons, which is drawn from a probability distribution. When a new patient is scheduled for treatment, a decision for protons or photons must be made, and a patient may wait only for a limited amount of time for a proton slot becoming available. The goal is to determine the thresholds for selecting a patient for proton therapy, which optimally balance the competing goals of making use of all available slots while not blocking slots with patients with low benefit. This problem can be formulated as a Markov decision process (MDP) and the optimal thresholds can be determined via a value‐policy iteration method.

The optimal thresholds depend on the number of available proton slots, the average number of patients under treatment, and the distribution of values. In addition, the optimal thresholds depend on the current utilization of the facility. For example, if one proton slot is available and a second frees up shortly, the optimal threshold is lower compared to a situation where all but one slot remain blocked for longer.

MDP methodology can be used to augment current NTCP model–based patient selection methods to the situation that, on any given day, the number of proton slots is limited. The optimal threshold then depends on the current utilization of the proton facility. Although, the optimal policy yields only a small nominal benefit over a constant threshold, it is more robust against variations in patient load.