skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Optimal Fidelity Selection for Human-in-the-loop Queues using Semi-Markov Decision Processes
We study optimal fidelity selection for a human operator servicing a queue of homogeneous tasks. The service time distribution of the human operator depends on her cognitive dynamics and the level of fidelity selected for servicing the task. Cognitive dynamics of the operator evolve as a Markov chain in which the cognitive state increases (decreases) with high probability whenever she is busy (resting). The tasks arrive according to a Poisson process and each task waiting in the queue loses its value at a fixed rate. We address the trade-off between high quality service of a task and consequent loss in value of future tasks using a Semi-Markov Decision Process (SMDP) framework. We numerically determine an optimal policy and establish its structural properties.  more » « less
Award ID(s):
1734272
PAR ID:
10309137
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2019 American Control Conference (ACC)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider a team of heterogeneous agents that is collectively responsible for servicing and subsequently reviewing a stream of homogeneous tasks. Each agent (autonomous system or human operator) has an associated mean service time and mean review time for servicing and reviewing the tasks, respectively, which are based on their expertise and skill-sets. The team objective is to collaboratively maximize the number of "serviced and reviewed" tasks. To this end, we formulate a Common-Pool Resource (CPR) game and design utility functions to incentivize collaboration among team-members. We show the existence and uniqueness of the Pure Nash Equilibrium (PNE) for the CPR game. Additionally, we characterize the structure of the PNE and study the effect of heterogeneity among the agents at the PNE. We show that the formulated CPR game is a best response potential game for which both sequential best response dynamics and simultaneous best reply dynamics converge to the Nash equilibrium. Finally, we numerically illustrate the price of anarchy for the PNE. 
    more » « less
  2. We study optimal pricing in a single server queue when the customers valuation of service depends on their waiting time. In particular, we consider a very general model, where the customer valuations are random and are sampled from a distribution that depends on the queue length. The goal of the service provider is to set dynamic state dependent prices in order to maximize its revenue, while also managing congestion. We model the problem as a Markov decision process and present structural results on the optimal policy. We also present an algorithm to find an approximate optimal policy. We further present a myopic policy that is easy to evaluate and present bounds on its performance. We finally illustrate the quality of our approximate solution and the myopic solution using numerical simulations. 
    more » « less
  3. Cloud computing (CC), often necessitates dynamic adjustments due to its inherently fluid nature. In this paper, we introduce a novel dynamic task scheduling model that incorporates reward and holding cost considerations, leveraging the Continuous-Time Markov Decision Process (CTMDP) framework in heterogeneous CC systems. The primary goal of this model is to maximize the overall system reward for the Cloud Service Provider. By solving the Bellman Optimality Equation using the value-iteration method, we can derive an optimal scheduling policy for the dynamic task scheduling model. Additionally, to enhance its practicality in real-world scenarios, we incorporate a model-free reinforcement learning algorithm to obtain the optimal policy for our proposed model without requiring explicit knowledge of the system environment. Simulation results demonstrate that our proposed model outperforms two common static scheduling methods. 
    more » « less
  4. Using the context of human-supervised object collection tasks, we explore policies for a robot to seek assistance from a human supervisor and avoid loss of human trust in the robot. We consider a human-robot interaction scenario in which a mobile manipulator chooses to collect objects either autonomously or through human assistance; while the human supervisor monitors the robot’s operation, assists when asked, or intervenes if the human perceives that the robot may not accomplish its goal. We design an optimal assistance-seeking policy for the robot using a Partially Observable Markov Decision Process (POMDP) setting in which human trust is a hidden state and the objective is to maximize collaborative performance. We conduct two sets of human-robot interaction experiments. The data from the first set of experiments is used to estimate POMDP parameters, which are used to compute an optimal assistance-seeking policy that is used in the second experiment. For most participants, the estimated POMDP reveals that humans are more likely to intervene when their trust is low and the robot is performing a high-complexity task; and that the robot asking for assistance in high-complexity tasks can increase human trust in the robot. Our experimental results show that the proposed trust-aware policy yields superior performance compared with an optimal trust-agnostic policy. 
    more » « less
  5. Autonomous systems that can assist humans with increasingly complex tasks are becoming ubiquitous. Moreover, it has been established that a human’s decision to rely on such systems is a function of both their trust in the system and their own self-confidence as it relates to executing the task of interest. Given that both under- and over-reliance on automation can pose significant risks to humans, there is motivation for developing autonomous systems that could appropriately calibrate a human’s trust or self-confidence to achieve proper reliance behavior. In this article, a computational model of coupled human trust and self-confidence dynamics is proposed. The dynamics are modeled as a partially observable Markov decision process without a reward function (POMDP/R) that leverages behavioral and self-report data as observations for estimation of these cognitive states. The model is trained and validated using data collected from 340 participants. Analysis of the transition probabilities shows that the proposed model captures the probabilistic relationship between trust, self-confidence, and reliance for all discrete combinations of high and low trust and self-confidence. The use of the proposed model to design an optimal policy to facilitate trust and self-confidence calibration is a goal of future work. 
    more » « less