Partially observable Markov decision processes (POMDPs) provide a flexible representation for real-world decision and control problems. However, POMDPs are notoriously difficult to solve, especially when the state and observation spaces are continuous or hybrid, which is often the case for physical systems. While recent online sampling-based POMDP algorithms that plan with observation likelihood weighting have shown practical effectiveness, a general theory characterizing the approximation error of the particle filtering techniques that these algorithms use has not previously been proposed. Our main contribution is bounding the error between any POMDP and its corresponding finite sample particle belief MDP (PB-MDP) approximation. This fundamental bridge between PB-MDPs and POMDPs allows us to adapt any sampling-based MDP algorithm to a POMDP by solving the corresponding particle belief MDP, thereby extending the convergence guarantees of the MDP algorithm to the POMDP. Practically, this is implemented by using the particle filter belief transition model as the generative model for the MDP solver. While this requires access to the observation density model from the POMDP, it only increases the transition sampling complexity of the MDP solver by a factor of O(C), where C is the number of particles. Thus, when combined with sparse sampling MDP algorithms, this approach can yield algorithms for POMDPs that have no direct theoretical dependence on the size of the state and observation spaces. In addition to our theoretical contribution, we perform five numerical experiments on benchmark POMDPs to demonstrate that a simple MDP algorithm adapted using PB-MDP approximation, Sparse-PFT, achieves performance competitive with other leading continuous observation POMDP solvers.
more »
« less
PBVI for Optimal Photoplethysmography Noise Filter Selection Using Human Activity Recognition Observations for Improved Heart Rate Estimation on Multi-Sensor Systems
Abstract This work details the partially observable markov decision process (POMDP) and the point-based value iteration (PBVI) algorithms for use in multisensor systems, specifically, a sensor system capable of heart rate (HR) estimation through wearable photoplethysmography (PPG) and accelerometer signals. PPG sensors are highly susceptible to motion artifact (MA); however, current methods focus more on overall MA filters, rather than action specific filtering. An end-to-end embedded human activity recognition (HAR) System is developed to represent the observation uncertainty, and two action specific PPG MA reducing filters are proposed as actions. PBVI allows optimal action decision-making based on an uncertain observation, effectively balancing correct action choice and sensor system cost. Two central systems are proposed to accompany these algorithms, one for unlimited observation access and one for limited observation access. Through simulation, it can be shown that the limited observation system performs optimally when sensor cost is negligible, while limited observation access performs optimally when a negative reward for sensor use is considered. The final general framework for POMDP and PBVI was applied to a specific HR estimation example. This work can be expanded on and used as a basis for future work on similar multisensor system.
more »
« less
- Award ID(s):
- 1828010
- PAR ID:
- 10516061
- Publisher / Repository:
- ASME International; AMERICAN SOCIETY OF MECHANICAL ENGINEERS; AMERICAN SOCIETY OF MECHANICL ENGINEERS
- Date Published:
- Journal Name:
- Journal of Medical Devices
- Volume:
- 18
- Issue:
- 1
- ISSN:
- 1932-6181
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The paper introduces a new algorithm for planning in partially observable Markov decision processes (POMDP) based on the idea of aggregate simulation. The algorithm uses product distributions to approximate the belief state and shows how to build a representation graph of an approximate action-value function over belief space. The graph captures the result of simulating the model in aggregate under independence assumptions, giving a symbolic representation of the value function. The algorithm supports large observation spaces using sampling networks, a representation of the process of sampling values of observations, which is integrated into the graph representation. Following previous work in MDPs this approach enables action selection in POMDPs through gradient optimization over the graph representation. This approach complements recent algorithms for POMDPs which are based on particle representations of belief states and an explicit search for action selection. Our approach enables scaling to large factored action spaces in addition to large state spaces and observation spaces. An experimental evaluation demonstrates that the algorithm provides excellent performance relative to state of the art in large POMDP problems.more » « less
-
Abstract To be responsive to dynamically changing real-world environments, an intelligent agent needs to perform complex sequential decision-making tasks that are often guided by commonsense knowledge. The previous work on this line of research led to the framework called interleaved commonsense reasoning and probabilistic planning (i corpp ), which used P-log for representing commmonsense knowledge and Markov Decision Processes (MDPs) or Partially Observable MDPs (POMDPs) for planning under uncertainty. A main limitation of i corpp is that its implementation requires non-trivial engineering efforts to bridge the commonsense reasoning and probabilistic planning formalisms. In this paper, we present a unified framework to integrate i corpp ’s reasoning and planning components. In particular, we extend probabilistic action language pBC + to express utility, belief states, and observation as in POMDP models. Inheriting the advantages of action languages, the new action language provides an elaboration tolerant representation of POMDP that reflects commonsense knowledge. The idea led to the design of the system pbcplus2pomdp , which compiles a pBC + action description into a POMDP model that can be directly processed by off-the-shelf POMDP solvers to compute an optimal policy of the pBC + action description. Our experiments show that it retains the advantages of i corpp while avoiding the manual efforts in bridging the commonsense reasoner and the probabilistic planner.more » « less
-
The objective of this study was to investigate the accuracy of a wearable photoplethysmography (PPG) sensor in monitoring heart rate (HR) of sheep housed in high-temperature environments. We hypothesized that the PPG sensor would be capable of differentiating low, normal, and high HR, but would struggle to produce exact HR estimates. The sensor was open source and comprised of a microprocessor (SparkFun®ThingPlus), a photoplethysmography sensor (SparkFun® MAX30101 & MAX32664), and a data storage module (SD Card 16GB), all sewn into a nylon collar with hook-and-loop closure. Sheep (n=4) were divided into 2 groups and exposed to different thermal environments in a cross-over design. The collar was placed around the neck of the sheep during the data collection phase and the manual HR were collected twice a day using a stethoscope. Precision and accuracy of numeric heart rate estimates were analyzed in R software using Pearson correlation and root mean squared prediction errors. Random forest regression was used to classify HR based on low, medium, and high to determine opportunities to leverage the PPG sensors for HR classification. Sensitivity, specificity, and accuracy were measured to evaluate the classification approach. Our results indicated that the PPG-based sensor measured sheep HR with poor accuracy and with higher average estimates in comparison with manually measured with a stethoscope. Categorical classification of HR was also poor, with accuracies ranging from 32% to 49%. Additional work is needed focusing on data analytics, and signal optimization to further rely on PPG sensors for accurately measuring HR in sheep.more » « less
-
Remaining Useful Life (RUL) estimation is critical in many engineering systems where proper predictive maintenance is needed to increase a unit's effectiveness and reduce time and cost of repairing. Typically for such systems, multiple sensors are normally used to monitor performance, which create difficulties for system state identification. In this paper, we develop a semi-supervised left-to-right constrained Hidden Markov Model (HMM) model, which is effective in estimating the RUL, while capturing the jumps among states in condition dynamics. In addition, based on the HMM model learned from multiple sensors, we build a Partial Observable Markov Decision Process (POMDP) to demonstrate how such RUL estimation can be effectively used for optimal preventative maintenance decision making. We apply this technique to the NASA Engine degradation data and demonstrate the effectiveness of the proposed method.more » « less
An official website of the United States government

