NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Scalable Decision-Making In Stochastic Environments Through Learned Temporal Abstraction

Luo, Baiting; Pettet, Ava; Laszka, Aron; Dubey, Abhishek; Mukhopadhyay, Ayan (January 2025, International Conference on Learning Representations)

Free, publicly-accessible full text available January 23, 2026
Reinforcement Learning-based Approach for Vehicle-to-Building Charging with Heterogeneous Agents and Long Term Rewards

Liu, Fangqi; Sen, Rishav; Talusan, Jose; Pettet, Ava; Kandel, Aaron; Suzue, Yoshinori; Mukhopadhyay, Ayan; Dubey, Abhishek (December 2024, International Conference on Autonomous Agents and Multi-Agent Systems)

Free, publicly-accessible full text available December 20, 2025
Shrinking POMCP: A Framework for Real-Time UAV Search and Rescue

https://doi.org/10.1109/ICAA64256.2024.00016

Zhang, Yunuo; Luo, Baiting; Mukhopadhyay, Ayan; Stojcsics, Daniel; Elenius, Daniel; Roy, Anirban; Jha, Susmit; Maroti, Miklos; Koutsoukos, Xenofon; Karsai, Gabor; et al (October 2024, IEEE)

Full Text Available
Act as You Learn: Adaptive Decision-Making in Non-Stationary Markov Decision Processes

Luo, Baiting; Zhang, Yunuo; Dubey, Abhishek; Mukhopadhyay, Ayan (May 2024, 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS))

A fundamental (and largely open) challenge in sequential decision-making is dealing with non-stationary environments, where exogenous environmental conditions change over time. Such problems are traditionally modeled as non-stationary Markov decision processes (NSMDP). However, existing approaches for decision-making in NSMDPs have two major shortcomings: first, they assume that the updated environmental dynamics at the current time are known (although future dynamics can change); and second, planning is largely pessimistic, i.e., the agent acts ``safely'' to account for the non-stationary evolution of the environment. We argue that both these assumptions are invalid in practice -- updated environmental conditions are rarely known, and as the agent interacts with the environment, it can learn about the updated dynamics and avoid being pessimistic, at least in states whose dynamics it is confident about. We present a heuristic search algorithm called \textit{Adaptive Monte Carlo Tree Search (ADA-MCTS)} that addresses these challenges. We show that the agent can learn the updated dynamics of the environment over time and then act as it learns, i.e., if the agent is in a region of the state space about which it has updated knowledge, it can avoid being pessimistic. To quantify ``updated knowledge,'' we disintegrate the aleatoric and epistemic uncertainty in the agent's updated belief and show how the agent can use these estimates for decision-making. We compare the proposed approach with the multiple state-of-the-art approaches in decision-making across multiple well-established open-source problems and empirically show that our approach is faster and highly adaptive without sacrificing safety.
more » « less
Full Text Available
Decision Making in Non-Stationary Environments with Policy-Augmented Search

Pettet, Ava; Zhang, Yunuo; Luo, Baiting; Wray, Kyle; Baier, Hendrik; Laszka, Aron; Dubey, Abhishek; Mukhopadhyay, Ayan (May 2024, International Conference on Autonomous Agents and Multiagent Systems)

Sequential decision-making under uncertainty is present in many important problems. Two popular approaches for tackling such problems are reinforcement learning and online search (e.g., Monte Carlo tree search). While the former learns a policy by interacting with the environment (typically done before execution), the latter uses a generative model of the environment to sample promising action trajectories at decision time. Decision-making is particularly challenging in non-stationary environments, where the environment in which an agent operates can change over time. Both approaches have shortcomings in such settings -- on the one hand, policies learned before execution become stale when the environment changes and relearning takes both time and computational effort. Online search, on the other hand, can return sub-optimal actions when there are limitations on allowed runtime. In this paper, we introduce \textit{Policy-Augmented Monte Carlo tree search} (PA-MCTS), which combines action-value estimates from an out-of-date policy with an online search using an up-to-date model of the environment. We prove theoretical results showing conditions under which PA-MCTS selects the one-step optimal action and also bound the error accrued while following PA-MCTS as a policy. We compare and contrast our approach with AlphaZero, another hybrid planning approach, and Deep Q Learning on several OpenAI Gym environments. Through extensive experiments, we show that under non-stationary settings with limited time constraints, PA-MCTS outperforms these baselines.
more » « less
Full Text Available
HPRoP: Hierarchical Privacy-Preserving Route Planning for Smart Cities

https://doi.org/10.1145/3616874

Tiausas, Francis; Yasumoto, Keiichi; Talusan, Jose Paolo; Yamana, Hayato; Yamaguchi, Hirozumi; Bhattacharjee, Shameek; Dubey, Abhishek; Das, Sajal K. (August 2023, ACM Transactions on Cyber-Physical Systems)

Route Planning Systems (RPS) are a core component of autonomous personal transport systems essential for safe and efficient navigation of dynamic urban environments with the support of edge-based smart city infrastructure, but they also raise concerns about user route privacy in the context of both privately-owned and commercial vehicles. Numerous high profile data breaches in recent years have fortunately motivated research on privacy-preserving RPS, but most of them are rendered impractical by greatly increased communication and processing overhead. We address this by proposing an approach called Hierarchical Privacy-Preserving Route Planning (HPRoP) which divides and distributes the route planning task across multiple levels, and protects locations along the entire route. This is done by combining Inertial Flow partitioning, Private Information Retrieval (PIR), and Edge Computing techniques with our novel route planning heuristic algorithm. Normalized metrics were also formulated to quantify the privacy of the source/destination points (endpoint location privacy) and the route itself (route privacy). Evaluation on a simulated road network showed that HPRoP reliably produces routes differing only by ≤20% in length from optimal shortest paths, with completion times within ∼ 25 seconds which is reasonable for a PIR-based approach. On top of this, more than half of the produced routes achieved near-optimal endpoint location privacy (∼ 1.0) and good route privacy (≥ 0.8).
more » « less
Full Text Available
Scalable Pythagorean Mean based Incident Detection in Smart Transportation Systems

https://doi.org/10.1145/3603381

Islam, Md. Jaminur; Talusan, Jose Paolo; Bhattacharjee, Shameek; Tiausas, Francis; Dubey, Abhishek; Yasumoto, Keiichi; Das, Sajal K. (June 2023, ACM Transactions on Cyber-Physical Systems)

Modern smart cities need smart transportation solutions to quickly detect various traffic emergencies and incidents in the city to avoid cascading traffic disruptions. To materialize this, roadside units and ambient transportation sensors are being deployed to collect speed data that enables the monitoring of traffic conditions on each road segment. In this paper, we first propose a scalable data-driven anomaly-based traffic incident detection framework for a city-scale smart transportation system. Specifically, we propose an incremental region growing approximation algorithm for optimal Spatio-temporal clustering of road segments and their data; such that road segments are strategically divided into highly correlated clusters. The highly correlated clusters enable identifying a Pythagorean Mean-based invariant as an anomaly detection metric that is highly stable under no incidents but shows a deviation in the presence of incidents. We learn the bounds of the invariants in a robust manner such that anomaly detection can generalize to unseen events, even when learning from real noisy data. Second, using cluster-level detection, we propose a folded Gaussian classifier to pinpoint the particular segment in a cluster where the incident happened in an automated manner. We perform extensive experimental validation using mobility data collected from four cities in Tennessee, compare with the state-of-the-art ML methods, to prove that our method can detect incidents within each cluster in real-time and outperforms known ML methods.
more » « less
Full Text Available
Synchrophasor Data Event Detection using Unsupervised Wavelet Convolutional Autoencoders

https://doi.org/10.1109/SMARTCOMP58114.2023.00080

Buckelew, Jacob; Basumallik, Sagnik; Sivaramakrishnan, Vasavi; Mukhopadhyay, Ayan; Srivastava, Anurag K.; Dubey, Abhishek (June 2023, 2023 IEEE International Conference on Smart Computing (SMARTCOMP))

Timely and accurate detection of events affecting the stability and reliability of power transmission systems is crucial for safe grid operation. This paper presents an efficient unsupervised machine-learning algorithm for event detection using a combination of discrete wavelet transform (DWT) and convolutional autoencoders (CAE) with synchrophasor phasor measurements. These measurements are collected from a hardware-in-the-loop testbed setup equipped with a digital real-time simulator. Using DWT, the detail coefficients of measurements are obtained. Next, the decomposed data is then fed into the CAE that captures the underlying structure of the transformed data. Anomalies are identified when significant errors are detected between input samples and their reconstructed outputs. We demonstrate our approach on the IEEE-14 bus system considering different events such as generator faults, line-to-line faults, line-to-ground faults, load shedding, and line outages simulated on a real-time digital simulator (RTDS). The proposed implementation achieves a classification accuracy of 97.7%, precision of 98.0%, recall of 99.5%, F1 Score of 98.7%, and proves to be efficient in both time and space requirements compared to baseline approaches.
more » « less
Full Text Available
Traffic Anomaly Detection Via Conditional Normalizing Flow

https://doi.org/10.1109/ITSC55140.2022.9922061

Kang, Zhuangwei; Mukhopadhyay, Ayan; Gokhale, Aniruddha; Wen, Shijie; Dubey, Abhishek (October 2022, 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC))

Traffic congestion anomaly detection is of paramount importance in intelligent traffic systems. The goals of transportation agencies are two-fold: to monitor the general traffic conditions in the area of interest and to locate road segments under abnormal congestion states. Modeling congestion patterns can achieve these goals for citywide roadways, which amounts to learning the distribution of multivariate time series (MTS). However, existing works are either not scalable or unable to capture the spatial-temporal information in MTS simultaneously. To this end, we propose a principled and comprehensive framework consisting of a data-driven generative approach that can perform tractable density estimation for detecting traffic anomalies. Our approach first clusters segments in the feature space and then uses conditional normalizing flow to identify anomalous temporal snapshots at the cluster level in an unsupervised setting. Then, we identify anomalies at the segment level by using a kernel density estimator on the anomalous cluster. Extensive experiments on synthetic datasets show that our approach significantly outperforms several state-of-the-art congestion anomaly detection and diagnosis methods in terms of Recall and F1-Score. We also use the generative model to sample labeled data, which can train classifiers in a supervised setting, alleviating the lack of labeled data for anomaly detection in sparse settings.
more » « less
Full Text Available

Search for: All records