NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A deep learned nanowire segmentation model using synthetic data augmentation

https://doi.org/10.1038/s41524-022-00767-x

Lin, Binbin; Emami, Nima; Santos, David A.; Luo, Yuting; Banerjee, Sarbajit; Xu, Bai-Xiang (April 2022, npj Computational Materials)

Abstract Automated particle segmentation and feature analysis of experimental image data are indispensable for data-driven material science. Deep learning-based image segmentation algorithms are promising techniques to achieve this goal but are challenging to use due to the acquisition of a large number of training images. In the present work, synthetic images are applied, resembling the experimental images in terms of geometrical and visual features, to train the state-of-art Mask region-based convolutional neural networks to segment vanadium pentoxide nanowires, a cathode material within optical density-based images acquired using spectromicroscopy. The results demonstrate the instance segmentation power in real optical intensity-based spectromicroscopy images of complex nanowires in overlapped networks and provide reliable statistical information. The model can further be used to segment nanowires in scanning electron microscopy images, which are fundamentally different from the training dataset known to the model. The proposed methodology can be extended to any optical intensity-based images of variable particle morphology, material class, and beyond.
more » « less
Chemomechanical damage prediction from phase-field simulation video sequences using a deep-learning-based methodology

https://doi.org/10.1016/j.isci.2024.110822

Zeng, Quan; Rezaei, Shahed; Carrillo, Luis; Davidson, Rachel; Xu, Bai-Xiang; Banerjee, Sarbajit; Ding, Yu (September 2024, iScience)

Full Text Available
Finite Time Logarithmic Regret Bounds for Self-Tuning Regulation

Singh, R; Mete, A; Kar, A; Kumar, P R (July 2024, Proceedings of the 41st International Conference on Machine Learning)

We establish the first finite-time logarithmic regret bounds for the self-tuning regulation problem. We introduce a modified version of the certainty equivalence algorithm, which we call PIECE, that clips inputs in addition to utilizing probing inputs for exploration. We show that it has a ClogT upper bound on the regret after T time-steps for bounded noise, and Clog3T in the case of sub-Gaussian noise, unlike the LQ problem where logarithmic regret is shown to be not possible. The PIECE algorithm is also designed to address the critical challenge of poor initial transient performance of reinforcement learning algorithms for linear systems. Comparative simulation results illustrate the improved performance of PIECE.
more » « less
Full Text Available
TAP: The Attention Patch for Cross-Modal Knowledge Transfer from Unlabeled Modality

Wang, Yinsong; Shahrampour, Shahin (June 2024, Transactions on machine learning research)

Full Text Available
Provable Policy Gradient Methods for Average-Reward Markov Potential Games

Cheng, Min; Zhou, Ruida; Kumar, P R; Tian, Chao (May 2024, Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS) 2024)

We study Markov potential games under the infinite horizon average reward criterion. Most previous studies have been for discounted rewards. We prove that both algorithms based on independent policy gradient and independent natural policy gradient converge globally to a Nash equilibrium for the average reward criterion. To set the stage for gradient-based methods, we first establish that the average reward is a smooth function of policies and provide sensitivity bounds for the differential value functions, under certain conditions on ergodicity and the second largest eigenvalue of the underlying Markov decision process (MDP). We prove that three algorithms, policy gradient, proximal-Q, and natural policy gradient (NPG), converge to an ϵ-Nash equilibrium with time complexity O(1ϵ2), given a gradient/differential Q function oracle. When policy gradients have to be estimated, we propose an algorithm with ~O(1mins,aπ(a|s)δ) sample complexity to achieve δ approximation error w.r.t~the ℓ2 norm. Equipped with the estimator, we derive the first sample complexity analysis for a policy gradient ascent algorithm, featuring a sample complexity of ~O(1/ϵ5). Simulation studies are presented.
more » « less
Full Text Available
Linear Convergence of Independent Natural Policy Gradient in Games With Entropy Regularization

https://doi.org/10.1109/LCSYS.2024.3410149

Sun, Youbang; Liu, Tao; Kumar, P R; Shahrampour, Shahin (January 2024, IEEE Control Systems Letters)

Full Text Available
TAKDE: Temporal Adaptive Kernel Density Estimator for Real-Time Dynamic Density Estimation

https://doi.org/10.1109/TPAMI.2023.3297950

Wang, Yinsong; Ding, Yu; Shahrampour, Shahin (November 2023, IEEE Transactions on Pattern Analysis and Machine Intelligence)

Full Text Available
Systolic Array Placement on FPGAs

https://doi.org/10.1109/ICCAD57390.2023.10323742

Hu, Hailiang; Fang, Donghao; Li, Wuxi; Yuan, Bo; Hu, Jiang (October 2023, IEEE)

Full Text Available
Near-Equivalence Between Bounded Regret and Delay Robustness in Interactive Decision Making

Kang, E H; Kumar, P R (October 2023, NeurIPS 2023 Workshop on Adaptive Experimental Design and Active Learning in the Real World)

Interactive decision making, encompassing bandits, contextual bandits, and reinforcement learning, has recently been of interest to theoretical studies of experimentation design and recommender system algorithm research. Recently, it has been shown that the wellknown Graves-Lai constant being zero is a necessary and sufficient condition for achieving bounded (or constant) regret in interactive decision making. As this condition may be a strong requirement for many applications, the practical usefulness of pursuing bounded regret has been questioned. In this paper, we show that the condition of the Graves-Lai constant being zero is also necessary to achieve delay model robustness when reward delays are unknown (i.e., when feedbacks are anonymous). Here, model robustness is measured in terms of ✏-robustness, one of the most widely used and one of the least adversarial robustness concepts in the robust statistics literature. In particular, we show that ✏-robustness cannot be achieved for a consistent (i.e., uniformly sub-polynomial regret) algorithm however small the nonzero ✏ value is when the Grave-Lai constant is not zero. While this is a strongly negative result, we also provide a positive result for linear rewards models (Linear contextual bandits, Reinforcement learning with linear MDP) that the Grave-Lai constant being zero is also sufficient for achieving bounded regret without any knowledge of delay models, i.e., the best of both the efficiency world and the delay robustness world.
more » « less
Full Text Available
The Reward Biased Method: An Optimism based Approach for Reinforcement Learning

https://doi.org/10.1109/Allerton58177.2023.10313396

Mete, Akshay; Singh, Rahul; Kumar, P. R. (September 2023, IEEE)

Full Text Available

« Prev Next »

Search for: All records