NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Farsighter: Efficient Multi-Step Exploration for Deep Reinforcement Learning [Farsighter: Efficient Multi-Step Exploration for Deep Reinforcement Learning]

https://doi.org/10.5220/0011800600003393

Liu, Yongshuai; Liu, Xin (January 2023, SCITEPRESS - Science and Technology Publications)
CLARA: A Constrained Reinforcement Learning Based Resource Allocation Framework for Network Slicing

https://doi.org/10.1109/BigData52589.2021.9671840

Liu, Yongshuai; Ding, Jiaxin; Zhang, Zhi-Li; Liu, Xin (December 2021, IEEE International Conference on Big Data)

Full Text Available
Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey

https://doi.org/10.24963/ijcai.2021/614

Liu, Yongshuai; Halev, Avishai; Liu, Xin (August 2021, The 30th International Joint Conference on Artificial Intelligence (IJCAI))

Full Text Available
Resource Allocation Method for Network Slicing Using Constrained Reinforcement Learning

https://doi.org/10.23919/IFIPNetworking52078.2021.9472202

Liu, Yongshuai; Ding, Jiaxin; Liu, Xin (June 2021, IFIP Networking Conference (IFIP Networking))
null (Ed.)
Full Text Available
CTS2: Time Series Smoothing with Constrained Reinforcement Learning

Liu, Yongshuai; Liu, Xin (January 2021, Asian conference on machine learning)

Full Text Available
CTS2: Time Series Smoothing with Constrained Reinforcement Learning

Liu, Yongshuai; Liu, Xin (January 2021, The 13th Asian Conference on Machine Learning (ACML))

Full Text Available
IPO: Interior-point Policy Optimization under Constraints

https://doi.org/10.1609/aaai.v34i04.5932

Liu, Yongshuai; Ding, Jiaxin; Liu, Xin (January 2020, Proceedings of the AAAI Conference on Artificial Intelligence)

In this paper, we study reinforcement learning (RL) algorithms to solve real-world decision problems with the objective of maximizing the long-term reward as well as satisfying cumulative constraints. We propose a novel first-order policy optimization method, Interior-point Policy Optimization (IPO), which augments the objective with logarithmic barrier functions, inspired by the interior-point method. Our proposed method is easy to implement with performance guarantees and can handle general types of cumulative multiconstraint settings. We conduct extensive evaluations to compare our approach with state-of-the-art baselines. Our algorithm outperforms the baseline algorithms, in terms of reward maximization and constraint satisfaction.
more » « less
Full Text Available
Less is More: Culling the Training Set to Improve Robustness of Deep Neural Networks

https://doi.org/http://doi.org/10.1007/978-3-030-01554-1_6

Liu, Yongshuai; Chen, Jiyu; Chen, Hao (September 2018, International Conference on Decision and Game Theory for Security)

Deep neural networks are vulnerable to adversarial examples. Prior defenses attempted to make deep networks more robust by either changing the network architecture or augmenting the training set with adversarial examples, but both have inherent limitations. Motivated by recent research that shows outliers in the training set have a high negative influence on the trained model, we studied the relationship between model robustness and the quality of the training set. We first show that outliers give the model better generalization ability but weaker robustness. Next, we propose an adversarial example detection framework, in which we design two methods for removing outliers from training set to obtain the sanitized model and then detect adversarial example by calculating the difference of outputs between the original and the sanitized model. We evaluated the framework on both MNIST and SVHN. Based on the difference measured by Kullback-Leibler divergence, we could detect adversarial examples with accuracy between 94.67% to 99.89%.
more » « less
Full Text Available

Search for: All records