Search for: All records

Creators/Authors contains: "Chen, Yudong"

« Prev Next »

Total Resources

37

Resource Type
Conference Paper

17

Conference Proceeding

1

Dataset

0

Journal Article

19

Workshop Report

0

Availability
Full Text / Resource Available

33

Citation Only

4

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Restless Bandits with Average Reward: Breaking the Uniform Global Attractor Assumption

Hong, Yige ; Xie, Qiaomin ; Chen, Yudong ; Wang, Weina ( December 2023 , Advances in neural information processing systems)

We study the infinite-horizon Restless Bandit problem with the average reward criterion, under both discrete-time and continuous-time settings. A fundamental goal is to design computationally efficient policies that achieve a diminishing optimality gap as the number of arms, N, grows large. Existing results on asymptotic optimality all rely on the uniform global attractor property (UGAP), a complex and challenging-to-verify assumption. In this paper, we propose a general, simulation-based framework, Follow-the-Virtual-Advice, that converts any single-armed policy into a policy for the original N-armed problem. This is done by simulating the single-armed policy on each arm and carefully steering the real state towards the simulated state. Our framework can be instantiated to produce a policy with an O(1/pN) optimality gap. In the discrete-time setting, our result holds under a simpler synchronization assumption, which covers some problem instances that violate UGAP. More notably, in the continuous-time setting, we do not require any additional assumptions beyond the standard unichain condition. In both settings, our work is the first asymptotic optimality result that does not require UGAP.
more » « less
Restless Bandits with Average Reward: Breaking the Uniform Global Attractor Assumption

Hong, Yige Hong ; Xie, Qiaomin ; Chen, Yudong ; Wang, Weina ( December 2023 , NeurIPS Proceedings)
Algorithmic Regularization in Model-Free Overparametrized Asymmetric Matrix Factorization

https://doi.org/10.1137/22M1519833

Jiang, Liwei ; Chen, Yudong ; Ding, Lijun ( September 2023 , SIAM Journal on Mathematics of Data Science)

Free, publicly-accessible full text available September 30, 2024
Bias and Extrapolation in Markovian Linear Stochastic Approximation with Constant Stepsizes

https://doi.org/10.1145/3578338.3593526

Huo, Dongyan ; Chen, Yudong ; Xie, Qiaomin ( June 2023 , Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems)

Free, publicly-accessible full text available June 19, 2024
Overcoming the Long Horizon Barrier for Sample-Efficient Reinforcement Learning with Latent Low-Rank Structure

https://doi.org/10.1145/3589973

Sam, Tyler ; Chen, Yudong ; Yu, Christina Lee ( May 2023 , Proceedings of the ACM on Measurement and Analysis of Computing Systems)

The practicality of reinforcement learning algorithms has been limited due to poor scaling with respect to the problem size, as the sample complexity of learning an ε-optimal policy is Ω(|S||A|H/ ε2) over worst case instances of an MDP with state space S, action space A, and horizon H. We consider a class of MDPs for which the associated optimal Q* function is low rank, where the latent features are unknown. While one would hope to achieve linear sample complexity in |S| and |A| due to the low rank structure, we show that without imposing further assumptions beyond low rank of Q*, if one is constrained to estimate the Q function using only observations from a subset of entries, there is a worst case instance in which one must incur a sample complexity exponential in the horizon H to learn a near optimal policy. We subsequently show that under stronger low rank structural assumptions, given access to a generative model, Low Rank Monte Carlo Policy Iteration (LR-MCPI) and Low Rank Empirical Value Iteration (LR-EVI) achieve the desired sample complexity of Õ((|S|+|A|)poly (d,H)/ε2) for a rank d setting, which is minimax optimal with respect to the scaling of |S|, |A|, and ε. In contrast to literature on linear and low-rank MDPs, we do not require a known feature mapping, our algorithm is computationally simple, and our results hold for long time horizons. Our results provide insights on the minimal low-rank structural assumptions required on the MDP with respect to the transition kernel versus the optimal action-value function.
more » « less
Free, publicly-accessible full text available May 19, 2024
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium

https://doi.org/10.1287/moor.2022.1268

Xie, Qiaomin ; Chen, Yudong ; Wang, Zhaoran ; Yang, Zhuoran ( June 2022 , Mathematics of Operations Research)

We develop provably efficient reinforcement learning algorithms for two-player zero-sum finite-horizon Markov games with simultaneous moves. To incorporate function approximation, we consider a family of Markov games where the reward function and transition kernel possess a linear structure. Both the offline and online settings of the problems are considered. In the offline setting, we control both players and aim to find the Nash equilibrium by minimizing the duality gap. In the online setting, we control a single player playing against an arbitrary opponent and aim to minimize the regret. For both settings, we propose an optimistic variant of the least-squares minimax value iteration algorithm. We show that our algorithm is computationally efficient and provably achieves an [Formula: see text] upper bound on the duality gap and regret, where d is the linear dimension, H the horizon and T the total number of timesteps. Our results do not require additional assumptions on the sampling model. Our setting requires overcoming several new challenges that are absent in Markov decision processes or turn-based Markov games. In particular, to achieve optimism with simultaneous moves, we construct both upper and lower confidence bounds of the value function, and then compute the optimistic policy by solving a general-sum matrix game with these bounds as the payoff matrices. As finding the Nash equilibrium of a general-sum game is computationally hard, our algorithm instead solves for a coarse correlated equilibrium (CCE), which can be obtained efficiently. To our best knowledge, such a CCE-based scheme for optimism has not appeared in the literature and might be of interest in its own right.
more » « less
Full Text Available
Clustering heterogeneous financial networks

https://doi.org/10.1111/mafi.12407

Amini, Hamed ; Chen, Yudong ; Minca, Andreea ; Qian, Xin ( June 2023 , Mathematical Finance)

Abstract
We develop a convex‐optimization clustering algorithm for heterogeneous financial networks, in the presence of arbitrary or even adversarial outliers. In the stochastic block model with heterogeneity parameters, we penalize nodes whose degree exhibit unusual behavior beyond inlier heterogeneity. We prove that under mild conditions, this method achieves exact recovery of the underlying clusters. In absence of any assumption on outliers, they are shown not to hinder the clustering of the inliers. We test the performance of the algorithm on semi‐synthetic heterogenous networks reconstructed to match aggregate data on the Korean financial sector. Our method allows for recovery of sub‐sectors with significantly lower error rates compared to existing algorithms. For overlapping portfolio networks, we uncover a clustering structure supporting diversification effects in investment management.

more » « less
Hidden Integrality and Semirandom Robustness of SDP Relaxation for Sub-Gaussian Mixture Model

https://doi.org/10.1287/moor.2021.1216

Fei, Yingjie ; Chen, Yudong ( March 2022 , Mathematics of Operations Research)

We consider the problem of estimating the discrete clustering structures under the sub-Gaussian mixture model. Our main results establish a hidden integrality property of a semidefinite programming (SDP) relaxation for this problem: while the optimal solution to the SDP is not integer-valued in general, its estimation error can be upper bounded by that of an idealized integer program. The error of the integer program, and hence that of the SDP, are further shown to decay exponentially in the signal-to-noise ratio. In addition, we show that the SDP relaxation is robust under the semirandom setting in which an adversary can modify the data generated from the mixture model. In particular, we generalize the hidden integrality property to the semirandom model and thereby show that SDP achieves the optimal error bound in this setting. These results together highlight the “global-to-local” mechanism that drives the performance of the SDP relaxation. To the best of our knowledge, our result is the first exponentially decaying error bound for convex relaxations of mixture models. A corollary of our results shows that in certain regimes, the SDP solutions are in fact integral and exact. More generally, our results establish sufficient conditions for the SDP to correctly recover the cluster memberships of [Formula: see text] fraction of the points for any [Formula: see text]. As a special case, we show that under the [Formula: see text]-dimensional stochastic ball model, SDP achieves nontrivial (sometimes exact) recovery when the center separation is as small as [Formula: see text], which improves upon previous exact recovery results that require constant separation.
more » « less
Full Text Available
Structures of Spurious Local Minima in k -Means

https://doi.org/10.1109/TIT.2021.3122465

Qian, Wei ; Zhang, Yuqian ; Chen, Yudong ( January 2022 , IEEE Transactions on Information Theory)

Full Text Available
Towards a Unified Quadrature Framework for Large-Scale Kernel Machines

https://doi.org/10.1109/TPAMI.2021.3120183

Liu, Fanghui ; Huang, Xiaolin ; Chen, Yudong ; Suykens, Johan A. ( October 2021 , IEEE Transactions on Pattern Analysis and Machine Intelligence)

Full Text Available

« Prev Next »