NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

On the Model-Misspecification in Reinforcement Learning

Li, Yunfan; Yang, Lin (April 2024, International Conference on Artificial Intelligence and Statistics)

Full Text Available
Low-Switching Policy Gradient with Exploration via Online Sensitivity Sampling

Li, Yunfan; Wang, Yiran; Cheng, Yu; Yang, Lin (July 2023, Proceedings of Machine Learning Research)

Full Text Available
Automated Assessment of Critical View of Safety in Laparoscopic Cholecystectomy

https://doi.org/10.1109/ICHI57859.2023.00051

Li, Yunfan; Gupta, Himanshu; Ling, Haibin; Ramakrishnan, IV; Prasanna, Prateek; Georgakis, Georgios; Sasson, Aaron (June 2023, IEEE)

Full Text Available
Automated Assessment of Critical View of Safety in Laparoscopic Cholecystectomy

Li, Yunfan; Gupta, Himanshu; Ling, Haibin; Ramakrishnan, IV; Georgakis, Georgios; Sasson, Aaron; Prasanna, Prateek (January 2023, ritical View of Safety in Laparoscopic Cholecystectomy.)

Full Text Available
UVMBench: A Comprehensive Benchmark Suite for Researching Unified Virtual Memory in GPUs

Gu, Yongbin; Wu, Wenxuan; Li, Yunfan; Chen, Lizhong (July 2021, International Conference on Scientific Computing)

The recent introduction of Unified Virtual Memory (UVM) in GPUs offers a new programming model that allows GPUs and CPUs to share the same virtual memory space, which shifts the complex memory management from programmers to GPU driver/ hardware and enables kernel execution even when memory is oversubscribed. Meanwhile, UVM may also incur considerable performance overhead due to tracking and data migration along with special handling of page faults and page table walk. As UVM is attracting significant attention from the research community to develop innovative solutions to these problems, in this paper, we propose a comprehensive UVM benchmark suite named UVMBench to facilitate future research on this important topic. The proposed UVMBench consists of 32 representative benchmarks from a wide range of application domains. The suite also features unified programming implementation and diverse memory access patterns across benchmarks, thus allowing thorough evaluation and comparison with current state-of-the-art. A set of experiments have been conducted on real GPUs to verify and analyze the benchmark suite behaviors under various scenarios.
more » « less
Full Text Available
Joint mean–covariance estimation via the horseshoe

https://doi.org/10.1016/j.jmva.2020.104716

Li, Yunfan; Datta, Jyotishka; Craig, Bruce A.; Bhadra, Anindya (May 2021, Journal of Multivariate Analysis)
null (Ed.)
Full Text Available
EquiNox: Equivalent NoC Injection Routers for Silicon Interposer-Based Throughput Processors

https://doi.org/10.1109/HPCA47549.2020.00043

Li, Yunfan; Chen, Lizhong (February 2020, IEEE International Symposium on High Performance Computer Architecture (HPCA))
null (Ed.)
Throughput-oriented many-core processors demand highly efficient network-on-chip (NoC) architecture for data transferring. Recent advent of silicon interposer, stacked memory and 2.5D integration have further increased data transfer rate. This greatly intensifies traffic bottleneck in the NoC but, at the same time, also brings a significant new opportunity in utilizing wiring resources in the interposer. In this paper, we propose a novel concept called Equivalent Injection Routers (EIRs) which, together with interposer links, transform the few-to-many traffic pattern to many-to-many pattern, thus fundamentally solving the bottleneck problem. We have developed EquiNox as a design example. We utilize N-Queen and Monte Carlo Tree Search (MCTS) methods to help select EIRs by considering comprehensively from topological, architectural and physical aspects. Evaluation results show that, compared with prior work, the proposed EquiNox is able to reduce execution time by 23.5%, energy consumption by 18.9%, and EDP by 32.8%, under similar hardware cost.
more » « less
Full Text Available
Characterizing On-Chip Traffic Patterns in General-Purpose GPUs: A Deep Learning Approach

https://doi.org/10.1109/ICCD46524.2019.00016

Li, Yunfan; Penney, Drew; Ramamurthy, Abhishek; Chen, Lizhong (November 2019, IEEE 37th International Conference on Computer Design (ICCD))
null (Ed.)
Architectural optimizations in general-purpose graphics processing units (GPGPUs) often exploit workload characteristics to reduce power and latency while improving performance. This paper finds, however, that prevailing assumptions about GPGPU traffic pattern characterization are inaccurate. These assumptions must therefore be re-evaluated, and more appropriate new patterns must be identified. This paper proposes a methodology to classify GPGPU traffic patterns, combining a convolutional neural network (CNN) for feature extraction and a t-distributed stochastic neighbor embedding (t-SNE) algorithm to determine traffic pattern clusters. A traffic pattern dataset is generated from common GPGPU benchmarks, transformed using heat mapping, and iteratively refined to ensure appropriate and highly accurate labels. The proposed classification model achieves 98.8% validation accuracy and 94.24% test accuracy. Furthermore, traffic in 96.6% of examined kernels can be classified into the eight identified traffic pattern categories.
more » « less
Full Text Available
Express Link Placement for NoC-Based Many-Core Platforms

https://doi.org/10.1145/3337821.3337877

Li, Yunfan; Zhu, Di; Chen, Lizhong (August 2019, International Conference on Parallel Processing)

Full Text Available

Search for: All records