skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Yuan, Yuan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available December 1, 2026
  2. The emerging Ray-tracing cores on GPUs have been repurposed for non-ray-tracing tasks by researchers recently. In this paper, we explore the benefits and effectiveness of executing graph algorithms on RT cores. We re-design breadth-first search and triangle counting on the new hardware as graph algorithm representatives. Our implementations focus on how to convert the graph operations to bounding volume hierarchy construction and ray generation, which are computational paradigms specific to ray tracing. We evaluate our RT-based methods on a wide range of real-world datasets. The results do not show the advantage of the RT-based methods over CUDA-based methods. We extend the experiments to the set intersection workload on synthesized datasets, and the RT-based method shows superior performance when the skew ratio is high. By carefully comparing the RT-based and CUDA-based binary search, we discover that RT cores are more efficient at searching for elements, but this comes with a constant and non-trivial overhead of the execution pipeline. Furthermore, the overhead of BVH construction is substantially higher than sorting on CUDA cores for large datasets. Our case studies unveil several rules of adapting graph algorithms to ray-tracing cores that might benefit future evolution of the emerging hardware towards general-computing tasks. 
    more » « less
    Free, publicly-accessible full text available May 27, 2026
  3. Free, publicly-accessible full text available March 1, 2026
  4. This paper proposes a real-size, single-shot, high-speed, and energy-efficient tensorized optical multimodal fusion network (TOMFuN) on an electro-photonic large-scale III–V-on-Si in-memory compute engine. The TOMFuN architecture leverages a memory-efficient and low-complexity self-attention for the embedding network for the text information and tensor-train and CANDECOMP/PARAFAC decompositions for compressing the model parameters in the large-scale fully connected layers. Compared to full-size counterparts, our proposed network maintains a compatible inference accuracy in multimodal sentiment analysis tasks while requiring 92.8× fewer model parameters and 51.3× fewer hardware resources. Furthermore, the impact of photonic device imperfections on the TOMFuN architecture is investigated. The simulation results show that noise-aware on-chip training exhibits superior robustness. Finally, chip performance analysis shows that our TOMFuN inference accelerator has 230.73 PetaOps computational speed, 6.51 TOPS/W power efficiency, and 2.7 µs latency with the input dimensions of 1024. 
    more » « less
    Free, publicly-accessible full text available March 1, 2026
  5. Predicting extreme events in chaotic systems, characterized by rare but intensely fluctuating properties, is of great importance due to their impact on the performance and reliability of a wide range of systems. Some examples include weather forecasting, traffic management, power grid operations, and financial market analysis, to name a few. Methods of increasing sophistication have been developed to forecast events in these systems. However, the boundaries that define the maximum accuracy of forecasting tools are still largely unexplored from a theoretical standpoint. Here, we address the question: What is the minimum possible error in the prediction of extreme events in complex, chaotic systems? We derive the minimum probability of error in extreme event forecasting along with its information-theoretic lower and upper bounds. These bounds are universal for a given problem, in that they hold regardless of the modeling approach for extreme event prediction: from traditional linear regressions to sophisticated neural network models. The limits in predictability are obtained from the cost-sensitive Fano’s and Hellman’s inequalities using the Rényi entropy. The results are also connected to Takens’ embedding theorem using the information can’t hurt inequality. Finally, the probability of error for a forecasting model is decomposed into three sources: uncertainty in the initial conditions, hidden variables, and suboptimal modeling assumptions. The latter allows us to assess whether prediction models are operating near their maximum theoretical performance or if further improvements are possible. The bounds are applied to the prediction of extreme events in the Rössler system and the Kolmogorov flow. 
    more » « less
    Free, publicly-accessible full text available November 1, 2025