TOMFuN: A tensorized optical multimodal fusion network

Xiao, Xian; Zhao, Yequan; Yuan, Yuan; Kurczveil, Geza; Fiorentino, Marco; Beausoleil, Ray; Zhang, Zheng

doi:10.1063/5.0255883

Citation Details

This content will become publicly available on March 1, 2026

TOMFuN: A tensorized optical multimodal fusion network

This paper proposes a real-size, single-shot, high-speed, and energy-efficient tensorized optical multimodal fusion network (TOMFuN) on an electro-photonic large-scale III–V-on-Si in-memory compute engine. The TOMFuN architecture leverages a memory-efficient and low-complexity self-attention for the embedding network for the text information and tensor-train and CANDECOMP/PARAFAC decompositions for compressing the model parameters in the large-scale fully connected layers. Compared to full-size counterparts, our proposed network maintains a compatible inference accuracy in multimodal sentiment analysis tasks while requiring 92.8× fewer model parameters and 51.3× fewer hardware resources. Furthermore, the impact of photonic device imperfections on the TOMFuN architecture is investigated. The simulation results show that noise-aware on-chip training exhibits superior robustness. Finally, chip performance analysis shows that our TOMFuN inference accelerator has 230.73 PetaOps computational speed, 6.51 TOPS/W power efficiency, and 2.7 µs latency with the input dimensions of 1024. more »

Award ID(s):: 2311295 2419889

PAR ID:: 10612166

Author(s) / Creator(s):: Xiao, Xian; Zhao, Yequan; Yuan, Yuan; Kurczveil, Geza; Fiorentino, Marco; Beausoleil, Ray; Zhang, Zheng

Publisher / Repository:: AIP publishing

Date Published:: 2025-03-01

Journal Name:: APL Machine Learning

Volume:: 3

Issue:: 1

ISSN:: 2770-9019

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on March 1, 2026
Journal Article:
https://doi.org/10.1063/5.0255883

More Like this