TMModel: Modeling Texture Memory and Mobile GPU Performance to Accelerate DNN Computations

Guan, J; Hu, Z; Antonopoulus, C; Bellas, N; Lalis, S; Smirni, E; Zhou, G; Agrawal, G; Ren, B

Citation Details

This content will become publicly available on June 9, 2026

TMModel: Modeling Texture Memory and Mobile GPU Performance to Accelerate DNN Computations

The demand for Deep Neural Network (DNN) execution (including both inference and training) on mobile system-ona-chip (SoCs) has surged, driven by factors like the need for real-time latency, privacy, and reducing vendors’ costs. Mainstream mobile GPUs (eg, Qualcomm Adreno GPUs) usually have a 2.5 D L1 texture cache that offers throughput superior to that of on-chip memory. However, to date, there is limited understanding of the performance features of such a 2.5 D cache, which limits the optimization potential. This paper introduces TMModel, a framework with three components: 1) a set of micro-benchmarks and a novel performance assessment methodology to characterize a non-well-documented architecture with 2D memory, 2) a complete analytical performance model configurable for different data access pattern (s), tiling size (s), and other GPU execution parameters for a given operator (and associated size and shape), and 3) a compilation framework incorporating this model and generating optimized code with low overhead. TMModel is validated both on a set of DNN kernels and for training complete models on mobile GPU. more »

Award ID(s):: 2333895

PAR ID:: 10613664

Author(s) / Creator(s):: Guan, J; Hu, Z; Antonopoulus, C; Bellas, N; Lalis, S; Smirni, E; Zhou, G; Agrawal, G; Ren, B

Publisher / Repository:: ACM - Proceedings of ICS 2025

Date Published:: 2025-06-09

ISSN:: 9798-4007

ISBN:: 979-8-4007-1537-2

Format(s):: Medium: X

Location:: salt late city

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on June 9, 2026
Conference Paper:
The DOI is not currently available.

More Like this