RandAR: Decoder-only Autoregressive Visual Generation in Random Orders

Pang, Z; Zhang, T; Luan, F; Man, Y; Tan, H; Zhang, K; Freeman, W T; Wang, Y

Citation Details

This content will become publicly available on June 11, 2026

RandAR: Decoder-only Autoregressive Visual Generation in Random Orders

We introduce RandAR, a decoder-only visual autoregressive (AR) model capable of generatng images in arbitrary token orders. Unlike previous decoder-only AR models that rely on a predefined generation order, RandAR removes this inductive bias, unlocking new capabilities in decoder-only generation. Our essential design enabling random order is to insert a "position instruction token" before each image token to be predicted, representing the spatial location of the next image token. Trained on randomly permuted token sequences -- a more challenging task than fixed-order generation, RandAR achieves comparable performance to conventional raster-order counterpart. More importantly, decoder-only transformers trained from random orders acquire new capabilities. For the efficiency bottleneck of AR models, RandAR adopts parallel decoding with KV-Cache at inference time, enjoying 2.5x acceleration without sacrificing generation quality. Additionally, RandAR supports in-painting, outpainting and resolution extrapolation in a zero-shot manner.We hope RandAR inspires new directions for decoder-only visual generation models and broadens their applications across diverse scenarios. Our project page is at https://rand-ar.github.io/. more »

Award ID(s):: 1955864

PAR ID:: 10646691

Author(s) / Creator(s):: Pang, Z; Zhang, T; Luan, F; Man, Y; Tan, H; Zhang, K; Freeman, W T; Wang, Y

Publisher / Repository:: IEEE / CVF CVPR 2025

Date Published:: 2025-06-11

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on June 11, 2026
Conference Paper:
The DOI is not currently available.

More Like this