Distributed VLMs: Efficient vision-language processing through cloud-edge collaboration

Li, Y; Gumaste, D; Turkcan, M; Ghaderi, LJ; Zussman, G; Kostic, Z

Citation Details

This content will become publicly available on February 1, 2026

Distributed VLMs: Efficient vision-language processing through cloud-edge collaboration

Vision Language models (VLMs) have transformed Generative AI by enabling systems to interpret and respond to multi-modal data in real-time. While advancements in edge computing have made it possible to deploy smaller Large Language Models (LLMs) on smartphones and laptops, deploying competent VLMs on edge devices remains challenging due to their high computational demands. Furthermore, cloud-only deployments fail to utilize the evolving processing capabilities at the edge and limit responsiveness. This paper introduces a distributed architecture for VLMs that addresses these limitations by partitioning model components between edge devices and central servers. In this setup, vision components run on edge devices for immediate processing, while language generation of the VLM is handled by a centralized server, resulting in up to 33% improvement in throughput over traditional cloud-only solutions. Moreover, our approach enhances the computational efficiency of off-the-shelf VLM models without the need for model compression techniques. This work demonstrates the scalability and efficiency of a hybrid architecture for VLM deployment and contributes to the discussion on how distributed approaches can improve VLM performance. Index Terms—vision-language models (VLMs), edge computing, distributed computing, inference optimization, edge-cloud collaboration. more »

Award ID(s):: 2038984

PAR ID:: 10639785

Author(s) / Creator(s):: Li, Y; Gumaste, D; Turkcan, M; Ghaderi, LJ; Zussman, G; Kostic, Z

Publisher / Repository:: in Proc. 4th IEEE Workshop on Pervasive and Resource-constrained Artificial Intelligence (PeRConAI), 2025.

Date Published:: 2025-02-01

Format(s):: Medium: X

Location:: Washington, DC

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on February 1, 2026
Conference Proceeding:
The DOI is not currently available.

More Like this