A Parallel Gumbel-Softmax VAE Framework with Performance-Based Tuning

Zhou, Fangshi; Zhao, Tianming; Nguyen, Luan Viet; Yao, Zhongmei

doi:10.3233/FAIA240689

Traditional training algorithms for Gumbel Softmax Variational Autoencoders (GS-VAEs) typically rely on an annealing scheme that gradually reduces the Softmax temperature τ according to a given function. This approach can lead to suboptimal results. To improve the performance, we propose a parallel framework for GS-VAEs, which embraces dual latent layers and multiple sub-models with diverse temperature strategies. Instead of relying on a fixed function for adjusting τ, our training algorithm uses loss difference as performance feedback to dynamically update each sub-model’s temperature τ, which is inspired by the need to balance exploration and exploitation in learning. By combining diversity in temperature strategies with the performance-based tuning method, our design helps prevent sub-models from becoming trapped in local optima and finds the GS-VAE model that best fits the given dataset. In experiments using four classic image datasets, our model significantly surpasses a standard GS-VAE that employs a temperature annealing scheme across multiple tasks, including data reconstruction, generalization capabilities, anomaly detection, and adversarial robustness. Our implementation is publicly available at https://github.com/wxzg7045/Gumbel-Softmax-VAE-2024/tree/main.

More Like this