DNAGAST: Generative Adversarial Set Transformers for High-throughput Sequencing

Ludwig_II, David W; Phillips, Joshua L

doi:10.32473/flairs.38.1.139043

Citation Details

This content will become publicly available on May 14, 2026

DNAGAST: Generative Adversarial Set Transformers for High-throughput Sequencing

High-throughput sequencing (HTS) is a modern DNA sequencing technology used to rapidly read thousands of genomic fragments from microorganisms given a sample. The large amount of data produced by this process makes deep learning, whose performance often scales with dataset size, a suitable fit for processing HTS samples. While deep learning models have utilized sets of DNA sequences to make informed predictions, to our knowledge, there are no models in the current literature capable of generating synthetic HTS samples, a tool which could enable experimenters to predict HTS samples given some environmental parameters. Furthermore, the unordered nature of HTS samples poses a challenge to nearly all deep learning architectures because they have an inherent dependence on input order. To address this gap in the literature, we introduce DNA Generative Adversarial Set Transformer (DNAGAST), the first model capable of generating synthetic HTS samples.We qualitatively and quantitatively demonstrate DNAGAST’s ability to produce realistic synthetic samples and explore various methods to mitigate mode-collapse. Additionally, we propose novel quantitative diversity metrics to measure the effects of mode-collapse for unstructured set-based data. more »

Award ID(s):: 1933925

PAR ID:: 10623700

Author(s) / Creator(s):: Ludwig_II, David W; Phillips, Joshua L

Publisher / Repository:: University of Florida Press

Date Published:: 2025-05-14

Journal Name:: The International FLAIRS Conference Proceedings

Volume:: 38

ISSN:: 2334-0754

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on May 14, 2026
Journal Article:
https://doi.org/10.32473/flairs.38.1.139043

More Like this