- Home
- Search Results
- Page 1 of 1
Search for: All records
-
Total Resources3
- Resource Type
-
0000000003000000
- More
- Availability
-
30
- Author / Contributor
- Filter by Author / Creator
-
-
Aggarwal, Vaneet (2)
-
Tamboli, Dipesh (2)
-
Yu, Denny (2)
-
Anton, Nicholas E. (1)
-
Chen, Jiayu (1)
-
Hornbeck, Tera (1)
-
Jotheeswaran, Kiran Pranesh (1)
-
Kothandaraman, Harish (1)
-
Lanman, Nadia A (1)
-
Malusare, Aditya (1)
-
Nagle, Amy M. (1)
-
Norman, Susan (1)
-
Shroff, Anand D. (1)
-
Zhou, Guoyang (1)
-
#Tyler Phillips, Kenneth E. (0)
-
#Willis, Ciara (0)
-
& Abreu-Ramos, E. D. (0)
-
& Abramson, C. I. (0)
-
& Abreu-Ramos, E. D. (0)
-
& Adams, S.G. (0)
-
- Filter by Editor
-
-
Lengauer, Thomas (1)
-
& Spizer, S. M. (0)
-
& . Spizer, S. (0)
-
& Ahn, J. (0)
-
& Bateiha, S. (0)
-
& Bosch, N. (0)
-
& Brennan K. (0)
-
& Brennan, K. (0)
-
& Chen, B. (0)
-
& Chen, Bodong (0)
-
& Drown, S. (0)
-
& Ferretti, F. (0)
-
& Higgins, A. (0)
-
& J. Peters (0)
-
& Kali, Y. (0)
-
& Ruiz-Arias, P.M. (0)
-
& S. Spitzer (0)
-
& Sahin. I. (0)
-
& Spitzer, S. (0)
-
& Spitzer, S.M. (0)
-
-
Have feedback or suggestions for a way to improve these results?
!
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Lengauer, Thomas (Ed.)Abstract SummaryThis article presents the Ensemble Nucleotide Byte-level Encoder-Decoder (ENBED) foundation model, analyzing DNA sequences at byte-level precision with an encoder–decoder Transformer architecture. ENBED uses a subquadratic implementation of attention to develop an efficient model capable of sequence-to-sequence transformations, generalizing previous genomic models with encoder-only or decoder-only architectures. We use Masked Language Modeling to pretrain the foundation model using reference genome sequences and apply it in the following downstream tasks: (i) identification of enhancers, promotors, and splice sites, (ii) recognition of sequences containing base call mismatches and insertion/deletion errors, an advantage over tokenization schemes involving multiple base pairs, which lose the ability to analyze with byte-level precision, (iii) identification of biological function annotations of genomic sequences, and (iv) generating mutations of the Influenza virus using the encoder–decoder architecture and validating them against real-world observations. In each of these tasks, we demonstrate significant improvement as compared to the existing state-of-the-art results. Availability and implementationThe source code used to develop and fine-tune the foundation model has been released on Github (https://github.itap.purdue.edu/Clan-labs/ENBED).more » « less
-
Tamboli, Dipesh; Chen, Jiayu; Jotheeswaran, Kiran Pranesh; Yu, Denny; Aggarwal, Vaneet (, IEEE Journal of Biomedical and Health Informatics)
-
Anton, Nicholas E.; Zhou, Guoyang; Hornbeck, Tera; Nagle, Amy M.; Norman, Susan; Shroff, Anand D.; Yu, Denny (, Applied Ergonomics)
An official website of the United States government
