eCOMPASS: evaluative comparison of multiple protein alignments by statistical score

Neuwald, Andrew F; Kolaczkowski, Bryan D; Altschul, Stephen F

doi:10.1093/bioinformatics/btab374

Citation Details

eCOMPASS: evaluative comparison of multiple protein alignments by statistical score

Abstract Motivation Detecting subtle biologically relevant patterns in protein sequences often requires the construction of a large and accurate multiple sequence alignment (MSA). Methods for constructing MSAs are usually evaluated using benchmark alignments, which, however, typically contain very few sequences and are therefore inappropriate when dealing with large numbers of proteins. Results eCOMPASS addresses this problem using a statistical measure of relative alignment quality based on direct coupling analysis (DCA): To maintain protein structural integrity over evolutionary time, substitutions at one residue position typically result in compensating substitutions at other positions. eCOMPASS computes the statistical significance of the congruence between high scoring directly coupled pairs and 3D contacts in corresponding structures, which depends upon properly aligned homologous residues. We illustrate eCOMPASS using both simulated and real MSAs. Availability and Implementation The eCOMPASS executable, C ++ open source code and input data sets are available at https://www.igs.umaryland.edu/labs/neuwald/software/compass. Supplementary information Supplementary data are available at Bioinformatics online. more »

Award ID(s):: 1817942

PAR ID:: 10238805

Author(s) / Creator(s):: Neuwald, Andrew F; Kolaczkowski, Bryan D; Altschul, Stephen F

Editor(s):: Ponty, Yann

Date Published:: 2021-05-13

Journal Name:: Bioinformatics

ISSN:: 1367-4803

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1093/bioinformatics/btab374

More Like this