Correlations between alignment gaps and nucleotide substitution or amino acid replacement

Seo, Tae-Kun; Redelings, Benjamin D.; Thorne, Jeffrey L.

doi:10.1073/pnas.2204435119

Citation Details

Correlations between alignment gaps and nucleotide substitution or amino acid replacement

To assess the conventional treatment in evolutionary inference of alignment gaps as missing data, we propose a simple nonparametric test of the null hypothesis that the locations of alignment gaps are independent of the nucleotide substitution or amino acid replacement process. When we apply the test to 1,390 protein alignments that are informed by protein tertiary structure and use a 5% significance level, the null hypothesis of independence between amino acid replacement and gap location is rejected for ∼65% of datasets. Via simulations that include substitution and insertion–deletion, we show that the test performs well with true alignments. When we simulate according to the null hypothesis and then apply the test to optimal alignments that are inferred by each of four widely used software packages, the null hypothesis is rejected too frequently. Via further simulations and analyses, we show that the overly frequent rejections of the null hypothesis are not solely due to weaknesses of widely used software for finding optimal alignments. Instead, our evidence suggests that optimal alignments are unrepresentative of true alignments and that biased evolutionary inferences may result from relying upon individual optimal alignments. more »

Award ID(s):: 1754142

PAR ID:: 10446633

Author(s) / Creator(s):: Seo, Tae-Kun; Redelings, Benjamin D.; Thorne, Jeffrey L.

Date Published:: 2022-08-23

Journal Name:: Proceedings of the National Academy of Sciences

Volume:: 119

Issue:: 34

ISSN:: 0027-8424

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1073/pnas.2204435119

More Like this