Sample Complexity of Branch-length Estimation by Maximum Likelihood

Clancy, David; Lyu, Hanbaek; Roch, Sebastien

Citation Details

This content will become publicly available on July 16, 2026

Sample Complexity of Branch-length Estimation by Maximum Likelihood

We consider the branch-length estimation problem on a bifurcating tree: a character evolves along the edges of a binary tree according to a two-state symmetric Markov process, and we seek to recover the edge transition probabilities from repeated observations at the leaves. This problem arises in phylogenetics, and is related to latent tree graphical model inference. In general, the log-likelihood function is non-concave and may admit many critical points. Nevertheless, simple coordinate maximization has been known to perform well in practice, defying the complexity of the likelihood landscape. In this work, we provide the first theoretical guarantee as to why this might be the case. We show that deep inside the Kesten-Stigum reconstruction regime, provided with polynomially many m samples (assuming the tree is balanced), there exists a universal parameter regime (independent of the size of the tree) where the log-likelihood function is strongly concave and smooth with high probability. On this high-probability likelihood landscape event, we show that the standard coordinate maximization algorithm converges exponentially fast to the maximum likelihood estimator, which is within O(1/sqrt(m)) from the true parameter, provided a sufficiently close initial point. more »

Award ID(s):: 2023239

PAR ID:: 10625993

Author(s) / Creator(s):: Clancy, David; Lyu, Hanbaek; Roch, Sebastien

Publisher / Repository:: International Conference on Machine Learning

Date Published:: 2025-07-16

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on July 16, 2026
Conference Paper:
The DOI is not currently available.

More Like this