Statistical binning leads to profound model violation due to gene tree error incurred by trying to avoid gene tree error
- Award ID(s):
- 1655571
- Publication Date:
- NSF-PAR ID:
- 10090086
- Journal Name:
- Molecular Phylogenetics and Evolution
- Volume:
- 134
- Issue:
- C
- Page Range or eLocation-ID:
- 164 to 171
- ISSN:
- 1055-7903
- Sponsoring Org:
- National Science Foundation
More Like this
-
When forest conditions are mapped from empirical models, uncertainty in remotely sensed predictor variables can cause the systematic overestimation of low values, underestimation of high values, and suppression of variability. This regression dilution or attenuation bias is a well-recognized problem in remote sensing applications, with few practical solutions. Attenuation is of particular concern for applications that are responsive to prediction patterns at the high end of observed data ranges, where systematic error is typically greatest. We addressed attenuation bias in models of tree species relative abundance (percent of total aboveground live biomass) based on multitemporal Landsat and topoclimatic predictor data. We developed a multi-objective support vector regression (MOSVR) algorithm that simultaneously minimizes total prediction error and systematic error caused by attenuation bias. Applied to 13 tree species in the Acadian Forest Region of the northeastern U.S., MOSVR performed well compared to other prediction methods including single-objective SVR (SOSVR) minimizing total error, Random Forest (RF), gradient nearest neighbor (GNN), and Random Forest nearest neighbor (RFNN) algorithms. SOSVR and RF yielded the lowest total prediction error but produced the greatest systematic error, consistent with strong attenuation bias. Underestimation at high relative abundance caused strong deviations between predicted patterns of species dominance/codominance andmore »