Model Evaluation Guidelines for Geomagnetic Index Predictions

Liemohn, Michael W.  (ORCID:0000000270392631); McCollough, James P.  (ORCID:0000000336158857); Jordanova, Vania K.  (ORCID:0000000304758743); Ngwira, Chigomezyo M.  (ORCID:0000000185013246); Morley, Steven K.  (ORCID:0000000185200199); Cid, Consuelo  (ORCID:0000000228633745); Tobiska, W. Kent  (ORCID:0000000204158484); Wintoft, Peter  (ORCID:000000023680126X); Ganushkina, Natalia Yu.  (ORCID:000000029259850X); Welling, Daniel T.  (ORCID:0000000205901022); Bingham, Suzy; Balikhin, Michael A.  (ORCID:0000000281105626); Opgenoorth, Hermann J.  (ORCID:0000000175735165); Engel, Miles A.  (ORCID:0000000342489636); Weigel, Robert S.  (ORCID:0000000295215228); Singer, Howard J.  (ORCID:0000000253646505); Buresova, Dalia  (ORCID:0000000294023152); Bruinsma, Sean  (ORCID:0000000285263314); Zhelavskaya, Irina S.  (ORCID:0000000270295372); Shprits, Yuri Y.  (ORCID:0000000296250834); Vasile, Ruggero

doi:10.1029/2018SW002067

Abstract Geomagnetic indices are convenient quantities that distill the complicated physics of some region or aspect of near‐Earth space into a single parameter. Most of the best‐known indices are calculated from ground‐based magnetometer data sets, such as Dst, SYM‐H, Kp, AE, AL, and PC. Many models have been created that predict the values of these indices, often using solar wind measurements upstream from Earth as the input variables to the calculation. This document reviews the current state of models that predict geomagnetic indices and the methods used to assess their ability to reproduce the target index time series. These existing methods are synthesized into a baseline collection of metrics for benchmarking a new or updated geomagnetic index prediction model. These methods fall into two categories: (1) fit performance metrics such as root‐mean‐square error and mean absolute error that are applied to a time series comparison of model output and observations and (2) event detection performance metrics such as Heidke Skill Score and probability of detection that are derived from a contingency table that compares model and observation values exceeding (or not) a threshold value. A few examples of codes being used with this set of metrics are presented, and other aspects of metrics assessment best practices, limitations, and uncertainties are discussed, including several caveats to consider when using geomagnetic indices.

More Like this