SEMBLEU: A Robust Metric for AMR Parsing Evaluation

Song, Linfeng; Gildea, Daniel

Citation Details

Evaluating AMR parsing accuracy involves comparing pairs of AMR graphs. The major evaluation metric, SMATCH (Cai and Knight,2013), searches for one-to-one mappings between the nodes of two AMRs with a greedy hill-climbing algorithm, which leads to search errors. We propose SEMBLEU, a robust metric that extends BLEU (Papineni et al., 2002) to AMRs. It does not suffer from search errors and considers non-local correspondences in addition to local ones. SEMBLEU is fully content-driven and punishes situations where a system’s output does not preserve most information from the input. Preliminary experiments on both sentence and corpus levels show that SEMBLEU has slightly higher consistency with human judgments than SMATCH. Our code is available athttp://github.com/freesunshine0316/sembleu. more »

Award ID(s):: 1813823

PAR ID:: 10112009

Author(s) / Creator(s):: Song, Linfeng; Gildea, Daniel

Date Published:: 2019-08-01

Journal Name:: Association for Computational Linguistics (ACL)

Page Range / eLocation ID:: 4547–4552

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this