Iterative Paraphrastic Augmentation with Discriminative Span Alignment

Culkin, Ryan; Hu, J. Edward; Stengel-Eskin, Elias; Qin, Guanghui; Durme, Benjamin Van

doi:10.1162/tacl_a_00380

Citation Details

Iterative Paraphrastic Augmentation with Discriminative Span Alignment

Abstract We introduce a novel paraphrastic augmentation strategy based on sentence-level lexically constrained paraphrasing and discriminative span alignment. Our approach allows for the large-scale expansion of existing datasets or the rapid creation of new datasets using a small, manually produced seed corpus. We demonstrate our approach with experiments on the Berkeley FrameNet Project, a large-scale language understanding effort spanning more than two decades of human labor. With four days of training data collection for a span alignment model and one day of parallel compute, we automatically generate and release to the community 495,300 unique (Frame,Trigger) pairs in diverse sentential contexts, a roughly 50-fold expansion atop FrameNet v1.7. The resulting dataset is intrinsically and extrinsically evaluated in detail, showing positive results on a downstream task. more »

Award ID(s):: 1749025

PAR ID:: 10293091

Author(s) / Creator(s):: Culkin, Ryan; Hu, J. Edward; Stengel-Eskin, Elias; Qin, Guanghui; Durme, Benjamin Van

Date Published:: 2021-01-01

Journal Name:: Transactions of the Association for Computational Linguistics

Volume:: 9

ISSN:: 2307-387X

Page Range / eLocation ID:: 494 to 509

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1162/tacl_a_00380

More Like this