Semantically informed data augmentation for unscoped episodic logical forms

Juvekar, Mandar; Kim, Gene; Schubert, Lenhart

Citation Details

This content will become publicly available on June 1, 2024

Semantically informed data augmentation for unscoped episodic logical forms

Unscoped Logical Form (ULF) of Episodic Logic is a meaning representation format that captures the overall semantic type structure of natural language while leaving certain finer details, such as word sense and quantifier scope, underspecified for ease of parsing and annotation. While a learned parser exists to convert English to ULF, its performance is severely limited by the lack of a large dataset to train the system. We present a ULF dataset augmentation method that samples type-coherent ULF expressions using the ULF semantic type system and filters out samples corresponding to implausible English sentences using a pretrained language model. Our data augmentation method is configurable with parameters that trade off between plausibility of samples with sample novelty and augmentation size. We find that the best configuration of this augmentation method substantially improves parser performance beyond using the existing unaugmented dataset. more »

Award ID(s):: 1940981

NSF-PAR ID:: 10435036

Author(s) / Creator(s):: Juvekar, Mandar; Kim, Gene; Schubert, Lenhart

Date Published:: 2023-06-01

Journal Name:: 15th International Conference on Computational Semantics

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on June 1, 2024
Conference Paper:
The DOI is not currently available.

More Like this