Unsupervised Learning of PCFGs with Normalizing Flow

Jin, Lifeng; Doshi-Velez, Finale; Miller, Timothy; Schwartz, Lane; Schuler, William

Citation Details

Unsupervised PCFG inducers hypothesize sets of compact context-free rules as explanations for sentences. PCFG induction not only provides tools for low-resource languages, but also plays an important role in modeling language acquisition (Bannard et al., 2009; Abend et al. 2017). However, current PCFG induction models, using word tokens as input, are unable to incorporate semantics and morphology into induction, and may encounter issues of sparse vocabulary when facing morphologically rich languages. This paper describes a neural PCFG inducer which employs context embeddings (Peters et al., 2018) in a normalizing flow model (Dinh et al., 2015) to extend PCFG induction to use semantic and morphological information. Linguistically motivated sparsity and categorical distance constraints are imposed on the inducer as regularization. Experiments show that the PCFG induction model with normalizing flow produces grammars with state-of-the-art accuracy on a variety of different languages. Ablation further shows a positive effect of normalizing flow, context embeddings and proposed regularizers. more »

Award ID(s):: 1816891

PAR ID:: 10109681

Author(s) / Creator(s):: Jin, Lifeng; Doshi-Velez, Finale; Miller, Timothy; Schwartz, Lane; Schuler, William

Date Published:: 2019-01-01

Journal Name:: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Page Range / eLocation ID:: 2442–2452

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this