NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Representations of syntax [MASK] useful: Effects of constituency and dependency structure in recursive LSTMs

Lepori, Michael; Linzen, Tal; McCoy, R. Thomas (July 2020, Association for Computational Linguistics)

Sequence-based neural networks show significant sensitivity to syntactic structure, but they still perform less well on syntactic tasks than tree-based networks. Such tree-based networks can be provided with a constituency parse, a dependency parse, or both. We evaluate which of these two representational schemes more effectively introduces biases for syntactic structure that increase performance on the subject-verb agreement prediction task. We find that a constituency-based network generalizes more robustly than a dependency-based one, and that combining the two types of structure does not yield further improvement. Finally, we show that the syntactic robustness of sequential models can be substantially improved by fine-tuning on a small amount of constructed data, suggesting that data augmentation is a viable alternative to explicit constituency structure for imparting the syntactic biases that sequential models are lacking.
more » « less
Full Text Available
Does Syntax Need to Grow on Trees? Sources of Hierarchical Inductive Bias in Sequence-to-Sequence Networks

https://doi.org/10.1162/tacl_a_00304

McCoy, R. Thomas; Frank, Robert; Linzen, Tal (January 2020, Transactions of the Association for Computational Linguistics)

Learners that are exposed to the same training data might generalize differently due to differing inductive biases. In neural network models, inductive biases could in theory arise from any aspect of the model architecture. We investigate which architectural factors affect the generalization behavior of neural sequence-to-sequence models trained on two syntactic tasks, English question formation and English tense reinflection. For both tasks, the training set is consistent with a generalization based on hierarchical structure and a generalization based on linear order. All architectural factors that we investigated qualitatively affected how models generalized, including factors with no clear connection to hierarchical structure. For example, LSTMs and GRUs displayed qualitatively different inductive biases. However, the only factor that consistently contributed a hierarchical bias across tasks was the use of a tree-structured model rather than a model with sequential recurrence, suggesting that human-like syntactic generalization requires architectural syntactic structure.
more » « less
Full Text Available
Finding Hierarchical Structure in Neural Stacks Using Unsupervised Parsing

https://doi.org/10.18653/v1/W19-4823

Merrill, William; Khazan, Lenny; Amsel, Noah; Hao, Yiding; Mendelsoohn, Simon; Frank, Robert (August 2019, Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP)

Full Text Available
Open Sesame: Getting Inside BERT’s Linguistic Knowledge

https://doi.org/10.18653/v1/W19-4825

Lin, Yongjie Lin; Tan, Yi Chern; Frank, Robert (August 2019, Proceedings of the Second BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP)

Full Text Available

Search for: All records