skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Modeling Syntactic Knowledge With Neuro-Symbolic Computation [Modeling Syntactic Knowledge With Neuro-Symbolic Computation]
To overcome the limitations of prevailing NLP methods, a Hybrid-Architecture Symbolic Parser and Neural Lexicon system is proposed to detect structural ambiguity by producing as many syntactic representations as there are interpretations for an utterance. HASPNeL comprises a symbolic AI, feature-unification parser, a lexicon generated using manual classification and machine learning, and a neural network encoder which tags each lexical item in a synthetic corpus and estimates likelihoods for each utterance’s interpretation with respect to the corpus. Language variation is accounted for by lexical adjustments in feature specifications and minimal parameter settings. Contrary to pure probabilistic system, HASPNeL’s neuro-symbolic architecture will perform grammaticality judgements of utterances that do not correspond to rankings of probabilistic systems; have a greater degree of system stability as it is not susceptible to perturbations in the training data; detect lexical and structural ambiguity by producing all possible grammatical representations regardless of their presence in the training data; eliminate the effects of diminishing returns, as it does not require massive amounts of annotated data, unavailable for underrepresented languages; avoid overparameterization and potential overfitting; test current syntactic theory by implementing a Minimalist grammar formalism; and model human language competence by satisfying conditions of learnability, evolvability, and universality.  more » « less
Award ID(s):
2219712
PAR ID:
10557863
Author(s) / Creator(s):
; ;
Publisher / Repository:
SCITEPRESS - Science and Technology Publications
Date Published:
ISSN:
2184433X 21843589
ISBN:
978-989-758-623-1
Page Range / eLocation ID:
608 to 616
Subject(s) / Keyword(s):
Minimalist Syntax Parser Lexicon Structural Ambiguity Cognitive Modeling Computational Linguistics Natural Language Processing Symbolic computation Neural Networks Explainable Artificial Intelligence
Format(s):
Medium: X
Location:
Lisbon, Portugal
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper presents a conversational pipeline for crafting domain knowledge for complex neuro-symbolic models through natural language prompts. It leverages large language models to generate declarative programs in the DomiKnowS framework. The programs in this framework express concepts and their relationships as a graph in addition to logical constraints between them. The graph, later, can be connected to trainable neural models according to those specifications. Our proposed pipeline utilizes techniques like dynamic in-context demonstration retrieval, model refinement based on feedback from a symbolic parser, visualization, and user interaction to generate the tasks’ structure and formal knowledge representation. This approach empowers domain experts, even those not well-versed in ML/AI, to formally declare their knowledge to be incorporated in customized neural models in the DomiKnowS framework. 
    more » « less
  2. This tutorial will provide an overview of recent advances on neuro- symbolic approaches for information retrieval. A decade ago, knowl- edge graphs and semantic annotations technology led to active research on how to best leverage symbolic knowledge. At the same time, neural methods have demonstrated to be versatile and highly effective. From a neural network perspective, the same representation approach can service document ranking or knowledge graph rea- soning. End-to-end training allows to optimize complex methods for downstream tasks. We are at the point where both the symbolic and the neural research advances are coalescing into neuro-symbolic approaches. The underlying research questions are how to best combine sym- bolic and neural approaches, what kind of symbolic/neural ap- proaches are most suitable for which use case, and how to best integrate both ideas to advance the state of the art in information retrieval. Materials are available online: https://github.com/laura-dietz/ neurosymbolic-representations-for-IR 
    more » « less
  3. In standard models of language production or comprehension, the elements which are retrieved from memory and combined into a syntactic structure are “lemmas” or “lexical items.” Such models implicitly take a “lexicalist” approach, which assumes that lexical items store meaning, syntax, and form together, that syntactic and lexical processes are distinct, and that syntactic structure does not extend below the word level. Across the last several decades, linguistic research examining a typologically diverse set of languages has provided strong evidence against this approach. These findings suggest that syntactic processes apply both above and below the “word” level, and that both meaning and form are partially determined by the syntactic context. This has significant implications for psychological and neurological models of language processing as well as for the way that we understand different types of aphasia and other language disorders. As a consequence of the lexicalist assumptions of these models, many kinds of sentences that speakers produce and comprehend—in a variety of languages, including English—are challenging for them to account for. Here we focus on language production as a case study. In order to move away from lexicalism in psycho- and neuro-linguistics, it is not enough to simply update the syntactic representations of words or phrases; the processing algorithms involved in language production are constrained by the lexicalist representations that they operate on, and thus also need to be reimagined. We provide an overview of the arguments against lexicalism, discuss how lexicalist assumptions are represented in models of language production, and examine the types of phenomena that they struggle to account for as a consequence. We also outline what a non-lexicalist alternative might look like, as a model that does not rely on a lemma representation, but instead represents that knowledge as separate mappings between (a) meaning and syntax and (b) syntax and form, with a single integrated stage for the retrieval and assembly of syntactic structure. By moving away from lexicalist assumptions, this kind of model provides better cross-linguistic coverage and aligns better with contemporary syntactic theory. 
    more » « less
  4. This tutorial will provide an overview of recent advances on neuro-symbolic approaches for information retrieval. A decade ago, knowledge graphs and semantic annotations technology led to active re- search on how to best leverage symbolic knowledge. At the same time, neural methods have demonstrated to be versatile and highly effective. From a neural network perspective, the same representation approach can service document ranking or knowledge graph reasoning. End-to-end training allows to optimize complex methods for downstream tasks. We are at the point where both the symbolic and the neural research advances are coalescing into neuro-symbolic approaches. The underlying research questions are how to best combine symbolic and neural ap- proaches, what kind of symbolic/neural approaches are most suitable for which use case, and how to best integrate both ideas to advance the state of the art in information retrieval. 
    more » « less
  5. Paraphrasing natural language sentences is a multifaceted process: it might involve replacing individual words or short phrases, local rearrangement of content, or high-level restructuring like topicalization or passivization. Past approaches struggle to cover this space of paraphrase possibilities in an interpretable manner. Our work, inspired by pre-ordering literature in machine translation, uses syntactic transformations to softly "reorder" the source sentence and guide our neural paraphrasing model. First, given an input sentence, we derive a set of feasible syntactic rearrangements using an encoder-decoder model. This model operates over a partially lexical, partially syntactic view of the sentence and can reorder big chunks. Next, we use each proposed rearrangement to produce a sequence of position embeddings, which encourages our final encoder-decoder paraphrase model to attend to the source words in a particular order. Our evaluation, both automatic and human, shows that the proposed system retains the quality of the baseline approaches while giving a substantial increase in the diversity of the generated paraphrases. 
    more » « less