skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Deep neural networks easily learn unnatural infixation and reduplication patterns
Morphological patterns can involve simple concatenation of fixed strings (e.g., unkind, kindness) or ‘nonconcatenative’ processes such as infixation (e.g., Chamorro l-um-iʔeʔ ‘saw (actor-focus)’, Topping, 1973) and reduplication (e.g., Amele ba-bagawen ‘as he came out’, Roberts, 1987), among many others (e.g., Anderson, 1992; Inkelas, 2014). Recent work has established that deep neural networks are capable of inducing both concatenative and nonconatenative patterns (e.g., Kannand Schütze, 2017; Nelson et al., 2020). In this paper, we verify that encoder-decoder networks can learn and generalize attested types of infixation and reduplication from modest training sets. We show further that the same networks readily learn many infixation and reduplication patterns that are unattested in natural languages, raising questions about their relationship to linguistic theory and viability as models of human learning.  more » « less
Award ID(s):
1941593
PAR ID:
10217150
Author(s) / Creator(s):
;
Editor(s):
Ettinger, Allyson; Pavlich, Ellie; Prickett, Brandon
Date Published:
Journal Name:
Proceedings of the Society for Computation in Linguistics
Volume:
4
Page Range / eLocation ID:
Article 52
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Reduplication is common, but analogous reversal processes are rare, even though reversal, which involves nested rather than crossed dependencies, is less complex on the Chomsky hierarchy. We hypothesize that the explanation is that repetitions can be recognized when they match and reactivate a stored trace in short-term memory, but recognizing a reversal requires rearranging the input in working memory before attempting to match it to the stored trace. Repetitions can thus be recognized, and repetition patterns learned, implicitly, whereas reversals require explicit, conscious awareness. To test these hypotheses, participants were trained to recognize either a reduplication or a syllable-reversal pattern, and then asked to state the rule. In two experiments, above-chance classification performance on the Reversal pattern was confined to Correct Staters, whereas above-chance performance on the Reduplication pattern was found with or without correct rule-stating. Final proportion correct was positively correlated with final response time for the Reversal Correct Staters but no other group. These results support the hypothesis that reversal, unlike reduplication, requires conscious, time-consuming computation. 
    more » « less
  2. Abstract Artificial neural networks are increasingly used for geophysical modeling to extract complex nonlinear patterns from geospatial data. However, it is difficult to understand how networks make predictions, limiting trust in the model, debugging capacity, and physical insights. EXplainable Artificial Intelligence (XAI) techniques expose how models make predictions, but XAI results may be influenced by correlated features. Geospatial data typically exhibit substantial autocorrelation. With correlated input features, learning methods can produce many networks that achieve very similar performance (e.g., arising from different initializations). Since the networks capture different relationships, their attributions can vary. Correlated features may also cause inaccurate attributions because XAI methods typically evaluate isolated features, whereas networks learn multifeature patterns. Few studies have quantitatively analyzed the influence of correlated features on XAI attributions. We use a benchmark framework of synthetic data with increasingly strong correlation, for which the ground truth attribution is known. For each dataset, we train multiple networks and compare XAI-derived attributions to the ground truth. We show that correlation may dramatically increase the variance of the derived attributions, and investigate the cause of the high variance: is it because different trained networks learn highly different functions or because XAI methods become less faithful in the presence of correlation? Finally, we show XAI applied to superpixels, instead of single grid cells, substantially decreases attribution variance. Our study is the first to quantify the effects of strong correlation on XAI, to investigate the reasons that underlie these effects, and to offer a promising way to address them. 
    more » « less
  3. Unbounded productivity is a hallmark of linguistic competence. Here, we asked whether this capacity automatically applies to signs. Participants saw video-clips of novel signs in American Sign Language (ASL) produced by a signer whose body appeared in a monochromatic color, and they quickly identified the signs’ color. The critical manipulation compared reduplicative (αα) signs to non-reduplicative (αβ) controls. Past research has shown that reduplication is frequent in ASL, and frequent structures elicit stronger Stroop interference. If signers automatically generalize the reduplication function, then αα signs should elicit stronger color-naming interference. Results showed no effect of reduplication for signs whose base (α) consisted of native ASL features (possibly, due to the similarity of α items to color names). Remarkably, signers were highly sensitive to reduplication when the base (α) included novel features. These results demonstrate that signers can freely extend their linguistic knowledge to novel forms, and they do so automatically. Unbounded productivity thus defines all languages, irrespective of input modality. 
    more » « less
  4. Abstract Assessing forced climate change requires the extraction of the forced signal from the background of climate noise. Traditionally, tools for extracting forced climate change signals have focused on one atmospheric variable at a time, however, using multiple variables can reduce noise and allow for easier detection of the forced response. Following previous work, we train artificial neural networks to predict the year of single‐ and multi‐variable maps from forced climate model simulations. To perform this task, the neural networks learn patterns that allow them to discriminate between maps from different years—that is, the neural networks learn the patterns of the forced signal amidst the shroud of internal variability and climate model disagreement. When presented with combined input fields (multiple seasons, variables, or both), the neural networks are able to detect the signal of forced change earlier than when given single fields alone by utilizing complex, nonlinear relationships between multiple variables and seasons. We use layer‐wise relevance propagation, a neural network explainability tool, to identify the multivariate patterns learned by the neural networks that serve as reliable indicators of the forced response. These “indicator patterns” vary in time and between climate models, providing a template for investigating inter‐model differences in the time evolution of the forced response. This work demonstrates how neural networks and their explainability tools can be harnessed to identify patterns of the forced signal within combined fields. 
    more » « less
  5. Accurate traffic speed prediction is critical to many applications, from routing and urban planning to infrastructure management. With sufficient training data where all spatio-temporal patterns are well- represented, machine learning models such as Spatial-Temporal Graph Convolutional Networks (STGCN), can make reasonably accurate predictions. However, existing methods fail when the training data distribution (e.g., traffic patterns on regular days) is different from test distribution (e.g., traffic patterns on special days). We address this challenge by proposing a traffic-law-informed network called Reaction-Diffusion Graph Ordinary Differential Equation (RDGODE) network, which incorporates a physical model of traffic speed evolution based on a reliable and interpretable reaction- diffusion equation that allows the RDGODE to adapt to unseen traffic patterns. We show that with mismatched training data, RDGODE is more robust than the state-of-the-art machine learning methods in the following cases. (1) When the test dataset exhibits spatio-temporal patterns not represented in the training dataset, the performance of RDGODE is more consistent and reliable. (2) When the test dataset has missing data, RDGODE can maintain its accuracy by intrinsically imputing the missing values. 
    more » « less