skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: On the Robustness of Neural Models for Full Sentence Transformation
This paper describes the LECS Lab submission to the AmericasNLP 2024 Shared Task on the Creation of Educational Materials for Indigenous Languages. The task requires transforming a base sentence with regards to one or more linguistic properties (such as negation or tense). We observe that this task shares many similarities with the well-studied task of word-level morphological inflection, and we explore whether the findings from inflection research are applicable to this task. In particular, we experiment with a number of augmentation strategies, finding that they can significantly benefit performance, but that not all augmented data is necessarily beneficial. Furthermore, we find that our character-level neural models show high variability with regards to performance on unseen data, and may not be the best choice when training data is limited.  more » « less
Award ID(s):
2149404
PAR ID:
10539622
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Publisher / Repository:
Association for Computational Linguistics
Date Published:
Page Range / eLocation ID:
159 to 173
Format(s):
Medium: X
Location:
Mexico City, Mexico
Sponsoring Org:
National Science Foundation
More Like this
  1. Life Cycle Analysis (LCA) has long been utilized for decision making about the sustainability of products. LCA provides information about the total emissions generated for a given functional unit of a product, which is utilized by industries or consumers for comparing two products with regards to environmental performance. However, many existing LCAs utilize data that is representative of an average system with regards to life cycle stage, thus providing an aggregate picture. It has been shown that regional variation may lead to large variation in the environmental impacts of a product, specifically dealing with energy consumption, related emissions and resource consumptions. Hence, improving the reliability of LCA results for decision making with regards to environmental performance needs regional models to be incorporated for building a life cycle inventory that is representative of the origin of products from a certain region. In this work, we present the integration of regionalized data from process systems models and other sources to build regional LCA models and quantify the spatial variations per unit of biodiesel produced in the state of Indiana for environmental impact. In order to include regional variation, we have incorporated information about plant capacity for producing biodiesel from North and Central Indiana. The LCA model built is a cradle-to-gate. Once the region-specific models are built, the data were utilized in SimaPro to integrate with upstream processes to perform a life cycle impact assessment (LCIA). We report the results per liter of biodiesel from northern and central Indiana facilities in this work. The impact categories studied were global warming potential (kg CO2 eq) and freshwater eutrophication (kg P eq). While there were a lot of variations at individual county level, both regions had a similar global warming potential impact and the northern region had relatively lower eutrophication impacts. 
    more » « less
  2. Gutkin, Boris S. (Ed.)
    Converging evidence suggests the brain encodes time in dynamic patterns of neural activity, including neural sequences, ramping activity, and complex dynamics. Most temporal tasks, however, require more than just encoding time, and can have distinct computational requirements including the need to exhibit temporal scaling, generalize to novel contexts, or robustness to noise. It is not known how neural circuits can encode time and satisfy distinct computational requirements, nor is it known whether similar patterns of neural activity at the population level can exhibit dramatically different computational or generalization properties. To begin to answer these questions, we trained RNNs on two timing tasks based on behavioral studies. The tasks had different input structures but required producing identically timed output patterns. Using a novel framework we quantified whether RNNs encoded two intervals using either of three different timing strategies: scaling, absolute, or stimulus-specific dynamics. We found that similar neural dynamic patterns at the level of single intervals, could exhibit fundamentally different properties, including, generalization, the connectivity structure of the trained networks, and the contribution of excitatory and inhibitory neurons. Critically, depending on the task structure RNNs were better suited for generalization or robustness to noise. Further analysis revealed different connection patterns underlying the different regimes. Our results predict that apparently similar neural dynamic patterns at the population level (e.g., neural sequences) can exhibit fundamentally different computational properties in regards to their ability to generalize to novel stimuli and their robustness to noise—and that these differences are associated with differences in network connectivity and distinct contributions of excitatory and inhibitory neurons. We also predict that the task structure used in different experimental studies accounts for some of the experimentally observed variability in how networks encode time. 
    more » « less
  3. Multilevel modeling and multi-task learning are two widely used approaches for modeling nested (multi-level) data, which contain observations that can be clustered into groups, characterized by their group-level features. Despite the similarity of the problems they address, the explicit relationship between multilevel modeling and multi-task learning has not been carefully examined. In this paper, we present a comparative analysis between the two methods to illustrate their strengths and limitations when applied to two-level nested data. We provide a detailed analysis demonstrating the equivalence of their formulations under a mild condition from an optimization perspective. We also demonstrate their limitations in terms of their predictive performance and especially, their difficulty in identifying potential cross-scale interactions between the local and group-level features when applied to datasets with either a small number of groups or limited training examples per group. To overcome these limitations, we propose a novel method for disaggregating the coarse-scale values of the group-level features in the nested data. Experimental results on both synthetic and real-world data show that the disaggregated group-level features can help enhance the prediction accuracy of the models significantly and identify the cross-scale interactions more effectively. 
    more » « less
  4. Recently, several task-parallel programming models have emerged to address the high synchronization and load imbalance issues as well as data movement overheads in modern shared memory architectures. OpenMP, the most commonly used shared memory parallel programming model, has added task execution support with dataflow dependencies. HPX and Regent are two more recent runtime systems that also support the dataflow execution model and extend it to distributed memory environments. We focus on parallelization of sparse matrix computations on shared memory architectures. We evaluate the OpenMP, HPX and Regent runtime systems in terms of performance and ease of implementation, and compare them against the traditional BSP model for two popular eigensolvers, Lanczos and LOBPCG. We give a general outline in regards to achieving parallelism using these runtime systems, and present a heuristic for tuning their performance to balance tasking overheads with the degree of parallelism that can be exposed. We then demonstrate their merits on two architectures, Intel Broadwell (a multicore processor) and AMD EPYC (a modern manycore processor). We observe that these frameworks achieve up to 13.7 × fewer cache misses over an efficient BSP implementation across L1, L2 and L3 cache layers. They also obtain up to 9.9 × improvement in execution time over the same BSP implementation. 
    more » « less
  5. Kim, Been (Ed.)
    Integrating and processing information from various sources or modalities are critical for obtaining a comprehensive and accurate perception of the real world in autonomous systems and cyber-physical systems. Drawing inspiration from neuroscience, we develop the Information-Theoretic Hierarchical Perception (ITHP) model, which utilizes the concept of information bottleneck. Different from most traditional fusion models that incorporate all modalities identically in neural networks, our model designates a prime modality and regards the remaining modalities as detectors in the information pathway, serving to distill the flow of information. Our proposed perception model focuses on constructing an effective and compact information flow by achieving a balance between the minimization of mutual information between the latent state and the input modal state, and the maximization of mutual information between the latent states and the remaining modal states. This approach leads to compact latent state representations that retain relevant information while minimizing redundancy, thereby substantially enhancing the performance of multimodal representation learning. Experimental evaluations on the MUStARD, CMU-MOSI, and CMU-MOSEI datasets demonstrate that our model consistently distills crucial information in multimodal learning scenarios, outperforming state-of-the-art benchmarks. Remarkably, on the CMU-MOSI dataset, ITHP surpasses human-level performance in the multimodal sentiment binary classification task across all evaluation metrics (i.e., Binary Accuracy, F1 Score, Mean Absolute Error, and Pearson Correlation). 
    more » « less