skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on March 1, 2026

Title: Linear Relational Decoding of Morphology in Language Models
A two-part affine approximation has been found to be a good approximation for trans- former computations over certain subject- object relations. Adapting the Bigger Analogy Test Set, we show that the linear transforma- tion W s, where s is a middle layer representa- tion of a subject token and W is derived from model derivatives, is also able to accurately re- produce final object states for many relations. This linear technique is able to achieve 90% faithfulness on morphological relations, and we show similar findings multi-lingually and across models. Our findings indicate that some conceptual relationships in language models, such as morphology, are readily interpretable from latent space, and are sparsely encoded by cross-layer linear transformations.  more » « less
Award ID(s):
2349452
PAR ID:
10655817
Author(s) / Creator(s):
;
Publisher / Repository:
NAACL, aclanthology.org
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this work, we present a Linear Matrix Inequality (LMI) based method to synthesize an optimal H1 estimator for a large class of linear coupled partial differential equations (PDEs) utilizing only finite dimensional measurements. Our approach extends the newly developed framework for representing and analyzing distributed parameter systems using operators on the space of square integrable functions that are equipped with multipliers and kernels of semi-separable class. We show that by redefining the state, the PDEs can be represented using operators that embed the boundary conditions and input-output relations explicitly. The optimal estimator synthesis problem is formulated as a convex optimization subject to LMIs that require no approximation or discretization. A scalable algorithm is presented to synthesize the estimator. The algorithm is illustrated by suitable examples. 
    more » « less
  2. null (Ed.)
    Training a semantic segmentation model requires large densely-annotated image datasets that are costly to obtain. Once the training is done, it is also difficult to add new ob- ject categories to such segmentation models. In this pa- per, we tackle the few-shot semantic segmentation prob- lem, which aims to perform image segmentation task on un- seen object categories merely based on one or a few sup- port example(s). The key to solving this few-shot segmen- tation problem lies in effectively utilizing object informa- tion from support examples to separate target objects from the background in a query image. While existing meth- ods typically generate object-level representations by av- eraging local features in support images, we demonstrate that such object representations are typically noisy and less distinguishing. To solve this problem, we design an ob- ject representation generator (ORG) module which can ef- fectively aggregate local object features from support im- age(s) and produce better object-level representation. The ORG module can be embedded into the network and trained end-to-end in a weakly-supervised fashion without extra hu- man annotation. We incorporate this design into a modified encoder-decoder network to present a powerful and efficient framework for few-shot semantic segmentation. Experimen- tal results on the Pascal-VOC and MS-COCO datasets show that our approach achieves better performance compared to existing methods under both one-shot and five-shot settings. 
    more » « less
  3. Abstract A petrophysical model that accurately relates bulk electrical conductivity (σ) to pore fluid conductivity (σw) is critical to the interpretation of geophysical measurements. Classical models are either only applicable over a limited salinity regime or incorrectly explain the nonlinear‐to‐linear behavior of the σ(σw) relationship. In this study, asymptotic limits at zero and infinite salinity are first established in which, σ is expressed as a linear function of σwwith four parameters: cementation exponent (m), the equivalent value of volumetric surface electrical conductivity (σs), the volume fraction of overlapped diffuse layer (ϕod) and parameter χ representing the ratio of the volume fraction of the water phase to that of the solid phases in the surface conduction pathway. Subsequently, we bridge the gap between the two extremes by employing the Padé approximant (PA). Given that parameter χ exhibits a marginal influence on the σ(σw) curve, based on measurements for 15 samples, we identify its optimal value to be 0.4. After setting the optimal value ofχ, we proceed to evaluate the performance of the PA model by comparing its estimates and estimates made by two existing models to measured values from 27 rock samples and eight sediment samples. The comparison confirms that the PA model estimates are more accurate than estimates made by existing models, particularly at low salinity and for samples with higher cation exchange capacity. The PA model is advantageous in scenarios involving the interpretation of electrical data in freshwater environments. 
    more » « less
  4. Abstract Language models are typically evaluated on their success at predicting the distribution of specific words in specific contexts. Yet linguistic knowledge also encodes relationships between contexts, allowing inferences between word distributions. We investigate the degree to which pre-trained transformer-based large language models (LLMs) represent such relationships, focusing on the domain of argument structure. We find that LLMs perform well in generalizing the distribution of a novel noun argument between related contexts that were seen during pre-training (e.g., the active object and passive subject of the verb spray), succeeding by making use of the semantically organized structure of the embedding space for word embeddings. However, LLMs fail at generalizations between related contexts that have not been observed during pre-training, but which instantiate more abstract, but well-attested structural generalizations (e.g., between the active object and passive subject of an arbitrary verb). Instead, in this case, LLMs show a bias to generalize based on linear order. This finding points to a limitation with current models and points to a reason for which their training is data-intensive.1 
    more » « less
  5. Large-scale software verification relies critically on the use of compositional languages, semantic models, specifications, and verification techniques. Recent work on certified abstraction layers synthesizes game semantics, the refinement calculus, and algebraic effects to enable the composition of heterogeneous components into larger certified systems. However, in existing models of certified abstraction layers, compositionality is restricted by the lack of encapsulation of state. In this paper, we present a novel game model for certified abstraction layers where the semantics of layer interfaces and implementations are defined solely based on their observable behaviors. Our key idea is to leverage Reddy's pioneer work on modeling the semantics of imperative languages not as functions on global states but as objects with their observable behaviors. We show that a layer interface can be modeled as an object type (i.e., a layer signature) plus an object strategy. A layer implementation is then essentially a regular map, in the sense of Reddy, from an object with the underlay signature to that with the overlay signature. A layer implementation is certified when its composition with the underlay object strategy implements the overlay object strategy. We also describe an extension that allows for non-determinism in layer interfaces. After formulating layer implementations as regular maps between object spaces, we move to concurrency and design a notion of concurrent object space, where sequential traces may be identified modulo permutation of independent operations. We show how to express protected shared object concurrency, and a ticket lock implementation, in a simple model based on regular maps between concurrent object spaces. 
    more » « less