skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 16, 2025

Title: Uncovering Meanings of Embeddings via Partial Orthogonality
Machine learning tools often rely on embedding text as vectors of real numbers.In this paper, we study how the semantic structure of language is encoded in the algebraic structure of such embeddings.Specifically, we look at a notion of "semantic independence" capturing the idea that, e.g., "eggplant" and "tomato" are independent given "vegetable". Although such examples are intuitive, it is difficult to formalize such a notion of semantic independence. The key observation here is that any sensible formalization should obey a set of so-called independence axioms, and thus any algebraic encoding of this structure should also obey these axioms. This leads us naturally to use partial orthogonality as the relevant algebraic structure. We develop theory and methods that allow us to demonstrate that partial orthogonality does indeed capture semantic independence.Complementary to this, we also introduce the concept of independence preserving embeddings where embeddings preserve the conditional independence structures of a distribution, and we prove the existence of such embeddings and approximations to them.  more » « less
Award ID(s):
1956330
PAR ID:
10542247
Author(s) / Creator(s):
; ;
Publisher / Repository:
Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Main Conference Track
Date Published:
Subject(s) / Keyword(s):
machine learning embeddings semantic meaning Markov boundary partial orthogonality
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Raz, Ran (Ed.)
    We give upper and lower bounds on the power of subsystems of the Ideal Proof System (IPS), the algebraic proof system recently proposed by Grochow and Pitassi, where the circuits comprising the proof come from various restricted algebraic circuit classes. This mimics an established research direction in the boolean setting for subsystems of Extended Frege proofs whose lines are circuits from restricted boolean circuit classes. Essentially all of the subsystems considered in this paper can simulate the well-studied Nullstellensatz proof system, and prior to this work there were no known lower bounds when measuring proof size by the algebraic complexity of the polynomials (except with respect to degree, or to sparsity). Our main contributions are two general methods of converting certain algebraic lower bounds into proof complexity ones. Both require stronger arithmetic lower bounds than common, which should hold not for a specific polynomial but for a whole family defined by it. These may be likened to some of the methods by which Boolean circuit lower bounds are turned into related proof-complexity ones, especially the "feasible interpolation" technique. We establish algebraic lower bounds of these forms for several explicit polynomials, against a variety of classes, and infer the relevant proof complexity bounds. These yield separations between IPS subsystems, which we complement by simulations to create a partial structure theory for IPS systems. Our first method is a functional lower bound, a notion of Grigoriev and Razborov, which is a function f' from n-bit strings to a field, such that any polynomial f agreeing with f' on the boolean cube requires large algebraic circuit complexity. We develop functional lower bounds for a variety of circuit classes (sparse polynomials, depth-3 powering formulas, read-once algebraic branching programs and multilinear formulas) where f'(x) equals 1/p(x) for a constant-degree polynomial p depending on the relevant circuit class. We believe these lower bounds are of independent interest in algebraic complexity, and show that they also imply lower bounds for the size of the corresponding IPS refutations for proving that the relevant polynomial p is non-zero over the boolean cube. In particular, we show super-polynomial lower bounds for refuting variants of the subset-sum axioms in these IPS subsystems. Our second method is to give lower bounds for multiples, that is, to give explicit polynomials whose all (non-zero) multiples require large algebraic circuit complexity. By extending known techniques, we give lower bounds for multiples for various restricted circuit classes such sparse polynomials, sums of powers of low-degree polynomials, and roABPs. These results are of independent interest, as we argue that lower bounds for multiples is the correct notion for instantiating the algebraic hardness versus randomness paradigm of Kabanets and Impagliazzo. Further, we show how such lower bounds for multiples extend to lower bounds for refutations in the corresponding IPS subsystem. 
    more » « less
  2. Effect systems are lightweight extensions to type systems that can verify a wide range of important properties with modest developer burden. But our general understanding of effect systems is limited primarily to systems where the order of effects is irrelevant. Understanding such systems in terms of a semilattice of effects grounds understanding of the essential issues and provides guidance when designing new effect systems. By contrast, sequential effect systems—where the order of effects is important—lack an established algebraic structure on effects. We present an abstract polymorphic effect system parameterized by an effect quantale—an algebraic structure with well-defined properties that can model the effects of a range of existing sequential effect systems. We define effect quantales, derive useful properties, and show how they cleanly model a variety of known sequential effect systems. We show that for most effect quantales, there is an induced notion of iterating a sequential effect; that for systems we consider the derived iteration agrees with the manually designed iteration operators in prior work; and that this induced notion of iteration is as precise as possible when defined. We also position effect quantales with respect to work on categorical semantics for sequential effect systems, clarifying the distinctions between these systems and our own in the course of giving a thorough survey of these frameworks. Our derived iteration construct should generalize to these semantic structures, addressing limitations of that work. Finally, we consider the relationship between sequential effects and Kleene Algebras, where the latter may be used as instances of the former. 
    more » « less
  3. Traditional sentence embedding models encode sentences into vector representations to capture useful properties such as the semantic similarity between sentences. However, in addition to similarity, sentence semantics can also be interpreted via compositional operations such as sentence fusion or difference. It is unclear whether the compositional semantics of sentences can be directly reflected as compositional operations in the embedding space. To more effectively bridge the continuous embedding and discrete text spaces, we explore the plausibility of incorporating various compositional properties into the sentence embedding space that allows us to interpret embedding transformations as compositional sentence operations. We propose InterSent, an end-to-end framework for learning interpretable sentence embeddings that supports compositional sentence operations in the embedding space. Our method optimizes operator networks and a bottleneck encoder-decoder model to produce meaningful and interpretable sentence embeddings. Experimental results demonstrate that our method significantly improves the interpretability of sentence embeddings on four textual generation tasks over existing approaches while maintaining strong performance on traditional semantic similarity tasks. 
    more » « less
  4. We prove that the simplicial cocommutative coalgebra of singular chains on a connected topological space determines the homotopy type rationally and one prime at a time, without imposing any restriction on the fundamental group. In particular, the fundamental group and the homology groups with coefficients in arbitrary local systems of vector spaces are completely determined by the natural algebraic structure of the chains. The algebraic structure is presented as the class of the simplicial cocommutative coalgebra of chains under a notion of weak equivalence induced by a functor from coalgebras to algebras coined by Adams as the cobar construction. The fundamental group is determined by a quadratic equation on the zeroth homology of the cobar construction of the normalized chains which involves Steenrod’s chain homotopies for cocommutativity of the coproduct. The homology groups with local coefficients are modeled by an algebraic analog of the universal cover which is invariant under our notion of weak equivalence. We conjecture that the integral homotopy type is also determined by the simplicial coalgebra of integral chains, which we prove when the universal cover is of finite type. 
    more » « less
  5. Free monads (and their variants) have become a popular general-purpose tool for representing the semantics of effectful programs in proof assistants. These data structures support the compositional definition of semantics parameterized by uninterpreted events, while admitting a rich equational theory of equivalence. But monads are not the only way to structure effectful computation, why should we limit ourselves? In this paper, inspired by applicative functors, selective functors, and other structures, we define a collection of data structures and theories, which we call program adverbs, that capture a variety of computational patterns. Program adverbs are themselves composable, allowing them to be used to specify the semantics of languages with multiple computation patterns. We use program adverbs as the basis for a new class of semantic embeddings called Tlön embeddings. Compared with embeddings based on free monads, Tlön embeddings allow more flexibility in computational modeling of effects, while retaining more information about the program's syntactic structure. 
    more » « less