One of the goals of natural language understanding is to develop models that map sentences into meaning representations. However, training such models requires expensive annotation of complex structures, which hinders their adoption. Learning to actively-learn(LTAL) is a recent paradigm for reducing the amount of labeled data by learning a policy that selects which samples should be labeled. In this work, we examine LTAL for learning semantic representations, such as QA-SRL. We show that even an oracle policy that is allowed to pick examples that maximize performance on the test set (and constitutes an upper bound on the potential of LTAL), does not substantially improve performance compared to a random policy. We investigate factors that could explain this finding and show that a distinguishing characteristic of successful applications of LTAL is the interaction between optimization and the oracle policy selection process. In successful applications of LTAL, the examples selected by the oracle policy do not substantially depend on the optimization procedure, while in our setup the stochastic nature of optimization strongly affects the examples selected by the oracle. We conclude that the current applicability of LTAL for improving data efficiency in learning semantic meaning representations is limited.
more »
« less
This content will become publicly available on April 9, 2026
Algorithms for Affirmative Action
This paper illustrates how fundamental concepts from optimization—such as greedy algorithms, matroids, maximum weight matching, and NP-completeness—arise in domains where policymakers wish to select a set of applicants while ensuring representation for specific groups. Examples of such settings include visa lotteries in the United States, the election for Chile’s constitutional assembly, affordable housing lotteries in New York City, selection for Indian civil service positions, and admission to Indian and Brazilian universities. By providing these examples alongside sample exercises, I aim to offer educators tools to make optimization theory accessible to students at all levels, while highlighting its policy relevance. Supplemental Material: The online data files are available at https://doi.org/10.1287/ited.2023.0039 .
more »
« less
- Award ID(s):
- 2339912
- PAR ID:
- 10627012
- Publisher / Repository:
- Informs Transactions on Education
- Date Published:
- Journal Name:
- INFORMS Transactions on Education
- ISSN:
- 1532-0545
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Modern fabrication methods have greatly simplified manufacturing of complex free-form shapes at an affordable cost, and opened up new possibilities for improving functionality and customization through automatic optimization, shape optimization in particular. However, most existing shape optimization methods focus on single parts. In this work, we focus on supporting shape optimization for assemblies, more specifically, assemblies that are held together by contact and friction. Examples of which include furniture joints, construction set assemblies, certain types of prosthetic devices and many other. To enable this optimization, we present a framework supporting robust and accurate optimization of a number of important functionals, while enforcing constraints essential for assembly functionality: weight, stress, difficulty of putting the assembly together, and how reliably it stays together. Our framework is based on smoothed formulation of elasticity equations with contact, analytically derived shape derivatives, and robust remeshing to enable large changes of shape, and at the same time, maintain accuracy. We demonstrate the improvements it can achieve for a number of computational and experimental examples.more » « less
-
The expectation is an example of a descriptive statistic that is monotone with respect to stochastic dominance, and additive for sums of independent random variables. We provide a complete characterization of such statistics, and explore a number of applications to models of individual and group decision‐making. These include a representation of stationary monotone time preferences, extending the work of Fishburn and Rubinstein (1982) to time lotteries. This extension offers a new perspective on risk attitudes toward time, as well as on the aggregation of multiple discount factors. We also offer a novel class of non‐expected utility preferences over gambles which satisfy invariance to background risk as well as betweenness, but are versatile enough to capture mixed risk attitudes.more » « less
-
The ability to speak and understand a host country’s primary language is strongly associated with measures of immigrant integration. We estimate the causal effects of English language training for adult immigrants on participants’ civic and economic outcomes using randomized enrollment lotteries from a public adult education program in Massachusetts. Participation doubles voter participation and increases annual earnings by $2,400 (56 percent). Increased tax revenue from earnings gains cover program costs over time, generating a 6 percent return for taxpayers. Ours is the first randomized evaluation of adult English language training as a standalone intervention in the United States. (JEL D72, H75, I21, I26, J15, J24, J31)more » « less
-
Modern data analytics applications, such as knowledge graph reasoning and machine learning, typically involve recursion through aggregation. Such computations pose great challenges to both system builders and theoreticians: first, to derive simple yet powerful abstractions for these computations; second, to define and study the semantics for the abstractions; third, to devise optimization techniques for these computations. In recent work we presented a generalization of Datalog called Datalog, which addresses these challenges. Datalog is a simple abstraction, which allows aggregates to be interleaved with recursion, and retains much of the simplicity and elegance of Datalog. We define its formal semantics based on an algebraic structure called Partially Ordered Pre-Semirings, and illustrate through several examples how Datalog can be used for a variety of applications. Finally, we describe a new optimization rule for Datalog, called the FGH-rule, then illustrate the FGH-rule on several examples, including a simple magic-set rewriting, generalized semi-naïve evaluation, and a bill-of-material example, and briefly discuss the implementation of the FGH-rule and present some experimental validation of its effectiveness.more » « less
An official website of the United States government
