skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Teacher Improves Learning by Selecting a Training Subset
We call a learner super-teachable if a teacher can trim down an iid training set while making the learner learn even better. We provide sharp super-teaching guarantees on two learners: the maximum likelihood estimator for the mean of a Gaussian, and the large margin classifier in 1D. For general learners, we provide a mixed-integer nonlinear programming-based algorithm to find a super teaching set. Empirical experiments show that our algorithm is able to find good super-teaching sets for both regression and classification problems.  more » « less
Award ID(s):
1712596 1740751
PAR ID:
10059913
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of Machine Learning Research
Volume:
84
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We study the problem of active learning with the added twist that the learner is assisted by a helpful teacher. We consider the following natural interaction protocol: At each round, the learner proposes a query asking for the label of an instance xq, the teacher provides the requested label {xq,yq} along with explanatory information to guide the learning process. In this paper, we view this information in the form of an additional contrastive example ({xc,yc}) where xc is picked from a set constrained by xq (e.g., dissimilar instances with the same label). Our focus is to design a teaching algorithm that can provide an informative sequence of contrastive examples to the learner to speed up the learning process. We show that this leads to a challenging sequence optimization problem where the algorithm's choices at a given round depend on the history of interactions. We investigate an efficient teaching algorithm that adaptively picks these contrastive examples. We derive strong performance guarantees for our algorithm based on two problem-dependent parameters and further show that for specific types of active learners (e.g., a generalized binary search learner), the proposed teaching algorithm exhibits strong approximation guarantees. Finally, we illustrate our bounds and demonstrate the effectiveness of our teaching framework via two numerical case studies. 
    more » « less
  2. Machine teaching has traditionally been constrained by the assumption of a fixed learner’s model. In this paper, we challenge this notion by proposing a novel black-box Markov learner model, drawing inspiration from decision psychology and neuroscience where learners are often viewed as black boxes with adaptable parameters. We model the learner’s dynamics as a Markov decision process (MDP) with unknown parameters, encompassing a wide range of learner types studied in machine teaching literature. This approach reduces teaching complexity to finding an optimal policy for the underlying MDP. Building on this, we introduce an algorithm for teaching in this black-box setting and provide an analysis of teaching costs under different scenarios. We further establish a connection between our model and two types of learners in psychology and neuroscience, the epiphany learner and the non-epiphany learner, linking them with discounted and non-discounted black-box Markov learners respectively. This alignment offers a psychologically and neuroscientifically grounded perspective to our work. Supported by numerical study results, this paper delivers a significant contribution to machine teaching, introducing a robust, versatile learner model with a rigorous theoretical foundation. 
    more » « less
  3. null (Ed.)
    One widely-studied model of teaching calls for a teacher to provide the minimal set of labeled examples that uniquely specifies a target concept. The assumption is that the teacher knows the learner’s hypothesis class, which is often not true of real-life teaching scenarios. We consider the problem of teaching a learner whose representation and hypothesis class are unknown—that is, the learner is a black box. We show that a teacher who does not interact with the learner can do no better than providing random examples. We then prove, however, that with interaction, a teacher can efficiently find a set of teaching examples that is a provably good approximation to the optimal set. As an illustration, we show how this scheme can be used to shrink training sets for any family of classifiers: that is, to find an approximately-minimal subset of training instances that yields the same classifier as the entire set. 
    more » « less
  4. null (Ed.)
    In distributed systems, a group of learners achieve consensus when, by observing the output of some acceptors, they all arrive at the same value. Consensus is crucial for ordering transactions in failure-tolerant systems. Traditional consensus algorithms are homogeneous in three ways: (1) all learners are treated equally, (2) all acceptors are treated equally, and (3) all failures are treated equally.These assumptions, however, are unsuitable for cross-domain applications, including blockchains, where not all acceptors are equally trustworthy, and not all learners have the same assumptions and priorities. We present the first algorithm to be heterogeneous in all three respects. Learners set their own mixed failure tolerances over differently trusted sets of acceptors. We express these assumptions in a novel Learner Graph, and demonstrate sufficient conditions for consensus. We present Heterogeneous Paxos, an extension of Byzantine Paxos. Heterogeneous Paxos achieves consensus for any viable Learner Graph in best-case three message sends, which is optimal. We present a proof-of-concept implementation and demonstrate how tailoring for heterogeneous scenarios can save resources and reduce latency. 
    more » « less
  5. Simulated learners represent computational theories of human learning that can be used to evaluate educational technologies, provide practice opportunities for teachers, and advance our theoretical understanding of human learning. A key challenge in working with simulated learners is evaluating the accuracy of the simulation compared to the behavior of real human students. One way this evaluation is done is by comparing the error-rate learning curves from a population of human learners and a corresponding set of simulated learners. In this paper, we argue that this approach misses an opportunity to more accurately capture nuances in learning by treating all errors as the same. We present a simulated learner system, the Apprentice Learner (AL) Architecture, and use this more nuanced evaluation to demonstrate ways in which it does and does not explain and accurately predict student learning in terms of the reduction of different kinds of errors over time as it learns, as human students do, from an Intelligent Tutoring System (ITS). 
    more » « less