skip to main content


Search for: All records

Creators/Authors contains: "Shafto, Patrick"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Obtaining solutions to optimal transportation (OT) problems is typically intractable when marginal spaces are continuous. Recent research has focused on approximating continuous solutions with discretization methods based on i.i.d. sampling, and this has shown convergence as the sample size increases. However, obtaining OT solutions with large sample sizes requires intensive computation effort, which can be prohibitive in practice. In this paper, we propose an algorithm for calculating discretizations with a given number of weighted points for marginal distributions by minimizing the (entropy-regularized) Wasserstein distance and providing bounds on the performance. The results suggest that our plans are comparable to those obtained with much larger numbers of i.i.d. samples and are more efficient than existing alternatives. Moreover, we propose a local, parallelizable version of such discretizations for applications, which we demonstrate by approximating adorable images.

     
    more » « less
    Free, publicly-accessible full text available June 1, 2024
  2. null (Ed.)
    It is desirable to combine the expressive power of deep learning with Gaussian Process (GP) in one expressive Bayesian learning model. Deep kernel learning showed success as a deep network used for feature extraction. Then, a GP was used as the function model. Recently, it was suggested that, albeit training with marginal likelihood, the deterministic nature of a feature extractor might lead to overfitting, and replacement with a Bayesian network seemed to cure it. Here, we propose the conditional deep Gaussian process (DGP) in which the intermediate GPs in hierarchical composition are supported by the hyperdata and the exposed GP remains zero mean. Motivated by the inducing points in sparse GP, the hyperdata also play the role of function supports, but are hyperparameters rather than random variables. It follows our previous moment matching approach to approximate the marginal prior for conditional DGP with a GP carrying an effective kernel. Thus, as in empirical Bayes, the hyperdata are learned by optimizing the approximate marginal likelihood which implicitly depends on the hyperdata via the kernel. We show the equivalence with the deep kernel learning in the limit of dense hyperdata in latent space. However, the conditional DGP and the corresponding approximate inference enjoy the benefit of being more Bayesian than deep kernel learning. Preliminary extrapolation results demonstrate expressive power from the depth of hierarchy by exploiting the exact covariance and hyperdata learning, in comparison with GP kernel composition, DGP variational inference and deep kernel learning. We also address the non-Gaussian aspect of our model as well as way of upgrading to a full Bayes inference. 
    more » « less
  3. null (Ed.)
    Abstract State-of-the-art deep-learning systems use decision rules that are challenging for humans to model. Explainable AI (XAI) attempts to improve human understanding but rarely accounts for how people typically reason about unfamiliar agents. We propose explicitly modelling the human explainee via Bayesian teaching, which evaluates explanations by how much they shift explainees’ inferences toward a desired goal. We assess Bayesian teaching in a binary image classification task across a variety of contexts. Absent intervention, participants predict that the AI’s classifications will match their own, but explanations generated by Bayesian teaching improve their ability to predict the AI’s judgements by moving them away from this prior belief. Bayesian teaching further allows each case to be broken down into sub-examples (here saliency maps). These sub-examples complement whole examples by improving error detection for familiar categories, whereas whole examples help predict correct AI judgements of unfamiliar cases. 
    more » « less
  4. Pham, Tien ; Solomon, Latasha ; Hohil, Myron E. (Ed.)
  5. Abstract

    What maximizes instructional impact in early childhood? We propose a simple intervention employing “Pedagogical Questions”. We explore whether swapping some instructional language with questions in psychosomatic storybooks improves preschoolers’ memory, learning, and generalization. Seventy-two preschoolers were randomly assigned to one of three conditions and were read storybooks employing eitherDirect Instruction,Pedagogical Questions, orControlcontent. Posttest measures of psychosomatic understanding, judgments about the possibility of psychosomatic events, and memory for storybook details showed that children in thePedagogical Questionscondition demonstrated greater memory for relevant storybook details and improved psychosomatic understanding. Our results suggest that pedagogical questions are a relatively simple educational manipulation to improve memory, learning, and transfer of theory-rich content.

     
    more » « less
  6. null (Ed.)
    In studies involving human subjects, voluntary participation may lead to sampling bias, thus limiting the generalizability of findings. This effect may be especially pronounced in developmental studies, where parents serve as both the primary environmental input and decision maker of whether their child participates in a study. We present a novel empirical and modeling approach to estimate how parental consent may bias measurements of children’s behavior. Specifically, we coupled naturalistic observations of parent–child interactions in public spaces with a behavioral test with children, and used modeling methods to impute the behavior of children who did not participate. Results showed that parents’ tendency to use questions to teach was associated with both children’s behavior in the test and parents’ tendency to participate. Exploiting these associations with a model-based multiple imputation and a propensity score–matching procedure, we estimated that the means of the participating and not-participating groups could differ as much as 0.23 standard deviations for the test measurements, and standard deviations themselves are likely underestimated. These results suggest that ignoring factors associated with consent may lead to systematic biases when generalizing beyond lab samples, and the proposed general approach provides a way to estimate these biases in future research. 
    more » « less
  7. Abstract

    Neural network architectures are achieving superhuman performance on an expanding range of tasks. To effectively and safely deploy these systems, their decision‐making must be understandable to a wide range of stakeholders. Methods to explain artificial intelligence (AI) have been proposed to answer this challenge, but a lack of theory impedes the development of systematic abstractions, which are necessary for cumulative knowledge gains. We propose Bayesian Teaching as a framework for unifying explainable AI (XAI) by integrating machine learning and human learning. Bayesian Teaching formalizes explanation as a communication act of an explainer to shift the beliefs of an explainee. This formalization decomposes a wide range of XAI methods into four components: (a) the target inference, (b) the explanation, (c) the explainee model, and (d) the explainer model. The abstraction afforded by Bayesian Teaching to decompose XAI methods elucidates the invariances among them. The decomposition of XAI systems enables modular validation, as each of the first three components listed can be tested semi‐independently. This decomposition also promotes generalization through recombination of components from different XAI systems, which facilitates the generation of novel variants. These new variants need not be evaluated one by one provided that each component has been validated, leading to an exponential decrease in development time. Finally, by making the goal of explanation explicit, Bayesian Teaching helps developers to assess how suitable an XAI system is for its intended real‐world use case. Thus, Bayesian Teaching provides a theoretical framework that encourages systematic, scientific investigation of XAI.

     
    more » « less
  8. null (Ed.)