Theory of mind, the ability to model others’ thoughts and desires, is a cornerstone of human social intelligence. This makes it an important challenge for the machine learning community, but previous works mainly attempt to design agents that model the "mental state" of others as passive observers or in specific predefined roles, such as in speaker-listener scenarios. In contrast, we propose to model machine theory of mind in a more general symmetric scenario. We introduce a multi-agent environment SymmToM where, like in real life, all agents can speak, listen, see other agents, and move freely through the world. Effective strategies to maximize an agent’s reward require it to develop a theory of mind. We show that reinforcement learning agents that model the mental states of others achieve significant performance improvements over agents with no such theory of mind model. Importantly, our best agents still fail to achieve performance comparable to agents with access to the gold-standard mental state of other agents, demonstrating that the modeling of theory of mind in multi-agent scenarios is very much an open challenge.
more »
« less
Machine learning for evaluating and improving theories
We summarize our recent work that uses machine learning techniques as a complement to theoretical modeling, rather than a substitute for it. The key concepts are those of the completeness and restrictiveness of a model. A theory's completeness is how much it improves predictions over a naive baseline, relative to how much improvement is possible. When a theory is relatively incomplete, machine learning algorithms can help reveal regularities that the theory doesn't capture, and thus lead to the construction of theories that make more accurate predictions. Restrictiveness measures a theory's ability to match arbitrary hypothetical data: A very unrestrictive theory will be complete on almost any data, so the fact that it is complete on the actual data is not very instructive. We algorithmically quantify restrictiveness by measuring how well the theory approximates randomly generated behaviors. Finally, we propose "algorithmic experimental design" as a method to help select which experiments to run.
more »
« less
- Award ID(s):
- 1951056
- PAR ID:
- 10271869
- Date Published:
- Journal Name:
- ACM SIGecom Exchanges
- Volume:
- 18
- Issue:
- 1
- ISSN:
- 1551-9031
- Page Range / eLocation ID:
- 4 to 11
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Recent work has shown that the input-output behavior of some common machine learning classifiers can be captured in symbolic form, allowing one to reason about the behavior of these classifiers using symbolic techniques. This includes explaining decisions, measuring robustness, and proving formal properties of machine learning classifiers by reasoning about the corresponding symbolic classifiers. In this work, we present a theory for unveiling thereasonsbehind the decisions made by Boolean classifiers and study some of its theoretical and practical implications. At the core of our theory is the notion of acomplete reason,which can be viewed as a necessary and sufficient condition for why a decision was made. We show how the complete reason can be used for computing notions such as sufficient reasons (also known as PI-explanations and abductive explanations), how it can be used for determining decision and classifier bias and how it can be used for evaluating counterfactual statements such as “a decision will stick even if ...because ... .” We present a linear-time algorithm for computing the complete reasoning behind a decision, assuming the classifier is represented by a Boolean circuit of appropriate form. We then show how the computed complete reason can be used to answer many queries about a decision in linear or polynomial time. We finally conclude with a case study that illustrates the various notions and techniques we introduced.more » « less
-
null (Ed.)In order to manage the public health crisis associated with COVID-19, it is critically important that healthcare workers can quickly identify high-risk patients in order to provide effective treatment with limited resources. Statistical learning tools have the potential to help predict serious infection early-on in the progression of the disease. However, many of these techniques are unable to take full advantage of temporal data on a per-patient basis as they handle the problem as a single-instance classification. Furthermore, these algorithms rely on complete data to make their predictions. In this work, we present a novel approach to handle the temporal and missing data problems, simultaneously; our proposed Simultaneous Imputation-Multi Instance Support Vector Machine method illustrates how multiple instance learning techniques and low-rank data imputation can be utilized to accurately predict clinical outcomes of COVID-19 patients. We compare our approach against recent methods used to predict outcomes on a public dataset with a cohort of 361 COVID-19 positive patients. In addition to improved prediction performance early on in the progression of the disease, our method identifies a collection of biomarkers associated with the liver, immune system, and blood, that deserve additional study and may provide additional insight into causes of patient mortality due to COVID-19. We publish the source code for our method online.more » « less
-
Transfer learning has become an increasingly popular technique in machine learning as a way to leverage a pretrained model trained for one task to assist with building a finetuned model for a related task. This paradigm has been especially popular for privacy in machine learning, where the pretrained model is considered public, and only the data for finetuning is considered sensitive. However, there are reasons to believe that the data used for pretraining is still sensitive, making it essential to understand how much information the finetuned model leaks about the pretraining data. In this work we propose a new membership-inference threat model where the adversary only has access to the finetuned model and would like to infer the membership of the pretraining data. To realize this threat model, we implement a novel metaclassifier-based attack, TMI, that leverages the influence of memorized pretraining samples on predictions in the downstream task. We evaluate TMI on both vision and natural language tasks across multiple transfer learning settings, including finetuning with differential privacy. Through our evaluation, we find that TMI can successfully infer membership of pretraining examples using query access to the finetuned model.more » « less
-
Most attention in K-12 artificial intelligence and machine learning (AI/ML) education has been given to having youths train models, with much less attention to the equally important testing of models when creating machine learning applications. Testing ML applications allows for the evaluation of models against predictions and can help creators of applications identify and address failure and edge cases that could negatively impact user experiences. We investigate how testing each other's projects supported youths to take perspective about functionality, performance, and potential issues in their own projects. We analyzed testing worksheets, audio and video recordings collected during a two week workshop in which 11 high school youths created physical computing projects that included (audio, pose, and image) ML classifiers. We found that through peer-testing youths reflected on the size of their training datasets, the diversity of their training data, the design of their classes and the contexts in which they produced training data. We discuss future directions for research on peer-testing in AI/ML education and current limitations for these kinds of activities.more » « less
An official website of the United States government

