skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The Optimality of Bayes' Theorem
We show that the Bayes theorem is in fact an optimal way to blend prior and observational information using three different perspectives: two from information theory and one from the duality of variational inference.  more » « less
Award ID(s):
1808576 1845799
PAR ID:
10288595
Author(s) / Creator(s):
Date Published:
Journal Name:
SIAM news
Volume:
54
Issue:
6
ISSN:
1557-9573
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The computer science literature on identification of people using personal information paints a wide spectrum, from aggregate information that doesn’t contain information about individual people, to information that itself identifies a person. However, privacy laws and regulations often distinguish between only two types, often called personally identifiable information and de-identified information. We show that the collapse of this technological spectrum of identifiability into only two legal definitions results in the failure to encourage privacy-preserving practices. We propose a set of legal definitions that spans the spectrum. We start with anonymous information. Computer science has created anonymization algorithms, including differential privacy, that provide mathematical guarantees that a person cannot be identified. Although the California Consumer Privacy Act (CCPA) defines aggregate information, it treats aggregate information the same as de-identified information. We propose a definition of anonymous information based on the technological possibility of logical association of the information with other information. We argue for the exclusion of anonymous information from notice and consent requirements. We next consider de-identified information. Computer science has created de-identification algorithms, including generalization, that minimize (but not eliminate) the risk of re-identification. GDPR defines anonymous information but not de-identified information, and CCPA defines de-identified information but not anonymous information. The definitions do not align. We propose a definition of de-identified information based on the reasonableness of association with other information. We propose legal controls to protect against re-identification. We argue for the inclusion of de-identified information in notice requirements, but the exclusion of de-identified information from choice requirements. We next address the distinction between trackable and non-trackable information. Computer science has shown how one-time identifiers can be used to protect reasonably linkable information from being tracked over time. Although both GDPR and CCPA discuss profiling, neither formally defines it as a form of personal information, and thus both fail to adequately protect against it. We propose definitions of trackable information and non-trackable information based on the likelihood of association with information from other contexts. We propose a set of legal controls to protect against tracking. We argue for requiring stronger forms of user choice for trackable information, which will encourage the use of non-trackable information. Finally, we address the distinction between pseudonymous and reasonably identifiable information. Computer science has shown how pseudonyms can be used to reduce identification. Neither GDPR nor CCPA makes a distinction between pseudonymous and reasonable identifiable information. We propose definitions based on the reasonableness of identifiability of the information, and we propose a set of legal controls to protect against identification. We argue for requiring stronger forms of user choice for reasonably identifiable information, which will encourage the use of pseudonymous information. Our definitions of anonymous information, de-identified information, non-trackable information, trackable information, and reasonably identifiable information can replace the over-simplified distinction between personally identifiable information versus de-identified information. We hope that this full spectrum of definitions can be used in a comprehensive privacy law to tailor notice and consent requirements to the characteristics of each type of information. 
    more » « less
  2. To understand the operation of the olfactory system, it is essential to know how information is encoded in the olfactory bulb. We applied Shannon information theoretic methods to address this, with signals from up to 57 glomeruli simultaneously optically imaged from presynaptic inputs in glomeruli in the mouse dorsal (dOB) and lateral (lOB) olfactory bulb, in response to six exemplar pure chemical odors. We discovered that, first, the tuning of these signals from glomeruli to a set of odors is remarkably broad, with a mean sparseness of 0.83 and a mean signal correlation of 0.64. Second, both of these factors contribute to the low information that is available from the responses of even populations of many tens of glomeruli, which was only 1.35 bits across 33 glomeruli on average, compared with the 2.58 bits required to perfectly encode these six odors. Third, although there is considerable interest in the possibility of temporal encoding of stimulus including odor identity, the amount of information in the temporal aspects of the presynaptic glomerular responses was low (mean 0.11 bits) and, importantly, was redundant with respect to the information available from the rates. Fourth, the information from simultaneously recorded glomeruli asymptotes very gradually and nonlinearly, showing that glomeruli do not have independent responses. Fifth, the information from a population became available quite rapidly, within 100 ms of sniff onset, and the peak of the glomerular response was at 200 ms. Sixth, the information from the lOB was not additive with that of the dOB. NEW & NOTEWORTHY We report broad tuning and low odor information available across the lateral and dorsal bulb populations of glomeruli. Even though response latencies can be significantly predictive of stimulus identity, such contained very little information and none that was not redundant with information based on rate coding alone. Last, in line with the emerging notion of the important role of earliest stages of responses (“primacy”), we report a very rapid rise in information after each inhalation. 
    more » « less
  3. null (Ed.)
    Design can be seen as a series of decisions that are informed by information that the designer has gathered from the environment and transformed into actionable knowledge. The sheer volume and variety of available information compels designers to impose structure upon the desired information, which in turn may affect subsequent design activities. To better understand how information may inform design decisions, this study investigates the relationship between designers’ information organization behaviors and their generated ideas by recruiting eight professionals (four from software design and four from graphic design) for individual 3-hour design sessions. They were asked to generate ideas for a design problem (reducing pedestrian accidents in Nebraska) using the provided information. Results reveal that designers structured the information in three different ways (Clusters, Relations, and Nests), and both designer background and organizational strategy display different roles in the features generated in their ideas. 
    more » « less
  4. An important question in interactive information retrieval (IIR) is: How do task characteristics influence users’ needs? In this paper, we investigate the effects of cognitive task complexity on the types of information considered useful for a task. We characterize information types from two perspectives. From one perspective, we classify task-related information items based on inherent characteristics (referred to as info-types): factual statements, concepts/definitions, opinionated statements, and insights—tips/advice related to the task domain. From a second perspective, we used Byström and Järvelin’s framework [5] to define information types based on how the information might be used to complete the task (referred to as functional roles): (1) to help the task doer understand the task requirements (problem information); (2) to help the task doer strategize on how to approach the task (problem-solving information); and (3) to help the task doer learn about the task domain (domain information). Our results suggest that: (1) cognitive task complexity influences the functional roles of information items deemed useful for the task (RQ1); (2) certain info-types are more (or less) likely to play certain functional roles (RQ2); and task complexity influences the variety of functional roles played by info-types (RQ3). 
    more » « less
  5. The standard assumption in social learning environments is that agents learn from others through choice outcomes. We argue that in many settings, agents can also infer information from others’ response times (RT), which can increase efficiency. To investigate this, we conduct a standard information cascade experiment and find that RTs do contain information that is not revealed by choice outcomes alone. When RTs are observable, subjects extract this private information and are more likely to break from incorrect cascades. Our results suggest that in environments where RTs are publicly available, the information structure may be richer than previously thought. 
    more » « less