skip to main content


Title: Information avoidance and overvaluation under epistemic constraints: Principles and implications for regulatory policies
The Value of Information (VoI) assesses the impact of data in a decision process. A risk-neutral agent, quantifying the VoI in monetary terms, prefers to collect data only if their VoI surpasses the cost to collect them. For an agent acting without external constraints, data have non-negative VoI (as free “information cannot hurt”) and those with an almost-negligible potential effect on the agent's belief have an almost-negligible VoI. However, these intuitive properties do not hold true for an agent acting under external constraints related to epistemic quantities, such as those posed by some regulations. For example, a manager forced to repair an asset when its probability of failure is too high can prefer to avoid collecting free information about the actual condition of the asset, and even to pay in order to avoid this, or she can assign a high VoI to almost-irrelevant data. Hence, by enforcing epistemic constraints in the regulations, the policy-maker can induce a range of counter-intuitive, but rational, behaviors, from information avoidance to over-evaluation of barely relevant information, in the agents obeying the regulations. This paper illustrates how the structural properties of VoI change depending on such external epistemic constraints, and discusses how incentives and penalties can alleviate these induced attitudes toward information.  more » « less
Award ID(s):
1638327
NSF-PAR ID:
10161384
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Reliability engineering systems safety
Volume:
197
ISSN:
0951-8320
Page Range / eLocation ID:
106814
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The value of information (VoI) provides a rational metric to assess the impact of data in decision processes, including maintenance of engineering systems. According to the principle that “information never hurts”, VoI is guaranteed to be non-negative when a single agent aims at minimizing an expected cost. However, in other contexts such as non-cooperative games, where agents compete against each other, revealing a piece of information to all agents may have a negative impact to some of them, as the negative effect of the competitors being informed and adjusting their policies surpasses the direct VoI. Being aware of this, some agents prefer to avoid having certain information collected, when it must be shared with others, as the overall VoI is negative for them. A similar result may occur for managers of infrastructure assets following the prescriptions of codes and regulations. Modern codes require the probability of some failure events be below a threshold, so managers are forced to retrofit assets if that probability is too high. If the economic incentive of those agents disagrees with the code requirements, the VoI associated with tests or inspections may be negative. In this paper, we investigate under what circumstance this happens, and how severe the effects of this issue can be. 
    more » « less
  2. People are involved with the collection and curation of all biodiversity data, whether they are researchers, members of the public, taxonomists, conservationists, collection managers or wildlife managers. Knowing who those people are and connecting their biographical information to the biodiversity data they collect helps us contextualise their scientific work. We are particularly concerned with those people and communities involved in the collection and identification of biological specimens. People from herbaria and natural science museums have been collecting and preserving specimens from all over the world for more than 200 years. The problem is that many of these people are only known by unstandardized names written on specimen labels, often with only initials and without any biographical information. The process of identifying and linking individuals to their biographies enables us to improve the quality of the data held by collections while also quantifying the contributions of the often underappreciated people who collected and identified these specimens. This process improves our understanding of the history of collecting, and addresses current and future needs for maintaining the provenance of specimens so as to comply with national and international practices and regulations. In this talk we will outline the steps that collection managers, data scientists, curators, software engineers, and collectors can take to work towards fully disambiguated collections. With examples, we can show how they can use these data to help them in their work, in the evaluation of their collections, and in measuring the impact of individuals and organisations, local to global. 
    more » « less
  3. It takes great effort to manually or semi-automatically convert free-text phenotype narratives (e.g., morphological descriptions in taxonomic works) to a computable format before they can be used in large-scale analyses. We argue that neither a manual curation approach nor an information extraction approach based on machine learning is a sustainable solution to produce computable phenotypic data that are FAIR (Findable, Accessible, Interoperable, Reusable) (Wilkinson et al. 2016). This is because these approaches do not scale to all biodiversity, and they do not stop the publication of free-text phenotypes that would need post-publication curation. In addition, both manual and machine learning approaches face great challenges: the problem of inter-curator variation (curators interpret/convert a phenotype differently from each other) in manual curation, and keywords to ontology concept translation in automated information extraction, make it difficult for either approach to produce data that are truly FAIR. Our empirical studies show that inter-curator variation in translating phenotype characters to Entity-Quality statements (Mabee et al. 2007) is as high as 40% even within a single project. With this level of variation, curated data integrated from multiple curation projects may still not be FAIR. The key causes of this variation have been identified as semantic vagueness in original phenotype descriptions and difficulties in using standardized vocabularies (ontologies). We argue that the authors describing characters are the key to the solution. Given the right tools and appropriate attribution, the authors should be in charge of developing a project's semantics and ontology. This will speed up ontology development and improve the semantic clarity of the descriptions from the moment of publication. In this presentation, we will introduce the Platform for Author-Driven Computable Data and Ontology Production for Taxonomists, which consists of three components: a web-based, ontology-aware software application called 'Character Recorder,' which features a spreadsheet as the data entry platform and provides authors with the flexibility of using their preferred terminology in recording characters for a set of specimens (this application also facilitates semantic clarity and consistency across species descriptions); a set of services that produce RDF graph data, collects terms added by authors, detects potential conflicts between terms, dispatches conflicts to the third component and updates the ontology with resolutions; and an Android mobile application, 'Conflict Resolver,' which displays ontological conflicts and accepts solutions proposed by multiple experts. a web-based, ontology-aware software application called 'Character Recorder,' which features a spreadsheet as the data entry platform and provides authors with the flexibility of using their preferred terminology in recording characters for a set of specimens (this application also facilitates semantic clarity and consistency across species descriptions); a set of services that produce RDF graph data, collects terms added by authors, detects potential conflicts between terms, dispatches conflicts to the third component and updates the ontology with resolutions; and an Android mobile application, 'Conflict Resolver,' which displays ontological conflicts and accepts solutions proposed by multiple experts. Fig. 1 shows the system diagram of the platform. The presentation will consist of: a report on the findings from a recent survey of 90+ participants on the need for a tool like Character Recorder; a methods section that describes how we provide semantics to an existing vocabulary of quantitative characters through a set of properties that explain where and how a measurement (e.g., length of perigynium beak) is taken. We also report on how a custom color palette of RGB values obtained from real specimens or high-quality specimen images, can be used to help authors choose standardized color descriptions for plant specimens; and a software demonstration, where we show how Character Recorder and Conflict Resolver can work together to construct both human-readable descriptions and RDF graphs using morphological data derived from species in the plant genus Carex (sedges). The key difference of this system from other ontology-aware systems is that authors can directly add needed terms to the ontology as they wish and can update their data according to ontology updates. a report on the findings from a recent survey of 90+ participants on the need for a tool like Character Recorder; a methods section that describes how we provide semantics to an existing vocabulary of quantitative characters through a set of properties that explain where and how a measurement (e.g., length of perigynium beak) is taken. We also report on how a custom color palette of RGB values obtained from real specimens or high-quality specimen images, can be used to help authors choose standardized color descriptions for plant specimens; and a software demonstration, where we show how Character Recorder and Conflict Resolver can work together to construct both human-readable descriptions and RDF graphs using morphological data derived from species in the plant genus Carex (sedges). The key difference of this system from other ontology-aware systems is that authors can directly add needed terms to the ontology as they wish and can update their data according to ontology updates. The software modules currently incorporated in Character Recorder and Conflict Resolver have undergone formal usability studies. We are actively recruiting Carex experts to participate in a 3-day usability study of the entire system of the Platform for Author-Driven Computable Data and Ontology Production for Taxonomists. Participants will use the platform to record 100 characters about one Carex species. In addition to usability data, we will collect the terms that participants submit to the underlying ontology and the data related to conflict resolution. Such data allow us to examine the types and the quantities of logical conflicts that may result from the terms added by the users and to use Discrete Event Simulation models to understand if and how term additions and conflict resolutions converge. We look forward to a discussion on how the tools (Character Recorder is online at http://shark.sbs.arizona.edu/chrecorder/public) described in our presentation can contribute to producing and publishing FAIR data in taxonomic studies. 
    more » « less
  4. We assess the Value of Information (VoI) for inspecting components in systems managed by multiple agents, using game theory and Nash equilibrium analysis. We focus on binary systems made up by binary components which can be either intact or damaged. Agents taking maintenance actions are responsible for the repair costs of their own components, and the penalty for system failure is shared among all agents. The precision of inspection is also considered, and we identify the prior and posterior Nash equilibrium with perfect or imperfect inspections. The VoI is assessed for the individual agents as well as for the whole set of agents, and the analysis consider series, parallel and general systems. A negative VoI can trigger the phenomenon of Information Avoidance (IA), where rational agents prefer not to collect free information. We discuss whether it is possible that the VoI is negative for one or for all agents, for the agents with inspected or uninspected components, and for the total sum of VoIs. 
    more » « less
  5. null (Ed.)
    Interest in physical therapy and individual exercises such as yoga/dance has increased alongside the well-being trend, and people globally enjoy such exercises at home/office via video streaming platforms. However, such exercises are hard to follow without expert guidance. Even if experts can help, it is almost impossible to give personalized feedback to every trainee remotely. Thus, automated pose correction systems are required more than ever, and we introduce a new captioning dataset named FixMyPose to address this need. We collect natural language descriptions of correcting a “current” pose to look like a “target” pose. To support a multilingual setup, we collect descriptions in both English and Hindi. The collected descriptions have interesting linguistic properties such as egocentric relations to the environment objects, analogous references, etc., requiring an understanding of spatial relations and commonsense knowledge about postures. Further, to avoid ML biases, we maintain a balance across characters with diverse demographics, who perform a variety of movements in several interior environments (e.g., homes, offices). From our FixMyPose dataset, we introduce two tasks: the pose-correctional-captioning task and its reverse, the target-pose-retrieval task. During the correctional-captioning task, models must generate the descriptions of how to move from the current to the target pose image, whereas in the retrieval task, models should select the correct target pose given the initial pose and the correctional description. We present strong cross-attention baseline models (uni/multimodal, RL, multilingual) and also show that our baselines are competitive with other models when evaluated on other image-difference datasets. We also propose new task-specific metrics (object-match, body-part-match, direction-match) and conduct human evaluation for more reliable evaluation, and we demonstrate a large human-model performance gap suggesting room for promising future work. Finally, to verify the sim-to-real transfer of our FixMyPose dataset, we collect a set of real images and show promising performance on these images. Data and code are available: https://fixmypose-unc.github.io. 
    more » « less