<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Information avoidance and overvaluation under epistemic constraints: Principles and implications for regulatory policies</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>01/22/2020</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10161384</idno>
					<idno type="doi">10.1016/j.ress.2020.106814</idno>
					<title level='j'>Reliability engineering  systems safety</title>
<idno>0951-8320</idno>
<biblScope unit="volume">197</biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Matteo Pozzi</author><author>Carl Malings</author><author>Andreea Minca</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[The Value of Information (VoI) assesses the impact of data in a decision process. A risk-neutral agent, quantifying the VoI in monetary terms, prefers to collect data only if their VoI surpasses the cost to collect them. For an agent acting without external constraints, data have non-negative VoI (as free “information cannot hurt”) and those with an almost-negligible potential effect on the agent's belief have an almost-negligible VoI. However, these intuitive properties do not hold true for an agent acting under external constraints related to epistemic quantities, such as those posed by some regulations. For example, a manager forced to repair an asset when its probability of failure is too high can prefer to avoid collecting free information about the actual condition of the asset, and even to pay in order to avoid this, or she can assign a high VoI to almost-irrelevant data. Hence, by enforcing epistemic constraints in the regulations, the policy-maker can induce a range of counter-intuitive, but rational, behaviors, from information avoidance to over-evaluation of barely relevant information, in the agents obeying the regulations.This paper illustrates how the structural properties of VoI change depending on such external epistemic constraints, and discusses how incentives and penalties can alleviate these induced attitudes toward information.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Information collected by sensors and inspectors can significantly reduce the uncertainty in decision-making in many fields of engineering. In the management of urban systems, for example, pervasive integration of sensing technologies can be the base for a novel quantitative urban science that, in turn, would allow for a better control of these systems. However, such integration would follow from the decisions of managers and stakeholders of the urban assets (whom we will hereafter refer to as "agents"), who act under societal regulations which are often developed neglecting their effects on information collection.</p><p>Design, operation and maintenance of these assets can be formulated as a decision making process, under uncertainty on demands, capacities and long-term evolution. Agents take these "exploitative" decisions (e.g. repairing or replacing some assets) with the aim of optimizing their own utilities, or minimizing their own losses. As the consequences of these actions can potentially affect safety and economic prosperity of communities at a broader level, the society usually imposes regulations and public policies to affect or even control them. Specifically, as agents may be prone to accept risks higher than the society can tolerate, possibly because they do not include all societal costs in their analysis, society can impose constraints on the available actions, depending on the circumstances. For example, a building code can prevent a structure from being open to the public when the probability of its failure is too high, despite the owner's will to do so. Through these constraints, society is able to indirectly implement the decisions that it considers optimal, balancing costs for construction, maintenance, operation and renovation with risks related to failures and malfunctioning.</p><p>If agents are free to take explorative actions for supporting their exploitative strategy, e.g. by collecting information using sensors and inspectors, they can base their behavior on Value of Information (VoI), an utility-based metric to assesses the impact of data in a decision process introduced by the seminal work of Raiffa and Schlaifer <ref type="bibr">[16]</ref>. A risk-neutral agent, quantifying the VoI in monetary terms, prefers to collect data only if their VoI surpasses the cost to collect them. When an agent acts without external constraints, data have non-negative VoI (as free "information cannot hurt" <ref type="bibr">[5]</ref>) and those with an almost-negligible potential effect on the agent's belief have an almost-negligible VoI. However, these intuitive properties do not hold true for an agent acting under external constraints related to epistemic quantities, such as those posed by some regulations. For example, a manager forced to repair an asset when its probability of failure is too high will prefer to avoid collecting free information about the actual condition of the asset, and even to pay in order to avoid this, or she can assign a high VoI to almost-irrelevant data, if her economic perspective does not agree with that enforced by regulations. Hence, by enforcing epistemic constraints in the regulations, the policy maker can induce a range of counter-intuitive, but rational, behaviors, from information avoidance to overevaluation of barely relevant information, in the agents obeying the regulations.</p><p>Information is related to the legal distinction between "negligence", i.e. the failure to exercise the care against a risk which might be expected of a prudent person in the same circumstances, and "recklessness", i.e. knowing and willful exposure of others to a risk <ref type="bibr">[3]</ref>: implicit in this distinction is whether or not the agent had knowledge of the risk in question. Between these, recklessness is typically considered to be the more severe transgression, carrying with it a heavier penalty for the agent. From a legal standpoint, agents may prefer to avoid information to be merely responsible for "negligence", hedging against the heavier penalties of "recklessness". This provides a general example of how well-intentioned regulations can prompt seemingly counter-intuitive and counter-productive "willful ignorance" in agents acting under these constraints. A related example in the engineering domain, presented by Rayner <ref type="bibr">[17]</ref>, is an environmental remediation project where few field samples were taken, so as to minimize the possibility of these samples contradicting a computer model which predicted that the project was on course for success.</p><p>In other contexts such as non-cooperative games, where agents compete against each other, revealing a piece of information to all agents may have a negative impact to some of them, as the negative effect of the competitors being informed and adjusting their policies surpasses the direct VoI. Being aware of this, some agents prefer to avoid having certain information collected, when it must be shared with others, as the overall VoI is negative for them. For example, an entrepreneur sharing a market with a competitor may find a piece of information irrelevant for her, but key for her competitor, who can improve his strategy and reduce her share of the market <ref type="bibr">[2]</ref>. In that case, the impact of information is clearly negative, when assessed by that agent. These mechanisms are related to the topic of "information avoidance", extensively studied by social sciences, including psychology and behavioral economics <ref type="bibr">[7,</ref><ref type="bibr">10,</ref><ref type="bibr">21]</ref>.</p><p>Recent interest in VoI analysis for civil infrastructure systems is testified to by works such as that of Pozzi and Der Kiureghian <ref type="bibr">[13]</ref>, Srinivasan and Parlikad <ref type="bibr">[19]</ref>, Straub <ref type="bibr">[20]</ref>, Zonta et al. <ref type="bibr">[24]</ref>, Qin et al. <ref type="bibr">[15]</ref>, Goulet et al. <ref type="bibr">[8]</ref>, Thons <ref type="bibr">[22]</ref>. VoI is used as an objective function to be maximized for optimizing information collection by Malings and Pozzi <ref type="bibr">[11]</ref> and Memarzadeh and Pozzi <ref type="bibr">[12]</ref>. A discussion of the effects of the discrepancies between agents' preferences in relation to civil infrastructure management is in Pozzi et al. <ref type="bibr">[14]</ref>, Tonelli et al. <ref type="bibr">[25]</ref> and Verzobio et al. <ref type="bibr">[26]</ref>.</p><p>In this paper, we illustrate how the structural properties of the VoI, of observing the state of an engineering system, change depending on the external epistemic constraints, and discuss how regulatory design can alleviate undesired attitudes toward information. Our motivating question for writing this paper was: "why do rational agents sometimes prefer not to know?" i.e. "how can the VoI be negative, in some conditions?" To clarify our scope, we note that some phenomena that can also be described as cases of negative VoI are not the core of this paper. Experience suggests that sometimes it is better to neglect or refuse irrelevant information because "too much information can harm". This can happen because supposedly free information is not actually free, when considering all costs related for collecting and processing it, and so agents should neglect information with nil VoI. Also, it is a common experience that an agent can take a decision, then revise it based on noisy measures, while the prior decision was actually correct and the information has misled her. However, we point out that "information cannot hurt" is a principle holding in the expected prior prediction while, in a given single empirical realization, it may not hold true. We can also argue that if the processing model is incorrect, then the impact of information can be detrimental. For example, consider an agent overconfident in the precision of a sensor, or unaware of its systematic bias. That agent can be misled by the information, so that she would have done better without it. Again, previous results hold under model consistency: after all, probability models ignorance, and if an agent suspects that a model may be inappropriate, she should extend it until it captures the complete uncertainty in the relation between the system's state and measures.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Problem formulation</head><p>We focus on a simple and paradigmatic case of reliability assessment and control. Consider a system exposed to the risk of failure, for example a structural system prone to deterioration and collapse. If the failure occurs, the agent controlling the system must sustain a significant loss, but the failure also has consequences at the societal level. The probability of such failure is assessed considering the current available information on the uncertain system condition, the stochastic evolution of capacity and demand. Expensive maintenance actions are available to the agent, to mitigate the risk. Hence, she is facing a decision-making problem under uncertainty: should she repair the system or not? We assume the agent is rational, so that her behavior is modeled by the principle of minimizing expected loss, and risk-neutral. To decide the best course of action, the agent compares the expected loss of doing nothing, accepting the failure risk, with that related to maintenance actions. However, the agent must also follow rules defined by a societal regulation: in practice, these rules may prescribe specific actions to be undertaken in certain circumstances. We simplify and generalize such regulations into a constraint to the agent's decision: the agent must take a maintenance action when the probability of system failure exceeds a threshold of societally-accepted risk. In the context of structural systems, for example, such regulation is motivated by the societal need of reducing the rate of failure events: a building code prevents owners from opening unsafe buildings to the public.</p><p>In this setting, we consider that information can be collected for reducing the uncertainty on the system condition state, for example by inspecting it. The agent has to assess whether it is worth inspecting, trading off the inspection cost with the cost reduction due to a more appropriate decision based on the information. Fig. <ref type="figure">1</ref>  Analyzing this problem, our goal is twofold: first, we aim at assessing whether inspecting is convenient for the agent, by quantifying the VoI under societal constraint; second, we aim at evaluating the overall impact of the constraint, to discuss the pros and cons of such constraint from the societal perspective. We outline the properties of unconstrained VoI in the next Section, the effect of epistemic societal constraints in Section 4, we discuss regulatory design in Section 5 and draw some conclusions in Section 6.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Why "information never hurts" in unconstrained decisionmaking</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Unconstrained decision making under uncertainty</head><p>We start modeling the problem of information collection when the agent is not subject to any external constraint (i.e., there is no societal regulation). In this section we illustrate why, in this case, the expected prior loss is a concave and thus continuous function in the convex domain of possible beliefs; hence, why the VoI is always non-negative in unconstrained decision-making.</p><p>As outlined in Fig. <ref type="figure">1</ref>, consider the agent controlling a system and facing a one-shot decision, aiming at minimizing her expected loss. She has to select action a among a finite set of options = &#8230; A A {1, 2, , | |} of size |A|, while her loss L(x, a) depends on the selected action and on the state x of the system, defined on finite set = &#8230; X X {1, 2, , | |} of size |X|. Set A includes maintenance actions, as repairing or replacing a component, and the do-nothing option. List X can include specific malfunctions, damages, up to the failure of the system, together with the system well-functioning. The agent's belief about the state of the system can be summarized in vector b, of length |X|, with entry i defined as</p><p>i , and it is evaluated considering all relevant background information G. Hence stochastic vector b lists non-negative entries and it has unitary norm-1:</p><p>. These belief components include the probability of damage and failure of the system. If she exactly knows that the system is in state j, than = b e j , where e j is the j unit vector in the standard basis (i.e. all entries of b are zeros except for a unitary entry at position j). The domain &#937; B of the possible beliefs is a convex set of dimension X (| | 1), resulting from the intersection of the hypercube where all |X| components are between zero and one and the hyperplane where the norm-1 is unitary.</p><p>We define</p><p>as the expected loss selecting action a under belief b (where f i j ( , )</p><p>indicates the expectation of function f with vector v assigning the distribution of variable i). Introducing, for each action a, |X|-dimensional vector &#955; a of possible losses depending on the system state, whose entry</p><p>, we derive by the definition of expectation that the expected loss under action a is a scalar product, a linear function of belief b:</p><p>The decision problem can be expressed as the minimization on the set of functions</p><p>| | . The optimal loss l*, as a function of belief b, is:</p><p>And the corresponding optimal policy &#928;*, mapping the current belief into the optimal action, is defined by using "argmin" instead of "min" in the previous equation. As is clear from Eq. (1), the optimal loss l* is the lower envelope of the set of |A| linear functions defined by "lambda" vectors &#8230; { , , , }</p><p>| | , and therefore, it is a concave, and hence continuous function. We can also define the optimality basin for action a as the subset B a &#8838;&#937; B of the belief domain where that action is optimal:</p><p>. Each basin is a convex set as it is the subset of a convex set resulting from imposing linear inequalities (of course, some basins can also be empty). Also, the set of basins are disjoint (defining, if necessary, an arbitrary criterion to pick one action among those with equal expected losses) and it completely covers &#937; B . , any of these two coordinates can completely define that domain, as indicated in picture (d). In this case the entire belief can be described by the probability of the system being in one state, e.g. the probability of failure. Picture (e) shows the expected losses for 4 actions, and the function l*. Picture (f) reports the optimal policy and the optimality basins.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Belief updating after information processing</head><p>In this section we show how the measurement can be related to a distribution of possible posterior beliefs, and how the expected posterior belief is equal to the prior one. Bayes' formula is the key for processing information. We assume a measure or observation y, that can be a multi-dimensional vector of arbitrary dimension, defined on domain Y, is related, directly or indirectly, to the system state x. The relation between the system and the measure is captured by emission function p(y|x), that defines the probability density of y (or its discrete probability distribution, if y is a discrete variable), when the system state is x. We define b &#960; as the prior belief, before considering the measure (i.e., when information G mentioned in the previous section is the empty set &#8709;).</p><p>By combining prior belief and the emission function for a given realized measure y, we derive the posterior belief b &#969; (y) by Bayes' formula, whose entry i is:</p><p>Eq. ( <ref type="formula">2</ref>) defines a deterministic map between measure y and posterior belief b &#969; . It shows how, under a specific emission function, the prior belief is moved to the posterior one, following a realized value of the measure. In the prior condition, measure y is a random variable, modeled by predictive distribution:</p><p>which serves as a normalizing constant for Eq. ( <ref type="formula">2</ref>). Consequently, in the prior condition the posterior belief b &#969; is also a random variable, whose distribution p &#969; derives from the transformation of distribution p y via the map of Eq. ( <ref type="formula">2</ref>). Variables y and b &#969; are of the same type: if the set of possible measures is discrete and finite, so is that of posterior beliefs while, if the former set is continuous, so is the latter one. Moreover, by the consistency of probability calculus, as:</p><p>we see that the expected posterior belief, evaluated in the prior condition, has to be identical with the prior one: = b b (where indicates the expectation with respect to distribution p &#969; ).</p><p>Fig. <ref type="figure">3(a-d</ref>) shows an example of the inference process, when |X| is 2 and there is a univariate continuous measure y. Emission functions are Gaussian (a), with mean</p><p>and variance</p><p>, thus distribution p y is the mixture of two Gaussian components (b), depending on prior belief = b [0.6 0.4] T . The pos- terior belief, as a function of the observed value y, is as plotted in graph (c). Graph (d) shows the corresponding distribution of the posterior belief, in terms of its second component. Collecting the measure can be seen as trading the prior belief for a random realization of the posterior belief, generated from a distribution whose expected value is identical to the prior belief. In this trading, the belief can move toward the extreme regions of the domain, where the state x is perfectly known. Graph (e) illustrates an example when |X| is 3, and only = Y | | 5 values of observation y are possible. Hence, the posterior belief can assume only 5 values, and discrete distribution P &#969; defines the probability of each possible outcome.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Pre-posterior analysis and value of information</head><p>The Value of Information (VoI) is the expected loss reduction due to the availability of a measurement. In the prior condition, an agent would select action &#928;*(b &#960; ), obtaining expected loss</p><p>Posterior optimal action &#928;*[b &#969; (y)], using the same policy identified above, and loss l*[b &#969; (y)] depend on the realized measure y. The expected posterior loss is = l l b * * ( ). The VoI can be defined as the difference between prior and posterior expected loss:</p><p>We now define a subset of the belief domain, &#937; Y , as the union of the prior belief and the set of possible posterior beliefs for any possible realization of the measure, so that p &#969; is nil outside that subset. Only the value of function l* on &#937; Y is relevant for assessing the VoI. Hence, the VoI can be computed by combining two functions defined on &#937; Y : the optimal loss function l* and the predictive distribution p &#969; , whose mean value is b &#960; .</p><p>As the expected loss function l* is a concave function, we derive from Eq. ( <ref type="formula">4</ref>) and Jensen's inequality <ref type="bibr">[9]</ref> that VoI is non-negative for any prior belief b &#960; , any loss function L and any emission function p(y|x). This is the "information never hurts" principle.</p><p>Also, consider the case when function l* is linear in &#937; Y : in that case, we can commute function l* and the expectation in Eq. ( <ref type="formula">4</ref>), the difference vanishes and the VoI is zero. A trivial limit case is when the measure does not affect the belief, so b &#969; is identical to b &#960; for all possible realized measures, and &#937; Y contains just one point. This can happen if variables x and y are independent: clearly, statistically negligible information has no value in decision making. A more general case is that when</p><p>, that is, for any possible realized measure the posterior optimal action is equal to the prior one, so that l* is a linear function in &#937; Y . Again, if no possible measure output is able to change the prior action, then the VoI is zero. Conversely, if there is a positive probability that the posterior optimal action is different than a * , because the corresponding posterior optimal loss is strictly less than l a * , then the VoI is strictly positive.</p><p>In the Appendix, we prove another property of the VoI, related to the continuity of l*: information with almost-negligible effect on the belief has almost negligible VoI. Intuitively, if the posterior belief is close to the prior one for all possible realizations of the measures, the expected posterior loss tends to be similar to the prior loss, and the VoI tends to be zero.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Posterior loss and value of perfect information</head><p>In this section, we show how posterior loss l *, seen as a function of prior belief b &#960; , is also a concave function, never above l *. That is, we show that the expected loss after taking a measurement is never higher than the expected loss before measuring. How should a rational agent react to information? As seen before, she should update her belief using Bayes' formula, and then act as if the updated belief was her prior one, identifying optimal action a * and getting corresponding expected loss l*. However, the posterior decision can also be described in terms of a "conditional plan", describing how to react to any realized observation, by selecting a specific action <ref type="bibr">[18]</ref>. If we now assume a finite set Y of |Y| observable values, the entire plan is a set of |Y| instructions such as "if observation is = y j, then take action = a k". For example, a plan pre- scribes to execute the first action no matter what the observation is, another to execute that action only after some observed values, and do the second otherwise. Plan j, &#966; j : Y &#8594; A, is therefore a map from the set of observable values Y to the set of actions A. The set M of all possible conditional plans contains</p><p>We can compute the expected loss l i j , , executing plan &#966; j when the state x is equal to i, combining the emission function and the loss function:</p><p>So l i j</p><p>, is a weighted average of the values of function L related to the system being in state i, and so it is bounded by the minimum and the maximum of these values. We can define a new loss matrix L&#8242;, of size |X| by |M|, populated following the formula in Eq. ( <ref type="formula">6</ref>). Now, the conditional plan plays the role played by the action in the original setting: depending on her prior belief, the agent selects the conditional plan that minimizes the expected loss. Following the same approach outlined above, we define linear function = l b b ( )</p><p>as the expected loss following conditional plan j, where vector &#945; j , of size |X|, lists the expected losses for all states, as = i l ( )</p><p>, . Using Eq. ( <ref type="formula">6</ref>), we can derive vector &#945; j from the set of lambda vectors, as</p><p>. Starting from a specific belief b, the optimal policy &#928;* can be translated into the specific plan related to the minimum loss. So optimal posterior loss l * can be expressed as the lower envelope of a set of |M| linear functions, defined by the set of "alpha" vectors</p><p>Hence, l * is a concave, continuous function. While we have showed that this property holds for a discrete set of |Y| observable values, the same property holds for an infinite number of possible observations, |Y| goes to infinity, and even when the set of observable values is uncountable.</p><p>Hence, we can define the VoI as a function on &#937; B :</p><p>* ( ) * ( ), depending on loss matrix and emission probabilities.</p><p>Also, among the |M| conditional plans, |A| of them assign the same action for all possible realized measures: those plans can be executed in the prior condition, as they do not require access to the information, and they are equivalent to the prior selection of an action, independent of the observation. Hence, the set of alpha vectors includes that of lambda vectors as a subset. By comparing Eqs. ( <ref type="formula">1</ref>) and ( <ref type="formula">7</ref>), we conclude that l l * *, and this provides another proof that the function VoI, defined as the difference between l * and l *, is non-negative everywhere. In light of this, we can read the VoI as the benefit of adopting a more flexible reaction plan: while the expected loss l * derives from selecting a plan among the |A| possible rigid ones, independent of the measure, l * derives from the selection among the broader set of |M| adaptive ones that react to the measures.</p><p>The VoI is thus the difference of two concave functions: it is continuous (as the difference between two continuous functions is continuous), but it is not necessarily concave. Also, at the corners of the belief domain, where the state is known (vector b lists only zeros except for one unitary entry), then the posterior belief is surely equal to the prior one, so functions l * and l* are identical, and the VoI is nil.</p><p>In the limit case of perfect information, observable variable y is equivalent to state variable x, so that the posterior belief collapses on one unit vector of the standard basis: </p><p>for both states. In graph (a), posterior loss function l * is reported for &#963; &#603; equal to 1, 0.5 and 0 (the latter being the case of perfect information). Clearly, the posterior loss is lower if &#963; &#603; is lower, down to linear function l *. For a high value of &#963; &#603; , the posterior loss is indis- tinguishable from the prior one, and the VoI is nil. Graph (b) shows how the VoI is affected by &#963; &#603; : function EVPI is concave, while VoI has local maxima for higher values of &#963; &#603; .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5.">Interpretation and use of the VoI depending on the attitude toward risk</head><p>In our definition of the VoI, no cost or loss for the very act of collecting information is explicitly included. Information only has an effect on the epistemic belief of the agent, and not on the cost she has to pay. In the previous section, all loss functions and VoI can be intended as being in "loss units", corresponding to the utility unit <ref type="bibr">[23]</ref>. As such, the VoI represents the expected loss reduction due to free information. We have seen that this is always non-negative, indicating that free "information never hurts". For a version of the same principle from the standpoint of information theory, see Cover and Thomas <ref type="bibr">[5]</ref>. In terms of decision-making, this shows that the rational agent should always accept free information. But is the actual numerical VoI of any use? If the agent had to decide between collecting free information and an alternative action whose effect can be summarized by an expected loss reduction, then the agent should opt for the former option only if its VoI is above the reduction due to the latter. However, we may be more interested in a different setting, where information is expensive, and the agent has to decide on buying it before selecting an action. Information cost may impact the overall agent's loss. On one hand, the agent can assess the optimal expected loss without information, as we have seen above; on the other, to predict the overall effect of information, she can update the original loss function L including the impact of information cost for any pair of state and action and, using this updated loss function, she can assess the expected posterior loss. She should buy the information only if that posterior loss is less than that without information. This procedure, that we can call "the long route" is affected by the sensor cost from the beginning and the VoI, as defined before, plays no role in it. If we now consider a different information cost, we have to repeat the procedure from scratch. We may be tempted to follow a different and shorter route, first computing the VoI in terms of losses, and then converting this quantity in a monetary value, to compare with the sensor cost. However, if the conversion between monetary value is non-linear, as it is for risk-seeking or risk-adverse agents, it is unclear where this conversation between VoI and monetary cost should be made: on what interval along the cost to loss conversion? To find a monetary value V M consistent with the long route, one has to equate the prior loss with the posterior one as a function of unknown V M , and solve that non-linear equation to identify V M (this procedure is illustrated by <ref type="bibr">[4]</ref>). However, there is no guarantee that a single solution does exist. As an extreme case, consider a peculiar agent that loves monetary costs of even amounts of dollars, and hates costs of odd amounts. Also, consider that all costs related to pair of state and action are even. Clearly, for this agent the VoI (related to free information) is not of much use for deciding about buying an information whose cost is an odd amount of dollars, and not a single solution for V M may exist.</p><p>However, for risk neutral agents the VoI, as defined above, is also directly useful for deciding whether to buy expensive information. Indeed, for those agents all losses, and so the VoI itself, can be directly expressed in monetary costs. Moreover, due to the superposition of effects, the expected posterior loss buying an information for cost L I (as outlined in Fig. <ref type="figure">1</ref>) is simply the sum of L I and the expected posterior loss with free information. So it is worth buying the information + l L l * * I , i.e. when L I &#8804; VoI. Hence, by invoking the superposition of effects, we can interpret the VoI as the maximum cost the rational risk neutral agent should pay for getting that information, as paying more would induce an overall negative effect. Let us call D&#8838;&#937; B the subset of the belief domain where information should be collected. If cost L I is zero then D coincides with &#937; B , as the VoI is non-negative. When L I is positive, then D is a proper subset of &#937; B , that does not include its corners corresponding to perfect knowledge, where the VoI is zero. As the VoI is not generally concave, set D is generally not convex, or even connected. However, as function EVPI is concave, set D corresponding to perfect information is convex. If cost L I is above the maximum value of the VoI, set D is empty. I is above l * is also that where the VoI is above L I .</p><p>We briefly summarize the properties of the VoI in unconstrained decision making, discussed above. The VoI is a continuous non-negative function in the belief's domain. Information with almost-negligible effect on the belief has almost negligible VoI.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">VoI when acting under external constraints</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">External epistemic constraints</head><p>We now consider the effects of external constraints, e.g. societal regulations, in decision-making. Most external constraints can be captured by appropriate adjustments of loss matrix L: e.g., if one action a is forbidden when the state is x, this can be modeled by assigning an infinite loss to this pair, and similarly if a cost exceeds a budget constraint, to claim that those combinations cannot be accepted. Hence, those constrains can be embedded in the pre-processing that defines function L and, given that the structural properties that we have listed above hold for any loss function, they hold under any of these constraints.</p><p>A more complex setting occurs when the constraint restricts the available actions depending on the belief, and not depending on the state. We call these "epistemic constraints", as the belief is an epistemic descriptor. As outlined in Section 2, our motivating example is that of an agent following a set of regulations forbidding some actions when the risk is too high. These constraints cannot be embedded in any equivalent loss matrix. The agent's loss is modeled by matrix L but, following those constraints, the agent cannot generally implement her optimal policy &#928;*, as this may violate them. Hence, to investigate the exploitative and explorative behavior of an agent acting under external epistemic constraints we assume she has to follow a sub-optimal policy: the best one compatible with the constraints. In the next section, we show how the structural properties of VoI are not preserved in this setting.</p><p>We add a clarification about the nature of epistemic constraints. The decision-making problem under an epistemic constraint can be translated into an equivalent unconstrained problem where the constraint is represented by a change occurring in the external world the agent is interacting with. For example, the activation of the external epistemic constraint forbidding the agent to open the asset to the public when the probability of failure is above a threshold can be equivalently intended as a physical impediment (say a team of policemen), preventing that opening, which is activated following a mechanism consistent with that of activating the epistemic constraint. As a result of this exercise in the art of the analogy, the agent may think that now she has re-formulated the problem as one without any epistemic constraint, where the general properties of the unconstrained VoI hold. Now, while this is formally correct, we note that, in this setting, collecting information not only affects the agent's belief, but also triggers reactions of the "external world", and the properties described in Section 3 may not hold true. For example, even if an agent decides to maintain her posterior action identical to the prior one (and so neglecting the information), her expected posterior loss is not necessarily identical to the prior one, if others react to the information. So our analysis of external epistemic constraints applies to both equivalent settings: when the constraint is intended as an "internal" rule or when it models an external force acting on the agent.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Acting under external epistemic constraints</head><p>We consider that an agent adopts the best policy + A : B consistent with the external epistemic constraints, that is generally different with respect to the optimal unconstrained one &#928;*. The constraints, however, are inactive for perfect beliefs, at the corners of domain &#937; B , as constraints acting on those cases can be directly embedded in the loss matrix L. The corresponding expected loss + l is:</p><p>While function l* is concave and continuous, as it results from a minimization, function + l is not necessarily concave or even continuous. Clearly, it cannot be below</p><p>*, as the latter represents the minimum loss. So the non-negative expected loss increment due to constraints, &#916;l, is:</p><p>Increment &#916;l is also not necessarily continuous. We can express the prior loss as a function of the expected prior increment = l l b ( ), as:</p><p>And the expected posterior loss as related to the posterior increment = l l b ( ), as:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Value of information acting under an external epistemic constraints</head><p>In this section, we show how, when acting under an external constraint (e.g. a regulation), the VoI for a piece of information can be different from the VoI evaluated without considering the constraint. That is, we show external constraints can cause an agent to overvalue or under-value information. We define + VoI to be the value of information if acting following policy + . This quantity is related to the unconstrained VoI by increment = VoI l l , as:</p><p>Both &#916;l &#960; and &#916;l &#969; are non-negative, and &#916;VoI can be either positive or negative. As functions of the belief, &#916;l &#960; is not necessarily continuous, nor is &#916;VoI, while &#916;l &#969; is continuous under certain assumptions as we will see in a later example. At the corners of belief domain &#937; B , inhas no impact and both &#916;VoI and + VoI are zero, as is VoI. Due to the non-negativity of the increments, we can bound &#916;VoI as l VoI l . However, in general &#916;l &#960; and &#916;l &#969; are arbitrarily large, and so &#916;VoI and + VoI can be positive or negative, and their modulus arbitrary large. Negativity of the VoI can explain information avoidance for rational agents: actually, if + VoI is negative they should be ready to pay up to + VoI to avoid collecting information. Fig. <ref type="figure">6</ref>(a) shows a simple example of those limit cases. The first three actions are related to a low loss while the other two are related to a loss &#948; higher. From prior belief b B , the posterior belief can be b A or b C , depending on the realized observation, from prior belief b C , the posterior belief can be b B or b D . Thus, from prior belief b B , the agent would like to collect the information, as the corresponding + VoI is &#948;. From prior belief b C , instead, the agent prefers to avoid information, as</p><p>. This shows how + VoI can be positive or negative, as &#948; can be arbitrarily large. Fig. <ref type="figure">6(b,</ref><ref type="figure">c</ref>) shows the example of Figs. <ref type="figure">4</ref> and<ref type="figure">5</ref>, under an external epistemic constraint forcing action 2 for belief 15% &#8804; b &#960;, 2 &#8804; 85%. The corresponding function + l is discontinuous. We consider the same emission functions introduced above, with = 1. Function &#916;l &#960; is dis- and it is zero where policies + and &#928;* agree with each other. Function + l (and so function &#916;l &#969; ) is continuous, as the continuous distribution of the posterior belief acts as a smoother of function + l .</p><p>Where + l is continuous, a jump of function + l is identical of that of + VoI . Both functions &#916;VoI and + VoI are discontinuous, and they assume po- sitive and negative values in &#937; B . As illustrated above for the general case, we note that when b &#960;, 2 is just below 15% the</p><p>, when the expected loss is maximum: any update increasing belief component b &#960;, 2 , even for an arbitrary small quantity (i.e., even for almost negligible information), will significantly reduce the loss, hence + VoI is not only positive, but much higher than VoI. Generally, + VoI can be higher, equal or lower than VoI, depending on the prior belief. In the Appendix, we show how almost-negligible information can have arbitrarily large value, in the constrained setting.</p><p>We close this Section by focusing on two special cases. If, at the prior belief b &#960; , the two policies agree with each over (i.e., = + b b ( ) *( )) then &#916;l &#960; is zero, so = VoI l cannot be positive, and + VoI is not higher than Hence the constrained agent gives less value (or the same value) to the information with respect to the unconstrained one, due to the sub-optimality of the posterior action. Indeed, for such a belief, the constraint is inactive in the prior condition, and the information exposes her to fall under the constraints: the non-positive &#916;VoI quantifies this effect. Conversely, it can be the case that, while + and &#928;* are different at b &#960; , they are identical for all reachable posterior beliefs in &#937; Y : in that case, &#916;l &#969; is zero and + VoI is not lower than VoI: the agent is penalized by the constraint only in the prior setting, and the information has the additional value of letting the agent escape the constraint. This latter case happens, for example, in the case of perfect information.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Regulatory design to promote well-balanced information collection</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Reasons for external epistemic constraints and their impact on information collection</head><p>Why are external epistemic constraints imposed on the decisionmaker? We assume they are imposed by society, through regulations, with the purpose of influencing and controlling private decisions towards a common good ( <ref type="bibr">[6]</ref> presents a recent analysis in the context of structural design). The actions taken by the agent have public economic consequences that society can assess. Once those consequences are summarized in cost function C, as illustrated in Fig. <ref type="figure">1</ref>, society can identify the basins of optimality corresponding to the optimal policy according to societal evaluation, following the approach outlined above. If societal cost C differs from agent's cost L, the optimal policy identified by society will differ from that identified by the decisionmaker. Assessing function L and predicting the agents' behavior, society can calibrate a regulation. This regulation can forbid certain actions to be taken under specific circumstances, namely under certain epistemic beliefs. By appropriate regulation, society can enforce the adoption of its optimal policy by the agent. However, what is the effect of this enforcement on information collection? Society assigns a value to any possible information, and it wishes information to be collected when this value is above its cost. For example, society would like relevant information to be always collected, as, from their perspective, it "never hurts". However, the VoI as assessed by the decision-maker acting under external epistemic constraints can be different, much higher of that assessed by society, or even negative, as we have seen above. So, from the standpoint of society, agents collect information that is too expensive to be collected, and avoid other relevant information.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Incentives and penalties for influencing decision making</head><p>Policy makers can influence the behavior of agents in many ways, e.g. by funding an education system promoting respect to other citizens and to specific societal values. Here we are considering a more direct enforcement method: a regulation enforcing penalties and incentives to make loss function L, summarizing the economic cost the agent has to pay depending on the system condition and the adopted action, consistent with societal function C, modeling the corresponding societal costs. For example, society could enforce an economic penalty in case of failure of the asset, or pass laws to increase the liability of the agent, thereby transferring some costs from C to L. Let Z be the economic penalty (if positive) or incentives (if negative) society poses to the agent.</p><p>Then, the overall loss for the agent is</p><p>, agent's and societal losses coincide. After the implementation of these corrections, epistemic constraints are redundant for the agent, as her optimal policy is aligned with that preferred by society. We note that in order to induce an agent's behavior consistent with societal desiderata, it is not necessary that C and L&#8242; are identical: it is sufficient that they differ for an arbitrary positive scaling factor &#947; &gt; 0 and an arbitrary offset &#946;. If so, the VoI assessed by the agent is just scaled by factor &#947; with respect to that assessed by society, and it has the same intuitive properties of the unconstrained VoI. The same effect can also be achieved by providing epistemic incentives and penalties, depending on the belief, instead of the state, e.g. a regulation can impose a penalty proportional to the probability of failure. Society can provide economic</p><p>if action a is taken under belief b: The adjusted expected loss function l a is:</p><p>Society can also provide incentives directly for promoting information collection (however, it is harder to conceive that society could put a penalty against buying almost irrelevant information).</p><p>Overall, given the results of Section 3, incentives and penalties should aim at obtaining a concave loss function for the agent, so as not to trigger undesirable effects related to information avoidance and over-evaluation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.">A maintenance problem under a single reliability constraint</head><p>We consider a simple decision making problem, following <ref type="bibr">[14]</ref>, where the agent has the responsibility for an asset whose state is binary: it can be intact (this is state =</p><p>x 1), or damaged and doomed to failure (this is state = x 2), so = X | | 2. Thus, as in the examples shown in Figs. <ref type="figure">4</ref><ref type="figure">5</ref><ref type="figure">6</ref>, the agent's belief is univocally defined by failure probability b &#960;, 2 , which we now indicate simply as P, for ease of notation. She has two alternative options: do-nothing (this is action a 1 ) or repair the asset (this is action a 2 ), so = A | | 2. If she does nothing and the asset is doomed to failure, she has to sustain cost of failure L F , but the failure can be avoided by paying repair cost L R .Thus her unconstrained optimal policy is to repair the asset only if P is above = P L L / R F , and optimal unconstrained loss is = l P L P P * ( ) min{ / , 1} R . We assume that society the cost of failure and repair as C F and C R respectively, so the policy imposed by society, consistent with its proper evaluation, is to repair the asset if = P P C C &#175;/ R F , and we assume that &lt; P P &#175;. So the constrained loss + l is identical to l* for P below P &#175;and above P , but is different from l* and equal to L R in the interval</p><p>M . The expected loss increment is:</p><p>So the increment is nonzero only in the range of belief between the threshold imposed by society and that identified by the unconstrained If we define</p><p>M as the probability that the posterior belief falls in &#937; M , and = &#181; P P</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>[ | ]</head><p>M as the expected belief in interval &#937; M , then the expected posterior loss increment due to the constraint is</p><p>, because &#916;l is an affine function in &#937; M , according to Eq. <ref type="bibr">(15)</ref>  l . As expected, variation &#916;VoI is bounded by l and &#916;l &#960; . That graph also reports the unconstrained VoI, and function VoI C , that will be introduced in the next Section. While these high values of failure probability are selected for the sake of readability of the graphs, the reader is referred to Pozzi et al. <ref type="bibr">[14]</ref> for a similar example with smaller probability values, often encountered in reliability analysis of civil engineering components.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.4.">A corresponding regulation design</head><p>In the setting of previous Section, the agent would avoid collecting information when belief P is below P &#175;and, actually, she would be willing to pay up to the 14% of the repair cost L R , to avoid the information being collected. In the limit case of almost irrelevant information (i.e. when &#963; &#603; approaches infinity) the agent can pay up to 35% of the repair cost (i.e., half of the discontinuity in &#916;l &#960; ) to avoid collecting information or to collect information, depending on the fact that her belief P is just below or just above P &#175;. If P &#175;is close to zero, and under specific scenarios for almost irrelevant information, this VoI can be as large as the repair cost.</p><p>With references to the problem outlined in Section 5.3, we assume society assesses the cost of failure as C F &gt; L F , while that of repair is consistent with that assigned by the agent: , for lowering the repair cost. More generally, let &#916;L F and &#916;L R be the penalty and the incentive for failure and repair, respectively. By imposing that the adjusted policy of the agent is consistent with that of society (i.e.</p><p>), we get one linear equation, to define those quantities. Another equation can come from the condition that expected penalties should balance expected incentives, so that the regulation is self-sustainable. To do so, we need to assume a distribution of prior belief P. Let us define</p><p>F the probability that the prior belief is below the threshold, and the optimal action is doing nothing,</p><p>is the expected probability of failure below the threshold, and</p><p>is the probability that a repair is needed. If so, incentives and penalties should be:</p><p>where = r &#181; P P / FR F F R . We note that the penalty can be enforced as an epistemic one, asking for payment &#916;L F P for an asset when the probability of failure is P. Again, constraints becomes redundant under this regulation. The adjusted optimal loss function for the agent is continuous and concave, so the properties of unconstrained VoI hold. The VoI assessed by the agent, that we can now call VoI v for clarity, is proportional to that assessed by society, which we call VoI c , and is less than that for society: , society can make the adjusted VoI as assessed by the agent consistent with VoI c . Properly, this regulation is not equilibrated: on one hand, the incentive for information collection is a cost for society, on the other hand, society gets a benefit from the better induced attitude towards information. This consideration could be embedded info a different (equilibrated) selection of incentive and penalty values, after some assumptions on information availability.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>We have investigated the effect of societal constraints on agents' attitudes toward information. While these constraints are effective in forcing agents to take decisions consistent with society's will, they can have unwanted "second-order" effects on information collection, if this activity is controlled by these agents and unconstrained. Risk-neutral agents will collect information only if its cost is below its value. However, the VoI assessed by agents whose preferences are not aligned with society will differ from that assessed by society itself. In the illustrated example, two undesirable outcomes can occur. First, an agent forced by the constraint to repair an asset can be willing to pay too much for escaping the constraint by collecting information, up to, in the limit case, the repair cost for receiving almost irrelevant information. So, paradoxically enough, the very availability of information makes things worse for society, as its members waste resources. Second, the economic effect can be of the same magnitude (but much higher in relative terms) when the constraint is currently inactive, because the asset is judged to be safe enough. In this latter case, the agent can prefer to avoid information and, again in the limit case, pay up to the repair cost to do so.</p><p>To overcome these undesirable induced behaviors, society can provide economic incentives and penalties to agents, to make the loss function concave for the decision makers. The calibration of such measures to make constraints redundant, and even economically selfequilibrated, would depend on specific assumptions about agents' preferences, and we have provided an analysis of a simple problem. Society can also provide incentives for information collection, as illustrated above. Also, societal codes can require collecting data (e.g., <ref type="bibr">[1]</ref>), and they could even prescribe to evaluate VoI according to a given formula, encoding the assessment from the societal standpoint, and force agents to buy information when its cost is below that threshold, but the implementation of such a requirement would likely be controversial.</p><p>The analysis presented in this paper can be relevant for the legal distinction between negligence and reckleness, and for the attribution of liability. Apart from proposing practical solutions for overcoming the problems of information avoidance and over-evaluation, the first goal of this paper was to clarify how mathematical properties of VoI are affected by external epistemic constraints, as this is key for many engineering applications, e.g. for the optimization of information collection.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>CReditT author statement</head><p>CReditT for manuscript "Information avoidance and overvaluation under epistemic constraints: principles and implications for regulatory policies", by <ref type="bibr">Matteo</ref>  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Declaration of Competing Interest</head><p>Conflict of Interest and Authorship Conformation Form, for manuscript "Information avoidance and overvaluation under epistemic constraints: principles and implications for regulatory policies", by Matteo Pozzi, Carl Malings &amp; Andreea Minca, submitted to RESS.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Appendix. Effects of almost-irrelevant information</head><p>While, trivially, even under external epistemic constraints, an independent observation with no effect on the agent's belief has no value at all, the possible value of slightly dependent information strongly depends on those constraints.</p><p>By "almost irrelevant information", we refer to a source of information that leaves the posterior belief almost identical to the prior one. Quantitatively, we can define a ball or radius &#603;, centered on the prior belief b &#960; , so that the posterior belief b &#969; is always inside that ball, for any realized measure. We analyze the case of "almost irrelevant information" by computing the limit for &#603; going to zero. Alternatively, we can define the trace t &#969; of the covariance matrix &#931; &#969; of the posterior belief b &#969; , and compute that limit for t &#969; going to zero.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Almost irrelevant information in unconstrained decision making</head><p>We formally prove that, without any external epistemic constraint, information that have almost-negligible effects on the agent's belief have also almost-negligible VoI. Let us consider the case when the optimal expected loss function l* is a quadratic form:</p><p>where = b , matrix Q is negative definite, as l* is concave, Let us suppose we can identify a parameter of standard deviation &#963; &#969; , so that the covariance matrix is proportional to the corresponding variance:</p><p>, and trace t &#969; is proportional to 2 . From Eq.A.3, the VoI is proportional to that variance:</p><p>): this shows that function VoI(&#963; &#969; ) is continuous at &#963; &#969; equal to zero, where it is zero. Hence, for any monetary value, arbitrarily small, we can find a corresponding value of &#963; &#969; so that the VoI is equal to that small value. Now, generally the expected loss function l* cannot be expressed by a quadratic form as in Eq. (A.1). However, suppose we can find a quadratic form that is never above l* in the reachable belief's domain &#937; Y , and it is equal to l* at prior belief b &#960; . In this setting, the previous formula is an upperbound for the VoI:</p><p>So we have proven, again, that the VoI cannot be high, when &#963; &#969; is small. Actually, the quadratic bound is ineffective for some points in the belief's domain: the kinks where the curvature is infinite. For these points, we can define a piece-wise linear lower-bound for l *.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Almost irrelevant information under external epistemic constraints</head><p>Under external epistemic constraints, function l + is not necessarily continuous. We now prove that if there is a discontinuity of value &#948; &gt; 0 in l + , then almost irrelevant information can have any value in the [ , ] interval. To do so, we focus on a simple one-dimensional case of the belief domain, without losing generality as, in a higher dimensional space, the argument can be repeated for a segment passing through the discontinuity. Let us consider this simple loss function l*: So the VoI can be as large as &#948;. By slightly changing the condition in A.5, so that l + is zero at b equal to zero, and reversing the sign of variable b &#969; , we can also prove that the VoI can be as small as .</p></div></body>
		</text>
</TEI>
