skip to main content


Title: Individual Fairness Revisited: Transferring Techniques from Adversarial Robustness

We turn the definition of individual fairness on its head - rather than ascertaining the fairness of a model given a predetermined metric, we find a metric for a given model that satisfies individual fairness. This can facilitate the discussion on the fairness of a model, addressing the issue that it may be difficult to specify a priori a suitable metric. Our contributions are twofold:First, we introduce the definition of a minimal metric and characterize the behavior of models in terms of minimal metrics. Second, for more complicated models, we apply the mechanism of randomized smoothing from adversarial robustness to make them individually fair under a given weighted Lp metric. Our experiments show that adapting the minimal metrics of linear models to more complicated neural networks can lead to meaningful and interpretable fairness guarantees at little cost to utility.

 
more » « less
Award ID(s):
1704845
NSF-PAR ID:
10238795
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Twenty-Ninth International Joint Conference on Artificial Intelligence
Page Range / eLocation ID:
437 to 443
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Roth, A (Ed.)
    It is well understood that classification algorithms, for example, for deciding on loan applications, cannot be evaluated for fairness without taking context into account. We examine what can be learned from a fairness oracle equipped with an underlying understanding of “true” fairness. The oracle takes as input a (context, classifier) pair satisfying an arbitrary fairness definition, and accepts or rejects the pair according to whether the classifier satisfies the underlying fairness truth. Our principal conceptual result is an extraction procedure that learns the underlying truth; moreover, the procedure can learn an approximation to this truth given access to a weak form of the oracle. Since every “truly fair” classifier induces a coarse metric, in which those receiving the same decision are at distance zero from one another and those receiving different decisions are at distance one, this extraction process provides the basis for ensuring a rough form of metric fairness, also known as individual fairness. Our principal technical result is a higher fidelity extractor under a mild technical constraint on the weak oracle’s conception of fairness. Our framework permits the scenario in which many classifiers, with differing outcomes, may all be considered fair. Our results have implications for interpretablity – a highly desired but poorly defined property of classification systems that endeavors to permit a human arbiter to reject classifiers deemed to be“unfair” or illegitimately derived. 
    more » « less
  2. null (Ed.)
    Expanding on previous work of automating functional modeling, we have developed a more informed automation approach by assigning a weighted confidence metric to the wide variety of data in a design repository. Our work focuses on automating what we call linear functional chains, which are a component-based section of a full functional model. We mine the Design Repository to find correlations between component and function and flow. The automation algorithm we developed organizes these connections by component-function-flow frequency (CFF frequency), thus allowing the creation of linear functional chains. In previous work, we found that CFF frequency is the best metric in formulating the linear functional chain for an individual component; however, we found that this metric did not account for prevalence and consistency in the Design Repository data. To better understand our data, we developed a new metric, which we refer to as weighted confidence, to provide insight on the fidelity of the data, calculated by taking the harmonic mean of two metrics we extracted from our data, prevalence, and consistency. This method could be applied to any dataset with a wide range of individual occurrences. The contribution of this research is not to replace CFF frequency as a method of finding the most likely component-function-flow correlations but to improve the reliability of the automation results by providing additional information from the weighted confidence metric. Improving these automation results, allows us to further our ultimate objective of this research, which is to enable designers to automatically generate functional models for a product given constituent components. 
    more » « less
  3. Abstract Internal variability is the dominant cause of projection uncertainty of Arctic sea ice in the short and medium term. However, it is difficult to determine the realism of simulated internal variability in climate models, as observations only provide one possible realization while climate models can provide numerous different realizations. To enable a robust assessment of simulated internal variability of Arctic sea ice, we use a resampling technique to build synthetic ensembles for both observations and climate models, focusing on interannual variability, which is the dominant time scale of Arctic sea ice internal variability. We assess the realism of the interannual variability of Arctic sea ice cover as simulated by six models from phase 5 of the Coupled Model Intercomparison Project (CMIP5) that provide large ensembles compared to four observational datasets. We augment the standard definition of model and observational consistency by representing the full distribution of resamplings, analogous to the distribution of variability that could have randomly occurred. We find that modeled interannual variability typically lies within observational uncertainty. The three models with the smallest mean state biases are the only ones consistent in the pan-Arctic for all months, but no model is consistent for all regions and seasons. Hence, choosing the right model for a given task as well as using internal variability as an additional metric to assess sea ice simulations is important. The fact that CMIP5 large ensembles broadly simulate interannual variability consistent within observational uncertainty gives confidence in the internal projection uncertainty for Arctic sea ice based on these models. Significance Statement The purpose of this study is to evaluate the historical simulated internal variability of Arctic sea ice in climate models. Determining model realism is important to have confidence in the projected sea ice evolution from these models, but so far only mean state and trends are commonly assessed metrics. Here we assess internal variability with a focus on the interannual variability, which is the dominant time scale for internal variability. We find that, in general, models agree well with observations, but as no model is within observational uncertainty for all months and locations, choosing the right model for a given task is crucial. Further refinement of internal variability realism assessments will require reduced observational uncertainty. 
    more » « less
  4. As recommender systems have become more widespread and moved into areas with greater social impact, such as employment and housing, researchers have begun to seek ways to ensure fairness in the results that such systems produce. This work has primarily focused on developing recommendation approaches in which fairness metrics are jointly optimized along with recommendation accuracy. However, the previous work had largely ignored how individual preferences may limit the ability of an algorithm to produce fair recommendations. Furthermore, with few exceptions, researchers have only considered scenarios in which fairness is measured relative to a single sensitive feature or attribute (such as race or gender). In this paper, we present a re-ranking approach to fairness-aware recommendation that learns individual preferences across multiple fairness dimensions and uses them to enhance provider fairness in recommendation results. Specifically, we show that our opportunistic and metric-agnostic approach achieves a better trade-off between accuracy and fairness than prior re-ranking approaches and does so across multiple fairness dimensions. 
    more » « less
  5. Abstract

    Network analysis of infectious disease in wildlife can reveal traits or individuals critical to pathogen transmission and help inform disease management strategies. However, estimates of contact between animals are notoriously difficult to acquire. Researchers commonly use telemetry technologies to identify animal associations, but such data may have different sampling intervals and often captures a small subset of the population. The objectives of this study were to outline best practices for telemetry sampling in network studies of infectious disease by determining (a) the consequences of telemetry sampling on our ability to estimate network structure, (b) whether contact networks can be approximated using purely spatial contact definitions and (c) how wildlife spatial configurations may influence telemetry sampling requirements.

    We simulated individual movement trajectories for wildlife populations using a home range‐like movement model, creating full location datasets and corresponding ‘complete’ networks. To mimic telemetry data, we created ‘sample’ networks by subsampling the population (10%–100% of individuals) with a range of sampling intervals (every minute to every 3 days). We varied the definition of contact for sample networks, using either spatiotemporal or spatial overlap, and varied the spatial configuration of populations (random, lattice or clustered). To compare complete and sample networks, we calculated seven network metrics important for disease transmission and assessed mean ranked correlation coefficients and percent error between complete and sample network metrics.

    Telemetry sampling severely reduced our ability to calculate global node‐level network metrics, but had less impact on local and network‐level metrics. Even so, in populations with infrequent associations, high intensity telemetry sampling may still be necessary. Defining contact in terms of spatial overlap generally resulted in overly connected networks, but in some instances, could compensate for otherwise coarse telemetry data.

    By synthesizing movement and disease ecology with computational approaches, we characterized trade‐offs important for using wildlife telemetry data beyond ecological studies of individual movement, and found that careful use of telemetry data has the potential to inform network models. Thus, with informed application of telemetry data, we can make significant advances in leveraging its use for a better understanding and management of wildlife infectious disease.

     
    more » « less