skip to main content

Search for: All records

Creators/Authors contains: "Korolova, Aleksandra"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The prevalence and importance of algorithmic two-sided marketplaces has drawn attention to the issue of fairness in such settings. Algorithmic decisions are used in assigning students to schools, users to advertisers, and applicants to job interviews. These decisions should heed the preferences of individuals, and simultaneously be fair with respect to their merits (synonymous with fit, future performance, or need). Merits conditioned on observable features are always uncertain, a fact that is exacerbated by the widespread use of machine learning algorithms to infer merit from the observables. As our key contribution, we carefully axiomatize a notion of individual fairness in the two-sided marketplace setting which respects the uncertainty in the merits; indeed, it simultaneously recognizes uncertainty as the primary potential cause of unfairness and an approach to address it. We design a linear programming framework to find fair utility-maximizing distributions over allocations, and we show that the linear program is robust to perturbations in the estimated parameters of the uncertain merit distributions, a key property in combining the approach with machine learning techniques. 
    more » « less
    Free, publicly-accessible full text available July 23, 2024
  2. Free, publicly-accessible full text available June 12, 2024
  3. Social media platforms curate access to information and opportunities, and so play a critical role in shaping public discourse today. The opaque nature of the algorithms these platforms use to curate content raises societal questions. Prior studies have used black-box methods led by experts or collaborative audits driven by everyday users to show that these algorithms can lead to biased or discriminatory outcomes. However, existing auditing methods face fundamental limitations because they function independent of the platforms. Concerns of potential harmful outcomes have prompted proposal of legislation in both the U.S. and the E.U. to mandate a new form of auditing where vetted external researchers get privileged access to social media platforms. Unfortunately, to date there have been no concrete technical proposals to provide such auditing, because auditing at scale risks disclosure of users' private data and platforms' proprietary algorithms. We propose a new method for platform-supported auditing that can meet the goals of the proposed legislation. The first contribution of our work is to enumerate the challenges and the limitations of existing auditing methods to implement these policies at scale. Second, we suggest that limited, privileged access to relevance estimators is the key to enabling generalizable platform-supported auditing of social media platforms by external researchers. Third, we show platform-supported auditing need not risk user privacy nor disclosure of platforms' business interests by proposing an auditing framework that protects against these risks. For a particular fairness metric, we show that ensuring privacy imposes only a small constant factor increase (6.34x as an upper bound, and 4× for typical parameters) in the number of samples required for accurate auditing. Our technical contributions, combined with ongoing legal and policy efforts, can enable public oversight into how social media platforms affect individuals and society by moving past the privacy-vs-transparency hurdle. 
    more » « less
  4. We consider the problem of allocating divisible items among multiple agents, and consider the setting where any agent is allowed to introduce {\emph diversity constraints} on the items they are allocated. We motivate this via settings where the items themselves correspond to user ad slots or task workers with attributes such as race and gender on which the principal seeks to achieve demographic parity. We consider the following question: When an agent expresses diversity constraints into an allocation rule, is the allocation of other agents hurt significantly? If this happens, the cost of introducing such constraints is disproportionately borne by agents who do not benefit from diversity. We codify this via two desiderata capturing {\em robustness}. These are {\emph no negative externality} -- other agents are not hurt -- and {\emph monotonicity} -- the agent enforcing the constraint does not see a large increase in value. We show in a formal sense that the Nash Welfare rule that maximizes product of agent values is {\emph uniquely} positioned to be robust when diversity constraints are introduced, while almost all other natural allocation rules fail this criterion. We also show that the guarantees achieved by Nash Welfare are nearly optimal within a widely studied class of allocation rules. We finally perform an empirical simulation on real-world data that models ad allocations to show that this gap between Nash Welfare and other rules persists in the wild. 
    more » « less
  5. null (Ed.)
  6. null (Ed.)
  7. The Domain Name System (DNS) is used in every website visit and e-mail transmission, so privacy is an obvious concern. In DNS, users ask recursive resolvers (or ``recursives'') to make queries on their behalf. Prior analysis of DNS privacy focused on privacy risks to individual end-users, mainly in traffic between users and recursives. Recursives cache and aggregate traffic for many users, factors that are commonly assumed to protect end-user privacy above the recursive. We document \emph{institutional privacy} as a new risk posed by DNS data collected at authoritative servers, even after caching and aggregation by DNS recursives. We are the first to demonstrate this risk by looking at leaks of e-mail exchanges which show communications patterns, and leaks from accessing sensitive websites, both of which can harm an institution's public image. We define a methodology to identify queries from institutions and identify leaks. We show the current practices of prefix-preserving anonymization of IP addresses and aggregation above the recursive are not sufficient to protect institutional privacy, suggesting the need for novel approaches. We demonstrate this claim by applying our methodology to real-world traffic from DNS servers that use partial prefix-preserving anonymization. Our work prompts additional privacy considerations for institutions that run their own resolvers and authoritative server operators that log and share DNS data. 
    more » « less
  8. null (Ed.)
    Political campaigns are increasingly turning to targeted advertising platforms to inform and mobilize potential voters. The appeal of these platforms stems from their promise to empower advertisers to select (or "target") users who see their messages with great precision, including through inferences about those users' interests and political affiliations. However, prior work has shown that the targeting may not work as intended, as platforms' ad delivery algorithms play a crucial role in selecting which subgroups of the targeted users see the ads. In particular, the platforms can selectively deliver ads to subgroups within the target audiences selected by advertisers in ways that can lead to demographic skews along race and gender lines, and do so without the advertiser's knowledge. In this work we demonstrate that ad delivery algorithms used by Facebook, the most advanced targeted advertising platform, shape the political ad delivery in ways that may not be beneficial to the political campaigns and to societal discourse. In particular, the ad delivery algorithms lead to political messages on Facebook being shown predominantly to people who Facebook thinks already agree with the ad campaign's message even if the political advertiser targets an ideologically diverse audience. Furthermore, an advertiser determined to reach ideologically non-aligned users is non-transparently charged a high premium compared to their more aligned competitor, a difference from traditional broadcast media. Our results demonstrate that Facebook exercises control over who sees which political messages beyond the control of those who pay for them or those who are exposed to them. Taken together, our findings suggest that the political discourse's increased reliance on profit-optimized, non-transparent algorithmic systems comes at a cost of diversity of political views that voters are exposed to. Thus, the work raises important questions of fairness and accountability desiderata for ad delivery algorithms applied to political ads. 
    more » « less