skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, June 13 until 2:00 AM ET on Friday, June 14 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Kleinberg, Jon"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Instant runoff voting (IRV) is an increasingly-popular alternative to traditional plurality voting in which voters submit rankings over the candidates rather than single votes. In practice, elections using IRV often restrict the ballot length, the number of candidates a voter is allowed to rank on their ballot. We theoretically and empirically analyze how ballot length can influence the outcome of an election, given fixed voter preferences. We show that there exist preference profiles over k candidates such that up to k-1 different candidates win at different ballot lengths. We derive exact lower bounds on the number of voters required for such profiles and provide a construction matching the lower bound for unrestricted voter preferences. Additionally, we characterize which sequences of winners are possible over ballot lengths and provide explicit profile constructions achieving any feasible winner sequence. We also examine how classic preference restrictions influence our results—for instance, single-peakedness makes k-1 different winners impossible but still allows at least Ω(√k). Finally, we analyze a collection of 168 real-world elections, where we truncate rankings to simulate shorter ballots. We find that shorter ballots could have changed the outcome in one quarter of these elections. Our results highlight ballot length as a consequential degree of freedom in the design of IRV elections. 
    more » « less
    Free, publicly-accessible full text available June 27, 2024
  2. null (Ed.)
    Poverty and economic hardship are understood to be highly complex and dynamic phenomena. Due to the multi-faceted nature of welfare, assistance programs targeted at alleviating hardship can face challenges, as they often rely on simpler welfare measurements, such as income or wealth, that fail to capture to full complexity of each family's state. Here, we explore one important dimension – susceptibility to income shocks. We introduce a model of welfare that incorporates income, wealth, and income shocks and analyze this model to show that it can vary, at times substantially, from measures of welfare that only use income or wealth. We then study the algorithmic problem of optimally allocating subsidies in the presence of income shocks. We consider two well-studied objectives: the first aims to minimize the expected number of agents that fall below a given welfare threshold (a min-sum objective) and the second aims to minimize the likelihood that the most vulnerable agent falls below this threshold (a min-max objective). We present optimal and near-optimal algorithms for various general settings. We close with a discussion on future directions on allocating societal resources and ethical implications of related approaches. 
    more » « less
  3. Data collection often involves the partial measurement of a larger system. A common example arises in collecting network data: we often obtain network datasets by recording all of the interactions among a small set of core nodes, so that we end up with a measurement of the network consisting of these core nodes along with a potentially much larger set of fringe nodes that have links to the core. Given the ubiquity of this process for assembling network data, it is crucial to understand the role of such a “core-fringe” structure. Here we study how the inclusion of fringe nodes affects the standard task of network link prediction. One might initially think the inclusion of any additional data is useful, and hence that it should be beneficial to include all fringe nodes that are available. However, we find that this is not true; in fact, there is substantial variability in the value of the fringe nodes for prediction. Once an algorithm is selected, in some datasets, including any additional data from the fringe can actually hurt prediction performance; in other datasets, including some amount of fringe information is useful before prediction performance saturates or even declines; and in further cases, including the entire fringe leads to the best performance. While such variety might seem surprising, we show that these behaviors are exhibited by simple random graph models. 
    more » « less
  4. A typical way in which network data is recorded is to measure all interactions involving a specified set of core nodes, which produces a graph containing this core together with a potentially larger set of fringe nodes that link to the core. Interactions between nodes in the fringe, however, are not present in the resulting graph data. For example, a phone service provider may only record calls in which at least one of the participants is a customer; this can include calls between a customer and a non-customer, but not between pairs of non-customers. Knowledge of which nodes belong to the core is crucial for interpreting the dataset, but this metadata is unavailable in many cases, either because it has been lost due to difficulties in data provenance, or because the network consists of “found data” obtained in settings such as counter-surveillance. This leads to an algorithmic problem of recovering the core set. Since the core is a vertex cover, we essentially have a planted vertex cover problem, but with an arbitrary underlying graph. We develop a framework for analyzing this planted vertex cover problem, based on the theory of fixed-parameter tractability, together with algorithms for recovering the core. Our algorithms are fast, simple to implement, and out-perform several baselines based on core-periphery structure on various real-world datasets. 
    more » « less
  5. There are many settings in which a principal performs a task by delegating it to an agent, who searches over possible solutions and proposes one to the principal. This describes many aspects of the workflow within organizations, as well as many of the activities undertaken by regulatory bodies, who often obtain relevant information from the parties being regulated through a process of delegation. A fundamental tension underlying delegation is the fact that the agent's interests will typically differ -- potentially significantly -- from the interests of the principal, and as a result the agent may propose solutions based on their own incentives that are inefficient for the principal. A basic problem, therefore, is to design mechanisms by which the principal can constrain the set of proposals they are willing to accept from the agent, to ensure a certain level of quality for the principal from the proposed solution. In this work, we investigate how much the principal loses -- quantitatively, in terms of the objective they are trying to optimize -- when they delegate to an agent. We develop a methodology for bounding this loss of efficiency, and show that in a very general model of delegation, there is a family of mechanisms achieving a universal bound on the ratio between the quality of the solution obtained through delegation and the quality the principal could obtain in an idealized benchmark where they searched for a solution themself. Moreover, it is possible to achieve such bounds through mechanisms with a natural threshold structure, which are thus structurally simpler than the optimal mechanisms typically considered in the literature on delegation. At the heart of our framework is an unexpected connection between delegation and the analysis of prophet inequalities, which we leverage to provide bounds on the behavior of our delegation mechanisms. 
    more » « less
  6. ABSTRACT. A typical way in which network data is recorded is to measure all the interactions among a specified set of core nodes; this produces a graph containing this core together with a potentially larger set of fringe nodes that have links to the core. Interactions between pairs of nodes in the fringe, however, are not recorded by this process, and hence not present in the resulting graph data. For example, a phone service provider may only have records of calls in which at least one of the participants is a customer; this can include calls between a customer and a non-customer, but not between pairs of non-customers. Knowledge of which nodes belong to the core is an important piece of metadata that is crucial for interpreting the network dataset. But in many cases, this metadata is not available, either because it has been lost due to difficulties in data provenance, or because the network consists of “found data” obtained in settings such as counter-surveillance. This leads to a natural algorithmic problem, namely the recovery of the core set. Since the core set forms a vertex cover of the graph, we essentially have a planted vertex cover problem, but with an arbitrary underlying graph. We develop a theoretical framework for analyzing this planted vertex cover problem, based on results in the theory of fixed- parameter tractability, together with algorithms for recovering the core. Our algorithms are fast, simple to implement, and out-perform several methods based on network core-periphery structure on various real-world datasets. 
    more » « less