Personal mobility data from mobile phones and other sensors are increasingly used to inform policymaking during pandemics, natural disasters, and other humanitarian crises. However, even aggregated mobility traces can reveal private information about individual movements to potentially malicious actors. This paper develops and tests an approach for releasing private mobility data, which provides formal guarantees over the privacy of the underlying subjects. Specifically, we (1) introduce an algorithm for constructing differentially private mobility matrices and derive privacy and accuracy bounds on this algorithm; (2) use real-world data from mobile phone operators in Afghanistan and Rwanda to show how this algorithm can enable the use of private mobility data in two high-stakes policy decisions: pandemic response and the distribution of humanitarian aid; and (3) discuss practical decisions that need to be made when implementing this approach, such as how to optimally balance privacy and accuracy. Taken together, these results can help enable the responsible use of private mobility data in humanitarian response.
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract -
Poverty maps derived from satellite imagery are increasingly used to inform high-stakes policy decisions, such as the allocation of humanitarian aid and the distribution of government resources. Such poverty maps are typically constructed by training machine learning algorithms on a relatively modest amount of “ground truth” data from surveys, and then predicting poverty levels in areas where imagery exists but surveys do not. Using survey and satellite data from ten countries, this paper investigates disparities in representation, systematic biases in prediction errors, and fairness concerns in satellite-based poverty mapping across urban and rural lines, and shows how these phenomena affect the validity of policies based on predicted maps. Our findings highlight the importance of careful error and bias analysis before using satellite-based poverty maps in real-world policy decisions.more » « less
-
A key challenge in the design of effective anti-poverty programs is determining who should be eligible for program benefits. In devel- oping countries, one of the most common criteria is a Proxy Means Test — a simple decision rule that determines eligibility based on basic information about each household (for example, the number of rooms in the household, the number of children, whether there is indoor plumbing, and other observable characteristics) [1, 3, 4, 7]. At the core of each Proxy Means Test (PMT) is a machine learning algorithm that uses the short list of household characteristics to pre- dict whether the household should be deemed poor, and therefore eligible, or non-poor, and therefore ineligible [5, 6].more » « less
-
Abstract The COVID-19 pandemic has devastated many low- and middle-income countries, causing widespread food insecurity and a sharp decline in living standards 1 . In response to this crisis, governments and humanitarian organizations worldwide have distributed social assistance to more than 1.5 billion people 2 . Targeting is a central challenge in administering these programmes: it remains a difficult task to rapidly identify those with the greatest need given available data 3,4 . Here we show that data from mobile phone networks can improve the targeting of humanitarian assistance. Our approach uses traditional survey data to train machine-learning algorithms to recognize patterns of poverty in mobile phone data; the trained algorithms can then prioritize aid to the poorest mobile subscribers. We evaluate this approach by studying a flagship emergency cash transfer program in Togo, which used these algorithms to disburse millions of US dollars worth of COVID-19 relief aid. Our analysis compares outcomes—including exclusion errors, total social welfare and measures of fairness—under different targeting regimes. Relative to the geographic targeting options considered by the Government of Togo, the machine-learning approach reduces errors of exclusion by 4–21%. Relative to methods requiring a comprehensive social registry (a hypothetical exercise; no such registry exists in Togo), the machine-learning approach increases exclusion errors by 9–35%. These results highlight the potential for new data sources to complement traditional methods for targeting humanitarian assistance, particularly in crisis settings in which traditional data are missing or out of date.more » « less
-
null (Ed.)Recent papers demonstrate that non-traditional data, from mobile phones and other digital sensors, can be used to roughly estimate the wealth of individual subscribers. This paper asks a question more directly relevant to development policy: Can non-traditional data be used to more efficiently target development aid? By combining rich survey data from a "big push" anti-poverty program in Afghanistan with detailed mobile phone logs from program beneficiaries, we study the extent to which machine learning methods can accurately differentiate ultra-poor households eligible for program benefits from other households deemed ineligible. We show that supervised learning methods leveraging mobile phone data can identify ultra-poor households as accurately as standard survey-based measures of poverty, including consumption and wealth; and that combining survey-based measures with mobile phone data produces classifications more accurate than those based on a single data source. We discuss the implications and limitations of these methods for targeting extreme poverty in marginalized populations.more » « less
-
Abstract Much of our current risk assessment, especially for extreme events and natural disasters, comes from the assumption that the likelihood of future extreme events can be predicted based on the past. However, as global temperatures rise, established climate ranges may no longer be applicable, as historic records for extremes such as heat waves and floods may no longer accurately predict the changing future climate. To assess extremes (present‐day and future) over the contiguous United States, we used NOAA's Climate Extremes Index (CEI), which evaluates extremes in maximum and minimum temperature, extreme one‐day precipitation, days without precipitation, and the Palmer Drought Severity Index (PDSI). The CEI is a spatially sensitive index that uses percentile‐based thresholds rather than absolute values to determine climate “extremeness” and is thus well‐suited to compare extreme climate across regions. We used regional climate model data from the North American Regional Climate Change Assessment Program (NARCCAP) to compare a late 20th century reference period to a mid‐21st century “business as usual” (SRES A2) greenhouse gas‐forcing scenario. Results show a universal increase in extreme hot temperatures across all models, with annual average maximum and minimum temperatures exceeding 90th percentile thresholds consistently across the continental United States. Results for precipitation indicators have greater spatial variability from model to model, but indicate an overall movement towards less frequent but more extreme precipitation days in the future. Due to this difference in response between temperature and precipitation, the mid‐21st century CEI is primarily an index of temperature extremes, with 90th percentile temperatures contributing disproportionately to the overall increase in climate extremeness. We also examine the efficacy of the PDSI in this context in comparison to other drought indices.