skip to main content


Title: Machine learning and phone data can improve targeting of humanitarian aid
Abstract The COVID-19 pandemic has devastated many low- and middle-income countries, causing widespread food insecurity and a sharp decline in living standards 1 . In response to this crisis, governments and humanitarian organizations worldwide have distributed social assistance to more than 1.5 billion people 2 . Targeting is a central challenge in administering these programmes: it remains a difficult task to rapidly identify those with the greatest need given available data 3,4 . Here we show that data from mobile phone networks can improve the targeting of humanitarian assistance. Our approach uses traditional survey data to train machine-learning algorithms to recognize patterns of poverty in mobile phone data; the trained algorithms can then prioritize aid to the poorest mobile subscribers. We evaluate this approach by studying a flagship emergency cash transfer program in Togo, which used these algorithms to disburse millions of US dollars worth of COVID-19 relief aid. Our analysis compares outcomes—including exclusion errors, total social welfare and measures of fairness—under different targeting regimes. Relative to the geographic targeting options considered by the Government of Togo, the machine-learning approach reduces errors of exclusion by 4–21%. Relative to methods requiring a comprehensive social registry (a hypothetical exercise; no such registry exists in Togo), the machine-learning approach increases exclusion errors by 9–35%. These results highlight the potential for new data sources to complement traditional methods for targeting humanitarian assistance, particularly in crisis settings in which traditional data are missing or out of date.  more » « less
Award ID(s):
1942702
NSF-PAR ID:
10417644
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Nature
Volume:
603
Issue:
7903
ISSN:
0028-0836
Page Range / eLocation ID:
864 to 870
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Hundreds of millions of poor families receive some form of targeted social assistance. Many of these antipoverty programs involve some degree of geographic targeting, where aid is prioritized to the poorest regions of the country. However, policy makers in many low-resource settings lack the disaggregated poverty data required to make effective geographic targeting decisions. Using several independent datasets from Nigeria, this paper shows that high-resolution poverty maps, constructed by applying machine learning algorithms to satellite imagery and other nontraditional geospatial data, can improve the targeting of government cash transfers to poor families. Specifically, we find that geographic targeting relying on machine learning–based poverty maps can reduce errors of exclusion and inclusion relative to geographic targeting based on recent nationally representative survey data. This result holds for antipoverty programs that target both the poor and the extreme poor and for initiatives of varying sizes. We also find no evidence that machine learning–based maps increase targeting disparities by demographic groups, such as gender or religion. Based in part on these findings, the Government of Nigeria used this approach to geographically target emergency cash transfers in response to the COVID-19 pandemic. 
    more » « less
  2. Many critical policy decisions, from strategic investments to the allocation of humanitarian aid, rely on data about the geographic distribution of wealth and poverty. Yet many poverty maps are out of date or exist only at very coarse levels of granularity. Here we develop microestimates of the relative wealth and poverty of the populated surface of all 135 low- and middle-income countries (LMICs) at 2.4 km resolution. The estimates are built by applying machine-learning algorithms to vast and heterogeneous data from satellites, mobile phone networks, and topographic maps, as well as aggregated and deidentified connectivity data from Facebook. We train and calibrate the estimates using nationally representative household survey data from 56 LMICs and then validate their accuracy using four independent sources of household survey data from 18 countries. We also provide confidence intervals for each microestimate to facilitate responsible downstream use. These estimates are provided free for public use in the hope that they enable targeted policy response to the COVID-19 pandemic, provide the foundation for insights into the causes and consequences of economic development and growth, and promote responsible policymaking in support of sustainable development. 
    more » « less
  3. null (Ed.)
    Recent papers demonstrate that non-traditional data, from mobile phones and other digital sensors, can be used to roughly estimate the wealth of individual subscribers. This paper asks a question more directly relevant to development policy: Can non-traditional data be used to more efficiently target development aid? By combining rich survey data from a "big push" anti-poverty program in Afghanistan with detailed mobile phone logs from program beneficiaries, we study the extent to which machine learning methods can accurately differentiate ultra-poor households eligible for program benefits from other households deemed ineligible. We show that supervised learning methods leveraging mobile phone data can identify ultra-poor households as accurately as standard survey-based measures of poverty, including consumption and wealth; and that combining survey-based measures with mobile phone data produces classifications more accurate than those based on a single data source. We discuss the implications and limitations of these methods for targeting extreme poverty in marginalized populations. 
    more » « less
  4. null (Ed.)
    Purpose When a large-scale outbreak such as the COVID-19 pandemic happens, organizations that are responsible for delivering relief may face a lack of both provisions and human resources. Governments are the primary source for the humanitarian supplies required during such a crisis; however, coordination with humanitarian NGOs in handling such pandemics is a vital form of public-private partnership (PPP). Aid organizations have to consider not only the total degree of demand satisfaction in such cases but also the obligation that relief goods such as medicine and foods should be distributed as equitably as possible within the affected areas (AAs). Design/methodology/approach Given the challenges of acquiring real data associated with procuring relief items during the COVID-19 outbreak, a comprehensive simulation-based plan is used to generate 243 small, medium and large-sized problems with uncertain demand, and these problems are solved to optimality using GAMS. Finally, post-optimality analyses are conducted, and some useful managerial insights are presented. Findings The results imply that given a reasonable measure of deprivation costs, it can be important for managers to focus less on the logistical costs of delivering resources and more on the value associated with quickly and effectively reducing the overall suffering of the affected individuals. It is also important for managers to recognize that even though deprivation costs and transportation costs are both increasing as the time horizon increases, the actual growth rate of the deprivation costs decreases over time. Originality/value In this paper, a novel mathematical model is presented to minimize the total costs of delivering humanitarian aid for pandemic relief. With a focus on sustainability of operations, the model incorporates total transportation and delivery costs, the cost of utilizing the transportation fleet (transportation mode cost), and equity and deprivation costs. Taking social costs such as deprivation and equity costs into account, in addition to other important classic cost terms, enables managers to organize the best possible response when such outbreaks happen. 
    more » « less
  5. Background Even before the onset of the COVID-19 pandemic, children and adolescents were experiencing a mental health crisis, partly due to a lack of quality mental health services. The rate of suicide for Black youth has increased by 80%. By 2025, the health care system will be short of 225,000 therapists, further exacerbating the current crisis. Therefore, it is of utmost importance for providers, schools, youth mental health, and pediatric medical providers to integrate innovation in digital mental health to identify problems proactively and rapidly for effective collaboration with other health care providers. Such approaches can help identify robust, reproducible, and generalizable predictors and digital biomarkers of treatment response in psychiatry. Among the multitude of digital innovations to identify a biomarker for psychiatric diseases currently, as part of the macrolevel digital health transformation, speech stands out as an attractive candidate with features such as affordability, noninvasive, and nonintrusive. Objective The protocol aims to develop speech-emotion recognition algorithms leveraging artificial intelligence/machine learning, which can establish a link between trauma, stress, and voice types, including disrupting speech-based characteristics, and detect clinically relevant emotional distress and functional impairments in children and adolescents. Methods Informed by theoretical foundations (the Theory of Psychological Trauma Biomarkers and Archetypal Voice Categories), we developed our methodology to focus on 5 emotions: anger, happiness, fear, neutral, and sadness. Participants will be recruited from 2 local mental health centers that serve urban youths. Speech samples, along with responses to the Symptom and Functioning Severity Scale, Patient Health Questionnaire 9, and Adverse Childhood Experiences scales, will be collected using an Android mobile app. Our model development pipeline is informed by Gaussian mixture model (GMM), recurrent neural network, and long short-term memory. Results We tested our model with a public data set. The GMM with 128 clusters showed an evenly distributed accuracy across all 5 emotions. Using utterance-level features, GMM achieved an accuracy of 79.15% overall, while frame selection increased accuracy to 85.35%. This demonstrates that GMM is a robust model for emotion classification of all 5 emotions and that emotion frame selection enhances accuracy, which is significant for scientific evaluation. Recruitment and data collection for the study were initiated in August 2021 and are currently underway. The study results are likely to be available and published in 2024. Conclusions This study contributes to the literature as it addresses the need for speech-focused digital health tools to detect clinically relevant emotional distress and functional impairments in children and adolescents. The preliminary results show that our algorithm has the potential to improve outcomes. The findings will contribute to the broader digital health transformation. International Registered Report Identifier (IRRID) DERR1-10.2196/46970 
    more » « less