NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Measuring sensitivity to social distancing behavior during the COVID-19 pandemic

https://doi.org/10.1038/s41598-022-20198-4

Kontokosta, Constantine E.; Hong, Boyeong; Bonczak, Bartosz J. (September 2022, Scientific Reports)

Abstract Social distancing remains an effective nonpharmaceutical behavioral interventions to limit the spread of COVID-19 and other airborne diseases, but monitoring and enforcement create nontrivial challenges. Several jurisdictions have turned to “311” resident complaint platforms to engage the public in reporting social distancing non-compliance, but differences in sensitivity to social distancing behaviors can lead to a mis-allocation of resources and increased health risks for vulnerable communities. Using hourly visit data to designated establishments and more than 71,000 social distancing complaints in New York City during the first wave of the pandemic, we develop a method, derived from the Weber-Fechner law, to quantify neighborhood sensitivity and assess how tolerance to social distancing infractions and complaint reporting behaviors vary with neighborhood characteristics. We find that sensitivity to non-compliance is lower in minority and low-income neighborhoods, as well as in lower density areas, resulting in fewer reported complaints than expected given measured levels of overcrowding.
more » « less
Estimating Reporting Bias in 311 Complaint Data

Boxer, Kate S; Hong, Boyeong; Kontokosta, Constantine E; Neill, Daniel B (June 2025, The annals of applied statistics)

Systems such as “311” enable residents of a community to report on their environments and to request non-emergency municipal services. While such systems provide an important link between community and government, resident-generated data suffer from reporting bias, with some subpopulations reporting at lower rates than others. Our research focuses on defining the under-reporting of heating and hot water problems to New York City’s 311 system and developing methods to estimate under-reporting. First, we estimate non-reporting by fitting a latent variable model which estimates both the probability of an underlying heating problem conditional on building characteristics, and the probability of reporting a problem conditional on population characteristics. Second, we analyze “less-than-expected” reporting: buildings with fewer 311 calls than expected as compared to similarly-sized buildings with similar estimated problem durations. Together, these analyses determine neighborhoods and neighborhood-level socioeconomic characteristics that are predictive of under-reporting of heating and hot water problems. Our approaches can aid government agencies wishing to use resident-generated data to assist in constructing fair public policies.
more » « less
Free, publicly-accessible full text available June 1, 2026
Socio-spatial inequality and the effects of density on COVID-19 transmission in US cities

https://doi.org/10.1038/s44284-023-00008-2

Kontokosta, Constantine E; Hong, Boyeong; Bonczak, Bartosz J (January 2024, Nature Cities)

Full Text Available
The measure and mismeasure of fairness

Corbett-Davies, Sam; Gaebler, Johann D.; Nilforoshan, Hamed; Shroff, Ravi; Goel, Sharad (August 2023, Journal of machine learning research)
Weinberger, Kilian (Ed.)
The field of fair machine learning aims to ensure that decisions guided by algorithms are equitable. Over the last decade, several formal, mathematical definitions of fairness have gained prominence. Here we first assemble and categorize these definitions into two broad families: (1) those that constrain the effects of decisions on disparities; and (2) those that constrain the effects of legally protected characteristics, like race and gender, on decisions. We then show, analytically and empirically, that both families of definitions typically result in strongly Pareto dominated decision policies. For example, in the case of college admissions, adhering to popular formal conceptions of fairness would simultaneously result in lower student-body diversity and a less academically prepared class, relative to what one could achieve by explicitly tailoring admissions policies to achieve desired outcomes. In this sense, requiring that these fairness definitions hold can, perversely, harm the very groups they were designed to protect. In contrast to axiomatic notions of fairness, we argue that the equitable design of algorithms requires grappling with their context-specific consequences, akin to the equitable design of policy. We conclude by listing several open challenges in fair machine learning and offering strategies to ensure algorithms are better aligned with policy goals.
more » « less
Full Text Available
Provable detection of propagating sampling bias in prediction models

Ravishankar, Pavan; Mo, Qingyu; McFowland III, Edward; Neill, Daniel B. (January 2023, Proceedings of the 37th AAAI Conference on Artificial Intelligence)

With an increased focus on incorporating fairness in machine learning models, it becomes imperative not only to assess and mitigate bias at each stage of the machine learning pipeline but also to understand the downstream impacts of bias across stages. Here we consider a general, but realistic, scenario in which a predictive model is learned from (potentially biased) training data, and model predictions are assessed post-hoc for fairness by some auditing method. We provide a theoretical analysis of how a specific form of data bias, differential sampling bias, propagates from the data stage to the prediction stage. Unlike prior work, we evaluate the downstream impacts of data biases quantitatively rather than qualitatively and prove theoretical guarantees for detection. Under reasonable assumptions, we quantify how the amount of bias in the model predictions varies as a function of the amount of differential sampling bias in the data, and at what point this bias becomes provably detectable by the auditor. Through experiments on two criminal justice datasets– the well-known COMPAS dataset and historical data from NYPD’s stop and frisk policy– we demonstrate that the theoretical results hold in practice even when our assumptions are relaxed.
more » « less
Full Text Available
Pretrial release judgments and decision fatigue

Shroff, Ravi; Vamvourellis, Konstantinos (November 2022, Judgment and decision making)

Field studies in many domains have found evidence of decision fatigue, a phenomenon describing how decision quality can be impaired by the act of making previous decisions. Debate remains, however, over posited psychological mechanisms underlying decision fatigue, and the size of effects in high-stakes settings. We examine an extensive set of pretrial arraignments in a large, urban court system to investigate how judicial release and bail decisions are influenced by the time an arraignment occurs. We find that release rates decline modestly in the hours before lunch and before dinner, and these declines persist after statistically adjusting for an extensive set of observed covariates. However, we find no evidence that arraignment time affects pretrial release rates in the remainder of each decision-making session. Moreover, we find that release rates remain unchanged after a meal break even though judges have the opportunity to replenish their mental and physical resources by resting and eating. In a complementary analysis, we find that the rate at which judges concur with prosecutorial bail requests does not appear to be influenced by either arraignment time or a meal break. Taken together, our results imply that to the extent that decision fatigue plays a role in pretrial release judgments, effects are small and inconsistent with previous explanations implicating psychological depletion processes.
more » « less
Full Text Available
Efficient Optimization of Partition Scan Statistics via the Consecutive Partitions Property

https://doi.org/10.1080/10618600.2022.2077351

Pehlivanian, Charles A.; Neill, Daniel B. (January 2022, Journal of Computational and Graphical Statistics)

We generalize the spatial and subset scan statistics from the single to the multiple subset case. The two main approaches to defining the log-likelihood ratio statistic in the single subset case—the population-based and expectation-based scan statistics—are considered, leading to risk partitioning and multiple cluster detection scan statistics, respectively. We show that, for distributions in a separable exponential family, the risk partitioning scan statistic can be expressed as a scaled f-divergence of the normalized count and baseline vectors, and the multiple cluster detection scan statistic as a sum of scaled Bregman divergences. In either case, however, maximization of the scan statistic by exhaustive search over all partitionings of the data requires exponential time. To make this optimization computationally feasible, we prove sufficient conditions under which the optimal partitioning is guaranteed to be consecutive. This Consecutive Partitions Property generalizes the linear-time subset scanning property from two partitions (the detected subset and the remaining data elements) to the multiple partition case. While the number of consecutive partitionings of n elements into t partitions scales as O(n^(t−1)), making it computationally expensive for large t, we present a dynamic programming approach which identifies the optimal consecutive partitioning in O(n^2 t) time, thus allowing for the exact and efficient solution of large-scale risk partitioning and multiple cluster detection problems. Finally, we demonstrate the detection performance and practical utility of partition scan statistics using simulated and real-world data. Supplementary materials for this article are available online.
more » « less
Full Text Available
Nonparametric Subset Scanning for Detection of Heteroscedasticity

https://doi.org/10.1080/10618600.2022.2026779

Doss, Charles R.; McFowland, Edward (January 2022, Journal of Computational and Graphical Statistics)

Full Text Available
Commentary on “Causal Decision Making and Causal Effect Estimation Are Not the Same… and Why It Matters”

https://doi.org/10.1287/ijds.2021.0010

McFowland, Edward (January 2022, INFORMS Journal on Data Science)

Full Text Available
Causal Conceptions of Fairness and their Consequences

Nilforoshan, Hamed; Gaebler, Johann D.; Shroff, Ravi; Goel, Sharad (January 2022, Proceedings of the 39th International Conference on Machine Learning)
Chaudhuri, Kamalika; Jegelka, Stefanie; Song, Le; Szepesvari, Csaba; Niu, Gang; Sabato, Sivan (Ed.)
Recent work highlights the role of causality in designing equitable decision-making algorithms. It is not immediately clear, however, how existing causal conceptions of fairness relate to one another, or what the consequences are of using these definitions as design principles. Here, we first assemble and categorize popular causal definitions of algorithmic fairness into two broad families: (1) those that constrain the effects of decisions on counterfactual disparities; and (2) those that constrain the effects of legally protected characteristics, like race and gender, on decisions. We then show, analytically and empirically, that both families of definitions almost always—in a measure theoretic sense—result in strongly Pareto dominated decision policies, meaning there is an alternative, unconstrained policy favored by every stakeholder with preferences drawn from a large, natural class. For example, in the case of college admissions decisions, policies constrained to satisfy causal fairness definitions would be disfavored by every stakeholder with neutral or positive preferences for both academic preparedness and diversity. Indeed, under a prominent definition of causal fairness, we prove the resulting policies require admitting all students with the same probability, regardless of academic qualifications or group membership. Our results highlight formal limitations and potential adverse consequences of common mathematical notions of causal fairness.
more » « less
Full Text Available

Search for: All records