skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on June 1, 2026

Title: Estimating Reporting Bias in 311 Complaint Data
Systems such as “311” enable residents of a community to report on their environments and to request non-emergency municipal services. While such systems provide an important link between community and government, resident-generated data suffer from reporting bias, with some subpopulations reporting at lower rates than others. Our research focuses on defining the under-reporting of heating and hot water problems to New York City’s 311 system and developing methods to estimate under-reporting. First, we estimate non-reporting by fitting a latent variable model which estimates both the probability of an underlying heating problem conditional on building characteristics, and the probability of reporting a problem conditional on population characteristics. Second, we analyze “less-than-expected” reporting: buildings with fewer 311 calls than expected as compared to similarly-sized buildings with similar estimated problem durations. Together, these analyses determine neighborhoods and neighborhood-level socioeconomic characteristics that are predictive of under-reporting of heating and hot water problems. Our approaches can aid government agencies wishing to use resident-generated data to assist in constructing fair public policies.  more » « less
Award ID(s):
2040898 1926470
PAR ID:
10589600
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Institute of Mathematical Statistics
Date Published:
Journal Name:
The annals of applied statistics
ISSN:
1932-6157
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Social distancing remains an effective nonpharmaceutical behavioral interventions to limit the spread of COVID-19 and other airborne diseases, but monitoring and enforcement create nontrivial challenges. Several jurisdictions have turned to “311” resident complaint platforms to engage the public in reporting social distancing non-compliance, but differences in sensitivity to social distancing behaviors can lead to a mis-allocation of resources and increased health risks for vulnerable communities. Using hourly visit data to designated establishments and more than 71,000 social distancing complaints in New York City during the first wave of the pandemic, we develop a method, derived from the Weber-Fechner law, to quantify neighborhood sensitivity and assess how tolerance to social distancing infractions and complaint reporting behaviors vary with neighborhood characteristics. We find that sensitivity to non-compliance is lower in minority and low-income neighborhoods, as well as in lower density areas, resulting in fewer reported complaints than expected given measured levels of overcrowding. 
    more » « less
  2. Interest in earthquake resilience has increased in recent years, and the use of building cluster performance objectives has been shown to be an effective method for evaluating the resilience of built environment. A building cluster is a portfolio of buildings that share the same role in a community; its performance objectives are defined by considering earthquake scenarios, hazard levels, and individual building performance. The methodology presented in this paper employs performance-based assessments to estimate the probability of achieving building cluster performance objectives immediately following a seismic event. It can be used to assess the immediate post-earthquake community resilience in five steps: 1) hazard analysis, 2) conditional assessment of individual building performance, 3) conditional assessment of building cluster performance, 4) building cluster performance assessment by aggregation, and 5) earthquake resilience assessment of building clusters considering all hazard levels of interest. The design and extreme hazard levels are formulated using ground motion records selected based on the conditional spectra considering characteristics of earthquake scenarios and spatial correlation. Three performance objectives are defined for both individual buildings and building clusters: functionality, safe and usable during repair, and collapse prevention. Two engineering demand parameters – the maximum transient and the permanent interstory drift indices – are used to estimate individual building performance. The probability of achieving building cluster performance objective is calculated using the total probability theorem. The application of the proposed methodology is demonstrated using two clusters of reinforced concrete buildings, corresponding to ASCE 7 Risk Category II and IV structures, in San Francisco, CA. 
    more » « less
  3. The local government’s continuous support is critical for the well-being of a community during disaster events. E-Government systems that establish and maintain ongoing connections with the community thus play a vital role in supporting crisis response and recovery. Such systems’ ability to adapt to the crisis circumstances and to address emergent needs helps them continue their fundamental functions during disasters. Considering various services might require different amounts and types of resources, prioritization strategies are helpful in determining the processing order of requests. This paper discusses the role of prioritizing services within an e-Government system, to better understand how such a system can be managed to best utilize available resources. The study examines how a well-functioning e-Government system, the Orange County, Florida 311 non-emergency service system, responded to the COVID-19 pandemic and how the changes in service operations requirements can affect service provision, specifically with respect to assigning or re-assigning priority levels. 
    more » « less
  4. Abstract Microbes form multispecies communities that play essential roles in our environment and health. Not surprisingly, there is an increasing need for understanding if certain invader species will modify a given microbial community, producing either a desired or undesired change in the observed collection of resident species. However, the complex interactions that species can establish between each other and the diverse external factors underlying their dynamics have made constructing such understanding context-specific. Here we integrate tractable theoretical systems with tractable experimental systems to find general conditions under which non-resident species can change the collection of resident communities—game-changing species. We show that non-resident colonizers are more likely to be game-changers than transients, whereas game-changers are more likely to suppress than to promote resident species. Importantly, we find general heuristic rules for game-changers under controlled environments by integrating mutual invasibility theory with in vitro experimental systems, and general heuristic rules under changing environments by integrating structuralist theory with in vivo experimental systems. Despite the strong context-dependency of microbial communities, our work shows that under an appropriate integration of tractable theoretical and experimental systems, it is possible to unveil regularities that can then be potentially extended to understand the behavior of complex natural communities. 
    more » « less
  5. Carruthers, John; Duncan, Natasha; He, Canfei; Zhu, Shengjun (Ed.)
    This paper illustrates the application of machine learning algorithms in predictive analytics for local governments using administrative data. The developed and tested machine learning predictive algorithm overcomes known limitations of the conventional ordinary least squares method. Such limitations include but not limited to imposed linearity, presumed causality with independent variables as presumed causes and dependent variables as presume result, likely high multicollinearity among features, and spatial autocorrelation. The study applies the algorithms to 311 non-emergency service requests in the context of Miami-Dade County. The algorithms are applied to predict the volume of 311 service requests and the community characteristics affecting the volume across Census tract neighborhoods. Four common families of algorithms and an ensemble of them are applied. They are random forest, support vector machines, lasso and elastic-net regularized generalized linear models, and extreme gradient boosting. Two feature selection methods, namely Boruta and fscaret, are applied to identify the significant community characteristics. The results show that the machine learning algorithms capture spatial autocorrelation and clustering. The features generated by fscaret algorithms are parsimonious in predicting the 311 service request volume. 
    more » « less