skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 8:00 PM ET on Friday, March 21 until 8:00 AM ET on Saturday, March 22 due to maintenance. We apologize for the inconvenience.


Search for: All records

Award ID contains: 1758808

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    In this paper, we propose a flexible nested error regression small area model with high-dimensional parameter that incorporates heterogeneity in regression coefficients and variance components. We develop a new robust small area-specific estimating equations method that allows appropriate pooling of a large number of areas in estimating small area-specific model parameters. We propose a parametric bootstrap and jackknife method to estimate not only the mean squared errors but also other commonly used uncertainty measures such as standard errors and coefficients of variation. We conduct both model-based and design-based simulation experiments and real-life data analysis to evaluate the proposed methodology.

     
    more » « less
  2. Summary

    Computerised Record Linkage methods help us combine multiple data sets from different sources when a single data set with all necessary information is unavailable or when data collection on additional variables is time consuming and extremely costly. Linkage errors are inevitable in the linked data set because of the unavailability of error‐free unique identifiers. A small amount of linkage errors can lead to substantial bias and increased variability in estimating parameters of a statistical model. In this paper, we propose a unified theory for statistical analysis with linked data. Our proposed method, unlike the ones available for secondary data analysis of linked data, exploits record linkage process data as an alternative to taking a costly sample to evaluate error rates from the record linkage procedure. A jackknife method is introduced to estimate bias, covariance matrix and mean squared error of our proposed estimators. Simulation results are presented to evaluate the performance of the proposed estimators that account for linkage errors.

     
    more » « less
  3. Goal 1 of the 2030 Agenda for Sustainable Development, adopted by all United Nations member States in 2015, is to end poverty in all forms everywhere. The major indicator to monitor the goal is the so-called headcount ratio or poverty rate, i.e., proportion or percentage of people under poverty. In India, where nearly a quarter of population still live below the poverty line, monitoring of poverty needs greater attention, more frequently at shorter intervals (e.g., every year) to evaluate the effectiveness of planning, programs and actions taken by the governments to eradicate poverty. Poverty rate computation for India depends on two basic ingredients – rural and urban poverty lines for different states and union territories and average Monthly Per-capita Consumer Expenditure (MPCE). While MPCE can be obtained every year, usually from the Consumer Expenditure Survey on shorter schedules with a few exceptions where the information is obtained from another survey, determination of poverty lines is a highly complex, costly and time-consuming process. Poverty lines are essentially determined by a panel of experts who draws their conclusions partly based on their subjective opinions and partly based on data from multiple sources. The main data source the panel uses is the Consumer Expenditure Survey data with a detailed schedule, which are usually available every five years or so. In this paper, we undertake a feasibility study to explore if estimates of headcount ratios or Poverty Ratios in intervening years can be provided in absence of poverty lines by relating poverty ratios with average MPCE through a statistical model. Then we can use the fitted model to predict poverty rates for intervening years based on average MPCE. We explore a few in this work models using Bayesian methodology. The reason behind calling this ‘synthetic prediction’ rests on the synthetic assumption of model invariance over years, often used in the small area literature. While the data-based assessment of our Bayesian synthetic prediction procedure is encouraging, there is a great potential for improvements on the models presented in this paper, e.g., by incorporating more auxiliary data as they become available. In any case, we expect our preliminary work in this important area will encourage researchers to think about statistical modeling as a possible way to at least partially solve a problem for which no objective solution is currently available. 
    more » « less
  4. Abstract Understanding the impacts of pandemics on public health and related societal issues at granular levels is of great interest. COVID-19 is affecting everyone in the globe and mask wearing is one of the few precautions against it. To quantify people’s perception of mask effectiveness and to prevent the spread of COVID-19 for small areas, we use Understanding America Study’s (UAS) survey data on COVID-19 as our primary data source. Our data analysis shows that direct survey-weighted estimates for small areas could be highly unreliable. In this paper, we develop a synthetic estimation method to estimate proportions of perceived mask effectiveness for small areas using a logistic model that combines information from multiple data sources. We select our working model using an extensive data analysis facilitated by a new variable selection criterion for survey data and benchmarking ratios. We suggest a jackknife method to estimate the variance of our estimator. From our data analysis, it is evident that our proposed synthetic method outperforms the direct survey-weighted estimator with respect to commonly used evaluation measures. 
    more » « less