skip to main content


Title: Survey Data Quality in Analyzing Harmonized Indicators of Protest Behavior: A Survey Data Recycling Approach
This article proposes a new approach to analyze protest participation measured in surveys of uneven quality. Because single international survey projects cover only a fraction of the world’s nations in specific periods, researchers increasingly turn to ex-post harmonization of different survey data sets not a priori designed as comparable. However, very few scholars systematically examine the impact of the survey data quality on substantive results. We argue that the variation in source data, especially deviations from standards of survey documentation, data processing, and computer files—proposed by methodologists of Total Survey Error, Survey Quality Monitoring, and Fitness for Intended Use—is important for analyzing protest behavior. In particular, we apply the Survey Data Recycling framework to investigate the extent to which indicators of attending demonstrations and signing petitions in 1,184 national survey projects are associated with measures of data quality, controlling for variability in the questionnaire items. We demonstrate that the null hypothesis of no impact of measures of survey quality on indicators of protest participation must be rejected. Measures of survey documentation, data processing, and computer records, taken together, explain over 5% of the intersurvey variance in the proportions of the populations attending demonstrations or signing petitions.  more » « less
Award ID(s):
1738502
NSF-PAR ID:
10337336
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
American Behavioral Scientist
Volume:
66
Issue:
4
ISSN:
0002-7642
Page Range / eLocation ID:
412 to 433
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A common claim about the affluent democracies is that protest is trending, becoming more legitimate and widely used by all political contenders. In the new democracies, protest is seen as having contributed to democratization, but growing apathy has led to protest decline while in authoritarian regimes protest may be spurring more democratization. Assessing these ideas requires comparative trend data covering 15 or more years but constructing such data confronts problems. The major problem is that the most available survey item asks “have you ever joined (lawful) demonstrations,” making it difficult to time when this protest behavior occurred. We advance a novel method for timing these “ever” responses by focusing on young adults (aged 18-23 years), who are likely reporting on participation within the past 5 years. Drawing on the Survey Data Recycling harmonized data set, we use a multilevel model including harmonization and survey quality controls to create predicted probabilities for young adult participation (576 surveys, 119 countries, 1966-2010). Aggregating these to create country-year rate estimates, these compare favorably with overlapping estimates from surveys asking about “the past 5 years or so” and event data from the PolDem project. Harmonization and survey quality controls improve these predicted values. These data provide 15+ years trend estimates for 60 countries, which we use to illustrate the possibilities of estimating comparative protest trends.

     
    more » « less
  2. As active involvement in protest has been legitimized as an acceptable form of political activity, citizens’ protest potential has become an important measure to understand contemporary democratic politics. However, the arbitrary use of a forced-choice question, which prevents those who have previously participated in protests from expressing willingness to engage in future protest, and the limited coverage of international surveys across countries and years have impeded comparative research on protest potential. This research develops a new systematic weighting method for the measurement of protest potential for comparative research. Using the 1996 International Social Survey Program survey, which asks two separate questions about “have done” and “would do” demonstrations, I create a weighting scale for the forced-choice question by estimating the predicted probabilities of protest potential for those who have already participated in demonstrations. Capitalizing on the survey data recycling framework, this study also controls for harmonization procedures and the quality of surveys, thereby expanding the cross-national and temporal coverage beyond the affluent Western democracies. The results show that this weighting scale provides a valid measure of protest potential, and the survey data recycling framework improves comparability between surveys. 
    more » « less
  3. Response time (RT) – the time elapsing from the beginning of question reading for a given question until the start of the next question – is a potentially important indicator of data quality that can be reliably measured for all questions in a computer-administered survey using a latent timer (i.e., triggered automatically by moving on to the next question). In interviewer-administered surveys, RTs index data quality by capturing the entire length of time spent on a question–answer sequence, including interviewer question-asking behaviors and respondent question-answering behaviors. Consequently, longer RTs may indicate longer processing or interaction on the part of the interviewer, respondent, or both. RTs are an indirect measure of data quality; they do not directly measure reliability or validity, and we do not directly observe what factors lengthen the administration time. In addition, either too long or too short RTs could signal a problem (Ehlen, Schober, and Conrad 2007). However, studies that link components of RTs (interviewers’ question reading and response latencies) to interviewer and respondent behaviors that index data quality strengthen the claim that RTs indicate data quality (Bergmann and Bristle 2019; Draisma and Dijkstra 2004; Olson, Smyth, and Kirchner 2019). In general, researchers tend to consider longer RTs as signaling processing problems for the interviewer, respondent, or both (Couper and Kreuter 2013; Olson and Smyth 2015; Yan and Olson 2013; Yan and Tourangeau 2008). Previous work demonstrates that RTs are associated with various characteristics of interviewers (where applicable), questions, and respondents in web, telephone, and face-to-face interviews (e.g., Couper and Kreuter 2013; Olson and Smyth 2015; Yan and Tourangeau 2008). We replicate and extend this research by examining how RTs are associated with various question characteristics and several established tools for evaluating questions. We also examine whether increased interviewer experience in the study shortens RTs for questions with characteristics that impact the complexity of the interviewer’s task (i.e., interviewer instructions and parenthetical phrases). We examine these relationships in the context of a sample of racially diverse respondents who answered questions about participation in medical research and their health. 
    more » « less
  4. The SDR Database v.2.0 (SDR2) is a multi-country, multi-year database for research on political participation, social capital, and well-being. It comprises harmonized information from 23 international survey projects, covering over 4.4 million respondents from 156 countries in the period 1966 – 2017. SDR2 provides both target variables and methodological indicators that store source survey and ex-post harmonization metadata. SDR2 consists of three datasets. The MASTER file, which stores harmonized information for a total of 4,402,489 respondents. The auxiliary PLUG-SURVEY file containing controls for source data quality and a set of technical variables needed for merging this file with the MASTER file. And the PLUG-COUNTRY file, which is a dictionary of countries and territories used in the MASTER file. An overall description of the SDR2 Database, and detailed information about its datasets are available in the SDR2 documentation. SDR2 is a product of the project Survey Data Recycling: New Analytic Framework, Integrated Database, and Tools for Cross-national Social, Behavioral and Economic Research, financed by the US National Science Foundation (PTE Federal award 1738502). We thank the Ohio State University and the Institute of Philosophy and Sociology, Polish Academy of Sciences, for organizational support. 
    more » « less
  5. The explosive growth in citizen science combined with a recalcitrance on the part of mainstream science to fully embrace this data collection technique demands a rigorous examination of the factors influencing data quality and project efficacy. Patterns of contributor effort and task performance have been well reviewed in online projects; however, studies of hands-on citizen science are lacking. We used a single hands-on, out-of-doors project—the Coastal Observation and Seabird Survey Team (COASST)—to quantitatively explore the relationships among participant effort, task performance, and social connectedness as a function of the demographic characteristics and interests of participants, placing these results in the context of a meta-analysis of 54 citizen science projects. Although online projects were typified by high (>90%) rates of one-off participation and low retention (<10%) past 1 y, regular COASST participants were highly likely to continue past their first survey (86%), with 54% active 1 y later. Project-wide, task performance was high (88% correct species identifications over the 31,450 carcasses and 163 species found). However, there were distinct demographic differences. Age, birding expertise, and previous citizen science experience had the greatest impact on participant persistence and performance, albeit occasionally in opposite directions. Gender and sociality were relatively inconsequential, although highly gregarious social types, i.e., “nexus people,” were extremely influential at recruiting others. Our findings suggest that hands-on citizen science can produce high-quality data especially if participants persist, and that understanding the demographic data of participation could be used to maximize data quality and breadth of participation across the larger societal landscape.

     
    more » « less