skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 10:00 PM ET on Friday, December 8 until 2:00 AM ET on Saturday, December 9 due to maintenance. We apologize for the inconvenience.

Title: Factors Associated with Interviewers’ Evaluations of Respondents’ Performance in Telephone Interviews: Behavior, Response Quality Indicators, and Characteristics of Respondents and Interviewers

Interviewers’ postinterview evaluations of respondents’ performance (IEPs) are paradata, used to describe the quality of the data obtained from respondents. IEPs are driven by a combination of factors, including respondents’ and interviewers’ sociodemographic characteristics and what actually transpires during the interview. However, relatively few studies examine how IEPs are associated with features of the response process, including facets of the interviewer-respondent interaction and patterns of responding that index data quality. We examine whether features of the response process—various respondents’ behaviors and response quality indicators—are associated with IEPs in a survey with a diverse set of respondents focused on barriers and facilitators to participating in medical research. We also examine whether there are differences in IEPs across respondents’ and interviewers’ sociodemographic characteristics. Our results show that both respondents’ behaviors and response quality indicators predict IEPs, indicating that IEPs reflect what transpires in the interview. In addition, interviewers appear to approach the task of evaluating respondents with differing frameworks, as evidenced by the variation in IEPs attributable to interviewers and associations between IEPs and interviewers’ gender. Further, IEPs were associated with respondents’ education and ethnoracial identity, net of respondents’ behaviors, response quality indicators, and sociodemographic characteristics of respondents and interviewers. Future research should continue to build on studies that examine the correlates of IEPs to better inform whether, when, and how to use IEPs as paradata about the quality of the data obtained.

more » « less
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Public Opinion Quarterly
Page Range / eLocation ID:
p. 480-506
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Response time (RT) – the time elapsing from the beginning of question reading for a given question until the start of the next question – is a potentially important indicator of data quality that can be reliably measured for all questions in a computer-administered survey using a latent timer (i.e., triggered automatically by moving on to the next question). In interviewer-administered surveys, RTs index data quality by capturing the entire length of time spent on a question–answer sequence, including interviewer question-asking behaviors and respondent question-answering behaviors. Consequently, longer RTs may indicate longer processing or interaction on the part of the interviewer, respondent, or both. RTs are an indirect measure of data quality; they do not directly measure reliability or validity, and we do not directly observe what factors lengthen the administration time. In addition, either too long or too short RTs could signal a problem (Ehlen, Schober, and Conrad 2007). However, studies that link components of RTs (interviewers’ question reading and response latencies) to interviewer and respondent behaviors that index data quality strengthen the claim that RTs indicate data quality (Bergmann and Bristle 2019; Draisma and Dijkstra 2004; Olson, Smyth, and Kirchner 2019). In general, researchers tend to consider longer RTs as signaling processing problems for the interviewer, respondent, or both (Couper and Kreuter 2013; Olson and Smyth 2015; Yan and Olson 2013; Yan and Tourangeau 2008). Previous work demonstrates that RTs are associated with various characteristics of interviewers (where applicable), questions, and respondents in web, telephone, and face-to-face interviews (e.g., Couper and Kreuter 2013; Olson and Smyth 2015; Yan and Tourangeau 2008). We replicate and extend this research by examining how RTs are associated with various question characteristics and several established tools for evaluating questions. We also examine whether increased interviewer experience in the study shortens RTs for questions with characteristics that impact the complexity of the interviewer’s task (i.e., interviewer instructions and parenthetical phrases). We examine these relationships in the context of a sample of racially diverse respondents who answered questions about participation in medical research and their health. 
    more » « less
  2. Brenner, P. S. (Ed.)
    Features of the survey measurement process may affect responses from respondents in various racial, ethnic, or cultural groups in different ways. When responses from multiethnic populations are combined, such variability in responding could increase variable error or bias results. The current study examines the survey response process among Black and White respondents answering questions about trust in medical researchers and participation in medical research. Using transcriptions from telephone interviews, we code a rich set of behaviors produced by respondents that past research has shown to be associated with measurement error, including long question-answer sequences, uncodable answers, requests for repetition or clarification, affective responses, and tokens. In analysis, we test for differences between Black and White respondents in the likelihood with which behaviors occur and examine whether the behaviors vary by specific categorizations of the questions, including whether the questions are racially focused. Overall, we find that White respondents produce more behaviors that indicate cognitive processing problems for racially focused questions, which may be interpreted as demonstrating a “cultural” difference in the display of cognitive processing and interaction. Data are provided by the 2013–2014 Voices Heard Survey, a computer-assisted telephone survey designed to measure respondents’ perceptions of barriers and facilitators to participating in medical research. 
    more » « less
  3. Ethnoracial identity refers to the racial and ethnic categories that people use to classify themselves and others. How it is measured in surveys has implications for understanding inequalities. Yet how people self-identify may not conform to the categories standardized survey questions use to measure ethnicity and race, leading to potential measurement error. In interviewer-administered surveys, answers to survey questions are achieved through interviewer–respondent interaction. An analysis of interviewer–respondent interaction can illuminate whether, when, how, and why respondents experience problems with questions. In this study, we examine how indicators of interviewer–respondent interactional problems vary across ethnoracial groups when respondents answer questions about ethnicity and race. Further, we explore how interviewers respond in the presence of these interactional problems. Data are provided by the 2013–2014 Voices Heard Survey, a computer-assisted telephone survey designed to measure perceptions of participating in medical research among an ethnoracially diverse sample of respondents.

    more » « less
  4. Summary

    When potential survey respondents decide whether or not to participate in a telephone interview, they may consider what it would be like to converse with the interviewer who is currently inviting them to respond, e.g. how he or she sounds, speaks and interacts. In the study that is reported here, we examine the effect of three interactional speech behaviours on the outcome of survey invitations: interviewer fillers (e.g. ‘um’ and ‘uh’), householders’ backchannels (e.g. ‘uh huh’ and ‘I see’) and simultaneous speech or ‘overspeech’ between interviewer and householder. We examine how these behaviours are related to householders’ decisions to participate (agree), to decline the invitation (refusal) or to defer the decision (scheduled call-back) in a corpus of 1380 audiorecorded survey invitations (contacts). Agreement was highest when interviewers were moderately disfluent—neither robotic nor so disfluent as to appear incompetent. Further, household members produced more backchannels, a behaviour which is often assumed to reflect a listener’s engagement, when they ultimately agreed to participate than when they refused. Finally, there was more simultaneous speech in contacts where householders ultimately refused to participate; however, interviewers interrupted household members more when they ultimately scheduled a call-back, seeming to pre-empt householders’ attempts to refuse. We discuss implications for hiring and training interviewers, as well as the development of automated speech interviewing systems.

    more » « less
  5. Abstract

    Asking questions fluently, exactly as worded, and at a reasonable pace is a fundamental part of a survey interviewer’s role. Doing so allows the question to be asked as intended by the researcher and may decrease the risk of measurement error and contribute to rapport. Despite the central importance placed on reading questions exactly as worded, interviewers commonly misread questions, and it is not always clear why. Thus, understanding the risk of measurement error requires understanding how different interviewers, respondents, and question features may trigger question reading problems. In this article, we evaluate the effects of question features on question asking behaviors, controlling for interviewer and respondent characteristics. We also examine how question asking behaviors are related to question-asking time. Using two nationally representative telephone surveys in the United States, we find that longer questions and questions with transition statements are less likely to be read exactly and fluently, that questions with higher reading levels and parentheticals are less likely to be read exactly across both surveys and that disfluent readings decrease as interviewers gain experience across the field period. Other question characteristics vary in their associations with the outcomes across the two surveys. We also find that inexact and disfluent question readings are longer, but read at a faster pace, than exact and fluent question reading. We conclude with implications for interviewer training and questionnaire design.

    more » « less