skip to main content


Title: Data Exclusion in Policy Survey and Questionnaire Data: Aberrant Responses and Missingness

Data preprocessing is an integral step prior to analyzing data in psychological science, with implications for its potentially guiding policy. This article reports how psychological researchers address data preprocessing or quality concerns, with a focus on aberrant responses and missing data in self-report measures. 240 articles were sampled from four journals: Psychological Science, Journal of Personality and Social Psychology, Developmental Psychology, and Abnormal Psychology from 2012 to 2018. Nearly half of the studies did not report any missing data treatment (111/240; 46.25%), and if they did, the most common approach was listwise deletion (71/240; 29.6%). Studies that remove data due to missingness removed, on average, 12% of the sample. Likewise, most studies do not report any aberrant responses (194/240; 80%), but if they did, they classified 4% of the sample as suspect. Most studies are either not transparent enough about their data preprocessing steps or may be leveraging suboptimal procedures. Recommendations can improve transparency and data quality.

 
more » « less
Award ID(s):
1853166
NSF-PAR ID:
10401828
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
SAGE Publications
Date Published:
Journal Name:
Policy Insights from the Behavioral and Brain Sciences
Volume:
10
Issue:
1
ISSN:
2372-7322
Page Range / eLocation ID:
p. 11-17
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Two primary goals of psychological science should be to understand what aspects of human psychology are universal and the way that context and culture produce variability. This requires that we take into account the importance of culture and context in the way that we write our papers and in the types of populations that we sample. However, most research published in our leading journals has relied on sampling WEIRD (Western, educated, industrialized, rich, and democratic) populations. One might expect that our scholarly work and editorial choices would by now reflect the knowledge that Western populations may not be representative of humans generally with respect to any given psychological phenomenon. However, as we show here, almost all research published by one of our leading journals,Psychological Science, relies on Western samples and uses these data in an unreflective way to make inferences about humans in general. To take us forward, we offer a set of concrete proposals for authors, journal editors, and reviewers that may lead to a psychological science that is more representative of the human condition.

     
    more » « less
  2. Abstract

    Soil microbial communities play critical roles in various ecosystem processes, but studies at a large spatial and temporal scale have been challenging due to the difficulty in finding the relevant samples in available data sets as well as the lack of standardization in sample collection and processing. The National Ecological Observatory Network (NEON) has been collecting soil microbial community data multiple times per year for 47 terrestrial sites in 20 eco‐climatic domains, producing one of the most extensive standardized sampling efforts for soil microbial biodiversity to date. Here, we introduce the neonMicrobe R package—a suite of downloading, preprocessing, data set assembly, and sensitivity analysis tools for NEON’s newly published 16S and ITS amplicon sequencing data products which characterize soil bacterial and fungal communities, respectively. neonMicrobe is designed to make these data more accessible to ecologists without assuming prior experience with bioinformatic pipelines. We describe quality control steps used to remove quality‐flagged samples, report on sensitivity analyses used to determine appropriate quality filtering parameters for the DADA2 workflow, and demonstrate the immediate usability of the output data by conducting standard analyses of soil microbial diversity. The sequence abundance tables produced byneonMicrobecan be linked to NEON’s other data products (e.g., soil physical and chemical properties, plant community composition) and soil subsamples archived in the NEON Biorepository. We provide recommendations for incorporatingneonMicrobeinto reproducible scientific workflows, discuss technical considerations for large‐scale amplicon sequence analysis, and outline future directions for NEON‐enabled microbial ecology. In particular, we believe that NEON marker gene sequence data will allow researchers to answer outstanding questions about the spatial and temporal dynamics of soil microbial communities while explicitly accounting for scale dependence. We expect that the data produced by NEON and theneonMicrobeR package will act as a valuable ecological baseline to inform and contextualize future experimental and modeling endeavors.

     
    more » « less
  3. Scientists who perform major survival surgery on laboratory animals face a dual welfare and methodological challenge: how to choose surgical anesthetics and post-operative analgesics that will best control animal suffering, knowing that both pain and the drugs that manage pain can all affect research outcomes. Scientists who publish full descriptions of animal procedures allow critical and systematic reviews of data, demonstrate their adherence to animal welfare norms, and guide other scientists on how to conduct their own studies in the field. We investigated what information on animal pain management a reasonably diligent scientist might find in planning for a successful experiment. To explore how scientists in a range of fields describe their management of this ethical and methodological concern, we scored 400 scientific articles that included major animal survival surgeries as part of their experimental methods, for the completeness of information on anesthesia and analgesia. The 400 articles (250 accepted for publication pre-2011, and 150 in 2014–15, along with 174 articles they reference) included thoracotomies, craniotomies, gonadectomies, organ transplants, peripheral nerve injuries, spinal laminectomies and orthopedic procedures in dogs, primates, swine, mice, rats and other rodents. We scored articles for Publication Completeness (PC), which was any mention of use of anesthetics or analgesics; Analgesia Use (AU) which was any use of post-surgical analgesics, and Analgesia Completeness (a composite score comprising intra-operative analgesia, extended post-surgical analgesia, and use of multimodal analgesia). 338 of 400 articles were PC. 98 of these 338 were AU, with some mention of analgesia, while 240 of 338 mentioned anesthesia only but not postsurgical analgesia. Journals’ caliber, as measured by their 2013 Impact Factor, had no effect on PC or AU. We found no effect of whether a journal instructs authors to consult the ARRIVE publishing guidelines published in 2010 on PC or AC for the 150 mouse and rat articles in our 2014–15 dataset. None of the 302 articles that were silent about analgesic use included an explicit statement that analgesics were withheld, or a discussion of how pain management or untreated pain might affect results. We conclude that current scientific literature cannot be trusted to present full detail on use of animal anesthetics and analgesics. We report that publication guidelines focus more on other potential sources of bias in experimental results, under-appreciate the potential for pain and pain drugs to skew data, PLOS ONE | DOI:10.1371/journal.pone.0155001 May 12, 2016 1 / 24 a11111 OPEN ACCESS Citation: Carbone L, Austin J (2016) Pain and Laboratory Animals: Publication Practices for Better Data Reproducibility and Better Animal Welfare. PLoS ONE 11(5): e0155001. doi:10.1371/journal. pone.0155001 Editor: Chang-Qing Gao, Central South University, CHINA Received: December 29, 2015 Accepted: April 22, 2016 Published: May 12, 2016 Copyright: © 2016 Carbone, Austin. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability Statement: All relevant data are within the paper and its Supporting Information files. Authors may be contacted for further information. Funding: This study was funded by the United States National Science Foundation Division of Social and Economic Sciences. Award #1455838. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. and thus mostly treat pain management as solely an animal welfare concern, in the jurisdiction of animal care and use committees. At the same time, animal welfare regulations do not include guidance on publishing animal data, even though publication is an integral part of the cycle of research and can affect the welfare of animals in studies building on published work, leaving it to journals and authors to voluntarily decide what details of animal use to publish. We suggest that journals, scientists and animal welfare regulators should revise current guidelines and regulations, on treatment of pain and on transparent reporting of treatment of pain, to improve this dual welfare and data-quality deficiency. 
    more » « less
  4. The COVID-19 pandemic has dramatically altered family life in the United States. Over the long duration of the pandemic, parents had to adapt to shifting work conditions, virtual schooling, the closure of daycare facilities, and the stress of not only managing households without domestic and care supports but also worrying that family members may contract the novel coronavirus. Reports early in the pandemic suggest that these burdens have fallen disproportionately on mothers, creating concerns about the long-term implications of the pandemic for gender inequality and mothers’ well-being. Nevertheless, less is known about how parents’ engagement in domestic labor and paid work has changed throughout the pandemic, what factors may be driving these changes, and what the long-term consequences of the pandemic may be for the gendered division of labor and gender inequality more generally.

    The Study on U.S. Parents’ Divisions of Labor During COVID-19 (SPDLC) collects longitudinal survey data from partnered U.S. parents that can be used to assess changes in parents’ divisions of domestic labor, divisions of paid labor, and well-being throughout and after the COVID-19 pandemic. The goal of SPDLC is to understand both the short- and long-term impacts of the pandemic for the gendered division of labor, work-family issues, and broader patterns of gender inequality.

    Survey data for this study is collected using Prolifc (www.prolific.co), an opt-in online platform designed to facilitate scientific research. The sample is comprised U.S. adults who were residing with a romantic partner and at least one biological child (at the time of entry into the study). In each survey, parents answer questions about both themselves and their partners. Wave 1 of SPDLC was conducted in April 2020, and parents who participated in Wave 1 were asked about their division of labor both prior to (i.e., early March 2020) and one month after the pandemic began. Wave 2 of SPDLC was collected in November 2020. Parents who participated in Wave 1 were invited to participate again in Wave 2, and a new cohort of parents was also recruited to participate in the Wave 2 survey. Wave 3 of SPDLC was collected in October 2021. Parents who participated in either of the first two waves were invited to participate again in Wave 3, and another new cohort of parents was also recruited to participate in the Wave 3 survey. This research design (follow-up survey of panelists and new cross-section of parents at each wave) will continue through 2024, culminating in six waves of data spanning the period from March 2020 through October 2024. An estimated total of approximately 6,500 parents will be surveyed at least once throughout the duration of the study.

    SPDLC data will be released to the public two years after data is collected; Waves 1 and 2 are currently publicly available. Wave 3 will be publicly available in October 2023, with subsequent waves becoming available yearly. Data will be available to download in both SPSS (.sav) and Stata (.dta) formats, and the following data files will be available: (1) a data file for each individual wave, which contains responses from all participants in that wave of data collection, (2) a longitudinal panel data file, which contains longitudinal follow-up data from all available waves, and (3) a repeated cross-section data file, which contains the repeated cross-section data (from new respondents at each wave) from all available waves. Codebooks for each survey wave and a detailed user guide describing the data are also available. Response Rates: Of the 1,157 parents who participated in Wave 1, 828 (72%) also participated in the Wave 2 study. Presence of Common Scales: The following established scales are included in the survey:
    • Self-Efficacy, adapted from Pearlin's mastery scale (Pearlin et al., 1981) and the Rosenberg self-esteem scale (Rosenberg, 2015) and taken from the American Changing Lives Survey
    • Communication with Partner, taken from the Marriage and Relationship Survey (Lichter & Carmalt, 2009)
    • Gender Attitudes, taken from the National Survey of Families and Households (Sweet & Bumpass, 1996)
    • Depressive Symptoms (CES-D-10)
    • Stress, measured using Cohen's Perceived Stress Scale (Cohen, Kamarck, & Mermelstein, 1983)
    Full details about these scales and all other items included in the survey can be found in the user guide and codebook
    The second wave of the SPDLC was fielded in November 2020 in two stages. In the first stage, all parents who participated in W1 of the SPDLC and who continued to reside in the United States were re-contacted and asked to participate in a follow-up survey. The W2 survey was posted on Prolific, and messages were sent via Prolific’s messaging system to all previous participants. Multiple follow-up messages were sent in an attempt to increase response rates to the follow-up survey. Of the 1,157 respondents who completed the W1 survey, 873 at least started the W2 survey. Data quality checks were employed in line with best practices for online surveys (e.g., removing respondents who did not complete most of the survey or who did not pass the attention filters). After data quality checks, 5.2% of respondents were removed from the sample, resulting in a final sample size of 828 parents (a response rate of 72%).

    In the second stage, a new sample of parents was recruited. New parents had to meet the same sampling criteria as in W1 (be at least 18 years old, reside in the United States, reside with a romantic partner, and be a parent living with at least one biological child). Also similar to the W1 procedures, we oversampled men, Black individuals, individuals who did not complete college, and individuals who identified as politically conservative to increase sample diversity. A total of 1,207 parents participated in the W2 survey. Data quality checks led to the removal of 5.7% of the respondents, resulting in a final sample size of new respondents at Wave 2 of 1,138 parents.

    In both stages, participants were informed that the survey would take approximately 20 minutes to complete. All panelists were provided monetary compensation in line with Prolific’s compensation guidelines, which require that all participants earn above minimum wage for their time participating in studies.
    To be included in SPDLC, respondents had to meet the following sampling criteria at the time they enter the study: (a) be at least 18 years old, (b) reside in the United States, (c) reside with a romantic partner (i.e., be married or cohabiting), and (d) be a parent living with at least one biological child. Follow-up respondents must be at least 18 years old and reside in the United States, but may experience changes in relationship and resident parent statuses. Smallest Geographic Unit: U.S. State

    This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. In accordance with this license, all users of these data must give appropriate credit to the authors in any papers, presentations, books, or other works that use the data. A suggested citation to provide attribution for these data is included below:            

    Carlson, Daniel L. and Richard J. Petts. 2022. Study on U.S. Parents’ Divisions of Labor During COVID-19 User Guide: Waves 1-2.  

    To help provide estimates that are more representative of U.S. partnered parents, the SPDLC includes sampling weights. Weights can be included in statistical analyses to make estimates from the SPDLC sample representative of U.S. parents who reside with a romantic partner (married or cohabiting) and a child aged 18 or younger based on age, race/ethnicity, and gender. National estimates for the age, racial/ethnic, and gender profile of U.S. partnered parents were obtained using data from the 2020 Current Population Survey (CPS). Weights were calculated using an iterative raking method, such that the full sample in each data file matches the nationally representative CPS data in regard to the gender, age, and racial/ethnic distributions within the data. This variable is labeled CPSweightW2 in the Wave 2 dataset, and CPSweightLW2 in the longitudinal dataset (which includes Waves 1 and 2). There is not a weight variable included in the W1-W2 repeated cross-section data file.
     
    more » « less
  5. High levels of stress and anxiety are common amongst college students, particularly engineering students. Students report lack of sleep, grades, competition, change in lifestyle, and other significant stressors throughout their undergraduate education (1, 2). Stress and anxiety have been shown to negatively impact student experience (3-6), academic performance (6-8), and retention (9). Previous studies have focused on identifying factors that cause individual students stress while completing undergraduate engineering degree programs (1). However, it not well-understood how a culture of stress is perceived and is propagated in engineering programs or how this culture impacts student levels of identification with engineering. Further, the impact of student stress has not been directly considered in engineering regarding recruitment, retention, and success. Therefore, our guiding research question is: Does the engineering culture create stress for students that hinder their engineering identity development? To answer our research question, we designed a sequential mixed methods study with equal priority of quantitative survey data and qualitative individual interviews. Our study participants are undergraduate engineering students across all levels and majors at a large, public university. Our sample goal is 2000 engineering student respondents. We combined three published surveys to build our quantitative data collection instrument, including the Depression Anxiety Stress Scales (DASS), Identification with engineering subscale, and Engineering Department Inclusion Level subscale. The objective of the quantitative instrument is to illuminate individual perceptions of the existence of an engineering stress culture (ESC) and create an efficient tool to measure the impact ESC on engineering identity development. Specifically, we seek to understand the relationships among the following constructs; 1) identification with engineering, 2) stress and anxiety, and 3) feelings of inclusion within their department. The focus of this paper presents the results of the pilot of the proposed instrument with 20 participants and a detailed data collection and analysis process. In an effort to validate our instrument, we conducted a pilot study to refine our data collection process and the results will guide the data collection for the larger study. In addition to identifying relationships among construct, the survey data will be further analyzed to specify which demographics are mediating or moderating factors of these relationships. For example, does a student’s 1st generation status influence their perception of stress or engineering identity development? Our analysis may identify discipline-specific stressors and characterize culture components that promote student anxiety and stress. Our objective is to validate our survey instrument and use it to inform the protocol for the follow-up interviews to gain a deeper understanding of the responses to the survey instrument. Understanding what students view as stressful and how students identify stress as an element of program culture will support the development of interventions to mitigate student stress. References 1. Schneider L (2007) Perceived stress among engineering students. A Paper Presented at St. Lawrence Section Conference. Toronto, Canada. Retrieved from: www. asee. morrisville. edu. 2. Ross SE, Niebling BC, & Heckert TM (1999) Sources of stress among college students. Social psychology 61(5):841-846. 3. Goldman CS & Wong EH (1997) Stress and the college student. Education 117(4):604-611. 4. Hudd SS, et al. (2000) Stress at college: Effects on health habits, health status and self-esteem. College Student Journal 34(2):217-228. 5. Macgeorge EL, Samter W, & Gillihan SJ (2005) Academic Stress, Supportive Communication, and Health A version of this paper was presented at the 2005 International Communication Association convention in New York City. Communication Education 54(4):365-372. 6. Burt KB & Paysnick AA (2014) Identity, stress, and behavioral and emotional problems in undergraduates: Evidence for interaction effects. Journal of college student development 55(4):368-384. 7. Felsten G & Wilcox K (1992) Influences of stress and situation-specific mastery beliefs and satisfaction with social support on well-being and academic performance. Psychological Reports 70(1):291-303. 8. Pritchard ME & Wilson GS (2003) Using emotional and social factors to predict student success. Journal of college student development 44(1):18-28. 9. Zhang Z & RiCharde RS (1998) Prediction and Analysis of Freshman Retention. AIR 1998 Annual Forum Paper. 
    more » « less