skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Criteria for collapsing rating scale responses: A case study of the CLASS
Assessments of students’ attitudes and beliefs often rely on questions with rating scales that ask students the extent to which they agree or disagree with a statement. Unlike traditional physics problems with a single correct answer, rating scale questions often have a spectrum of 5 or more responses, none of which are correct. Researchers have found that responses on rating scale items can generally be treated as continuous and that unless there is good evidence to do otherwise, response categories should not be collapsed [1–3]. We discuss two potential reasons for collapsing response categories (lack of use and redundancy) and how to empirically test for them. To illustrate these methods, we use them on the Colorado Learning Attitudes about Science Survey. We found that students used all the response categories on the CLASS but that three of them were potentially redundant. This led us to conclude that the CLASS should be scored on a 5-point or 3-point scale, rather than the 2-point scale recommended by the instrument developers [4]. More broadly, we recommend the judicious use of data manipulations when scoring assessments and retaining all response categories unless there is a strong rational for collapsing them.  more » « less
Award ID(s):
1928596 1525338
PAR ID:
10192528
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2019 Physics Education Research Conference Proceedings
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Self-report assessments are used frequently in higher education to assess a variety of constructs, including attitudes, opinions, knowledge, and competence. Systems thinking is an example of one competence often measured using self-report assessments where individuals answer several questions about their perceptions of their own skills, habits, or daily decisions. In this study, we define systems thinking as the ability to see the world as a complex interconnected system where different parts can influence each other, and the interrelationships determine system outcomes. An alternative, less-common, assessment approach is to measure skills directly by providing a scenario about an unstructured problem and evaluating respondents’ judgment or analysis of the scenario (scenario-based assessment). This study explored the relationships between engineering students’ performance on self-report assessments and scenario-based assessments of systems thinking, finding that there were no significant relationships between the two assessment techniques. These results suggest that there may be limitations to using self-report assessments as a method to assess systems thinking and other competencies in educational research and evaluation, which could be addressed by incorporating alternative formats for assessing competence. Future work should explore these findings further and support the development of alternative assessment approaches. 
    more » « less
  2. Healthcare applications on Voice Personal Assistant System (e.g., Amazon Alexa), have shown a great promise to deliver personalized health services via a conversational interface. However, concerns are also raised about privacy, safety, and service quality. In this paper, we propose VerHealth, to systematically assess health-related applications on Alexa for how well they comply with existing privacy and safety policies. VerHealth contains a static module and a dynamic module based on machine learning that can trigger and detect violation behaviors hidden deep in the interaction threads. We use VerHealth to analyze 813 health-related applications on Alexa by sending over 855,000 probing questions and analyzing 863,000 responses. We also consult with three medical school students (domain experts) to confirm and assess the potential violations. We show that violations are quite common, e.g., 86.36% of them miss disclaimers when providing medical information; 30.23% of them store user physical or mental health data without approval. Domain experts believe that the applications' medical suggestions are often factually-correct but are of poor relevance, and applications should have asked more questions before providing suggestions for over half of the cases. Finally, we use our results to discuss possible directions for improvements. 
    more » « less
  3. null; null; null (Ed.)
    Prior research has shown that physics students often think about experimental procedures and data analysis very differently from experts. One key framework for analyzing student thinking has found that student thinking is more point-like, putting emphasis on the results of a single experimental trial, whereas set-like thinking relies on the results of many trials. Recent work, however, has found that students rarely fall into one of these two extremes, which may be a limitation of how student thinking is evaluated. Measurements of student thinking have focused on probing students’ procedural knowledge by asking them, for example, what steps they might take next in an experiment. Two common refrains are to collect more data, or to improve the experiment and collect better data. In both of these cases, the underlying reasons behind student responses could be based in point-like or set-like thinking. In this study we use individual student interviews to investigate how advanced physics students believe the collection of more and better data will affect the results of a classical and a quantum mechanical experiment. The results inform future frameworks and assessments for characterizing students thinking between the extremes of point and set reasoning in both classical and quantum regimes. 2020 
    more » « less
  4. We report on an experiment that we performed when we taught the undergraduate artificial intelligence class at the University of Southern California. We taught it - under very similar conditions - once with and once without an attendance requirement. The attendance requirement substantially increased the attendance of the students. It did not substantially affect their performance but decreased their course ratings across all categories in the official course evaluation, whose results happened to be biased toward the opinions of the students attending the lectures. For example, the overall rating of the instructor was 0.89 lower (on a 1-5 scale) with the attendance requirement and the overall rating of the class was 0.85 lower. Thus, the attendance requirement, combined with the policy for administering the course evaluation, had a large impact on the course ratings, which is a problem if the course ratings influence decisions on promotions, tenure, and salary increments for the instructors but also demonstrates the potential for the manipulation of course ratings. 
    more » « less
  5. Abstract Background Capturing measures of students’ attitudes toward science has long been a focus within the field of science education. The resulting interest has led to the development of many instruments over the years. There is considerable disagreement about how attitudes should be measured, and especially whether students’ attitudes toward science can or should be measured unidimensionally, or whether separate attitude dimensions or subscales should be considered. When it is agreed upon that the attitudes toward science construct should be measured along separate subscales, there is no consensus about which subscales should be used. Methods A streamlined version of the modified Attitudes Towards Science Inventory (mATSI), a widely used science measurement instrument, was validated for a more diverse sample as compared to the original study (Weinburgh and Steele in Journal of Women and Minorities in Science and Engineering 6:87–94, 2000). The analytical approach used factor analyses and longitudinal measurement invariance. The study used a sample of 2016 self-reported responses from 6 and 7th grade students. The factor analysis elucidated the factor structure of students’ attitudes toward science, and some modifications were made in accordance with the results. Measurement invariance analysis was used to confirm the stability of the measure. Results Our results support that the subscales, anxiety toward science and value and enjoyment of science , are two factors and stable over time. Conclusions Our results suggest that our proposed modified factor structure for students’ attitudes toward science is reliable, valid, and appropriate for use in longitudinal studies. This study and its resulting streamlined mATSI survey could be of value to those interested in studying student engagement and measuring middle-school students' attitudes toward science. 
    more » « less