Abstract Creativity is increasingly recognized as a core competency for the 21st century, making its development a priority in education, research, and industry. To effectively cultivate creativity, researchers and educators need reliable and accessible assessment tools. Recent software developments have significantly enhanced the administration and scoring of creativity measures; however, existing software often requires expertise in experiment design and computer programming, limiting its accessibility to many educators and researchers. In the current work, we introduce CAP—the Creativity Assessment Platform—a free web application for building creativity assessments, collecting data, and automatically scoring responses (cap.ist.psu.edu). CAP allows users to create custom creativity assessments in ten languages using a simple, point-and-click interface, selecting from tasks such as the Short Story Task, Drawing Task, and Scientific Creative Thinking Test. Users can automatically score task responses using machine learning models trained to match human creativity ratings—with multilingual capabilities, including the new Cross-Lingual Alternate Uses Scoring (CLAUS), a large language model achieving strong prediction of human creativity ratings in ten languages. CAP also provides a centralized dashboard to monitor data collection, score assessments, and automatically generate text for a Methods section based on the study’s tasks, metrics, and instructions—with a single click—promoting transparency and reproducibility in creativity assessment. Designed for ease of use, CAP aims to democratize creativity measurement for researchers, educators, and everyone in between.
more »
« less
Envisioning the Future of Creative Thinking Assessment
ABSTRACT The PISA assessment 2022 of creative thinking was a moonshot effort that introduced significant advancements over existing creativity tests, including a broad range of domains (written, visual, social, and scientific), implementation in many languages, and sophisticated scoring methods. PISA 2022 demonstrated the general feasibility of assessing creative thinking ability comprehensively at an international scale. However, the complexity of its assessment approach—such as time‐consuming scoring requiring human raters—implies the risk that it may not be easily applied by the scientific community and practitioners. In this commentary, we outline important next steps building on the PISA assessment to further enhance future assessments of creative thinking. Crucial future directions include 1) determining what tasks and scorings ensure high psychometric quality including content validity, 2) enabling efficient, objective scoring by applying AI methods such as Large Language Models (LLMs), 3) ensuring high language accessibility via multilingual tests, 4) targeting a broader age group, and 5) facilitating standardized, reproducible assessments via an open online testing platform. In sum, these developments would lead to an efficient, validated multilingual test of creative thinking, which enhances the accessibility of effective creative thinking assessments and thereby supports the democratization and reproducibility of creativity research.
more »
« less
- Award ID(s):
- 2155070
- PAR ID:
- 10594437
- Publisher / Repository:
- Wiley Blackwell (John Wiley & Sons)
- Date Published:
- Journal Name:
- The Journal of Creative Behavior
- Volume:
- 59
- Issue:
- 2
- ISSN:
- 0022-0175
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
ABSTRACT Automated scoring is a current hot topic in creativity research. However, most research has focused on the English language and popular verbal creative thinking tasks, such as the alternate uses task. Therefore, in this study, we present a large language model approach for automated scoring of a scientific creative thinking task that assesses divergent ideation in experimental tasks in the German language. Participants are required to generate alternative explanations for an empirical observation. This work analyzed a total of 13,423 unique responses. To predict human ratings of originality, we used XLM‐RoBERTa (Cross‐lingual Language Model‐RoBERTa), a large, multilingual model. The prediction model was trained on 9,400 responses. Results showed a strong correlation between model predictions and human ratings in a held‐out test set (n = 2,682;r = 0.80; CI‐95% [0.79, 0.81]). These promising findings underscore the potential of large language models for automated scoring of scientific creative thinking in the German language. We encourage researchers to further investigate automated scoring of other domain‐specific creative thinking tasks.more » « less
-
Metaphor is crucial in human cognition and creativity, facilitating abstract thinking, analogical reasoning, and idea generation. Typically, human raters manually score the originality of responses to creative thinking tasks – a laborious and error-prone process. Previous research sought to remedy these risks by scoring creativity tasks automatically using semantic distance and large language models (LLMs). Here, we extend research on automatic creativity scoring to metaphor generation – the ability to creatively describe episodes and concepts using nonliteral language. Metaphor is arguably more abstract and naturalistic than prior targets of automated creativity assessment. We collected 4,589 responses from 1,546 participants to various metaphor prompts and corresponding human creativity ratings. We fine-tuned two open-source LLMs (RoBERTa and GPT-2) – effectively “teaching” them to score metaphors like humans – before testing their ability to accurately assess the creativity of new metaphors. Results showed both models reliably predicted new human creativity ratings (RoBERTa r = .72, GPT-2 r = .70), significantly more strongly than semantic distance (r = .42). Importantly, the fine-tuned models generalized accurately to metaphor prompts they had not been trained on (RoBERTa r = .68, GPT-2 r = .63). We provide open access to the fine-tuned models, allowing researchers to assess metaphor creativity in a reproducible and timely manner.more » « less
-
Fostering creativity is vital for tackling 21st-century challenges, and education plays a key role in nurturing this skill. According to the associative theory, creativity involves connecting distant concepts in semantic memory. Here, we explore how semantic memory changes following an educational intervention intended to promote creativity. Specifically, we examine how a scientific education curriculum—Scientific Creativity in Practice (SCIP) program—impacts the semantic memory networks of 10–18-year-old students in a chemistry class (n = 176). Students in an Intervention group who received the SCIP intervention, and a Control group who did not, completed creative thinking tests, as well as verbal fluency tasks to estimate semantic networks in science-specific (chemistry) and domain-general (animal) categories. Results showed that the SCIP intervention enhanced performance on one test of scientific creative thinking but showed no significant difference on another. Using network science methods, we observed increased interconnectedness in both science-specific and domain-general categories, with lower path distances between concepts and reduced modularity. These traits define a ‘small-world’ network, balancing connections between closely related and remote concepts. Notably, the chemistry semantic network showed substantially more reorganization, consistent with the chemistry contents of the SCIP intervention. The findings suggest that semantic memory reorganization may be a cognitive mechanism underlying successful creativity interventions in science education.more » « less
-
Creativity research often relies on human raters to judge the novelty of participants’ responses on open-ended tasks, such as the Alternate Uses Task (AUT). Albeit useful, manual ratings are subjective and labor intensive. To address these limitations, researchers increasingly use automatic scoring methods based on a natural language processing technique for quantifying the semantic distance between words. However, many methodological choices remain open on how to obtain semantic distance scores for ideas, which can significantly impact reliability and validity. In this project, we propose a new semantic distance-based method, maximum associative distance (MAD), for assessing response novelty in AUT. Within a response, MAD uses the semantic distance of the word that is maximally remote from the prompt word to reflect response novelty. We compare the results from MAD with other competing semantic distance-based methods, including element-wise-multiplication—a commonly used compositional model—across three published datasets including a total of 447 participants. We found MAD to be more strongly correlated with human creativity ratings than the competing methods. In addition, MAD scores reliably predict external measures such as openness to experience. We further explored how idea elaboration affects the performance of various scoring methods and found that MAD is closely aligned with human raters in processing multi-word responses. The MAD method thus improves the psychometrics of semantic distance for automatic creativity assessment, and it provides clues about what human raters find creative about ideas.more » « less
An official website of the United States government
