Visualization literacy is an essential skill for accurately interpreting data to inform critical decisions. Consequently, it is vital to understand the evolution of this ability and devise targeted interventions to enhance it, requiring concise and repeatable assessments of visualization literacy for individuals. However, current assessments, such as the Visualization Literacy Assessment Test ( vlat ), are time-consuming due to their fixed, lengthy format. To address this limitation, we develop two streamlined computerized adaptive tests ( cats ) for visualization literacy, a-vlat and a-calvi , which measure the same set of skills as their original versions in half the number of questions. Specifically, we (1) employ item response theory (IRT) and non-psychometric constraints to construct adaptive versions of the assessments, (2) finalize the configurations of adaptation through simulation, (3) refine the composition of test items of a-calvi via a qualitative study, and (4) demonstrate the test-retest reliability (ICC: 0.98 and 0.98) and convergent validity (correlation: 0.81 and 0.66) of both CATS via four online studies. We discuss practical recommendations for using our CATS and opportunities for further customization to leverage the full potential of adaptive assessments. All supplemental materials are available at https://osf.io/a6258/ .
more »
« less
Evaluating convergence between two data visualization literacy assessments
Abstract Data visualizations play a crucial role in communicating patterns in quantitative data, making data visualization literacy a key target of STEM education. However, it is currently unclear to what degree different assessments of data visualization literacy measure the same underlying constructs. Here, we administered two widely used graph comprehension assessments (Galesic and Garcia-Retamero in Med Dec Mak 31:444–457, 2011; Lee et al. in IEEE Trans Vis Comput Graph 235:51–560, 2016) to both a university-based convenience sample and a demographically representative sample of adult participants in the USA (N=1,113). Our analysis of individual variability in test performance suggests that overall scores are correlated between assessments and associated with the amount of prior coursework in mathematics. However, further exploration of individual error patterns suggests that these assessments probe somewhat distinct components of data visualization literacy, and we do not find evidence that these components correspond to the categories that guided the design of either test (e.g., questions that require retrieving values rather than making comparisons). Together, these findings suggest opportunities for development of more comprehensive assessments of data visualization literacy that are organized by components that better account for detailed behavioral patterns.
more »
« less
- PAR ID:
- 10581241
- Publisher / Repository:
- Springer Science + Business Media
- Date Published:
- Journal Name:
- Cognitive Research: Principles and Implications
- Volume:
- 10
- Issue:
- 1
- ISSN:
- 2365-7464
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract The increasing integration of Visual Language Models (VLMs) into visualization systems demands a comprehensive understanding of their visual interpretation capabilities and constraints. While existing research has examined individual models, systematic comparisons of VLMs' visualization literacy remain unexplored. We bridge this gap through a rigorous, first‐of‐its‐kind evaluation of four leading VLMs (GPT‐4, Claude, Gemini, and Llama) using standardized assessments: the Visualization Literacy Assessment Test (VLAT) and Critical Thinking Assessment for Literacy in Visualizations (CALVI). Our methodology uniquely combines randomized trials with structured prompting techniques to control for order effects and response variability ‐ a critical consideration overlooked in many VLM evaluations. Our analysis reveals that while specific models demonstrate competence in basic chart interpretation (Claude achieving 67.9% accuracy on VLAT), all models exhibit substantial difficulties in identifying misleading visualization elements (maximum 30.0% accuracy on CALVI). We uncover distinct performance patterns: strong capabilities in interpreting conventional charts like line charts (76‐96% accuracy) and detecting hierarchical structures (80‐100% accuracy), but consistent difficulties with data‐dense visualizations involving multiple encodings (bubble charts: 18.6‐61.4%) and anomaly detection (25‐30% accuracy). Significantly, we observe distinct uncertainty management behavior across models, with Gemini displaying heightened caution (22.5% question omission) compared to others (7‐8%). These findings provide crucial insights for the visualization community by establishing reliable VLM evaluation benchmarks, identifying areas where current models fall short, and highlighting the need for targeted improvements in VLM architectures for visualization tasks. To promote reproducibility, encourage further research, and facilitate benchmarking of future VLMs, our complete evaluation framework, including code, prompts, and analysis scripts, is available athttps://github.com/washuvis/VisLit‐VLM‐Eval.more » « less
-
null (Ed.)A substantial amount of evidence suggests that students, particularly those from economically disadvantaged households, experience summer reading loss. Available evidence suggests this is due to a lack of participation in literacy-focused activities and access to books during the summer break from school. The current study investigated whether participation in Children’s Defense Fund’s Freedom Schools, a free, six-week, literacy-focused, culturally relevant summer camp, may help prevent summer reading loss. The sample consisted of 125 students who participated in three sites of the summer camp and completed pre- and post-test reading assessments. The results of this study suggest that the literacy-focused summer camp provides students with an academically enriching opportunity that may help prevent summer reading loss, particularly for students in Grades 3–5, who experienced small gains on average in vocabulary, fluency, and comprehension. Recommendations are provided regarding how the program can be modified to maximize potential benefits related to participation.more » « less
-
PurposeThis study aims to explore how network visualization provides opportunities for learners to explore data literacy concepts using locally and personally relevant data. Design/methodology/approachThe researchers designed six locally relevant network visualization activities to support students’ data reasoning practices toward understanding aggregate patterns in data. Cultural historical activity theory (Engeström, 1999) guides the analysis to identify how network visualization activities mediate students’ emerging understanding of aggregate data sets. FindingsPre/posttest findings indicate that this implementation positively impacted students’ understanding of network visualization concepts, as they were able to identify and interpret key relationships from novel networks. Interaction analysis (Jordan and Henderson, 1995) of video data revealed nuances of how activities mediated students’ improved ability to interpret network data. Some challenges noted in other studies, such as students’ tendency to focus on familiar concepts, are also noted as teachers supported conversations to help students move beyond them. Originality/valueTo the best of the authors’ knowledge, this is the first study the authors are aware of that supported elementary students in exploring data literacy through network visualization. The authors discuss how network visualizations and locally/personally meaningful data provide opportunities for learning data literacy concepts across the curriculum.more » « less
-
Graphical perception studies typically measure visualization encoding effectiveness using the error of an “average observer”, leading to canonical rankings of encodings for numerical attributes: e.g., position > area > angle > volume. Yet different people may vary in their ability to read different visualization types, leading to variance in this ranking across individuals not captured by population-level metrics using “average observer” models. One way we can bridge this gap is by recasting classic visual perception tasks as tools for assessing individual performance, in addition to overall visualization performance. In this article we replicate and extend Cleveland and McGill's graphical comparison experiment using Bayesian multilevel regression, using these models to explore individual differences in visualization skill from multiple perspectives. The results from experiments and modeling indicate that some people show patterns of accuracy that credibly deviate from the canonical rankings of visualization effectiveness. We discuss implications of these findings, such as a need for new ways to communicate visualization effectiveness to designers, how patterns in individuals’ responses may show systematic biases and strategies in visualization judgment, and how recasting classic visual perception tasks as tools for assessing individual performance may offer new ways to quantify aspects of visualization literacy. Experiment data, source code, and analysis scripts are available at the following repository: https://osf.io/8ub7t/?view_only=9be4798797404a4397be3c6fc2a68cc0 .more » « less
An official website of the United States government
