Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Interpretable time series deep learning systems are often assessed by checking temporal consistency on explanations, implicitly treating this as evidence of robustness. We show that this assumption can fail: Predictions and explanations can be adversarially decoupled, enabling targeted misclassification while the explanation remains plausible and consistent with a chosen reference rationale. We propose TSEF (Time Series Explanation Fooler), a dual-target attack that jointly manipulates the classifier and explainer outputs. In contrast to single-objective misclassification attacks that disrupt explanation and spread attribution mass broadly, TSEF achieves targeted prediction changes while keeping explanations consistent with the reference. Across multiple datasets and explainer backbones, our results consistently reveal that explanation stability is a misleading proxy for decision robustness and motivate coupling-aware robustness evaluations for trustworthy time series tasks.more » « less
-
In this paper, we propose a linear second-order numerical method for solving the Allen-Cahn equation with general mobility. The fully-discrete scheme is achieved by using the Crank-Nicolson formula for temporal integration and the central difference method for spatial approximation, together with two additional stabilization terms. Under mild constraints on the two stabilizing parameters, the proposed numerical scheme is shown to unconditionally preserve the discrete maximum bound principle and the discrete original energy dissipation law. Error estimate in the đżâ norm is successfully derived for the proposed scheme. Finally, some numerical experiments are conducted to verify the theoretical results and demonstrate the performance of the proposed scheme in combination with an adaptive time-stepping strategy.more » « less
-
In software development, many documents (e.g., tutorials for tools and mobile application websites) contain screenshots of graphical user interfaces (GUIs) to illustrate functionalities. Although screenshots are critical in such documents, screenshots can become outdated, especially if document developers forget to update them. Outdated screenshots can mislead users and diminish the credibility of documentation. Identifying screenshots manually is tedious and error-prone, especially when documents are numerous. However, no existing tools are proposed to detect outdated screenshots in GUI documents. To mitigate manual efforts, we propose DOSUD, a novel approach for detecting outdated screenshots. It is challenging to identify outdated screenshots since the differences are subtle and only specific areas are useful to identify such screenshots. To address the challenges, DOSUD automatically extracts and labels screenshots and trains a classification model to identify outdated screenshots. As the first exploration, we focus on Android applications and the most popular IDE, VS Code. We evaluated DOSUD on a benchmark comprising 10 popular applications, achieving high F1-scores. When applied in the wild, DOSUD identified 20 outdated screenshots across 50 Android application websites and 17 outdated screenshots in VS Code documentation. VS Code developers have confirmed and fixed all our bug reports.more » « less
-
Mills, Caitlin; Alexandron, Giora; Taibi, Davide; Lo_Bosco, Giosuè; Paquette, Luc (Ed.)Open-text responses provide researchers and educators with rich, nuanced insights that multiple-choice questions cannot capture. When reliably assessed, such responses have the potential to enhance teaching and learning. However, scaling and consistently capturing these nuances remain significant challenges, limiting the widespread use of open-text questions in educational research and assessments. In this paper, we introduce and evaluate GradeOpt, a unified multiagent automatic short-answer grading (ASAG) framework that leverages large language models (LLMs) as graders for short-answer responses. More importantly, GradeOpt incorporates two additional LLM-based agentsâthe reflector and the refinerâinto the multi-agent system. This enables GradeOpt to automatically optimize the original grading guidelines by performing self-reflection on its errors. To assess GradeOpt's effectiveness, we conducted experiments on two representative ASAG datasets, which include items designed to capture key aspects of teachers' pedagogical knowledge and students' learning progress. Our results demonstrate that GradeOpt consistently outperforms representative baselines in both grading accuracy and alignment with human evaluators across different knowledge domains. Finally, comprehensive ablation studies validate the contributions of GradeOpt's individual components, confirming their impact on overall performance.more » « less
An official website of the United States government

Full Text Available