Assessing the correctness of student answers in a dialog-based intelligent tutoring system (ITS) is a well-defined Natural Language Processing (NLP) task that has attracted the attention of many researchers in the field. Inspired by Vaswani’s transformer, we propose in this paper an attention-based transformer neural network with a multi-head attention mechanism for the task of student answer assessment. Results show the competitiveness of our proposed model. A highest accuracy of 71.5% was achieved when using ELMo embeddings, 10 heads of attention, and 2 layers. This is very competitive and rivals the highest accuracy achieved by a previously proposed BI-GRU-Capsnet deep network (72.5%) on the same dataset. The main advantages of using transformers over BI-GRU-Capsnet is reducing the training time and giving more space for parallelization.
more »
« less
Attention Based Transformer for Student Answers Assessment
tInspired by Vaswani's transformer, we propose in this paper anattentionYbased transformer neural networ< with a multiYheadattention mechanism for the tas< of student answer assessment.Results show the competitiveness of our proposed model. Ahighest accuracy of 71.5% was achieved when using ELMoembeddings, 10 heads of attention, and 2 layers. This is verycompetitive and rivals the highest accuracy achieved by apreviously proposed BIYGRUYCapsnet deep networ< (72.5%)on the same dataset. The main advantages of using transformersover BIYGRUYCapsnet is reducing the training time and givingmore space for parallelization
more »
« less
- Award ID(s):
- 1822752
- PAR ID:
- 10189532
- Date Published:
- Journal Name:
- Thirty-Third International FLAIRS Conference (FLAIRS-32)
- Page Range / eLocation ID:
- 3-8
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
FreqMixFormerV2: Lightweight Frequency-aware Mixed Transformer for Human Skeleton Action RecognitionTransformer-based human skeleton action recognition has been developed for years. However, the complexity and high parameter count demands of these models hinder their practical applications, especially in resource-constrained environments. In this work, we propose FreqMixForemrV2, which was built upon the Frequency-aware Mixed Transformer (FreqMixFormer) for identifying subtle and discriminative actions with pioneered frequency-domain analysis. We design a lightweight architecture that maintains robust performance while significantly reducing the model complexity. This is achieved through a redesigned frequency operator that optimizes high-frequency and low-frequency parameter adjustments, and a simplified frequency-aware attention module. These improvements result in a substantial reduction in model parameters, enabling efficient deployment with only a minimal sacrifice in accuracy. Comprehensive evaluations of standard datasets (NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets) demonstrate that the proposed model achieves a superior balance between efficiency and accuracy, outperforming state-of-the-art methods with only 60% of the parameters.more » « less
-
The key idea of current deep learning methods for dense prediction is to apply a model on a regular patch centered on each pixel to make pixel-wise predictions. These methods are limited in the sense that the patches are determined by network architecture instead of learned from data. In this work, we propose the dense transformer networks, which can learn the shapes and sizes of patches from data. The dense transformer networks employ an encoder-decoder architecture, and a pair of dense transformer modules are inserted into each of the encoder and decoder paths. The novelty of this work is that we provide technical solutions for learning the shapes and sizes of patches from data and efficiently restoring the spatial correspondence required for dense prediction. The proposed dense transformer modules are differentiable, thus the entire network can be trained. We apply the proposed networks on biological image segmentation tasks and show superior performance is achieved in comparison to baseline methods.more » « less
-
null (Ed.)In this study, analysis and control of a highly efficient, high-power full-bridge unidirectional resonant LLC solid-state transformer (SST) are discussed. A combination of pulse frequency modulation and phase-shift modulation is utilized to control this resonant converter for a wide load range. The converter is designed to maintain soft switching by using a resonant circuit to minimize the switching loss of the high-frequency converter. Zero-voltage-switching (ZVS) is achieved for the H-bridge converter. The ZVS boundary for the proposed combined control method is also analyzed in detail. The experimental setup for the suggested configuration was implemented, and the performance of the proposed control scheme and resonant LLC SST have been verified with test results. The proposed combined control scheme improves control performance. The obtained results show that, the proposed system can regulate output voltage and maintain soft switching in a wide range of load. Thus, the efficiency of the system is improved and an efficiency of 97.18% is achieved.more » « less
-
Code completion aims at speeding up code writing by predicting the next code token(s) the developer is likely to write. Works in this field focused on improving the accuracy of the generated predictions, with substantial leaps forward made possible by deep learning (DL) models. However, code completion techniques are mostly evaluated in the scenario of predicting the next token to type, with few exceptions pushing the boundaries to the prediction of an entire code statement. Thus, little is known about the performance of state-of-the-art code completion approaches in more challenging scenarios in which, for example, an entire code block must be generated. We present a large-scale study exploring the capabilities of state-of-the-art Transformer-based models in supporting code completion at different granularity levels, including single tokens, one or multiple entire statements, up to entire code blocks (e.g., the iterated block of a for loop). We experimented with several variants of two recently proposed Transformer-based models, namely RoBERTa and the Text-To-Text Transfer Transformer (T5), for the task of code completion. The achieved results show that Transformer-based models, and in particular the T5, represent a viable solution for code completion, with perfect predictions ranging from ~29%, obtained when asking the model to guess entire blocks, up to ~69%, reached in the simpler scenario of few tokens masked from the same code statement.more » « less
An official website of the United States government

