skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Attention Based Transformer for Student Answers Assessment
Assessing the correctness of student answers in a dialog-based intelligent tutoring system (ITS) is a well-defined Natural Language Processing (NLP) task that has attracted the attention of many researchers in the field. Inspired by Vaswani’s transformer, we propose in this paper an attention-based transformer neural network with a multi-head attention mechanism for the task of student answer assessment. Results show the competitiveness of our proposed model. A highest accuracy of 71.5% was achieved when using ELMo embeddings, 10 heads of attention, and 2 layers. This is very competitive and rivals the highest accuracy achieved by a previously proposed BI-GRU-Capsnet deep network (72.5%) on the same dataset. The main advantages of using transformers over BI-GRU-Capsnet is reducing the training time and giving more space for parallelization.  more » « less
Award ID(s):
1822816
PAR ID:
10188357
Author(s) / Creator(s):
Date Published:
Journal Name:
Proceedings of the Thirty-Third International FLAIRS Conference (FLAIRS-32)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Motivated by the good results of capsule networks in text classification and other Natural Language Processing tasks, we present in this paper a Bi-GRU Capsule Networks model to automatically assess freely-generated student answers assessment within the context of dialogue-based intelligent tutoring systems. Our proposed model is composed of several important components: an embedding layer, a Bi-GRU layer, a capsule layer and a SoftMax layer. We have conducted a number of experiments considering a binary classification task: correct or incorrect answers. Our model has reached a highest accuracy of 72.50 when using an Elmo word embedding as detailed in the body of the paper. 
    more » « less
  2. Motivated by the good results of capsule networks in text classification and other Natural Language Processing tasks, we present in this paper a Bi-GRU Capsule Networks model to automatically assess freely-generated student answers assessment within the context of dialogue-based intelligent tutoring systems. Our proposed model is composed of several important components: an embedding layer, a Bi-GRU layer, a capsule layer and a SoftMax layer. We have conducted a number of experiments considering a binary classification task: correct or incorrect answers. Our model has reached a highest accuracy of 72.50 when using an Elmo word embedding as detailed in the body of the paper. 
    more » « less
  3. tInspired by Vaswani's transformer, we propose in this paper anattentionYbased transformer neural networ< with a multiYheadattention mechanism for the tas< of student answer assessment.Results show the competitiveness of our proposed model. Ahighest accuracy of 71.5% was achieved when using ELMoembeddings, 10 heads of attention, and 2 layers. This is verycompetitive and rivals the highest accuracy achieved by apreviously proposed BIYGRUYCapsnet deep networ< (72.5%)on the same dataset. The main advantages of using transformersover BIYGRUYCapsnet is reducing the training time and givingmore space for parallelization 
    more » « less
  4. In directed energy deposition (DED), accurately controlling and predicting melt pool characteristics is essential for ensuring desired material qualities and geometric accuracies. This paper introduces a robust surrogate model based on recurrent neural network (RNN) architectures—Long Short-Term Memory (LSTM), Bidirectional LSTM (Bi-LSTM), and Gated Recurrent Unit (GRU). Leveraging a time series dataset from multi-physics simulations and a three-factor, three-level experimental design, the model accurately predicts melt pool peak temperatures, lengths, widths, and depths under varying conditions. RNN algorithms, particularly Bi-LSTM, demonstrate high predictive accuracy, with an R-square of 0.983 for melt pool peak temperatures. For melt pool geometry, the GRU-based model excels, achieving R-square values above 0.88 and reducing computation time by at least 29%, showcasing its accuracy and efficiency. The RNN-based surrogate model built in this research enhances understanding of melt pool dynamics and supports precise DED system setups. 
    more » « less
  5. Robinson, Peter (Ed.)
    Abstract Motivation Oxford Nanopore sequencing producing long reads at low cost has made many breakthroughs in genomics studies. However, the large number of errors in Nanopore genome assembly affect the accuracy of genome analysis. Polishing is a procedure to correct the errors in genome assembly and can improve the reliability of the downstream analysis. However, the performances of the existing polishing methods are still not satisfactory. Results We developed a novel polishing method, NeuralPolish, to correct the errors in assemblies based on alignment matrix construction and orthogonal Bi-GRU networks. In this method, we designed an alignment feature matrix for representing read-to-assembly alignment. Each row of the matrix represents a read, and each column represents the aligned bases at each position of the contig. In the network architecture, a bi-directional GRU network is used to extract the sequence information inside each read by processing the alignment matrix row by row. After that, the feature matrix is processed by another bi-directional GRU network column by column to calculate the probability distribution. Finally, a CTC decoder generates a polished sequence with a greedy algorithm. We used five real datasets and three assembly tools including Wtdbg2, Flye and Canu for testing, and compared the results of different polishing methods including NeuralPolish, Racon, MarginPolish, HELEN and Medaka. Comprehensive experiments demonstrate that NeuralPolish achieves more accurate assembly with fewer errors than other polishing methods and can improve the accuracy of assembly obtained by different assemblers. Availability and implementation https://github.com/huangnengCSU/NeuralPolish.git. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less