Motivated by the good results of capsule networks in text classification and other Natural Language Processing tasks, we present in this paper a Bi-GRU Capsule Networks model to automatically assess freely-generated student answers assessment within the context of dialogue-based intelligent tutoring systems. Our proposed model is composed of several important components: an embedding layer, a Bi-GRU layer, a capsule layer and a SoftMax layer. We have conducted a number of experiments considering a binary classification task: correct or incorrect answers. Our model has reached a highest accuracy of 72.50 when using an Elmo word embedding as detailed in the body of the paper.
more »
« less
BI-GRU Capsule Networks for Student Answers Assessment
Motivated by the good results of capsule networks in text classification and other Natural Language Processing tasks, we present in this paper a Bi-GRU Capsule Networks model to automatically assess freely-generated student answers assessment within the context of dialogue-based intelligent tutoring systems. Our proposed model is composed of several important components: an embedding layer, a Bi-GRU layer, a capsule layer and a SoftMax layer. We have conducted a number of experiments considering a binary classification task: correct or incorrect answers. Our model has reached a highest accuracy of 72.50 when using an Elmo word embedding as detailed in the body of the paper.
more »
« less
- Award ID(s):
- 1822816
- PAR ID:
- 10110457
- Date Published:
- Journal Name:
- In Proceedings of The 2019 KDD Workshop on Deep Learning for Education (DL4Ed) in conjunction with the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2019), August 4-8, 2019, Anchorage, Alaska, USA
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Assessing the correctness of student answers in a dialog-based intelligent tutoring system (ITS) is a well-defined Natural Language Processing (NLP) task that has attracted the attention of many researchers in the field. Inspired by Vaswani’s transformer, we propose in this paper an attention-based transformer neural network with a multi-head attention mechanism for the task of student answer assessment. Results show the competitiveness of our proposed model. A highest accuracy of 71.5% was achieved when using ELMo embeddings, 10 heads of attention, and 2 layers. This is very competitive and rivals the highest accuracy achieved by a previously proposed BI-GRU-Capsnet deep network (72.5%) on the same dataset. The main advantages of using transformers over BI-GRU-Capsnet is reducing the training time and giving more space for parallelization.more » « less
-
null (Ed.)Convolutional neural networks (CNNs) have become a key asset to most of fields in AI. Despite their successful performance, CNNs suffer from a major drawback. They fail to capture the hierarchy of spatial relation among different parts of an entity. As a remedy to this problem, the idea of capsules was proposed by Hinton. In this paper, we propose the SubSpace Capsule Network (SCN) that exploits the idea of capsule networks to model possible variations in the appearance or implicitly-defined properties of an entity through a group of capsule subspaces instead of simply grouping neurons to create capsules. A capsule is created by projecting an input feature vector from a lower layer onto the capsule subspace using a learnable transformation. This transformation finds the degree of alignment of the input with the properties modeled by the capsule subspace.We show that SCN is a general capsule network that can successfully be applied to both discriminative and generative models without incurring computational overhead compared to CNN during test time. Effectiveness of SCN is evaluated through a comprehensive set of experiments on supervised image classification, semi-supervised image classification and high-resolution image generation tasks using the generative adversarial network (GAN) framework. SCN significantly improves the performance of the baseline models in all 3 tasks.more » « less
-
Effectively filtering and categorizing the large volume of user-generated content on social media during disaster events can help emergency management and disaster response prioritize their resources. Deep learning approaches, including recurrent neural networks and transformer-based models, have been previously used for this purpose. Capsule Neural Networks (CapsNets), initially proposed for image classification, have been proven to be useful for text analysis as well. However, to the best of our knowledge, CapsNets have not been used for classifying crisis-related messages, and have not been extensively compared with state-of-the-art transformer-based models, such as BERT. Therefore, in this study, we performed a thorough comparison between CapsNet models, state-of-the-art BERT models and two popular recurrent neural network models that have been successfully used for tweet classification, specifically, LSTM and Bi-LSTM models, on the task of classifying crisis tweets both in terms of their informativeness (binary classification), as well as their humanitarian content (multi-class classification). For this purpose, we used several benchmark datasets for crisis tweet classification, namely CrisisBench, CrisisNLP and CrisisLex. Experimental results show that the performance of the CapsNet models is on a par with that of LSTM and Bi-LSTM models for all metrics considered, while the performance obtained with BERT models have surpassed the performance of the other three models across different datasets and classes for both classification tasks, and thus BERT could be considered the best overall model for classifying crisis tweets.more » « less
-
There is a growing interest in low power highly efficient wearable devices for automatic dietary monitoring (ADM) [1]. The success of deep neural networks in audio event classification problems makes them ideal for this task. Deep neural networks are, however, not only computationally intensive and energy inefficient but also require a large amount of memory. To address these challenges, we propose a shallow gated recurrent unit (GRU) architecture suitable for resource-constrained applications. This paper describes the implementation of the Tiny Eats GRU, a shallow GRU neural network, on a low power microcontroller, Arm Cortex M0+, to classify eating episodes. Tiny Eats GRU is a hybrid of the traditional GRU [2] and eGRU [3] which makes it small and fast enough to fit on the Arm Cortex M0+ with comparable accuracy to the traditional GRU. The Tiny Eats GRU utilizes only 4% of the Arm Cortex M0+ memory and identifies eating or non-eating episodes with 6 ms latency and accuracy of 95.15%.more » « less
An official website of the United States government

