Comparison of Text Mining Feature Extraction Methods Using Moderated vs Non-Moderated Blogs: An Autism Perspective

Md Tayeen, Abu Saleh; Masadeh, Saleem; Mtibaa, Abderrahmen; Misra, Satyajayant; Choudhury, Moumita

doi:10.1145/3357729.3357740

Citation Details

Comparison of Text Mining Feature Extraction Methods Using Moderated vs Non-Moderated Blogs: An Autism Perspective

Online social media is being widely used by social scientists to study human behavior. Researchers have explored different feature extraction (FE) and classification techniques to perform sentiment analysis, topic identification, etc. Most studies tend to evaluate FE and classification methods using only one particular class of datasets---well-defined with little/no noise or with well-defined noise. For instance, when the datasets under study have different noise characteristics, various FE and/or classification methods may fail to identify a given topic. In this paper, we fill this gap by quantitatively comparing multiple FE methods and classifiers using three different datasets (two moderator-controlled blogs and one single-authored personal blogs) related to Autism Spectrum Disorder (ASD). Our result shows that no particular combination of FE and classifier is the best overall, but choosing the right ones can improve accuracy by over 30%. more »

Award ID(s):: 1757207

PAR ID:: 10156046

Author(s) / Creator(s):: Md Tayeen, Abu Saleh; Masadeh, Saleem; Mtibaa, Abderrahmen; Misra, Satyajayant; Choudhury, Moumita

Date Published:: 2019-01-01

Journal Name:: DPH 2019

Page Range / eLocation ID:: 69 to 78

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3357729.3357740

More Like this