Automatic Event Coding Framework for Spanish Political News Articles

Salam, Sayeed; Khan, Lamisah; El-Ghamry, Amir; Brandt, Patrick; Holmes, Jennifer; D'Orazio, Vito; Osorio, Javier

doi:10.1109/BigDataSecurity-HPSC-IDS49724.2020.00052

Citation Details

Automatic Event Coding Framework for Spanish Political News Articles

Today, Spanish speaking countries face widespread political crisis. These political conflicts are published in a large volume of Spanish news articles from Spanish agencies. Our goal is to create a fully functioning system that parses realtime Spanish texts and generates scalable event code. Rather than translating Spanish text into English text and using English event coders, we aim to create a tool that uses raw Spanish text and Spanish event coders for better flexibility, coverage, and cost.To accommodate the processing of a large number of Spanish articles, we adapt a distributed framework based on Apache Spark. We highlight how to extend the existing ontology to provide support for the automated coding process for Spanish texts. We also present experimental data to provide insight into the data collection process with filtering unrelated articles, scaling the framework, and gathering basic statistics on the dataset. more »

Award ID(s):: 1931541

PAR ID:: 10376287

Author(s) / Creator(s):: Salam, Sayeed; Khan, Lamisah; El-Ghamry, Amir; Brandt, Patrick; Holmes, Jennifer; D'Orazio, Vito; Osorio, Javier

Date Published:: 2020-05-01

Journal Name:: 2020 IEEE 6th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS)

Page Range / eLocation ID:: 246 to 253

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

More Like this