Deterministic Routing between Layout Abstractions for Multi-Scale Classification of Visually Rich Documents

Sarkhel, Ritesh; Nandi, Arnab

doi:10.24963/ijcai.2019/466

Citation Details

Deterministic Routing between Layout Abstractions for Multi-Scale Classification of Visually Rich Documents

Classifying heterogeneous visually rich documents is a challenging task. Difficulty of this task increases even more if the maximum allowed inference turnaround time is constrained by a threshold. The increased overhead in inference cost, compared to the limited gain in classification capabilities make current multi-scale approaches infeasible in such scenarios. There are two major contributions of this work. First, we propose a spatial pyramid model to extract highly discriminative multi-scale feature descriptors from a visually rich document by leveraging the inherent hierarchy of its layout. Second, we propose a deterministic routing scheme for accelerating end-to-end inference by utilizing the spatial pyramid model. A depth-wise separable multi-column convolutional network is developed to enable our method. We evaluated the proposed approach on four publicly available, benchmark datasets of visually rich documents. Results suggest that our proposed approach demonstrates robust performance compared to the state-of-the-art methods in both classification accuracy and total inference turnaround. more »

Award ID(s):: 1910356

PAR ID:: 10173224

Author(s) / Creator(s):: Sarkhel, Ritesh; Nandi, Arnab

Date Published:: 2019-08-01

Journal Name:: 28th International Joint Conference on Artificial Intelligence (IJCAI), 2019

Page Range / eLocation ID:: 3360 to 3366

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.24963/ijcai.2019/466

More Like this