DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks

Das, Sagnik; Ma, Ke; Shu, Zhixin; Samaras, Dimitris; Shilkrot, Roy

doi:10.1109/ICCV.2019.00022

Citation Details

DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks

Capturing document images with hand-held devices in unstructured environments is a common practice nowadays. However, “casual” photos of documents are usually unsuitable for automatic information extraction, mainly due to physical distortion of the document paper, as well as various camera positions and illumination conditions. In this work, we propose DewarpNet, a deep-learning approach for document image unwarping from a single image. Our insight is that the 3D geometry of the document not only determines the warping of its texture but also causes the illumination effects. Therefore, our novelty resides on the explicit modeling of 3D shape for document paper in an end-to-end pipeline. Also, we contribute the largest and most comprehensive dataset for document image unwarping to date – Doc3D. This dataset features multiple ground-truth annotations, including 3D shape, surface normals, UV map, albedo image, etc. Training with Doc3D, we demonstrate state-of-the-art performance for DewarpNet with extensive qualitative and quantitative evaluations. Our network also significantly improves OCR performance on captured document images, decreasing character error rate by 42% on average. Both the code and the dataset are released. more »

Award ID(s):: 1650499

PAR ID:: 10137869

Author(s) / Creator(s):: Das, Sagnik; Ma, Ke; Shu, Zhixin; Samaras, Dimitris; Shilkrot, Roy

Date Published:: 2019-10-01

Journal Name:: IEEE/CVF International Conference on Computer Vision

Page Range / eLocation ID:: 131 to 140

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/ICCV.2019.00022

More Like this