StruBERT: Structure-aware BERT for Table Search and Matching

Trabelsi, Mohamed; Chen, Zhiyu; Zhang, Shuo; Davison, Brian D.; Heflin, Jeff

doi:10.1145/3485447.3511972

Citation Details

StruBERT: Structure-aware BERT for Table Search and Matching

A table is composed of data values that are organized in %a 2D matrix with rows and columns providing implicit structural information. A table is usually accompanied by secondary information such as the caption, page title, etc., that form the textual information. Understanding the connection between the textual and structural information is an important, yet neglected aspect in table retrieval, as previous methods treat each source of information independently. In this paper, we propose StruBERT, a structure-aware BERT model that fuses the textual and structural information of a data table to produce context-aware representations for both textual and tabular content of a data table. We introduce the concept of horizontal self-attention, which extends the idea of vertical self-attention introduced in TaBERT and allows us to treat both dimensions of a table equally. StruBERT features are integrated in a new end-to-end neural ranking model to solve three table-related downstream tasks: keyword- and content-based table retrieval, and table similarity. We evaluate our approach using three datasets, and we demonstrate substantial improvements in terms of retrieval and classification metrics over state-of-the-art methods. more »

Award ID(s):: 1816325

PAR ID:: 10393251

Author(s) / Creator(s):: Trabelsi, Mohamed; Chen, Zhiyu; Zhang, Shuo; Davison, Brian D.; Heflin, Jeff

Date Published:: 2022-04-25

Journal Name:: Proceedings of the ACM Web Conference 2022

Page Range / eLocation ID:: 442 to 451

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Conference Paper:
https://doi.org/10.1145/3485447.3511972

More Like this