A Study of Extracting Causal Relationships from Text

Gujarathi, Pranav; Reddy, Manohar; Tayade, Neha; Chakraborty, Sunandan

doi:10.1007/978-3-031-16075-2_59

Discovering causal knowledge is an important aspect of much scientific research and such findings are often recorded in scholarly articles. Automatically identifying such knowledge from article text can be a useful tool and can act as an impetus for further research on those topics. Numerous applications, including building a causal knowledge graph, making pipelines for root cause analysis, discovering opportunities for drug discovery, and overall, a scalable building block towards turning large pieces of text into organized information can be built following such an approach. However, it requires robust methods to identify and aggregate causal knowledge from a large set of articles. The main challenge in designing new methods is the absence of a large labeled dataset. As a result, existing methods trained on existing datasets with limited size and variations in linguistic pattern, are unable to generalize well on unseen text. In this paper, we explore multiple unsupervised approaches, including a reinforcement learning-based model that learns to identify causal sentences from a small set of labeled sentences. We describe and discuss in detail our experiments for each approach to further encourage exploration of methods that can be re-utilized for different tasks as well, as opposed to simply exploring a supervised learning process which although superior in performance lacks the versatility to be re-purposed for slightly different tasks. We evaluate our methods on a custom-created dataset and show unique techniques to extract cause-effect relationships from the English language.

More Like this