Title: Anchor and Broadcast: An Efficient Concept Alignment Approach for Evaluation of Semantic Graphs
In this paper, we present AnCast, an intuitive and efficient tool for evaluating graph-based meaning representations (MR). AnCast implements evaluation metrics that are well understood in the NLP community, and they include concept F1, unlabeled relation F1, labeled relation F1, and weighted relation F1. The efficiency of the tool comes from a novel anchor broadcast alignment algorithm that is not subject to the trappings of local maxima. We show through experimental results that the AnCast score is highly correlated with the widely used Smatch score, but its computation takes only about 40% the time. more »« less
In this paper, we present AnCast, an intuitive and efficient tool for evaluating graph-based meaning representations (MR). AnCast implements evaluation metrics that are well understood in the NLP community, and they include concept F1, unlabeled relation F1, labeled relation F1, and weighted relation F1. The efficiency of the tool comes from a novel anchor broadcast alignment algorithm that is not subject to the trappings of local maxima. We show through experimental results that the AnCast score is highly correlated with the widely used Smatch score, but its computation takes only about 40% the time.
Abstract Background Diabetic retinopathy (DR) is a leading cause of blindness in American adults. If detected, DR can be treated to prevent further damage causing blindness. There is an increasing interest in developing artificial intelligence (AI) technologies to help detect DR using electronic health records. The lesion-related information documented in fundus image reports is a valuable resource that could help diagnoses of DR in clinical decision support systems. However, most studies for AI-based DR diagnoses are mainly based on medical images; there is limited studies to explore the lesion-related information captured in the free text image reports. Methods In this study, we examined two state-of-the-art transformer-based natural language processing (NLP) models, including BERT and RoBERTa, compared them with a recurrent neural network implemented using Long short-term memory (LSTM) to extract DR-related concepts from clinical narratives. We identified four different categories of DR-related clinical concepts including lesions, eye parts, laterality, and severity, developed annotation guidelines, annotated a DR-corpus of 536 image reports, and developed transformer-based NLP models for clinical concept extraction and relation extraction. We also examined the relation extraction under two settings including ‘gold-standard’ setting—where gold-standard concepts were used–and end-to-end setting. Results For concept extraction, the BERT model pretrained with the MIMIC III dataset achieve the best performance (0.9503 and 0.9645 for strict/lenient evaluation). For relation extraction, BERT model pretrained using general English text achieved the best strict/lenient F1-score of 0.9316. The end-to-end system, BERT_general_e2e, achieved the best strict/lenient F1-score of 0.8578 and 0.8881, respectively. Another end-to-end system based on the RoBERTa architecture, RoBERTa_general_e2e, also achieved the same performance as BERT_general_e2e in strict scores. Conclusions This study demonstrated the efficiency of transformer-based NLP models for clinical concept extraction and relation extraction. Our results show that it’s necessary to pretrain transformer models using clinical text to optimize the performance for clinical concept extraction. Whereas, for relation extraction, transformers pretrained using general English text perform better.
Rahimi Mahdi and Surdeanu Mihai
(, Proceedings of the 8th Workshop on Representation Learning for NLP (RepL4NLP))
While fully supervised relation classification (RC) models perform well on large-scale datasets, their performance drops drastically in low-resource settings. As generating annotated examples are expensive, recent zero-shot methods have been proposed that reformulate RC into other NLP tasks for which supervision exists such as textual entailment. However, these methods rely on templates that are manually created which is costly and requires domain expertise. In this paper, we present a novel strategy for template generation for relation classification, which is based on adapting Harris’ distributional similarity principle to templates encoded using contextualized representations. Further, we perform empirical evaluation of different strategies for combining the automatically acquired templates with manual templates. The experimental results on TACRED show that our approach not only performs better than the zero-shot RC methods that only use manual templates, but also that it achieves state-of-the-art performance for zero-shot TACRED at 64.3 F1 score.
Wang, Yiwei; Hooi, Bryan; Wang, Fei; Cai, Yujun; Liang, Yuxuan; Zhou, Wenxuan; Tang, Jing; Duan, Manjuan; Chen, Muhao
(, Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL))
Relation extraction (RE) aims to extract the relations between entity names from the textual context. In principle, textual context determines the ground-truth relation and the RE models should be able to correctly identify the relations reflected by the textual context. However, existing work has found that the RE models memorize the entity name patterns to make RE predictions while ignoring the textual context. This motivates us to raise the question: are RE models robust to the entity replacements? In this work, we operate the random and type-constrained entity replacements over the RE instances in TACRED and evaluate the state-of-the-art RE models under the entity replacements. We observe the 30% - 50% F1 score drops on the state-of-the-art RE models under entity replacements. These results suggest that we need more efforts to develop effective RE models robust to entity replacements. We release the source code at https://github.com/wangywUST/RobustRE.
Reavis, Janie L.; Demir, H. Seckin; Witherington, Blair E.; Bresette, Michael J.; Blain Christen, Jennifer; Senko, Jesse F.; Ozev, Sule
(, Frontiers in Marine Science)
Incidental capture, or bycatch, of marine species is a global conservation concern. Interactions with fishing gear can cause mortality in air-breathing marine megafauna, including sea turtles. Despite this, interactions between sea turtles and fishing gear—from a behavior standpoint—are not sufficiently documented or described in the literature. Understanding sea turtle behavior in relation to fishing gear is key to discovering how they become entangled or entrapped in gear. This information can also be used to reduce fisheries interactions. However, recording and analyzing these behaviors is difficult and time intensive. In this study, we present a machine learning-based sea turtle behavior recognition scheme. The proposed method utilizes visual object tracking and orientation estimation tasks to extract important features that are used for recognizing behaviors of interest with green turtles ( Chelonia mydas ) as the study subject. Then, these features are combined in a color-coded feature image that represents the turtle behaviors occurring in a limited time frame. These spatiotemporal feature images are used along a deep convolutional neural network model to recognize the desired behaviors, specifically evasive behaviors which we have labeled “reversal” and “U-turn.” Experimental results show that the proposed method achieves an average F1 score of 85% in recognizing the target behavior patterns. This method is intended to be a tool for discovering why sea turtles become entangled in gillnet fishing gear.
Sun, Haibo, and Xue, Nianwen. Anchor and Broadcast: An Efficient Concept Alignment Approach for Evaluation of Semantic Graphs. Retrieved from https://par.nsf.gov/biblio/10527020.
Sun, Haibo, & Xue, Nianwen. Anchor and Broadcast: An Efficient Concept Alignment Approach for Evaluation of Semantic Graphs. Retrieved from https://par.nsf.gov/biblio/10527020.
Sun, Haibo, and Xue, Nianwen.
"Anchor and Broadcast: An Efficient Concept Alignment Approach for Evaluation of Semantic Graphs". Country unknown/Code not available: ELRA and ICCL. https://par.nsf.gov/biblio/10527020.
@article{osti_10527020,
place = {Country unknown/Code not available},
title = {Anchor and Broadcast: An Efficient Concept Alignment Approach for Evaluation of Semantic Graphs},
url = {https://par.nsf.gov/biblio/10527020},
abstractNote = {In this paper, we present AnCast, an intuitive and efficient tool for evaluating graph-based meaning representations (MR). AnCast implements evaluation metrics that are well understood in the NLP community, and they include concept F1, unlabeled relation F1, labeled relation F1, and weighted relation F1. The efficiency of the tool comes from a novel anchor broadcast alignment algorithm that is not subject to the trappings of local maxima. We show through experimental results that the AnCast score is highly correlated with the widely used Smatch score, but its computation takes only about 40% the time.},
journal = {},
publisher = {ELRA and ICCL},
author = {Sun, Haibo and Xue, Nianwen},
editor = {Calzolari, Nicoletta and Kan, Min-Yen Kan and Hoste, Veronique and Lenci, Alessandro and Sakti, Sakriani and Xue, Nianwen}
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.