NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A Gated Recurrent Unit based architecture for recognizing ontology concepts from biological literature

https://doi.org/10.1186/s13040-022-00310-0

Devkota, Pratik; Mohanty, Somya D.; Manda, Prashanti (September 2022, BioData Mining)

Abstract BackgroundAnnotating scientific literature with ontology concepts is a critical task in biology and several other domains for knowledge discovery. Ontology based annotations can power large-scale comparative analyses in a wide range of applications ranging from evolutionary phenotypes to rare human diseases to the study of protein functions. Computational methods that can tag scientific text with ontology terms have included lexical/syntactic methods, traditional machine learning, and most recently, deep learning. ResultsHere, we present state of the art deep learning architectures based on Gated Recurrent Units for annotating text with ontology concepts. We use the Colorado Richly Annotated Full Text Corpus (CRAFT) as a gold standard for training and testing. We explore a number of additional information sources including NCBI’s BioThesauraus and Unified Medical Language System (UMLS) to augment information from CRAFT for increasing prediction accuracy. Our best model results in a 0.84 F1 and semantic similarity. ConclusionThe results shown here underscore the impact for using deep learning architectures for automatically recognizing ontology concepts from literature. The augmentation of the models with biological information beyond that present in the gold standard corpus shows a distinct improvement in prediction accuracy.
more » « less
A deep semantic matching approach for identifying relevant messages for social media analysis

https://doi.org/10.1038/s41598-023-38761-y

Biggers, Frederick Brown; Mohanty, Somya D.; Manda, Prashanti (July 2023, Scientific Reports)

Abstract There is a growing interest in using social media content for Natural Language Processing applications. However, it is not easy to computationally identify the most relevant set of tweets related to any specific event. Challenging semantics coupled with different ways for using natural language in social media make it difficult for retrieving the most relevant set of data from any social media outlet. This paper seeks to demonstrate a way to present the changing semantics of Twitter within the context of a crisis event, specifically tweets during Hurricane Irma. These methods can be used to identify the most relevant corpus of text for analysis in relevance to a specific incident such as a hurricane. Using an implementation of the Word2Vec method of Neural Network training mechanisms to create Word Embeddings, this paper will: discuss how the relative meaning of words changes as events unfold; present a mechanism for scoring tweets based upon dynamic, relative context relatedness; and show that similarity between words is not necessarily static. We present different methods for training the vector model in Word2Vec for identification of the most relevant tweets for any search query. The impact of tuning parameters such as Word Window Size, Minimum Word Frequency, Hidden Layer Dimensionality, and Negative Sampling on model performance was explored. The window containing the local maximum for AU_ROC for each parameter serves as a guide for other studies using the methods presented here for social media data analysis.
more » « less
Improving the Evaluation of NLP Approaches for Scientific Text Annotation with Ontology Embedding-Based Semantic Similarity Metrics

Devkota, Pratik; Mohanty, Somya; Manda, Prashanti (December 2023, ACL Anthology)

Full Text Available
Using ontology embeddings with deep learning architectures to improve prediction of ontology concepts from literature

Devkota, Pratik; Mohanty, Somya; Manda, Prashanti (September 2023, CEUR workshop proceedings)

Full Text Available
Ontology-Powered Boosting for Improved Recognition of Ontology Concepts from Biological Literature [Ontology-Powered Boosting for Improved Recognition of Ontology Concepts from Biological Literature]

https://doi.org/10.5220/0011683200003414

Devkota, Pratik; Mohanty, Somya; Manda, Prashanti (January 2023, 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023))

Full Text Available
Knowledge of the Ancestors: Intelligent Ontology-aware Annotation of Biological Literature using Semantic Similarity

Devkota, Pratik; Mohanty, Somya; Manda, Prashanti (September 2022, International COnference on Biomedical Ontology)

Full Text Available
Knowledge of the Ancestors: Intelligent Ontology-aware Annotation of Biological Literature using Semantic Similarity

Devkota; Pratik; Mohanty, Somya; Manda, Prashanti (January 2022, International Conference on Biomedical Ontology)

Full Text Available
Automated ontology-based annotation of scientific literature using deep learning

https://doi.org/10.1145/3391274.3393636

Manda, Prashanti; SayedAhmed, Saed; Mohanty, Somya D. (July 2020, Proceedings of The International Workshop on Semantic Big Data)
null (Ed.)
Full Text Available

Search for: All records