Identification of missing hierarchical relations in the vaccine ontology using acquired term pairs

Manuel, Warren; Abeysinghe, Rashmie; He, Yongqun; Tao, Cui; Cui, Licong

doi:10.1186/s13326-022-00276-2

Abstract BackgroundThe Vaccine Ontology (VO) is a biomedical ontology that standardizes vaccine annotation. Errors in VO will affect a multitude of applications that it is being used in. Quality assurance of VO is imperative to ensure that it provides accurate domain knowledge to these downstream tasks. Manual review to identify and fix quality issues (such as missing hierarchical is-a relations) is challenging given the complexity of the ontology. Automated approaches are highly desirable to facilitate the quality assurance of VO. MethodsWe developed an automated lexical approach that identifies potentially missingis-arelations in VO. First, we construct two types of VO concept-pairs: (1) linked; and (2) unlinked. Each concept-pair further derives an Acquired Term Pair (ATP) based on their lexical features. If the same ATP is obtained by a linked concept-pair and an unlinked concept-pair, this is considered to indicate a potentially missingis-arelation between the unlinked pair of concepts. ResultsApplying this approach on the 1.1.192 version of VO, we were able to identify 232 potentially missingis-arelations. A manual review by a VO domain expert on a random sample of 70 potentially missingis-arelations revealed that 65 of the cases were valid missingis-arelations in VO (a precision of 92.86%). ConclusionsThe results indicate that our approach is highly effective in identifying missingis-arelation in VO.

More Like this