Abstract BackgroundAs biomedical knowledge is rapidly evolving, concept enrichment of biomedical terminologies is an active research area involving automatic identification of missing or new concepts. Previously, we prototyped a lexical-based formal concept analysis (FCA) approach in which concepts were derived by intersecting bags of words, to identify potentially missing concepts in the National Cancer Institute (NCI) Thesaurus. However, this prototype did not handle concept naming and positioning. In this paper, we introduce a sequenced-based FCA approach to identify potentially missing concepts, supporting concept naming and positioning. MethodsWe consider the concept name sequences as FCA attributes to construct the formal context. The concept-forming process is performed by computing the longest common substrings of concept name sequences. After new concepts are formalized, we further predict their potential positions in the original hierarchy by identifying their supertypes and subtypes from original concepts. Automated validation via external terminologies in the Unified Medical Language System (UMLS) and biomedical literature in PubMed is performed to evaluate the effectiveness of our approach. ResultsWe applied our sequenced-based FCA approach to all the sub-hierarchies underDisease or Disorderin the NCI Thesaurus (19.08d version) and five sub-hierarchies underClinical FindingandProcedurein the SNOMED CT (US Edition, March 2020 release). In total, 1397 potentially missing concepts were identified in the NCI Thesaurus and 7223 in the SNOMED CT. For NCI Thesaurus, 85 potentially missing concepts were found in external terminologies and 315 of the remaining 1312 appeared in biomedical literature. For SNOMED CT, 576 were found in external terminologies and 1159 out of the remaining 6647 were found in biomedical literature. ConclusionOur sequence-based FCA approach has shown the promise for identifying potentially missing concepts in biomedical terminologies.
more »
« less
The diversity of experimental organisms in biomedical research may be influenced by biomedical funding
Contrary to concerns of some critics, we present evidence that biomedical research is not dominated by a small handful of model organisms. An exhaustive analysis of research literature suggests that the diversity of experimental organisms in biomedical research has increased substantially since 1975. There has been a longstanding worry that organism‐centric funding policies can lead to biases in experimental organism choice, and thus negatively impact the direction of research and the interpretation of results. Critics have argued that a focus on model organisms has unduly constrained the diversity of experimental organisms. The availability of large electronic databases of scientific literature, combined with interest in quantitative methods among philosophers of science, presents new opportunities for data‐driven investigations into organism choice in biomedical research. The diversity of organisms used in NIH‐funded research may be considerably lower than in the broader biomedical sciences, and may be subject to greater constraints on organism choice.
more »
« less
- Award ID(s):
- 1656284
- PAR ID:
- 10026900
- Publisher / Repository:
- Wiley Blackwell (John Wiley & Sons)
- Date Published:
- Journal Name:
- BioEssays
- Volume:
- 39
- Issue:
- 5
- ISSN:
- 0265-9247
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Mormyroidea is a superfamily of weakly electric African fishes with great potential as a model in a variety of biomedical research areas including systems neuroscience, muscle cell and craniofacial development, ion channel biophysics, and flagellar/ciliary biology. However, they are currently difficult to breed in the laboratory setting, which is essential for any tractable model organism. As such, there is a need to better understand the reproductive biology of mormyroids to breed them more reliably in the laboratory to effectively use them as a biomedical research model. This review seeks to (1) briefly highlight the biomedically relevant phenotypes of mormyroids and (2) compile information about mormyroid reproduction including sex differences, breeding season, sexual maturity, gonads, gametes, and courtship/spawning behaviors. We also highlight areas of mormyroid reproductive biology that are currently unexplored and/or have the potential for further investigation that may provide insights into more successful mormyroid laboratory breeding methods.more » « less
-
Abstract MotivationFigures in biomedical papers communicate essential information with the potential to identify relevant documents in biomedical and clinical settings. However, academic search interfaces mainly search over text fields. ResultsWe describe a search system for biomedical documents that leverages image modalities and an existing index server. We integrate a problem-specific taxonomy of image modalities and image-based data into a custom search system. Our solution features a front-end interface to enhance classical document search results with image-related data, including page thumbnails, figures, captions and image-modality information. We demonstrate the system on a subset of the CORD-19 document collection. A quantitative evaluation demonstrates higher precision and recall for biomedical document retrieval. A qualitative evaluation with domain experts further highlights our solution’s benefits to biomedical search. Availability and implementationA demonstration is available at https://runachay.evl.uic.edu/scholar. Our code and image models can be accessed via github.com/uic-evl/bio-search. The dataset is continuously expanded.more » « less
-
Entity linking is the task of linking mentions of named entities in natural language text, to entities in a curated knowledge-base. This is of significant importance in the biomedical domain, where it could be used to semantically annotate a large volume of clinical records and biomedical literature, to standardized concepts described in an ontology such as Unified Medical Language System (UMLS). We observe that with precise type information, entity disambiguation becomes a straightforward task. However, fine-grained type information is usually not available in biomedical domain. Thus, we propose LATTE, a LATent Type Entity Linking model, that improves entity linking by modeling the latent fine-grained type information about mentions and entities. Unlike previous methods that perform entity linking directly between the mentions and the entities, LATTE jointly does entity disambiguation, and latent fine-grained type learning, without direct supervision. We evaluate our model on two biomedical datasets: MedMentions, a large scale public dataset annotated with UMLS concepts, and a de-identified corpus of dictated doctor’s notes that has been annotated with ICD concepts. Extensive experimental evaluation shows our model achieves significant performance improvements over several state-of-the-art techniques.more » « less
-
AbstractUnicellular organisms can engage in a process by which a cell purposefully destroys itself, termed programmed cell death (PCD). While it is clear that the death of specific cells within amulticellularorganism could increase inclusive fitness (e.g., during development), the origin of PCD inunicellularorganisms is less obvious. Kin selection has been shown to help maintain instances of PCD in existing populations of unicellular organisms; however, competing hypotheses exist about whether additional factors are necessary to explain its origin. Those factors could include an environmental shift that causes latent PCD to be expressed, PCD hitchhiking on a large beneficial mutation, and PCD being simply a common pathology. Here, we present results using an artificial life model to demonstrate that kin selection can, in fact, be sufficient to give rise to PCD in unicellular organisms. Furthermore, when benefits to kin are direct—that is, resources provided to nearby kin—PCD is more beneficial than when benefits are indirect—that is, nonkin are injured, thus increasing the relative amount of resources for kin. Finally, when considering how strict organisms are in determining kin or nonkin (in terms of mutations), direct benefits are viable in a narrower range than indirect benefits. Open Research BadgesThis article has been awarded Open Data and Open Materials Badges. All materials and data are publicly accessible via the Open Science Framework athttps://github.com/anyaevostinar/SuicidalAltruismDissertation/tree/master/LongTerm.more » « less