Understanding causality is a longstanding goal across many different domains. Different articles, such as those published in medical journals, publish newly discovered knowledge, often causal. In this paper, we use this intuition to build a model that leverages causal relations to unearth factors related to Sjögren’s syndrome. Sjögren’s syndrome is an autoimmune disease affecting up to 3.1 million Americans. The uncommon nature of the disease, coupled with common symptoms of other autoimmune conditions such as rheumatoid arthritis, it is difficult for clinicians to timely diagnose the disease. This is further worsened by suboptimal communication between dentists, and physicians, including rheumatologists and ophthalmologists, because clinical manifestations of this disease require the patients to visit physicians with different specialties. A centralized information system with easy access to common and uncommon factors related to Sjögren’s syndrome may alleviate the problem. We use automatically extracted causal relationships from text related to Sjögren’s syndrome collected from the medical literature to identify a set of factors, such as “signs and symptoms” and “associated conditions”, related to this disease. We show that our approach is capable of retrieving such factors with high precision and recall values. Comparative experiments show that this approach leads to 25% improvement in retrieval F1-score compared to several state-of-the-art biomedical models, including BioBERT and Gram-CNN.
more »
« less
Note: Using Causality to Mine Sjögren’s Syndrome related Factors from Medical Literature
Research articles published in medical journals often present findings from causal experiments. In this paper, we use this intuition to build a model that leverages causal relations expressed in text to unearth factors related to Sjögren’s syndrome. Sjögren’s syndrome is an auto-immune disease affecting up to 3.1 million Americans. The uncommon nature of the disease, coupled with common symptoms with other autoimmune conditions make the timely diagnosis of this disease very hard. A centralized information system with easy access to common and uncommon factors related to Sjögren’s syndrome may alleviate the problem. We use automatically extracted causal relationships from text related to Sjögren’s syndrome collected from the medical literature to identify a set of factors, such as “signs and symptoms” and “associated conditions”, related to this disease. We show that our approach is capable of retrieving such factors with a high precision and recall values. Comparative experiments show that this approach leads to 25% improvement in retrieval F1-score compared to several state-of-the-art biomedical models, including BioBERT and Gram-CNN.
more »
« less
- Award ID(s):
- 1948322
- PAR ID:
- 10446656
- Date Published:
- Journal Name:
- COMPASS '22: Proceedings of the 5th ACM SIGCAS/SIGCHI Conference on Computing and Sustainable Societies
- Page Range / eLocation ID:
- 674 to 681
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract In response to the COVID-19 outbreak, scientists and medical researchers are capturing a wide range of host responses, symptoms and lingering postrecovery problems within the human population. These variable clinical manifestations suggest differences in influential factors, such as innate and adaptive host immunity, existing or underlying health conditions, comorbidities, genetics and other factors—compounding the complexity of COVID-19 pathobiology and potential biomarkers associated with the disease, as they become available. The heterogeneous data pose challenges for efficient extrapolation of information into clinical applications. We have curated 145 COVID-19 biomarkers by developing a novel cross-cutting disease biomarker data model that allows integration and evaluation of biomarkers in patients with comorbidities. Most biomarkers are related to the immune (SAA, TNF-∝ and IP-10) or coagulation (D-dimer, antithrombin and VWF) cascades, suggesting complex vascular pathobiology of the disease. Furthermore, we observe commonality with established cancer biomarkers (ACE2, IL-6, IL-4 and IL-2) as well as biomarkers for metabolic syndrome and diabetes (CRP, NLR and LDL). We explore these trends as we put forth a COVID-19 biomarker resource (https://data.oncomx.org/covid19) that will help researchers and diagnosticians alike.more » « less
-
Masanori Aikawa (Ed.)As binary switches, RAS proteins switch to an ON/OFF state during signaling and are on a leash under normal conditions. However, in RAS-related diseases such as cancer and RASopathies, mutations in the genes that regulate RAS signaling or the RAS itself permanently activate the RAS protein. The structural basis of this switch is well understood; however, the exact mechanisms by which RAS proteins are regulated are less clear. RAS/MAPK syndromes are multisystem developmental disorders caused by germline mutations in genes associated with the RAS/mitogen-activated protein kinase pathway, impacting 1 in 1,000–2,500 children. These include a variety of disorders such as Noonan syndrome (NS) and NS-related disorders (NSRD), such as cardio facio cutaneous (CFC) syndrome, Costello syndrome (CS), and NS with multiple lentigines (NSML, also known as LEOPARD syndrome). A frequent manifestation of cardiomyopathy (CM) and hypertrophic cardiomyopathy associated with RASopathies suggest that RASopathies could be a potential causative factor for CM. However, the current supporting evidence is sporadic and unclear. RASopathy-patients also display a broad spectrum of congenital heart disease (CHD). More than 15 genes encode components of the RAS/MAPK signaling pathway that are essential for the cell cycle and play regulatory roles in proliferation, differentiation, growth, and metabolism. These genes are linked to the molecular genetic pathogenesis of these syndromes. However, genetic heterogeneity for a given syndrome on the one hand and alleles for multiple syndromes on the other make classification difficult in diagnosing RAS/MAPK-related diseases. Although there is some genetic homogeneity in most RASopathies, several RASopathies are allelic diseases. This allelism points to the role of critical signaling nodes and sheds light on the overlap between these related syndromes. Even though considerable progress has been made in understanding the pathophysiology of RASopathy with the identification of causal mutations and the functional analysis of their pathophysiological consequences, there are still unidentified causal genes for many patients diagnosed with RASopathies.more » « less
-
Amid the COVID-19 pandemic, it has been reported that greater than 35% of patients with confirmed or suspected COVID-19 develop postacute sequelae of SARS CoV-2 (PASC). PASC is still a disease for which preliminary medical data are being collected—mostly measurements collected during hospital or clinical visits—and pathophysiological understanding is yet in its infancy. The disease is notable for its prevalence and its variable symptom presentation, and as such, management plans could be more holistically made if health care providers had access to unobtrusive home-based wearable and contactless continuous physiologic and physical sensor data. Such between-hospital or between-clinic data can quantitatively elucidate a majority of the temporal evolution of PASC symptoms. Although not universally of comparable accuracy to gold standard medical devices, home-deployed sensors offer great insights into the development and progression of PASC. Suitable sensors include those providing vital signs and activity measurements that correlate directly or by proxy to documented PASC symptoms. Such continuous, home-based data can give care providers contextualized information from which symptom exacerbation or relieving factors may be classified. Such data can also improve the collective academic understanding of PASC by providing temporally and activity-associated symptom cataloging. In this viewpoint, we make a case for the utilization of home-based continuous sensing that can serve as a foundation from which medical professionals and engineers may develop and pursue long-term mitigation strategies for PASC.more » « less
-
An ontology is a structured framework that categorizes entities, concepts, and relationships within a domain to facilitate shared understanding, and it is important in computational linguistics and knowledge representation. In this paper, we propose a novel framework to automatically extend an existing ontology from streaming data in a zero-shot manner. Specifically, the zero-shot ontology extension framework uses online and hierarchical clustering to integrate new knowledge into existing ontologies without substantial annotated data or domain-specific expertise. Focusing on the medical field, this approach leverages Large Language Models (LLMs) for two key tasks: Symptom Typing and Symptom Taxonomy among breast and bladder cancer survivors. Symptom Typing involves identifying and classifying medical symptoms from unstructured online patient forum data, while Symptom Taxonomy organizes and integrates these symptoms into an existing ontology. The combined use of online and hierarchical clustering enables real-time and structured categorization and integration of symptoms. The dual-phase model employs multiple LLMs to ensure accurate classification and seamless integration of new symptoms with minimal human oversight. The paper details the framework's development, experiments, quantitative analyses, and data visualizations, demonstrating its effectiveness in enhancing medical ontologies and advancing knowledge-based systems in healthcare.more » « less
An official website of the United States government

