skip to main content


Title: Preparing for the next pandemic via transfer learning from existing diseases with hierarchical multi-modal BERT: a study on COVID-19 outcome prediction
Abstract

Developing prediction models for emerging infectious diseases from relatively small numbers of cases is a critical need for improving pandemic preparedness. Using COVID-19 as an exemplar, we propose a transfer learning methodology for developing predictive models from multi-modal electronic healthcare records by leveraging information from more prevalent diseases with shared clinical characteristics. Our novel hierarchical, multi-modal model ($${\textsc {TransMED}}$$TRANSMED) integrates baseline risk factors from the natural language processing of clinical notes at admission, time-series measurements of biomarkers obtained from laboratory tests, and discrete diagnostic, procedure and drug codes. We demonstrate the alignment of$${\textsc {TransMED}}$$TRANSMED’s predictions with well-established clinical knowledge about COVID-19 through univariate and multivariate risk factor driven sub-cohort analysis.$${\textsc {TransMED}}$$TRANSMED’s superior performance over state-of-the-art methods shows that leveraging patient data across modalities and transferring prior knowledge from similar disorders is critical for accurate prediction of patient outcomes, and this approach may serve as an important tool in the early response to future pandemics.

 
more » « less
Award ID(s):
1838730
NSF-PAR ID:
10368315
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Scientific Reports
Volume:
12
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    We continue the program of proving circuit lower bounds via circuit satisfiability algorithms. So far, this program has yielded several concrete results, proving that functions in$\mathsf {Quasi}\text {-}\mathsf {NP} = \mathsf {NTIME}[n^{(\log n)^{O(1)}}]$Quasi-NP=NTIME[n(logn)O(1)]and other complexity classes do not have small circuits (in the worst case and/or on average) from various circuit classes$\mathcal { C}$C, by showing that$\mathcal { C}$Cadmits non-trivial satisfiability and/or#SAT algorithms which beat exhaustive search by a minor amount. In this paper, we present a new strong lower bound consequence of having a non-trivial#SAT algorithm for a circuit class${\mathcal C}$C. Say that a symmetric Boolean functionf(x1,…,xn) issparseif it outputs 1 onO(1) values of${\sum }_{i} x_{i}$ixi. We show that for every sparsef, and for all “typical”$\mathcal { C}$C, faster#SAT algorithms for$\mathcal { C}$Ccircuits imply lower bounds against the circuit class$f \circ \mathcal { C}$fC, which may bestrongerthan$\mathcal { C}$Citself. In particular:

    #SAT algorithms fornk-size$\mathcal { C}$C-circuits running in 2n/nktime (for allk) implyNEXPdoes not have$(f \circ \mathcal { C})$(fC)-circuits of polynomial size.

    #SAT algorithms for$2^{n^{{\varepsilon }}}$2nε-size$\mathcal { C}$C-circuits running in$2^{n-n^{{\varepsilon }}}$2nnεtime (for someε> 0) implyQuasi-NPdoes not have$(f \circ \mathcal { C})$(fC)-circuits of polynomial size.

    Applying#SAT algorithms from the literature, one immediate corollary of our results is thatQuasi-NPdoes not haveEMAJACC0THRcircuits of polynomial size, whereEMAJis the “exact majority” function, improving previous lower bounds againstACC0[Williams JACM’14] andACC0THR[Williams STOC’14], [Murray-Williams STOC’18]. This is the first nontrivial lower bound against such a circuit class.

     
    more » « less
  2. Abstract

    In this work, we aim to accurately predict the number of hospitalizations during the COVID-19 pandemic by developing a spatiotemporal prediction model. We propose HOIST, an Ising dynamics-based deep learning model for spatiotemporal COVID-19 hospitalization prediction. By drawing the analogy between locations and lattice sites in statistical mechanics, we use the Ising dynamics to guide the model to extract and utilize spatial relationships across locations and model the complex influence of granular information from real-world clinical evidence. By leveraging rich linked databases, including insurance claims, census information, and hospital resource usage data across the U.S., we evaluate the HOIST model on the large-scale spatiotemporal COVID-19 hospitalization prediction task for 2299 counties in the U.S. In the 4-week hospitalization prediction task, HOIST achieves 368.7 mean absolute error, 0.6$${R}^{2}$$R2and 0.89 concordance correlation coefficient score on average. Our detailed number needed to treat (NNT) and cost analysis suggest that future COVID-19 vaccination efforts may be most impactful in rural areas. This model may serve as a resource for future county and state-level vaccination efforts.

     
    more » « less
  3. For each odd integern≥<#comment/>3n \geq 3, we construct a rank-3 graphΛ<#comment/>n\Lambda _nwith involutionγ<#comment/>n\gamma _nwhose realC∗<#comment/>C^*-algebraCR∗<#comment/>(Λ<#comment/>n,γ<#comment/>n)C^*_{\scriptscriptstyle \mathbb {R}}(\Lambda _n, \gamma _n)is stably isomorphic to the exotic Cuntz algebraEn\mathcal E_n. This construction is optimal, as we prove that a rank-2 graph with involution(Λ<#comment/>,γ<#comment/>)(\Lambda ,\gamma )can never satisfyCR∗<#comment/>(Λ<#comment/>,γ<#comment/>)∼<#comment/>MEEnC^*_{\scriptscriptstyle \mathbb {R}}(\Lambda , \gamma )\sim _{ME} \mathcal E_n, and Boersema reached the same conclusion for rank-1 graphs (directed graphs) in [Münster J. Math.10(2017), pp. 485–521, Corollary 4.3]. Our construction relies on a rank-1 graph with involution(Λ<#comment/>,γ<#comment/>)(\Lambda , \gamma )whose realC∗<#comment/>C^*-algebraCR∗<#comment/>(Λ<#comment/>,γ<#comment/>)C^*_{\scriptscriptstyle \mathbb {R}}(\Lambda , \gamma )is stably isomorphic to the suspensionSRS \mathbb {R}. In the Appendix, we show that theii-fold suspensionSiRS^i \mathbb {R}is stably isomorphic to a graph algebra iff−<#comment/>2≤<#comment/>i≤<#comment/>1-2 \leq i \leq 1.

     
    more » « less
  4. Abstract

    We consider the problem of covering multiple submodular constraints. Given a finite ground setN, a weight function$$w: N \rightarrow \mathbb {R}_+$$w:NR+,rmonotone submodular functions$$f_1,f_2,\ldots ,f_r$$f1,f2,,froverNand requirements$$k_1,k_2,\ldots ,k_r$$k1,k2,,krthe goal is to find a minimum weight subset$$S \subseteq N$$SNsuch that$$f_i(S) \ge k_i$$fi(S)kifor$$1 \le i \le r$$1ir. We refer to this problem asMulti-Submod-Coverand it was recently considered by Har-Peled and Jones (Few cuts meet many point sets. CoRR.arxiv:abs1808.03260Har-Peled and Jones 2018) who were motivated by an application in geometry. Even with$$r=1$$r=1Multi-Submod-Covergeneralizes the well-known Submodular Set Cover problem (Submod-SC), and it can also be easily reduced toSubmod-SC. A simple greedy algorithm gives an$$O(\log (kr))$$O(log(kr))approximation where$$k = \sum _i k_i$$k=ikiand this ratio cannot be improved in the general case. In this paper, motivated by several concrete applications, we consider two ways to improve upon the approximation given by the greedy algorithm. First, we give a bicriteria approximation algorithm forMulti-Submod-Coverthat covers each constraint to within a factor of$$(1-1/e-\varepsilon )$$(1-1/e-ε)while incurring an approximation of$$O(\frac{1}{\epsilon }\log r)$$O(1ϵlogr)in the cost. Second, we consider the special case when each$$f_i$$fiis a obtained from a truncated coverage function and obtain an algorithm that generalizes previous work on partial set cover (Partial-SC), covering integer programs (CIPs) and multiple vertex cover constraints Bera et al. (Theoret Comput Sci 555:2–8 Bera et al. 2014). Both these algorithms are based on mathematical programming relaxations that avoid the limitations of the greedy algorithm. We demonstrate the implications of our algorithms and related ideas to several applications ranging from geometric covering problems to clustering with outliers. Our work highlights the utility of the high-level model and the lens of submodularity in addressing this class of covering problems.

     
    more » « less
  5. Abstract

    We explore properties of the family sizes arising in a linear birth process with immigration (BI). In particular, we study the correlation of the number of families observed during consecutive disjoint intervals of time. LettingS(ab) be the number of families observed in (ab), we study the expected sample variance and its asymptotics forpconsecutive sequential samples$$S_p =(S(t_0,t_1),\dots , S(t_{p-1},t_p))$$Sp=(S(t0,t1),,S(tp-1,tp)), for$$0=t_00=t0<t1<<tp. By conditioning on the sizes of the samples, we provide a connection between$$S_p$$Spandpsequential samples of sizes$$n_1,n_2,\dots ,n_p$$n1,n2,,np, drawn from a single run of a Chinese Restaurant Process. Properties of the latter were studied in da Silva et al. (Bernoulli 29:1166–1194, 2023.https://doi.org/10.3150/22-BEJ1494). We show how the continuous-time framework helps to make asymptotic calculations easier than its discrete-time counterpart. As an application, for a specific choice of$$t_1,t_2,\dots , t_p$$t1,t2,,tp, where the lengths of intervals are logarithmically equal, we revisit Fisher’s 1943 multi-sampling problem and give another explanation of what Fisher’s model could have meant in the world of sequential samples drawn from a BI process.

     
    more » « less