skip to main content

Title: Coupled matrix–matrix and coupled tensor–matrix completion methods for predicting drug–target interactions
Abstract Predicting the interactions between drugs and targets plays an important role in the process of new drug discovery, drug repurposing (also known as drug repositioning). There is a need to develop novel and efficient prediction approaches in order to avoid the costly and laborious process of determining drug–target interactions (DTIs) based on experiments alone. These computational prediction approaches should be capable of identifying the potential DTIs in a timely manner. Matrix factorization methods have been proven to be the most reliable group of methods. Here, we first propose a matrix factorization-based method termed ‘Coupled Matrix–Matrix Completion’ (CMMC). Next, in order to utilize more comprehensive information provided in different databases and incorporate multiple types of scores for drug–drug similarities and target–target relationship, we then extend CMMC to ‘Coupled Tensor–Matrix Completion’ (CTMC) by considering drug–drug and target–target similarity/interaction tensors. Results: Evaluation on two benchmark datasets, DrugBank and TTD, shows that CTMC outperforms the matrix-factorization-based methods: GRMF, $L_{2,1}$-GRMF, NRLMF and NRLMF$\beta $. Based on the evaluation, CMMC and CTMC outperform the above three methods in term of area under the curve, F1 score, sensitivity and specificity in a considerably shorter run time.
; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Briefings in Bioinformatics
Sponsoring Org:
National Science Foundation
More Like this
  1. Knowledge graphs (KGs) are powerful tools that codify relational behaviour between entities in knowledge bases. KGs can simultaneously model many different types of subject-predicate-object and higher-order relations. As such, they offer a flexible modeling framework that has been applied to many areas, including biology and pharmacology – most recently, in the fight against COVID-19. The flexibility of KG modeling is both a blessing and a challenge from the learning point of view. In this paper we propose a novel coupled tensor-matrix framework for KG embedding. We leverage tensor factorization tools to learn concise representations of entities and relations in knowledge bases and employ these representations to perform drug repurposing for COVID-19. Our proposed framework is principled, elegant, and achieves 100% improvement over the best baseline in the COVID-19 drug repurposing task using a recently developed biological KG.
  2. Hyperspectral super-resolution refers to the task of fusing a hyperspectral image (HSI) and a multispectral image (MSI) in order to produce a super-resolution image (SRI) that has high spatial and spectral resolution. Popular methods leverage matrix factorization that models each spectral pixel as a convex combination of spectral signatures belonging to a few endmembers. These methods are considered state-of-the-art, but several challenges remain. First, multiband images are naturally three dimensional (3-d) signals, while matrix methods usually ignore the 3-d structure, which is prone to information losses. Second, these methods do not provide identifiability guarantees under which the reconstruction task is feasible. Third, a tacit assumption is that the degradation operators from SRI to MSI and HSI are known - which is hardly the case in practice. Recently [1], [2] proposed a coupled tensor factorization approach to handle these issues. In this work we propose a hybrid model that combines the benefits of tensor and matrix factorization approaches. We also develop a new algorithm that is mathematically simple, enjoys identifiability under relaxed conditions and is completely agnostic of the spatial degradation operator. Experimental results with real hyperspectral data showcase the effectiveness of the proposed approach.
  3. Abstract

    Composites can be tailored to specific applications by adjusting process variables. These variables include those related to composition, such as volume fraction of the constituents and those associated with processing methods, methods that can affect composite topology. In the case of particle matrix composites, orientation of the inclusions affects the resulting composite properties, particularly so in instances where the particles can be oriented and arranged into structures. In this work, we study the effects of coupled electric and magnetic field processing with externally applied fields on those structures, and consequently on the resulting material properties that arise. The ability to vary these processing conditions with the goal of generating microstructures that yield target material properties adds an additional level of control to the design of composite material properties. Moreover, while analytical models allow for the prediction of resulting composite properties from constituents and composite topology, these models do not build upward from process variables to make these predictions.

    This work couples simulation of the formation of microscale architectures, which result from coupled electric and magnetic field processing of particulate filled polymer matrix composites, with finite element analysis of those structures to provide a direct and explicit linkages between process,more »structure, and properties. This work demonstrates the utility of these method as a tool for determining composite properties from constituent and processing parameters. Initial particle dynamics simulation incorporating electromagnetic responses between particles and between the particles and the applied fields, including dielectrophoresis, are used to stochastically generate representative volume elements for a given set of process variables. Next, these RVEs are analyzed as periodic structures using FEA yielding bulk material properties. The results are shown to converge for simulation size and discretization, validating the RVE as an appropriate representation of the composite volume. Calculated material properties are compared to traditional effective medium theory models. Simulations allow for mapping of composite properties with respect to not only composition, but also fundamentally from processing simulations that yield varying particle configurations, a step not present in traditional or more modern effective medium theories such as the Halpin Tsai or double-inclusion theories.

    « less
  4. Techniques of matrix completion aim to impute a large portion of missing entries in a data matrix through a small portion of observed ones. In practice, prior information and special structures are usually employed in order to improve the accuracy of matrix completion. In this paper, we propose a unified nonconvex optimization framework for matrix completion with linearly parameterized factors. In particular, by introducing a condition referred to as Correlated Parametric Factorization, we conduct a unified geometric analysis for the nonconvex objective by establishing uniform upper bounds for low-rank estimation resulting from any local minimizer. Perhaps surprisingly, the condition of Correlated Parametric Factorization holds for important examples including subspace-constrained matrix completion and skew-symmetric matrix completion. The effectiveness of our unified nonconvex optimization method is also empirically illustrated by extensive numerical simulations.
  5. Abstract Motivation

    Accurately predicting drug–target interactions (DTIs) in silico can guide the drug discovery process and thus facilitate drug development. Computational approaches for DTI prediction that adopt the systems biology perspective generally exploit the rationale that the properties of drugs and targets can be characterized by their functional roles in biological networks.


    Inspired by recent advance of information passing and aggregation techniques that generalize the convolution neural networks to mine large-scale graph data and greatly improve the performance of many network-related prediction tasks, we develop a new nonlinear end-to-end learning model, called NeoDTI, that integrates diverse information from heterogeneous network data and automatically learns topology-preserving representations of drugs and targets to facilitate DTI prediction. The substantial prediction performance improvement over other state-of-the-art DTI prediction methods as well as several novel predicted DTIs with evidence supports from previous studies have demonstrated the superior predictive power of NeoDTI. In addition, NeoDTI is robust against a wide range of choices of hyperparameters and is ready to integrate more drug and target related information (e.g. compound–protein binding affinity data). All these results suggest that NeoDTI can offer a powerful and robust tool for drug development and drug repositioning.

    Availability and implementation

    The source code and datamore »used in NeoDTI are available at:

    Supplementary information

    Supplementary data are available at Bioinformatics online.

    « less