Abstract Computational modeling of protein–DNA complex structures has important implications in biomedical applications such as structure‐based, computer aided drug design. A key step in developing methods for accurate modeling of protein–DNA complexes is similarity assessment between models and their reference complex structures. Existing methods primarily rely on distance‐based metrics and generally do not consider important functional features of the complexes, such as interface hydrogen bonds that are critical to specific protein–DNA interactions. Here, we present a new scoring function, ComparePD, which takes interface hydrogen bond energy and strength into account besides the distance‐based metrics for accurate similarity measure of protein–DNA complexes. ComparePD was tested on two datasets of computational models of protein–DNA complexes generated using docking (classified as easy, intermediate, and difficult cases) and homology modeling methods. The results were compared with PDDockQ, a modified version of DockQ tailored for protein–DNA complexes, as well as the metrics employed by the community‐wide experiment CAPRI (Critical Assessment of PRedicted Interactions). We demonstrated that ComparePD provides an improved similarity measure over PDDockQ and the CAPRI classification method by considering both conformational similarity and functional importance of the complex interface. ComparePD identified more meaningful models as compared to PDDockQ for all the cases having different top models between ComparePD and PDDockQ except for one intermediate docking case.
more »
« less
Evaluating Quantitative Measures for Assessing Functional Similarity in Engineering Design
Abstract The development of example-based design support tools, such as those used for design-by-analogy, relies heavily on the computation of similarity between designs. Various vector- and graph-based similarity measures operationalize different principles to assess the similarity of designs. Despite the availability of various types of similarity measures and the widespread adoption of some, these measures have not been tested for cross-measure agreement, especially in a design context. In this paper, several vector- and graph-based similarity measures are tested across two datasets of functional models of products to explore the ways in which they find functionally similar designs. The results show that the network-based measures fundamentally operationalize functional similarity in a different way than vector-based measures. Based upon the findings, we recommend a graph-based similarity measure such as NetSimile in the early stages of design when divergence is desirable and a vector-based measure such as cosine similarity in a period of convergence, when the scope of the desired function implementation is clearer.
more »
« less
- Award ID(s):
- 2034448
- PAR ID:
- 10567499
- Publisher / Repository:
- ASME
- Date Published:
- Journal Name:
- Journal of Mechanical Design
- Volume:
- 144
- Issue:
- 3
- ISSN:
- 1050-0472
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Quantifying the association between components of multivariate random curves is of general interest and is a ubiquitous and basic problem that can be addressed with functional data analysis. An important application is the problem of assessing functional connectivity based on functional magnetic resonance imaging (fMRI), where one aims to determine the similarity of fMRI time courses that are recorded on anatomically separated brain regions. In the functional brain connectivity literature, the static temporal Pearson correlation has been the prevailing measure for functional connectivity. However, recent research has revealed temporally changing patterns of functional connectivity, leading to the study of dynamic functional connectivity. This motivates new similarity measures for pairs of random curves that reflect the dynamic features of functional similarity. Specifically, we introduce gradient synchronization measures in a general setting. These similarity measures are based on the concordance and discordance of the gradients between paired smooth random functions. Asymptotic normality of the proposed estimates is obtained under regularity conditions. We illustrate the proposed synchronization measures via simulations and an application to resting-state fMRI signals from the Alzheimer’s Disease Neuroimaging Initiative and they are found to improve discrimination between subjects with different disease status.more » « less
-
Mathelier, Anthony (Ed.)Abstract Motivation Recent breakthroughs of single-cell RNA sequencing (scRNA-seq) technologies offer an exciting opportunity to identify heterogeneous cell types in complex tissues. However, the unavoidable biological noise and technical artifacts in scRNA-seq data as well as the high dimensionality of expression vectors make the problem highly challenging. Consequently, although numerous tools have been developed, their accuracy remains to be improved. Results Here, we introduce a novel clustering algorithm and tool RCSL (Rank Constrained Similarity Learning) to accurately identify various cell types using scRNA-seq data from a complex tissue. RCSL considers both local similarity and global similarity among the cells to discern the subtle differences among cells of the same type as well as larger differences among cells of different types. RCSL uses Spearman’s rank correlations of a cell’s expression vector with those of other cells to measure its global similarity, and adaptively learns neighbor representation of a cell as its local similarity. The overall similarity of a cell to other cells is a linear combination of its global similarity and local similarity. RCSL automatically estimates the number of cell types defined in the similarity matrix, and identifies them by constructing a block-diagonal matrix, such that its distance to the similarity matrix is minimized. Each block-diagonal submatrix is a cell cluster/type, corresponding to a connected component in the cognate similarity graph. When tested on 16 benchmark scRNA-seq datasets in which the cell types are well-annotated, RCSL substantially outperformed six state-of-the-art methods in accuracy and robustness as measured by three metrics. Availability and implementation The RCSL algorithm is implemented in R and can be freely downloaded at https://cran.r-project.org/web/packages/RCSL/index.html. Supplementary information Supplementary data are available at Bioinformatics online.more » « less
-
ObjectiveThis study explores subjective and objective driving style similarity to identify how similarity can be used to develop driver-compatible vehicle automation. BackgroundSimilarity in the ways that interaction partners perform tasks can be measured subjectively, through questionnaires, or objectively by characterizing each agent’s actions. Although subjective measures have advantages in prediction, objective measures are more useful when operationalizing interventions based on these measures. Showing how objective and subjective similarity are related is therefore prudent for aligning future machine performance with human preferences. MethodsA driving simulator study was conducted with stop-and-go scenarios. Participants experienced conservative, moderate, and aggressive automated driving styles and rated the similarity between their own driving style and that of the automation. Objective similarity between the manual and automated driving speed profiles was calculated using three distance measures: dynamic time warping, Euclidean distance, and time alignment measure. Linear mixed effects models were used to examine how different components of the stopping profile and the three objective similarity measures predicted subjective similarity. ResultsObjective similarity using Euclidean distance best predicted subjective similarity. However, this was only observed for participants’ approach to the intersection and not their departure. ConclusionDeveloping driving styles that drivers perceive to be similar to their own is an important step toward driver-compatible automation. In determining what constitutes similarity, it is important to (a) use measures that reflect the driver’s perception of similarity, and (b) understand what elements of the driving style govern subjective similarity.more » « less
-
null (Ed.)While scientific collaboration is critical for a scholar, some collaborators can be more significant than others, e.g., lifetime collaborators. It has been shown that lifetime collaborators are more influential on a scholar’s academic performance. However, little research has been done on investigating predicting such special relationships in academic networks. To this end, we propose Scholar2vec, a novel neural network embedding for representing scholar profiles. First, our approach creates scholars’ research interest vector from textual information, such as demographics, research, and influence. After bridging research interests with a collaboration network, vector representations of scholars can be gained with graph learning. Meanwhile, since scholars are occupied with various attributes, we propose to incorporate four types of scholar attributes for learning scholar vectors. Finally, the early-stage similarity sequence based on Scholar2vec is used to predict lifetime collaborators with machine learning methods. Extensive experiments on two real-world datasets show that Scholar2vec outperforms state-of-the-art methods in lifetime collaborator prediction. Our work presents a new way to measure the similarity between two scholars by vector representation, which tackles the knowledge between network embedding and academic relationship mining.more » « less
An official website of the United States government

