skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Title: Nucleic Acid Quantification by Multi-Frequency Impedance Cytometry and Machine Learning
Determining nucleic acid concentrations in a sample is an important step prior to proceeding with downstream analysis in molecular diagnostics. Given the need for testing DNA amounts and its purity in many samples, including in samples with very small input DNA, there is utility of novel machine learning approaches for accurate and high-throughput DNA quantification. Here, we demonstrated the ability of a neural network to predict DNA amounts coupled to paramagnetic beads. To this end, a custom-made microfluidic chip is applied to detect DNA molecules bound to beads by measuring the impedance peak response (IPR) at multiple frequencies. We leveraged electrical measurements including the frequency and imaginary and real parts of the peak intensity within a microfluidic channel as the input of deep learning models to predict DNA concentration. Specifically, 10 different deep learning architectures are examined. The results of the proposed regression model indicate that an R_Squared of 97% with a slope of 0.68 is achievable. Consequently, machine learning models can be a suitable, fast, and accurate method to measure nucleic acid concentration in a sample. The results presented in this study demonstrate the ability of the proposed neural network to use the information embedded in raw impedance data to predict the amount of DNA concentration.  more » « less
Award ID(s):
1846740
PAR ID:
10403207
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Biosensors
Volume:
13
Issue:
3
ISSN:
2079-6374
Page Range / eLocation ID:
316
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Electronic biosensors for DNA detection typically utilize immobilized oligonucleotide probes on a signal transducer, which outputs an electronic signal when target molecules bind to probes. However, limitation in probe selectivity and variable levels of non-target material in complex biological samples can lead to nonspecific binding and reduced sensitivity. Here we introduce the integration of 2.8 μm paramagnetic beads with DNA fragments. We apply a custom-made microfluidic chip to detect DNA molecules bound to beads by measuring Impedance Peak Response (IPR) at multiple frequencies. Technical and analytical performance was evaluated using beads containing purified Polymerase Chain Reaction (PCR) products of different lengths (157, 300, 613 bp) with DNA concentration ranging from 0.039 amol to 7.8 fmol. Multi-frequency IPR correlated positively with DNA amounts and was used to calculate a DNA quantification score. The minimum DNA amount of a 300 bp fragment coupled on beads that could be robustly detected was 0.0039 fmol (1.54 fg or 4750 copies/bead). Additionally, our approach allowed distinguishing beads with similar molar concentration DNA fragments of different lengths. Using this impedance sensor, purified PCR products could be analyzed within ten minutes to determine DNA fragment length and quantity based on comparison to a known DNA standard. 
    more » « less
  2. Abstract

    Nucleic acid-binding proteins (NABPs), including DNA-binding proteins (DBPs) and RNA-binding proteins (RBPs), play important roles in essential biological processes. To facilitate functional annotation and accurate prediction of different types of NABPs, many machine learning-based computational approaches have been developed. However, the datasets used for training and testing as well as the prediction scopes in these studies have limited their applications. In this paper, we developed new strategies to overcome these limitations by generating more accurate and robust datasets and developing deep learning-based methods including both hierarchical and multi-class approaches to predict the types of NABPs for any given protein. The deep learning models employ two layers of convolutional neural network and one layer of long short-term memory. Our approaches outperform existing DBP and RBP predictors with a balanced prediction between DBPs and RBPs, and are more practically useful in identifying novel NABPs. The multi-class approach greatly improves the prediction accuracy of DBPs and RBPs, especially for the DBPs with ~12% improvement. Moreover, we explored the prediction accuracy of single-stranded DNA binding proteins and their effect on the overall prediction accuracy of NABP predictions.

     
    more » « less
  3. Abstract

    Protein language models (pLMs) trained on a large corpus of protein sequences have shown unprecedented scalability and broad generalizability in a wide range of predictive modeling tasks, but their power has not yet been harnessed for predicting protein–nucleic acid binding sites, critical for characterizing the interactions between proteins and nucleic acids. Here, we present EquiPNAS, a new pLM-informed E(3) equivariant deep graph neural network framework for improved protein–nucleic acid binding site prediction. By combining the strengths of pLM and symmetry-aware deep graph learning, EquiPNAS consistently outperforms the state-of-the-art methods for both protein–DNA and protein–RNA binding site prediction on multiple datasets across a diverse set of predictive modeling scenarios ranging from using experimental input to AlphaFold2 predictions. Our ablation study reveals that the pLM embeddings used in EquiPNAS are sufficiently powerful to dramatically reduce the dependence on the availability of evolutionary information without compromising on accuracy, and that the symmetry-aware nature of the E(3) equivariant graph-based neural architecture offers remarkable robustness and performance resilience. EquiPNAS is freely available at https://github.com/Bhattacharya-Lab/EquiPNAS.

     
    more » « less
  4. Rapid, efficient and accurate nucleic acid molecule detection is important in the screening of diseases and pathogens, yet remains a limiting factor at point of care (POC) treatment. Microfluidic systems are characterized by fast, integrated, miniaturized features which provide an effective platform for qualitative and quantitative detection of nucleic acid molecules. The nucleic acid detection process mainly includes sample preparation and target molecule amplification. Given the advancements in theoretical research and technological innovations to date, nucleic acid extraction and amplification integrated with microfluidic systems has advanced rapidly. The primary goal of this review is to outline current approaches used for nucleic acid detection in the context of microfluidic systems. The secondary goal is to identify new approaches that will help shape future trends at the intersection of nucleic acid detection and microfluidics, particularly with regard to increasing disease and pathogen detection for improved diagnosis and treatment. 
    more » « less
  5. null (Ed.)
    It is critical for biological studies to annotate amino acid sequences and understand how proteins function. Protein function is important to medical research in the health industry (e.g., drug discovery). With the advancement of deep learning, accurate protein annotation models have been developed for alignment free protein annotation. In this paper, we develop a deep learning model with an attention mechanism that can predict Gene Ontology labels given a protein sequence input. We believe this model can produce accurate predictions as well as maintain good interpretability. We further show how the model can be interpreted by examining and visualizing the intermediate layer output in our deep neural network. 
    more » « less