skip to main content


Title: MotifAnalyzer‐PDZ : A computational program to investigate the evolution of PDZ‐binding target specificity
Abstract

Recognition of short linear motifs (SLiMs) or peptides by proteins is an important component of many cellular processes. However, due to limited and degenerate binding motifs, prediction of cellular targets is challenging. In addition, many of these interactions are transient and of relatively low affinity. Here, we focus on one of the largest families of SLiM‐binding domains in the human proteome, the PDZ domain. These domains bind the extreme C‐terminus of target proteins, and are involved in many signaling and trafficking pathways. To predict endogenous targets of PDZ domains, we developedMotifAnalyzer‐PDZ, a program that filters and compares all motif‐satisfying sequences in any publicly available proteome. This approach enables us to determine possible PDZ binding targets in humans and other organisms. Using this program, we predicted and biochemically tested novel human PDZ targets by looking for strong sequence conservation in evolution. We also identified three C‐terminal sequences in choanoflagellates that bind a choanoflagellate PDZ domain, theMonsiga brevicollisSHANK1 PDZ domain (mbSHANK1), with endogenously‐relevant affinities, despite a lack of conservation with the targets of a homologous human PDZ domain, SHANK1. All three are predicted to be signaling proteins, with strong sequence homology to cytosolic and receptor tyrosine kinases. Finally, we analyzed and compared the positional amino acid enrichments in PDZ motif‐satisfying sequences from over a dozen organisms. Overall,MotifAnalyzer‐PDZis a versatile program to investigate potential PDZ interactions. This proof‐of‐concept work is poised to enable similar types of analyses for other SLiM‐binding domains (e.g.,MotifAnalyzer‐Kinase).MotifAnalyzer‐PDZis available athttp://motifAnalyzerPDZ.cs.wwu.edu.

 
more » « less
Award ID(s):
1904711
NSF-PAR ID:
10372387
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Protein Science
Volume:
28
Issue:
12
ISSN:
0961-8368
Page Range / eLocation ID:
p. 2127-2143
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Choanoflagellates are single-celled eukaryotes with complex signaling pathways. They are considered the closest non-metazoan ancestors to mammals and other metazoans and form multicellular-like states called rosettes. The choanoflagellate Monosiga brevicollis contains over 150 PDZ domains, an important peptide-binding domain in all three domains of life (Archaea, Bacteria, and Eukarya). Therefore, an understanding of PDZ domain signaling pathways in choanoflagellates may provide insight into the origins of multicellularity. PDZ domains recognize the C-terminus of target proteins and regulate signaling and trafficking pathways, as well as cellular adhesion. Here, we developed a computational software suite, Domain Analysis and Motif Matcher (DAMM), that analyzes peptide-binding cleft sequence identity as compared with human PDZ domains and that can be used in combination with literature searches of known human PDZ-interacting sequences to predict target specificity in choanoflagellate PDZ domains. We used this program, protein biochemistry, fluorescence polarization, and structural analyses to characterize the specificity of A9UPE9_MONBE, a M. brevicollis PDZ domain-containing protein with no homology to any metazoan protein, finding that its PDZ domain is most similar to those of the DLG family. We then identified two endogenous sequences that bind A9UPE9 PDZ with <100 μM affinity, a value commonly considered the threshold for cellular PDZ–peptide interactions. Taken together, this approach can be used to predict cellular targets of previously uncharacterized PDZ domains in choanoflagellates and other organisms. Our data contribute to investigations into choanoflagellate signaling and how it informs metazoan evolution. 
    more » « less
  2. Abstract

    Identification of the molecular networks that facilitated the evolution of multicellular animals from their unicellular ancestors is a fundamental problem in evolutionary cellular biology. Choanoflagellates are recognized as the closest extant nonmetazoan ancestors to animals. These unicellular eukaryotes can adopt a multicellular‐like “rosette” state. Therefore, they are compelling models for the study of early multicellularity. Comparative studies revealed that a number of putative human orthologs are present in choanoflagellate genomes, suggesting that a subset of these genes were necessary for the emergence of multicellularity. However, previous work is largely based on sequence alignments alone, which does not confirm structural nor functional similarity. Here, we focus on the PDZ domain, a peptide‐binding domain which plays critical roles in myriad cellular signaling networks and which underwent a gene family expansion in metazoan lineages. Using a customized sequence similarity search algorithm, we identified 178 PDZ domains in theMonosiga brevicollisproteome. This includes 11 previously unidentified sequences, which we analyzed using Rosetta and homology modeling. To assess conservation of protein structure, we solved high‐resolution crystal structures of representativeM. brevicollisPDZ domains that are homologous to human Dlg1 PDZ2, Dlg1 PDZ3, GIPC, and SHANK1 PDZ domains. To assess functional conservation, we calculated binding affinities for mbGIPC, mbSHANK1, mbSNX27, and mbDLG‐3 PDZ domains fromM. brevicollis. Overall, we find that peptide selectivity is generally conserved between these two disparate organisms, with one possible exception, mbDLG‐3. Overall, our results provide novel insight into signaling pathways in a choanoflagellate model of primitive multicellularity.

     
    more » « less
  3. Abstract

    Protein–protein interactions that involve recognition of short peptides are critical in cellular processes. Protein–peptide interaction surface areas are relatively small and shallow, and there are often overlapping specificities in families of peptide‐binding domains. Therefore, dissecting selectivity determinants can be challenging. PDZ domains are a family of peptide‐binding domains located in several intracellular signaling and trafficking pathways. These domains are also directly targeted by pathogens, and a hallmark of many oncogenic viral proteins is a PDZ‐binding motif. However, amidst sequences that target PDZ domains, there is a wide spectrum in relative promiscuity. For example, the viral HPV16 E6 oncoprotein recognizes over double the number of PDZ domain‐containing proteins as the cystic fibrosis transmembrane conductance regulator (CFTR) in the cell, despite similar PDZ targeting‐sequences and identical motif residues. Here, we determine binding affinities for PDZ domains known to bind either HPV16 E6 alone or both CFTR and HPV16 E6, using peptides matching WT and hybrid sequences. We also use energy minimization to model PDZ–peptide complexes and use sequence analyses to investigate this difference. We find that while the majority of single mutations had marginal effects on overall affinity, the additive effect on the free energy of binding accurately describes the selectivity observed. Taken together, our results describe how complex and differing PDZ interactomes can be programmed in the cell.

     
    more » « less
  4. BACKGROUND Diverse organisms, from archaea and bacteria to plants and humans, use receptor systems to recognize both pathogens and dangerous self-derived or environmentally derived stimuli. These intricate, well-coordinated immune systems, composed of innate and adaptive components, ensure host survival. In the late 20th century, researchers identified the Toll/interleukin-1/resistance gene (TIR) domain as an evolutionarily conserved component of animal and plant innate immune systems. Today, TIR-domain proteins are known to be broadly distributed across the tree of life. The TIR domain was first recognized as an adaptor for the assembly of macromolecular signaling complexes in mammalian innate immune pathways. Work on axon degeneration in animals—as well as on plant, archaeal, and bacterial immune systems—has uncovered additional enzymatic activities for TIR domains. ADVANCES Mammalian axons initiate a self-destruct program upon injury and during disease that is mediated by the sterile alpha and TIR motif containing 1 (SARM1) protein. The SARM1 TIR domain enzymatically consumes the essential metabolic cofactor nicotinamide adenine dinucleotide (NAD + ) to promote axonal death. Identification of the SARM1 NAD + -consuming enzyme (NADase) revealed that TIR domains can function as enzymes. Given the evolutionary conservation of TIR domains, studies investigated whether the SARM1 TIR NADase was also conserved. Indeed, bacteria, archaea, and plant TIR domains possess NADase activity. In prokaryotes, TIR NADase activity is found in an ancient antiphage immune system. In plants, identification of TIR NADase activity and linkage of TIR enzymatic products to downstream signaling components addressed the question of how nucleotide-binding, leucine-rich repeat (NLR) receptors trigger hypersensitive cell death during an immune response. Studies in plants show that their TIR domains can cleave nucleic acids and possess 2′,3′ cyclic adenosine monophosphate (2′,3′-cAMP) and 2′,3′ cyclic guanosine monophosphate (2′,3′-cGMP) synthetase activity that aids cell death programs in plant innate immunity. Thus, TIR domains constitute an ancient family of enzymes that are activated in immune and cell death pathways. OUTLOOK The discovery of TIR-domain enzyme activities carries implications for innate immunity and neurodegeneration. The identification of the SARM1 NADase defined a drug target for a wide number of neurodegenerative diseases that is being exploited in both preclinical and clinical studies. Hyperactive mutations in the SARM1 NADase have been discovered in amyotrophic lateral sclerosis (ALS) patients. Future work will seek to clarify the contribution of the SARM1 axon degeneration pathway to ALS pathogenesis. NAD + biology influences cellular processes from metabolism to DNA repair to aging. How TIR enzymes influence the NAD + metabolome and its associated pathways in bacteria, archaea, plants, and animals will be an exciting area for upcoming investigation. The discovery of the diversity of TIR enzymatic products is revealing signaling pathways across kingdoms. Discovery of TIR enzymatic function in plants and animals may yet inspire studies of enzymatic functions for Toll-like receptors in animals. We anticipate that cross-kingdom studies of TIR-domain function will guide interventions that will span the tree of life, from treating human neurodegenerative disorders and bacterial infections to preventing plant diseases. Conserved TIR-domain enzymatic activity. TIR-domain proteins from prokaryotes and eukaryotes cleave NAD + into nicotinamide (Nam), ADP-ribose (ADPR), cyclic ADP-ribose (cADPR), isomers of cyclic ADP-ribose (2′ or 3′cADPR), and related molecules [e.g., phosphoribosyl adenosine monophosphate (pRib-AMP)]. Plant TIR domains also possess a nuclease activity, can degrade DNA and RNA, and can function as a 2′,3′-cAMP or 2′,3′-cGMP synthetase. TIR enzymatic activity drives cell death and immune pathways across kingdoms. TIR activity can kill cells directly through NAD + depletion or indirectly using enzymatic products as signal molecules. The representative TIR domain structure shown here is Protein Data Bank ID 6O0Q. EDS1, enhanced disease susceptibility 1; ThsA, Thoeris A. 
    more » « less
  5. Polen, Tino (Ed.)
    ABSTRACT Regulation of gene expression is a vital component of cellular biology. Transcription factor proteins often bind regulatory DNA sequences upstream of transcription start sites to facilitate the activation or repression of RNA polymerase. Research laboratories have devoted many projects to understanding the transcription regulatory networks for transcription factors, as these regulated genes provide critical insight into the biology of the host organism. Various in vivo and in vitro assays have been developed to elucidate transcription regulatory networks. Several assays, including SELEX-seq and ChIP-seq, capture DNA-bound transcription factors to determine the preferred DNA-binding sequences, which can then be mapped to the host organism’s genome to identify candidate regulatory genes. In this protocol, we describe an alternative in vitro , iterative selection approach to ascertaining DNA-binding sequences of a transcription factor of interest using restriction endonuclease, protection, selection, and amplification (REPSA). Contrary to traditional antibody-based capture methods, REPSA selects for transcription factor-bound DNA sequences by challenging binding reactions with a type IIS restriction endonuclease. Cleavage-resistant DNA species are amplified by PCR and then used as inputs for the next round of REPSA. This process is repeated until a protected DNA species is observed by gel electrophoresis, which is an indication of a successful REPSA experiment. Subsequent high-throughput sequencing of REPSA-selected DNAs accompanied by motif discovery and scanning analyses can be used for determining transcription factor consensus binding sequences and potential regulated genes, providing critical first steps in determining organisms’ transcription regulatory networks. IMPORTANCE Transcription regulatory proteins are an essential class of proteins that help maintain cellular homeostasis by adapting the transcriptome based on environmental cues. Dysregulation of transcription factors can lead to diseases such as cancer, and many eukaryotic and prokaryotic transcription factors have become enticing therapeutic targets. Additionally, in many understudied organisms, the transcription regulatory networks for uncharacterized transcription factors remain unknown. As such, the need for experimental techniques to establish transcription regulatory networks is paramount. Here, we describe a step-by-step protocol for REPSA, an inexpensive, iterative selection technique to identify transcription factor-binding sequences without the need for antibody-based capture methods. 
    more » « less