skip to main content


Title: The N‐terminus of obscurin is flexible in solution
Abstract

The N‐terminal half of the giant cytoskeletal protein obscurin is comprised of more than 50 Ig‐like domains, arranged in tandem. Domains 18–51 are connected to each other through short 5‐residue linkers, and this arrangement has been previously shown to form a semi‐flexible rod in solution. Domains 1–18 generally have slightly longer ~7 residue interdomain linkers, and the multidomain structure and motion conferred by this kind of linker is understudied. Here, we use NMR, SAXS, and MD to show that these longer linkers are associated with significantly more domain/domain flexibility, with the resulting multidomain structure being moderately compact. Further examination of the relationship between interdomain flexibility and linker length shows there is a 5 residue “sweet spot” linker length that results in dual‐domain systems being extended, and conversely that both longer or shorter linkers result in a less extended structure. This detailed knowledge of the obscurin N terminus structure and flexibility allowed for mathematical modeling of domains 1–18, which suggests that this region likely forms tangles if left alone in solution. Given how infrequently protein tangles occur in nature, and given the pathological outcomes that occur when tangles do arise, our data suggest that obscurin is likely either significantly scaffolded or else externally extended in the cell.

 
more » « less
Award ID(s):
2024182 1757874
NSF-PAR ID:
10442599
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Proteins: Structure, Function, and Bioinformatics
Volume:
91
Issue:
4
ISSN:
0887-3585
Page Range / eLocation ID:
p. 485-496
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Obscurin, a giant modular cytoskeletal protein, is comprised mostly of tandem immunoglobulin‐like (Ig‐like) domains. This architecture allows obscurin to connect distal targets within the cell. The linkers connecting the Ig domains are usually short (3–4 residues). The physical effect arising from these short linkers is not known; such linkers may lead to a stiff elongated molecule or, conversely, may lead to a more compact and dynamic structure. In an effort to better understand how linkers affect obscurin flexibility, and to better understand the physical underpinnings of this flexibility, here we study the structure and dynamics of four representative sets of dual obscurin Ig domains using experimental and computational techniques. We find in all cases tested that tandem obscurin Ig domains interact at the poles of each domain and tend to stay relatively extended in solution. NMR, SAXS, and MD simulations reveal that while tandem domains are elongated, they also bend and flex significantly. By applying this behavior to a simplified model, it becomes apparent obscurin can link targets more than 200 nm away. However, as targets get further apart, obscurin begins acting as a spring and requires progressively more energy to further elongate.

     
    more » « less
  2. Obeid, Iyad ; Selesnick, Ivan ; Picone, Joseph (Ed.)
    The Temple University Hospital Seizure Detection Corpus (TUSZ) [1] has been in distribution since April 2017. It is a subset of the TUH EEG Corpus (TUEG) [2] and the most frequently requested corpus from our 3,000+ subscribers. It was recently featured as the challenge task in the Neureka 2020 Epilepsy Challenge [3]. A summary of the development of the corpus is shown below in Table 1. The TUSZ Corpus is a fully annotated corpus, which means every seizure event that occurs within its files has been annotated. The data is selected from TUEG using a screening process that identifies files most likely to contain seizures [1]. Approximately 7% of the TUEG data contains a seizure event, so it is important we triage TUEG for high yield data. One hour of EEG data requires approximately one hour of human labor to complete annotation using the pipeline described below, so it is important from a financial standpoint that we accurately triage data. A summary of the labels being used to annotate the data is shown in Table 2. Certain standards are put into place to optimize the annotation process while not sacrificing consistency. Due to the nature of EEG recordings, some records start off with a segment of calibration. This portion of the EEG is instantly recognizable and transitions from what resembles lead artifact to a flat line on all the channels. For the sake of seizure annotation, the calibration is ignored, and no time is wasted on it. During the identification of seizure events, a hard “3 second rule” is used to determine whether two events should be combined into a single larger event. This greatly reduces the time that it takes to annotate a file with multiple events occurring in succession. In addition to the required minimum 3 second gap between seizures, part of our standard dictates that no seizure less than 3 seconds be annotated. Although there is no universally accepted definition for how long a seizure must be, we find that it is difficult to discern with confidence between burst suppression or other morphologically similar impressions when the event is only a couple seconds long. This is due to several reasons, the most notable being the lack of evolution which is oftentimes crucial for the determination of a seizure. After the EEG files have been triaged, a team of annotators at NEDC is provided with the files to begin data annotation. An example of an annotation is shown in Figure 1. A summary of the workflow for our annotation process is shown in Figure 2. Several passes are performed over the data to ensure the annotations are accurate. Each file undergoes three passes to ensure that no seizures were missed or misidentified. The first pass of TUSZ involves identifying which files contain seizures and annotating them using our annotation tool. The time it takes to fully annotate a file can vary drastically depending on the specific characteristics of each file; however, on average a file containing multiple seizures takes 7 minutes to fully annotate. This includes the time that it takes to read the patient report as well as traverse through the entire file. Once an event has been identified, the start and stop time for the seizure is stored in our annotation tool. This is done on a channel by channel basis resulting in an accurate representation of the seizure spreading across different parts of the brain. Files that do not contain any seizures take approximately 3 minutes to complete. Even though there is no annotation being made, the file is still carefully examined to make sure that nothing was overlooked. In addition to solely scrolling through a file from start to finish, a file is often examined through different lenses. Depending on the situation, low pass filters are used, as well as increasing the amplitude of certain channels. These techniques are never used in isolation and are meant to further increase our confidence that nothing was missed. Once each file in a given set has been looked at once, the annotators start the review process. The reviewer checks a file and comments any changes that they recommend. This takes about 3 minutes per seizure containing file, which is significantly less time than the first pass. After each file has been commented on, the third pass commences. This step takes about 5 minutes per seizure file and requires the reviewer to accept or reject the changes that the second reviewer suggested. Since tangible changes are made to the annotation using the annotation tool, this step takes a bit longer than the previous one. Assuming 18% of the files contain seizures, a set of 1,000 files takes roughly 127 work hours to annotate. Before an annotator contributes to the data interpretation pipeline, they are trained for several weeks on previous datasets. A new annotator is able to be trained using data that resembles what they would see under normal circumstances. An additional benefit of using released data to train is that it serves as a means of constantly checking our work. If a trainee stumbles across an event that was not previously annotated, it is promptly added, and the data release is updated. It takes about three months to train an annotator to a point where their annotations can be trusted. Even though we carefully screen potential annotators during the hiring process, only about 25% of the annotators we hire survive more than one year doing this work. To ensure that the annotators are consistent in their annotations, the team conducts an interrater agreement evaluation periodically to ensure that there is a consensus within the team. The annotation standards are discussed in Ochal et al. [4]. An extended discussion of interrater agreement can be found in Shah et al. [5]. The most recent release of TUSZ, v1.5.2, represents our efforts to review the quality of the annotations for two upcoming challenges we hosted: an internal deep learning challenge at IBM [6] and the Neureka 2020 Epilepsy Challenge [3]. One of the biggest changes that was made to the annotations was the imposition of a stricter standard for determining the start and stop time of a seizure. Although evolution is still included in the annotations, the start times were altered to start when the spike-wave pattern becomes distinct as opposed to merely when the signal starts to shift from background. This cuts down on background that was mislabeled as a seizure. For seizure end times, all post ictal slowing that was included was removed. The recent release of v1.5.2 did not include any additional data files. Two EEG files had been added because, originally, they were corrupted in v1.5.1 but were able to be retrieved and added for the latest release. The progression from v1.5.0 to v1.5.1 and later to v1.5.2, included the re-annotation of all of the EEG files in order to develop a confident dataset regarding seizure identification. Starting with v1.4.0, we have also developed a blind evaluation set that is withheld for use in competitions. The annotation team is currently working on the next release for TUSZ, v1.6.0, which is expected to occur in August 2020. It will include new data from 2016 to mid-2019. This release will contain 2,296 files from 2016 as well as several thousand files representing the remaining data through mid-2019. In addition to files that were obtained with our standard triaging process, a part of this release consists of EEG files that do not have associated patient reports. Since actual seizure events are in short supply, we are mining a large chunk of data for which we have EEG recordings but no reports. Some of this data contains interesting seizure events collected during long-term EEG sessions or data collected from patients with a history of frequent seizures. It is being mined to increase the number of files in the corpus that have at least one seizure event. We expect v1.6.0 to be released before IEEE SPMB 2020. The TUAR Corpus is an open-source database that is currently available for use by any registered member of our consortium. To register and receive access, please follow the instructions provided at this web page: https://www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml. The data is located here: https://www.isip.piconepress.com/projects/tuh_eeg/downloads/tuh_eeg_artifact/v2.0.0/. 
    more » « less
  3. KCNE3 is a single-pass integral membrane protein that regulates numerous voltage-gated potassium channel functions such as KCNQ1. Previous solution NMR studies suggested a moderate degree of curved α-helical structure in the transmembrane domain (TMD) of KCNE3 in lyso-myristoylphosphatidylcholine (LMPC) micelles and isotropic bicelles with the residues T71, S74 and G78 situated along the concave face of the curved helix. During the interaction of KCNE3 and KCNQ1, KCNE3 pushes its transmembrane domain against KCNQ1 to lock the voltage sensor in its depolarized conformation. A cryo-EM study of KCNE3 complexed with KCNQ1 in nanodiscs suggested a deviation of the KCNE3 structure from its independent structure in isotropic bicelles. Despite the biological significance of KCNE3 TMD, the conformational properties of KCNE3 are poorly understood. Here, all atom molecular dynamics (MD) simulations were utilized to investigate the conformational dynamics of the transmembrane domain of KCNE3 in a lipid bilayer containing a mixture of POPC and POPG lipids (3:1). Further, the effect of the interaction impairing mutations (V72A, I76A and F68A) on the conformational properties of the KCNE3 TMD in lipid bilayers was investigated. Our MD simulation results suggest that the KCNE3 TMD adopts a nearly linear α helical structural conformation in POPC-POPG lipid bilayers. Additionally, the results showed no significant change in the nearly linear α-helical conformation of KCNE3 TMD in the presence of interaction impairing mutations within the sampled time frame. The KCNE3 TMD is more stable with lower flexibility in comparison to the N-terminal and C-terminal of KCNE3 in lipid bilayers. The overall conformational flexibility of KCNE3 also varies in the presence of the interaction-impairing mutations. The MD simulation data further suggest that the membrane bilayer width is similar for wild-type KCNE3 and KCNE3 containing mutations. The Z-distance measurement data revealed that the TMD residue site A69 is close to the lipid bilayer center, and residue sites S57 and S82 are close to the surfaces of the lipid bilayer membrane for wild-type KCNE3 and KCNE3 containing interaction-impairing mutations. These results agree with earlier KCNE3 biophysical studies. The results of these MD simulations will provide complementary data to the experimental outcomes of KCNE3 to help understand its conformational dynamic properties in a more native lipid bilayer environment.

     
    more » « less
  4. Abstract

    Acyl carrier proteins (ACPs) are essential to the production of fatty acids. In some species of marine bacteria, ACPs are arranged into tandem repeats joined by peptide linkers, an arrangement that results in high fatty acid yields. By contrast,Escherichia coli, a relatively low producer of fatty acids, uses a single-domain ACP. In this work, we have engineered the nativeE.coliACP into tandem di- and tri-domain constructs joined by a naturally occurring peptide linker from the PUFA synthase ofPhotobacterium profundum. The size of these tandem fused ACPs was determined by size exclusion chromatography to be higher (21 kDa, 36 kDa and 141 kDa) than expected based on the amino acid sequence (12 kDa, 24 kDa and 37 kDa, respectively) suggesting the formation of a flexible extended conformation. Structural studies using small-angle X-ray scattering (SAXS), confirmed this conformational flexibility. The thermal stability for the di- and tri-domain constructs was similar to that of the unfused ACP, indicating a lack of interaction between domains. Lastly,E.colicultures harboring tandem ACPs produced up to 1.6 times more fatty acids than wild-type ACP, demonstrating the viability of ACP fusion as a method to enhance fatty acid yield in bacteria.

     
    more » « less
  5. Protein nanomaterial design is an emerging discipline with applications in medicine and beyond. A long-standing design approach uses genetic fusion to join protein homo-oligomer subunits via α-helical linkers to form more complex symmetric assemblies, but this method is hampered by linker flexibility and a dearth of geometric solutions. Here, we describe a general computational method for rigidly fusing homo-oligomer and spacer building blocks to generate user-defined architectures that generates far more geometric solutions than previous approaches. The fusion junctions are then optimized using Rosetta to minimize flexibility. We apply this method to design and test 92 dihedral symmetric protein assemblies using a set of designed homodimers and repeat protein building blocks. Experimental validation by native mass spectrometry, small-angle X-ray scattering, and negative-stain single-particle electron microscopy confirms the assembly states for 11 designs. Most of these assemblies are constructed from designed ankyrin repeat proteins (DARPins), held in place on one end by α-helical fusion and on the other by a designed homodimer interface, and we explored their use for cryogenic electron microscopy (cryo-EM) structure determination by incorporating DARPin variants selected to bind targets of interest. Although the target resolution was limited by preferred orientation effects and small scaffold size, we found that the dual anchoring strategy reduced the flexibility of the target-DARPIN complex with respect to the overall assembly, suggesting that multipoint anchoring of binding domains could contribute to cryo-EM structure determination of small proteins.

     
    more » « less