skip to main content


Title: The complete genome sequences of 56 Erythroxylum species
We present the whole genome sequences of 56 wild Erythroxylum species from Africa, China, and the American tropics. Deep Illumina sequencing was performed on a single leaf of each voucher. We de novo assembled sequence reads and then identified and used conserved regions across all preassemblies join contigs in a finishing step. The raw and assembled data is publicly available via Genbank.  more » « less
Award ID(s):
2010821
NSF-PAR ID:
10400001
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Biodiversity genomes
ISSN:
2687-7945
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Global biodiversity is declining at rates faster than at any other point in human history. Experimental manipulations at small spatial scales have demonstrated that communities with fewer species consistently produce less biomass than higher diversity communities. Understanding the consequences of the global extinction crisis for ecosystem functioning requires understanding how local experimental results are likely to change with increasing spatial and temporal scales and from experiments to naturally assembled systems.

    Scaling across time and space in a changing world requires baseline predictions. Here, we provide a graphical null model for area scaling of biodiversity–ecosystem functioning relationships using observed macroecological patterns: the species–area curve and the biomass–area curve. We use species–area and biomass–area curves to predict how species richness–biomass relationships are likely to change with increasing sampling extent. We then validate these predictions with data from two naturally assembled ecosystems: a Minnesota savanna and a Panamanian tropical dry forest.

    Our graphical null model predicts that biodiversity–ecosystem functioning relationships are scale‐dependent. However, we note two important caveats. First, our results indicate an apparent contradiction between predictions based on measurements in biodiversity–ecosystem functioning experiments and from scaling theory. When ecosystem functioning is measured as per unit area (e.g. biomass per m2), as is common in biodiversity–ecosystem functioning experiments, the slope of the biodiversity ecosystem functioning relationship should decrease with increasing scale. Alternatively, when ecosystem functioning is not measured per unit area (e.g. summed total biomass), as is common in scaling studies, the slope of the biodiversity–ecosystem functioning relationship should increase with increasing spatial scale. Second, the underlying macroecological patterns of biodiversity experiments are predictably different from some naturally assembled systems. These differences between the underlying patterns of experiments and naturally assembled systems may enable us to better understand when patterns from biodiversity–ecosystem functioning experiments will be valid in naturally assembled systems.

    Synthesis. This paper provides a simple graphical null model that can be extended to any relationship between biodiversity and any ecosystem functioning across space or time. Furthermore, these predictions provide crucial insights into how and when we may be able to extend results from small‐scale biodiversity experiments to naturally assembled regional and global ecosystems where biodiversity is changing.

     
    more » « less
  2. Carbone, Alessandra ; El-Kebir, Mohammed (Ed.)
    Compressed full-text indexes are one of the main success stories of bioinformatics data structures but even they struggle to handle some DNA readsets. This may seem surprising since, at least when dealing with short reads from the same individual, the readset will be highly repetitive and, thus, highly compressible. If we are not careful, however, this advantage can be more than offset by two disadvantages: first, since most base pairs are included in at least tens reads each, the uncompressed readset is likely to be at least an order of magnitude larger than the individual’s uncompressed genome; second, these indexes usually pay some space overhead for each string they store, and the total overhead can be substantial when dealing with millions of reads. The most successful compressed full-text indexes for readsets so far are based on the Extended Burrows-Wheeler Transform (EBWT) and use a sorting heuristic to try to reduce the space overhead per read, but they still treat the reads as separate strings and thus may not take full advantage of the readset’s structure. For example, if we have already assembled an individual’s genome from the readset, then we can usually use it to compress the readset well: e.g., we store the gap-coded list of reads’ starting positions; we store the list of their lengths, which is often highly compressible; and we store information about the sequencing errors, which are rare with short reads. There is nowhere, however, where we can plug an assembled genome into the EBWT. In this paper we show how to use one or more assembled or partially assembled genome as the basis for a compressed full-text index of its readset. Specifically, we build a labelled tree by taking the assembled genome as a trunk and grafting onto it the reads that align to it, at the starting positions of their alignments. Next, we compute the eXtended Burrows-Wheeler Transform (XBWT) of the resulting labelled tree and build a compressed full-text index on that. Although this index can occasionally return false positives, it is usually much more compact than the alternatives. Following the established practice for datasets with many repetitions, we compare different full-text indices by looking at the number of runs in the transformed strings. For a human Chr19 readset our preliminary experiments show that eliminating separators characters from the EBWT reduces the number of runs by 19%, from 220 million to 178 million, and using the XBWT reduces it by a further 15%, to 150 million. 
    more » « less
  3. Abstract

    Microbial genomes can be assembled from short-read sequencing data, but the assembly contiguity of these metagenome-assembled genomes is constrained by repeat elements. Correct assignment of genomic positions of repeats is crucial for understanding the effect of genome structure on genome function. We applied nanopore sequencing and our workflow, named Lathe, which incorporates long-read assembly and short-read error correction, to assemble closed bacterial genomes from complex microbiomes. We validated our approach with a synthetic mixture of 12 bacterial species. Seven genomes were completely assembled into single contigs and three genomes were assembled into four or fewer contigs. Next, we used our methods to analyze metagenomics data from 13 human stool samples. We assembled 20 circular genomes, including genomes ofPrevotella copriand a candidateCibiobactersp. Despite the decreased nucleotide accuracy compared with alternative sequencing and assembly approaches, our methods improved assembly contiguity, allowing for investigation of the role of repeat elements in microbial function and adaptation.

     
    more » « less
  4. null (Ed.)
    We report simulation studies on the self-assembly of a binary mixture of snowman and dumbbell shaped lobed particles. Depending on the lobe size and temperature, different types of self-assembled structures (random aggregates, spherical aggregates, liquid droplets, amorphous wire-like structures, amorphous ring structures, crystalline structures) are observed. At lower temperatures, heterogeneous structures are formed for lobed particles of both shapes. At higher temperatures, homogeneous self-assembled structures are formed mainly by the dumbbell shaped particles, while the snowman shaped particles remain in a dissociated state. We also investigated the porosities of self-assembled structures. The pore diameters in self-assemblies increased with an increase in temperature for a given lobe size. The particles having smaller lobes produced structures with larger pores than the particles having larger lobes. We further investigated the effect of σ , a parameter in the surface-shifted Lennard-Jones potential, on the self-assembled morphologies and their porosities. The self-assembled structures formed at a higher σ value are found to produce larger pores than those at a lower σ . 
    more » « less
  5. Hydrogen bonding plays a critical role in the self-assembly of peptide amphiphiles (PAs). Herein, we studied the effect of replacing the amide linkage between the peptide and lipid portions of the PA with a urea group, which possesses an additional hydrogen bond donor. We prepared three PAs with the peptide sequence Phe-Phe-Glu-Glu (FFEE): two are amide-linked with hydrophobic tails of different lengths and the other possesses an alkylated urea group. The differences in the self-assembled structures formed by these PAs were assessed using diverse microscopies, nuclear magnetic resonance (NMR), and dichroism techniques. We found that the urea group influences the morphology and internal arrangement of the assemblies. Molecular dynamics simulations suggest that there are about 50% more hydrogen bonds in nanostructures assembled from the urea-PA than those assembled from the other PAs. Furthermore, in silico studies suggest the presence of urea−π stacking interactions with the phenyl group of Phe, which results in distinct peptide conformations in comparison to the amide-linked PAs. We then studied the effect of the urea modification on the mechanical properties of PA hydrogels. We found that the hydrogel made of the urea-PA exhibits increased stability and self-healing ability. In addition, it allows cell adhesion, spreading, and growth as a matrix. This study reveals that the inclusion of urea bonds might be useful in controlling the morphology, mechanical, and biological properties of self-assembled nanostructures and hydrogels formed by the PAs. 
    more » « less