This is a pre-release of this MAG set, which is now published in ENA under BioProject PRJNA386568.FILES: mags_emerge_20230110.tar.gz - Archive containing MAG files (.fna). metadata_MAGs_EMERGE.tsv - Table containing MIMAG(5.0)-formatted sample attributes, genome information, and other metadata for the MAGs. This table also includes JGI or NCBI genome accession #s for some additional MAGs that are not part of the .tar.gz archive.  NEW in Version 1.0.0: Added source metagenome accessions, including SRA runs (derived_from) and BioSamples (metaG_biosample), for all MAGs where this info was available. Added other metadata (including SampleID__, assembly methods, and sequencing technology) that was previously absent for the externally-cited MAGs.   FUNDING: This research is a contribution of the EMERGE Biology Integration Institute (https://emerge-bii.github.io/), funded by the National Science Foundation, Biology Integration Institutes Program, Award # 2022070. This study was also funded by the Genomic Science Program of the United States Department of Energy Office of Biological and Environmental Research, grant #s DE-SC0004632. DE-SC0010580. and DE-SC0016440. We thank the Swedish Polar Research Secretariat and SITES for the support of the work done at the Abisko Scientific Research Station. SITES is supported by the Swedish Research Council's grant 4.3-2021-00164. Data collected at the Joint Genome Institute was generated under the following awards: The majority of sequencing at JGI was supported by BER Support Science Proposal 503530 (DOI: 10.46936/10.25585/60001148), conducted by the U.S. Department of Energy Joint Genome Institute (https://ror.org/04xm1d337), a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. Sequencing of SIP samples was performed under the Facilities Integrating Collaborations for User Science (FICUS) initiative (proposal 503547; award DOI: 10.46936/fics.proj.2017.49950/60006215) and used resources at the DOE Joint Genome Institute (https://ror.org/04xm1d337) and the Environmental Molecular Sciences Laboratory (https://ror.org/04rc0xn13), which are DOE Office of Science User Facilities. Both facilities are sponsored by the Office of Biological and Environmental Research and operated under Contract Nos. DE-AC02-05CH11231 (JGI) and DE-AC05-76RL01830 (EMSL). 
                        more » 
                        « less   
                    
                            
                            Metagenome-assembled genomes from Stordalen Mire, Sweden (MAGs v2)
                        
                    
    
            This release (MAGs v2) is a major new version of this metagenome-assembled genome (MAG) set. All previous releases on this page (which only differ in the metadata) are designated "MAGs v1." The current release (MAGs v2) uses CheckM2 v1.0.2 filtering (≥70% completeness, ≤10% contamination) to expand this dataset to include 36,419 MAGs, with the following subcategories: Cronin_v1:  Manually-curated subset of the "Field" category from MAGs v1. Cronin_v2:  MAGs from raw bin filtering on the same assemblies used to generate Cronin_v1. Woodcroft_v2:  MAGs from raw bin filtering on the same assemblies used to generate the MAGs reported in Woodcroft & Singleton et al. (2018). SIPS:  Updated genomes from samples originating from a stable isotope probing (SIP) incubation experiment by Moira Hough et al. ("SIP" in MAGs v1), re-analyzed due to read truncation and sample linkage issues in MAGs v1. JGI:  Expanded set of genomes from the Joint Genome Institute's metagenome annotation pipeline.   FILES: Emerge_MAGs_v2.tar.gz - Archive containing the MAG files (.fna). metadata_MAGs_v2_EMERGE.tsv - Table containing source sample names and accessions, GTDB taxonomy information, CheckM2 quality reports, NCBI GenomeBatch- and MIMAG(6.0)-formatted sample attributes and other metadata for the MAGs.    FUNDING: This research is a contribution of the EMERGE Biology Integration Institute (https://emerge-bii.github.io/), funded by the National Science Foundation, Biology Integration Institutes Program, Award # 2022070. This study was also funded by the Genomic Science Program of the United States Department of Energy Office of Biological and Environmental Research, grant #s DE-SC0004632. DE-SC0010580. and DE-SC0016440. We thank the Swedish Polar Research Secretariat and SITES for the support of the work done at the Abisko Scientific Research Station. SITES is supported by the Swedish Research Council's grant 4.3-2021-00164. Data collected at the Joint Genome Institute was generated under the following awards: The majority of sequencing at JGI was supported by BER Support Science Proposal 503530 (DOI: 10.46936/10.25585/60001148), conducted by the U.S. Department of Energy Joint Genome Institute (https://ror.org/04xm1d337), a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. Sequencing of SIP samples was performed under the Facilities Integrating Collaborations for User Science (FICUS) initiative (proposal 503547; award DOI: 10.46936/fics.proj.2017.49950/60006215) and used resources at the DOE Joint Genome Institute (https://ror.org/04xm1d337) and the Environmental Molecular Sciences Laboratory (https://ror.org/04rc0xn13), which are DOE Office of Science User Facilities. Both facilities are sponsored by the Office of Biological and Environmental Research and operated under Contract Nos. DE-AC02-05CH11231 (JGI) and DE-AC05-76RL01830 (EMSL). 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2022070
- PAR ID:
- 10591369
- Publisher / Repository:
- Zenodo
- Date Published:
- Subject(s) / Keyword(s):
- EMERGE Biology Integration Institute Stordalen Mire
- Format(s):
- Medium: X
- Right(s):
- Creative Commons Attribution 4.0 International
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            METHODS: Soil samples (6 total) were collected at the Stordalen Mire site in 2019 from two depths (1-5 & 20-24 cm below ground) across three habitats (Palsa, Bog, and Fen). DNA was extracted based on the protocol described by Li et al. (2024). For short reads, libraries were prepared at the Joint Genome Institute (JGI) with the KAPA Hyperprep kit, and sequenced with Illumina NovaSeq 6000. For long reads, libraries were prepared with the SMRTbell Express Template Prep Kit 2.0 (PacBio), then sequenced using PacBio Sequel IIe at JGI. PacBio data was processed at JGI to form filtered CCS (Circular Consensus Sequencing) reads. Assemblies were generated with short-only, long-only, and hybrid read sources: Short-only was assembled with metaSPAdes (v3.15.4) using Aviary (v0.5.3) with default parameters. Long-only was assembled with metaFlye (v2.9-b1768) using Aviary (v0.5.3) with default parameters. Hybrid assembly was performed using Aviary v0.5.3 with default parameters. This involved a step-down procedure with long-read assembly through metaFlye (v2.9-b1768), followed by short-read polishing by Racon (v1.4.3), Pilon (v1.24) and then Racon again. Next, reads that didn't map to high-quality metaFlye contigs were hybrid assembled with SPAdes (--meta option) and binned out with MetaBAT2 (v2.1.5). For each bin, the reads within the bin were hybrid assembled using Unicycler (v0.4.8). The high-coverage metaFlye contigs and Unicycler contigs were then combined to form the assembly fasta file. Genome recovery was performed using Aviary v0.5.3 with samples chosen for differential abundance binning by Bin Chicken (v0.4.2) using SingleM metapackage S3.0.5. This involved initial read mapping through CoverM (v0.6.1) using minimap2 (v2.18) and binning by MetaBAT, MetaBAT2 (v2.1.5), VAMB (v3.0.2), SemiBin (v1.3.1), Rosella (v0.4.2), CONCOCT (v1.1.0) and MaxBin2 (v2.2.7). Genomes were analyzed using CheckM2 (v1.0.2) and clustered at 95% ANI using Galah (v0.4.0). FILES: EMERGE_MAGs_2019_long-short-hybrid.tar.gz - Archive containing the MAG files (.fna). metadata_MAGs_2019_EMERGE.tsv - Table containing source sample names and accessions, GTDB classifications, CheckM2 quality information, NCBI GenomeBatch- and MIMAG(6.0)-formatted attributes, and other metadata for the MAGs. FUNDING: This research is a contribution of the EMERGE Biology Integration Institute (https://emerge-bii.github.io/), funded by the National Science Foundation, Biology Integration Institutes Program, Award # 2022070. This study was also funded by the Genomic Science Program of the United States Department of Energy Office of Biological and Environmental Research, grant #s DE-SC0004632. DE-SC0010580. and DE-SC0016440. We thank the Swedish Polar Research Secretariat and SITES for the support of the work done at the Abisko Scientific Research Station. SITES is supported by the Swedish Research Council's grant 4.3-2021-00164. Data from the Joint Genome Institute (JGI) was collected under BER Support Science Proposal 503530 (DOI: 10.46936/10.25585/60001148), conducted by the U.S. Department of Energy Joint Genome Institute (https://ror.org/04xm1d337), a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.more » « less
- 
            Pairwise geographic distances (m) between mire-wide plots in Stordalen Mire, northern Sweden. Distances are in the file Mirewide_Plots_distances-m.csv. This file was generated with the script Mirewide_Plots_Distances.R, using Mirewide_Plots_GPS.csv as input, and geosphere package version 1.5-10. Details of the plots, including latitude, longitude, and vegetation cover, are in the dataset "Stordalen Mire mire-wide survey: Vegetation cover" (https://doi.org/10.5281/zenodo.15048198). The latitude & longitude provided in that dataset represent more precise versions of the coordinates in Mirewide_Plots_GPS.csv (which also omits plot 8); the coordinates are otherwise identical in both datasets. FUNDING: National Aeronautics and Space Administration, Interdisciplinary Science program: From Archaea to the Atmosphere (award # NNX17AK10G). National Science Foundation, Biology Integration Institutes Program: EMERGE Biology Integration Institute (award # 2022070). United States Department of Energy Office of Biological and Environmental Research, Genomic Science Program: The IsoGenie Project (grant #s DE-SC0004632, DE-SC0010580, and DE-SC0016440). We thank the Swedish Polar Research Secretariat and SITES for the support of the work done at the Abisko Scientific Research Station. SITES is supported by the Swedish Research Council's grant 4.3-2021-00164.more » « less
- 
            Relative abundances of all microbial PiCRUST-inferred functional pathways for all samples, based on 16S rRNA amplicon sequencing data from a mire-wide survey (2015) and co-analyzed autochamber site samples (2014-2015). The 16S rRNA amplicon sequencing data is available under NCBI BioProject PRJNA1236848. The sample metadata and SRA accessions are available at https://doi.org/10.5281/zenodo.15047156. FUNDING: National Aeronautics and Space Administration, Interdisciplinary Science program: From Archaea to the Atmosphere (award # NNX17AK10G). National Science Foundation, Biology Integration Institutes Program: EMERGE Biology Integration Institute (award # 2022070). United States Department of Energy Office of Biological and Environmental Research, Genomic Science Program: The IsoGenie Project (grant #s DE-SC0004632, DE-SC0010580, and DE-SC0016440). Sequencing was performed using startup funding from the University of Arizona to Virginia Rich. We thank the Swedish Polar Research Secretariat and SITES for the support of the work done at the Abisko Scientific Research Station. SITES is supported by the Swedish Research Council's grant 4.3-2021-00164.more » « less
- 
            Stordalen Mire microbial ASV table (mire-wide_ASV_table.tsv) and taxonomy (mire-wide_taxonomy.tsv), based on 16S rRNA amplicon sequencing data from a mire-wide survey (2015) and co-analyzed autochamber site samples (2014-2015). The 16S rRNA amplicon sequencing data is available under NCBI BioProject PRJNA1236848. The sample metadata and SRA accessions are available at https://doi.org/10.5281/zenodo.15047156. FUNDING: National Aeronautics and Space Administration, Interdisciplinary Science program: From Archaea to the Atmosphere (award # NNX17AK10G). National Science Foundation, Biology Integration Institutes Program: EMERGE Biology Integration Institute (award # 2022070). United States Department of Energy Office of Biological and Environmental Research, Genomic Science Program: The IsoGenie Project (grant #s DE-SC0004632, DE-SC0010580, and DE-SC0016440). Sequencing was performed using startup funding from the University of Arizona to Virginia Rich. We thank the Swedish Polar Research Secretariat and SITES for the support of the work done at the Abisko Scientific Research Station. SITES is supported by the Swedish Research Council's grant 4.3-2021-00164.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
