The following repository contains the data used for the manuscript: "Searching for Low-Mass Exoplanets Amid Stellar Variability with a Fixed Effects Linear Model of Line-by-Line Shape Changes". Within line_property_files.zip there is file "line_property_files_README.md" that contains a description of each column of completeLines.csv. The important columns of this csv are date: time of observation line_order: unique identifier of a line, a combination of that line's central wavelength and order ID rv_template_0.5: the RV measure for a given line on a given day fit_gauss_a,fit_gauss_b,fit_gauss_depth,fit_gauss_sigmasq,proj_hg_coeff_0,proj_hg_coeff_2,proj_hg_coeff_3,proj_hg_coeff_4,proj_hg_coeff_5,proj_hg_coeff_6,proj_hg_coeff_7,proj_hg_coeff_8,proj_hg_coeff_9,proj_hg_coeff_10: the shape-change covariates that we use to control for stellar activity Below is a description of each of the data files completeLines.csv: this csv file contains the time series of every line's shape measurements and RVs, it is used throughout the analysis. line_property_files.zip: this directory contains .h5 files that contain all line-shape information and contaminated RVs for each line used in our analysis. The script clean_data.Rmd uses these as input and combines them all into a single csv file called completeLines.csv. project_template_deriv_onto_gh.h5: this contains the projection vector described in the paper to produce the orthogonal HG coefficients. The script clean_data.Rmd uses this as input and combines them all into a single csv file called completeLines.csv. models.zip: this directory contains the results from each model that was fit for our paper.
more »
« less
River floc data extracted from a river suspended sediment concentration-depth profile data compilation
The file "riverfloc_datacompilation.csv" contains the data in csv format. The file "metadata.txt" contains the metadata describing the data in the csv file. This version corrects an error in which the ionic strength and relative charge density (variables 48 and 50) were underestimated by a factor of 1000.
more »
« less
- Award ID(s):
- 2136991
- PAR ID:
- 10615210
- Publisher / Repository:
- CaltechDATA
- Date Published:
- Subject(s) / Keyword(s):
- flocculation mud river suspended sediment geomorphology
- Format(s):
- Medium: X
- Right(s):
- mit
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This resource contains source code and select data products behind the following Master's Thesis: Platt, L. (2024). Basins modulate signatures of river salinization (Master's thesis). University of Wisconsin-Madison, Freshwater and Marine Sciences. The source code represents an R-based data processing and modeling pipeline written using the R package "targets". Some of the folders in the source code zipfile are intentionally left empty (except for a hidden file ".placeholder") in order for the code repository to be setup with the required folder structure. To execute this code, download the zip folder, unzip, and open the salt-modeling-data.Rproj file. Then, reference the instructions in the README.md file for installing packages, building the pipeline, and examining the results. Newer versions of this repository may be updated in GitHub at github.com/lindsayplatt/salt-modeling-data. In addition to the source code, this resource contains three data files containing intermediate products of the pipeline. The first two represent data prepared for the random forest modeling. Data download and processing were completed in pipeline phases 1 - 5, and the random forest modeling was completed in phase 6 (see source code). site_attributes.csv which contains the USGS gage site numbers and their associated basin attributes site_classifications.csv which contains the classification of a site for both episodic signatures ("Episodic" or "Not episodic") and baseflow salinization signatures ("positive", "none", "negative", or NA). Note that an NA in the baseflow classification column means that the site did not meet minimum data requirements for calculating a trend and was not used in the random forest model for baseflow salinization. site_attribute_details.csv contains a table of each attribute shorthand used as column names in site_attributes.csv and their names, units, description, and data source.more » « less
-
This data set contains measurements from real HVAC (heating, ventilation, and air conditioning) systems of real buildings in the US. Each ZIP file contains CSV data files of a building for different scenarios. Refer to the README file in each ZIP file for details. The document `data_info.pdf` provides explanations of the variables/columns in the data files. This work was supported by the U.S. National Science Foundation (NSF) under grants 2514584 and 2513096.more » « less
-
The data files in this data set contain climate information from sites on the North Slope of Alaska in or near the Kuparuk River basin. The data was collected for a hydrologic study of rivers in the North Slope region between 1985-present. Hydro-meteorological stations were established at various locations throughout the Kuparuk, but also in the Putuligayuk and Sagavanirktok watersheds. The variables collected at most stations were air temperature, humidity, wind speed and direction, soil temperature, snow temperature, precipitation, snow depth, and radiation. In the Imnavait Creek watershed (headwaters of the Kuparuk River), the Imnavait B site (IB) meteorological station operated from 1986 to present. This data package contains meteorological data from the Imnavait B site (IB) station and snow depth from the nearby station in the valley bottom (Imnavait Creek weir [IH]) collected from 2017 to 2023. Variables in this data package include air temperature, relative humidity, wind speed and direction, rainfall, and radiation at the Imnavait B site (IB) (2018-2023) and winter snow depth at Imnavait Weir (IH) (2017-2023). IMPORTANT NOTE: This dataset contains Imnavait B site (IB) meteorological data for 2018-2023. Updates and corrections to Imnavait B site (IB) (and others) were made in 2021 to the original datasets by the investigators, and all of the previously published data files (prior to 2008) should be replaced with the updated dataset (1985-2018) available at https://arcticdata.io/catalog/view/doi%3A10.18739%2FA2TQ5RF72. The following corrections were made to the datasets originally published in 2008 and 2010 (for data collected 1985-2008): 1) data from annual .csv files were merged into one .csv file (for each station) containing all years of data, 2) appended new data collected from 2008 to 2018 into the .csv file 3) standardized file headers, 4) standardized variable names, units, and sensor installation height above ground surface 5) reviewed all data for quality assurance and added qualifiers to erroneous data, 6) added a data qualifier to wind data during periods of extensive riming on wind sensors, 7) added a qualifier when air temperatures are below -39 degrees Celsius (C) (minimum reporting temperature of some air temperature sensors), and 8) removed duplicative data and fixed timestamp issues. See https://arcticdata.io/catalog/view/urn%3Auuid%3Ad5fa4cfa-b84b-4970-926a-8dd10b418e6d for additional climate data from other nearby stations in our studies.more » « less
-
The historical settlement data compilation for Spain (HISDAC-ES) is a geospatial dataset consisting of over 240 gridded surfaces measuring the physical, functional, age-related, and evolutionary characteristics of the Spanish building stock. We scraped, harmonized, and aggregated cadastral building footprint data for Spain, covering over 12,000,000 building footprints including construction year attributes, to create a multi-faceted series of gridded surfaces (GeoTIFF format), describing the evolution of human settlements in Spain from 1900 to 2020, at 100m spatial and 5 years temporal resolution. Also, the dataset contains aggregated characteristics and completeness statistics at the municipality level, in CSV and GeoPackage format.!!! UPDATE 08-2023 !!!: We provide a new, improved version of HISDAC-ES. Specifically, we fixed two bugs in the production code that caused an incorrect rasterization of the multitemporal BUFA layers and of the PHYS layers (BUFA, BIA, DWEL, BUNITS sum and mean). Moreover, we added decadal raster datasets measuring residential building footprint and building indoor area (1900-2020), and provide a country-wide, harmonized building footprint centroid dataset in GeoPackage vector data format.File descriptions:Datasets are available in three spatial reference systems:HISDAC-ES_All_LAEA.zip: Raster data in Lambert Azimuthal Equal Area (LAEA) covering all Spanish territory.HISDAC-ES_IbericPeninsula_UTM30.zip: Raster data in UTM Zone 30N covering all the Iberic Peninsula + Céuta and Melilla.HISDAC-ES_CanaryIslands_REGCAN.zip: Raster data in REGCAN-95, covering the Canary Islands only.HISDAC-ES_MunicipAggregates.zip: Municipality-level aggregates and completeness statistics (CSV, GeoPackage), in LAEA projection.ES_building_centroids_merged_spatjoin.gpkg: 7,000,000+ building footprint centroids in GeoPackage format, harmonized from the different cadastral systems, representing the input data for HISDAC-ES. These data can be used for sanity checks or for the creation of further, user-defined gridded surfaces.Source data:HISDAC-ES is derived from cadastral building footprint data, available from different authorities in Spain:Araba province: https://geo.araba.eus/WFS_Katastroa?SERVICE=WFS&VERSION=1.1.0&REQUEST=GetCapabilitiesBizkaia province: https://web.bizkaia.eus/es/inspirebizkaiaGipuzkoa province: https://b5m.gipuzkoa.eus/web5000/es/utilidades/inspire/edificios/Navarra region: https://inspire.navarra.es/services/BU/wfsOther regions: http://www.catastro.minhap.es/INSPIRE/buildings/ES.SDGC.bu.atom.xmlData source of municipality polygons: Centro Nacional de Información Geográfica (https://centrodedescargas.cnig.es/CentroDescargas/index.jsp)Technical notes:Gridded dataFile nomenclature:./region_projection_theme/hisdac_es_theme_variable_version_resolution[m][_year].tifRegions:all: complete territory of Spaincan: Canarian Islands onlyibe: Iberic peninsula + Céuta + MelillaProjections:laea: Lambert azimuthal equal area (EPSG:3035)regcan: REGCAN95 / UTM zone 28N (EPSG:4083)utm: ETRS89 / UTM zone 30N (EPSG:25830)Themes:evolution / evol: multi-temporal physical measurementslanduse: multi-temporal building counts per land use (i.e., building function) classphysical / phys: physical building characteristics in 2020temporal / temp: temporal characteristics (construction year statistics)Variables: evolutionbudens: building density (count per grid cell area)bufa: building footprint areadeva: developed area (any grid cell containing at least one building)resbufa: residential building footprint arearesbia: residential building indoor areaVariables: physicalbia: building indoor areabufa: building footprint areabunits: number of building unitsdwel: number of dwellingsVariables: temporalmincoy: minimum construction year per grid cellmaxcoy: minimum construction year per grid cellmeancoy: mean construction year per grid cellmedcoy: median construction year per grid cellmodecoy: mode (most frequent) construction year per grid cellvarcoy: variety of construction years per grid cellVariable: landuseCounts of buildings per grid cell and land use type.Municipality-level datahisdac_es_municipality_stats_multitemporal_longform_v1.csv: This CSV file contains the zonal sums of the gridded surfaces (e.g., number of buildings per year and municipality) in long form. Note that a value of 0 for the year attribute denotes the statistics for records without construction year information.hisdac_es_municipality_stats_multitemporal_wideform_v1.csv: This CSV file contains the zonal sums of the gridded surfaces (e.g., number of buildings per year and municipality) in wide form. Note that a value of 0 for the year suffix denotes the statistics for records without construction year information.hisdac_es_municipality_stats_completeness_v1.csv: This CSV file contains the missingness rates (in %) of the building attribute per municipality, ranging from 0.0 (attribute exists for all buildings) to 100.0 (attribute exists for none of the buildings) in a given municipality.Column names for the completeness statistics tables:NATCODE: National municipality identifier*num_total: number of buildings per municperc_bymiss: Percentage of buildings with missing built year (construction year)perc_lumiss: Percentage of buildings with missing landuse attributeperc_luother: Percentage of buildings with landuse type "other"perc_num_floors_miss: Percentage of buildings without valid number of floors attributeperc_num_dwel_miss: Percentage of buildings without valid number of dwellings attributeperc_num_bunits_miss: Percentage of buildings without valid number of building units attributeperc_offi_area_miss: Percentage of buildings without valid official area (building indoor area, BIA) attributeperc_num_dwel_and_num_bunits_miss: Percentage of buildings missing both number of dwellings and number of building units attributeThe same statistics are available as geopackage file including municipality polygons in Lambert azimuthal equal area (EPSG:3035).*From the NATCODE, other regional identifiers can be derived as follows:NATCODE: 34 01 04 04001Country: 34Comunidad autónoma (CA_CODE): 01Province (PROV_CODE): 04LAU code: 04001 (province + municipality code)more » « less
An official website of the United States government
