skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: HISDAC-ES: Historical Settlement Data Compilation for Spain (1900-2020)
The historical settlement data compilation for Spain (HISDAC-ES) is a geospatial dataset consisting of over 240 gridded surfaces measuring the physical, functional, age-related, and evolutionary characteristics of the Spanish building stock. We scraped, harmonized, and aggregated cadastral building footprint data for Spain, covering over 12,000,000 building footprints including construction year attributes, to create a multi-faceted series of gridded surfaces (GeoTIFF format), describing the evolution of human settlements in Spain from 1900 to 2020, at 100m spatial and 5 years temporal resolution. Also, the dataset contains aggregated characteristics and completeness statistics at the municipality level, in CSV and GeoPackage format.!!! UPDATE 08-2023 !!!: We provide a new, improved version of HISDAC-ES. Specifically, we fixed two bugs in the production code that caused an incorrect rasterization of the multitemporal BUFA layers and of the PHYS layers (BUFA, BIA, DWEL, BUNITS sum and mean). Moreover, we added decadal raster datasets measuring residential building footprint and building indoor area (1900-2020), and provide a country-wide, harmonized building footprint centroid dataset in GeoPackage vector data format.File descriptions:Datasets are available in three spatial reference systems:HISDAC-ES_All_LAEA.zip: Raster data in Lambert Azimuthal Equal Area (LAEA) covering all Spanish territory.HISDAC-ES_IbericPeninsula_UTM30.zip: Raster data in UTM Zone 30N covering all the Iberic Peninsula + Céuta and Melilla.HISDAC-ES_CanaryIslands_REGCAN.zip: Raster data in REGCAN-95, covering the Canary Islands only.HISDAC-ES_MunicipAggregates.zip: Municipality-level aggregates and completeness statistics (CSV, GeoPackage), in LAEA projection.ES_building_centroids_merged_spatjoin.gpkg: 7,000,000+ building footprint centroids in GeoPackage format, harmonized from the different cadastral systems, representing the input data for HISDAC-ES. These data can be used for sanity checks or for the creation of further, user-defined gridded surfaces.Source data:HISDAC-ES is derived from cadastral building footprint data, available from different authorities in Spain:Araba province: https://geo.araba.eus/WFS_Katastroa?SERVICE=WFS&VERSION=1.1.0&REQUEST=GetCapabilitiesBizkaia province: https://web.bizkaia.eus/es/inspirebizkaiaGipuzkoa province: https://b5m.gipuzkoa.eus/web5000/es/utilidades/inspire/edificios/Navarra region: https://inspire.navarra.es/services/BU/wfsOther regions: http://www.catastro.minhap.es/INSPIRE/buildings/ES.SDGC.bu.atom.xmlData source of municipality polygons: Centro Nacional de Información Geográfica (https://centrodedescargas.cnig.es/CentroDescargas/index.jsp)Technical notes:Gridded dataFile nomenclature:./region_projection_theme/hisdac_es_theme_variable_version_resolution[m][_year].tifRegions:all: complete territory of Spaincan: Canarian Islands onlyibe: Iberic peninsula + Céuta + MelillaProjections:laea: Lambert azimuthal equal area (EPSG:3035)regcan: REGCAN95 / UTM zone 28N (EPSG:4083)utm: ETRS89 / UTM zone 30N (EPSG:25830)Themes:evolution / evol: multi-temporal physical measurementslanduse: multi-temporal building counts per land use (i.e., building function) classphysical / phys: physical building characteristics in 2020temporal / temp: temporal characteristics (construction year statistics)Variables: evolutionbudens: building density (count per grid cell area)bufa: building footprint areadeva: developed area (any grid cell containing at least one building)resbufa: residential building footprint arearesbia: residential building indoor areaVariables: physicalbia: building indoor areabufa: building footprint areabunits: number of building unitsdwel: number of dwellingsVariables: temporalmincoy: minimum construction year per grid cellmaxcoy: minimum construction year per grid cellmeancoy: mean construction year per grid cellmedcoy: median construction year per grid cellmodecoy: mode (most frequent) construction year per grid cellvarcoy: variety of construction years per grid cellVariable: landuseCounts of buildings per grid cell and land use type.Municipality-level datahisdac_es_municipality_stats_multitemporal_longform_v1.csv: This CSV file contains the zonal sums of the gridded surfaces (e.g., number of buildings per year and municipality) in long form. Note that a value of 0 for the year attribute denotes the statistics for records without construction year information.hisdac_es_municipality_stats_multitemporal_wideform_v1.csv: This CSV file contains the zonal sums of the gridded surfaces (e.g., number of buildings per year and municipality) in wide form. Note that a value of 0 for the year suffix denotes the statistics for records without construction year information.hisdac_es_municipality_stats_completeness_v1.csv: This CSV file contains the missingness rates (in %) of the building attribute per municipality, ranging from 0.0 (attribute exists for all buildings) to 100.0 (attribute exists for none of the buildings) in a given municipality.Column names for the completeness statistics tables:NATCODE: National municipality identifier*num_total: number of buildings per municperc_bymiss: Percentage of buildings with missing built year (construction year)perc_lumiss: Percentage of buildings with missing landuse attributeperc_luother: Percentage of buildings with landuse type "other"perc_num_floors_miss: Percentage of buildings without valid number of floors attributeperc_num_dwel_miss: Percentage of buildings without valid number of dwellings attributeperc_num_bunits_miss: Percentage of buildings without valid number of building units attributeperc_offi_area_miss: Percentage of buildings without valid official area (building indoor area, BIA) attributeperc_num_dwel_and_num_bunits_miss: Percentage of buildings missing both number of dwellings and number of building units attributeThe same statistics are available as geopackage file including municipality polygons in Lambert azimuthal equal area (EPSG:3035).*From the NATCODE, other regional identifiers can be derived as follows:NATCODE: 34 01 04 04001Country: 34Comunidad autónoma (CA_CODE): 01Province (PROV_CODE): 04LAU code: 04001 (province + municipality code)  more » « less
Award ID(s):
1924670 2121976
PAR ID:
10482844
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
figshare
Date Published:
Subject(s) / Keyword(s):
Urban geography Urban analysis and development History and theory of the built environment (excl. architecture) Other built environment and design not elsewhere classified Geoinformatics not elsewhere classified Spatial data and applications Geospatial information systems and geospatial data modelling Time series and spatial modelling
Format(s):
Medium: X Size: 3869477411 Bytes
Size(s):
3869477411 Bytes
Sponsoring Org:
National Science Foundation
More Like this
  1. These geotiff files represent road network statistics for each core-based statistical area (CBSA) in the conterminous U.S., within grid cells of 1km x 1km. The road network statistics are based on the National transportation dataset (USGS-NTD) v2019. These statistics include: gridcell_stats_azimuthvariety_1km_all_cbsas.tif: The number of unique road angles (azimuth / orientation) in bins of 10 degrees per 1 sqkm grid cell. gridcell_stats_deadendrate_1km_all_cbsas.tif: The proportion of dead ends (nodes of degree 1) of all nodes per 1 sqkm grid cell. gridcell_stats_kmroad_1km_all_cbsas.tif: The approximate total road network length per 1 sqkm grid cell. This is based on the road segment length appended to each road segment centroid and may be biased for very long road segments. gridcell_stats_meandegree_1km_all_cbsas.tif: The average nodal degree of all nodes per 1 sqkm grid cell. gridcell_stats_meangriddedness_1km_all_cbsas.tif: The average griddedness of all nodes per 1 sqkm grid cell. gridcell_stats_nodedensity_1km_all_cbsas.tif: The number of nodes per 1 sqkm grid cell. gridcell_stats_nodesperkmroad_1km_all_cbsas.tif: The number of nodes per km road within each 1 sqkm grid cell. gridcell_stats_firstbuiltup_1km_all_cbsas.tif: The approximate settlement age per 1 sqkm grid cell. This layer is derived from the HISDAC-US First-built-up year (FBUY) layer, which is derived from Zillow's Transaction and Assessment Dataset (ZTRAX). The FBUY data is available here: Leyk, Stefan; Uhl, Johannes H., 2018, "FBUY.tar.gz", Historical settlement composite layer for the U.S. 1810 - 2015, https://doi.org/10.7910/DVN/PKJ90M/BOA5YC, Harvard Dataverse, V2  gridcell_stats_1km_all_cbsas_arcmap10.8.mxd: ESRI ArcMap 10.8 MXD file for quick visualization of the gridded surfaces. Spatial resolution: 1x1km Spatial reference: SR-ORG:7480, USA_Contiguous_Albers_Equal_Area_Conic_USGS_version Source data: USGS-NTD, HISDAC-US. File format: GeoTIFF. Spatial coverage of the road network metrics: All CBSAs in the conterminous U.S. Spatial coverage of the "first built-up year" surface: all U.S. counties that are covered by the HISDAC-US  historical settlement layers. This datasets includes around 2,700 U.S.  counties. In the remaining counties, construction year coverage in the  underlying ZTRAX data (Zillow Transaction and Assessment Dataset) is  low. See Leyk & Uhl (2018) for details. All data created by Johannes H. Uhl, University of Colorado Boulder, USA. Code available at https://github.com/johannesuhl/USRoadNetworkEvolution. References: Burghardt, K., Uhl, J., Lerman, K.,  & Leyk, S. (2022). Road   Network Evolution in the Urban and Rural  United States Since 1900.   Computers, Environment and Urban Systems. Leyk, S., & Uhl, J. H. (2018). HISDAC-US, historical settlement   data  compilation for the conterminous United States over 200 years. Scientific data, 5(1), 1-14. DOI:  https://doi.org/10.1038/sdata.2018.175  
    more » « less
  2. An ESRI Shapfile containing spatially generalized built-up areas for each decade from 1900 to 2010, and for 2015, for each core-based statistical area (CBSA, i.e., metropolitan and micropolitan statistical area) in the conterminous United States. These areas are derived from historical settlement layers from the Historical settlement data compilation for the U.S. (HISDAC-US, Leyk & Uhl 2018). See Burghardt et al. (2022) for details on the data processing. Additionally, there is a CSV file (HISDAC-US_patch_statistics.csv) containing the counts of built-up property records (BUPR), and -locations (BUPL), as well as total building indoor area (BUI) and built-up area (BUA) per CBSA, year, and patch, extraced from the HISDAC-US data (Uhl & Leyk 2018, Uhl et al. 2021). This CSV can be joined to the shapefile (column uid2) by concatenating the columns msaid_year_Id. Spatial coverage: all CBSAs that are covered by the HISDAC-US historical settlement layers. This dataset includes around 2,700 U.S. counties. In the remaining counties, construction year coverage in the underlying ZTRAX data (Zillow Transaction and Assessment Dataset) is low. See Uhl et al. (2021) for details. All data created by Johannes H. Uhl, University of Colorado Boulder, USA. Code available at https://github.com/johannesuhl/USRoadNetworkEvolution. References: Burghardt, K., Uhl, J., Lerman, K.,  & Leyk, S. (2022). Road Network Evolution in the Urban and Rural  United States Since 1900. Computers, Environment and Urban Systems. Leyk, S., & Uhl, J. H. (2018). HISDAC-US, historical settlement data  compilation for the conterminous United States over 200 years. Scientific data, 5(1), 1-14. DOI:  https://doi.org/10.1038/sdata.2018.175  Uhl, J. H., Leyk, S., McShane, C. M., Braswell, A. E., Connor, D. S.,  & Balk, D. (2021). Fine-grained, spatiotemporal datasets measuring  200 years of land development in the United States. Earth system science data, 13(1), 119-153. DOI:  https://doi.org/10.5194/essd-13-119-2021  
    more » « less
  3. Tabulated statistics of road networks at the level of intersections and for built-up areas for each decade from 1900 to 2010, and for 2015, for each core-based statistical area (CBSA, i.e., metropolitan and micropolitan statistical area) in the conterminous United States. These areas are derived from historical road networks developed by Johannes Uhl. See Burghardt et al. (2022) for details on the data processing.  Spatial coverage: all CBSAs that are covered by the HISDAC-US historical settlement layers. This dataset includes around 2,700 U.S. counties. In the remaining counties, construction year coverage in the underlying ZTRAX data (Zillow Transaction and Assessment Dataset) is low. See Uhl et al. (2021) for details. All data created by Keith A. Burghardt, USC Information Sciences Institute, USA Codebook: these CBSA statistics are stratified by degree of aggregation. - CBSA_stats_diffFrom1950: Change in CBSA-aggregated patch statistics between 1950 and 2015 - CBSA_stats_by_decade: CBSA-aggregated patch statistics for each decade from 1900-2010 plus 2015 - CBSA_stats_by_decade: CBSA-aggregated cumulative patch statistics for each decade from 1900-2010 plus 2015. All roads created up to a given decade are used for calculating statistics. - Patch_stats_by_decade: Individual patch statistics for each decade from 1900-2010 plus 2015 - Patch_stats_by_decade: Individual cumulative patch statistics for each decade from 1900-2010 plus 2015. All roads created up to a given decade are used for calculating statistics. The statistics are the following: msaid: CBSA codeid: (if patch statistics) arbitrary int unique to each patch within the CBSA that yearyear: year of statisticspop: population within all CBSA countiespatch_bupr: built up property records (BUPR) within a patch (or sum of patches within CBSA)patch_bupl: built up property l (BUPL) within a patch (or sum of patches within CBSA)patch_bua: built up area (BUA) within a patch (or sum of patches within CBSA)all_bupr: Same as above but for all data in 2015 regardless of whether properties were in patchesall_bupl: Same as above but for all data in 2015 regardless of whether properties were in patchesall_bua: Same as above but for all data in 2015 regardless of whether properties were in patchesnum_nodes: number of nodes (intersections)num_edges: number of edges (roads between intersections)distance: total road length in kmk_mean: mean number of undirected roads per intersectionk1: fraction of nodes with degree 1k4plus: fraction of nodes with degree 4+bearing: histogram of different bearings between intersectionsentropy: entropy of bearing histogrammean_local_gridness: Griddedness used in textmean_local_gridness_max: Same as griddedness used in text but assumes we can have up to 3 quadrilaterals for degree 3 (maximum possible, although intersections will not necessarily create right angles) Code available at https://github.com/johannesuhl/USRoadNetworkEvolution. References: Burghardt, K., Uhl, J., Lerman, K.,  & Leyk, S. (2022). Road Network Evolution in the Urban and Rural  United States Since 1900. Computers, Environment and Urban Systems. 
    more » « less
  4. {"Abstract":["Data files were used in support of the research paper titled \u201cMitigating RF Jamming Attacks at the Physical Layer with Machine Learning<\/em>" which has been submitted to the IET Communications journal.<\/p>\n\n---------------------------------------------------------------------------------------------<\/p>\n\nAll data was collected using the SDR implementation shown here: https://github.com/mainland/dragonradio/tree/iet-paper. Particularly for antenna state selection, the files developed for this paper are located in 'dragonradio/scripts/:'<\/p>\n\n'ModeSelect.py': class used to defined the antenna state selection algorithm<\/li>'standalone-radio.py': SDR implementation for normal radio operation with reconfigurable antenna<\/li>'standalone-radio-tuning.py': SDR implementation for hyperparameter tunning<\/li>'standalone-radio-onmi.py': SDR implementation for omnidirectional mode only<\/li><\/ul>\n\n---------------------------------------------------------------------------------------------<\/p>\n\nAuthors: Marko Jacovic, Xaime Rivas Rey, Geoffrey Mainland, Kapil R. Dandekar\nContact: krd26@drexel.edu<\/p>\n\n---------------------------------------------------------------------------------------------<\/p>\n\nTop-level directories and content will be described below. Detailed descriptions of experiments performed are provided in the paper.<\/p>\n\n---------------------------------------------------------------------------------------------<\/p>\n\nclassifier_training: files used for training classifiers that are integrated into SDR platform<\/p>\n\n'logs-8-18' directory contains OTA SDR collected log files for each jammer type and under normal operation (including congested and weaklink states)<\/li>'classTrain.py' is the main parser for training the classifiers<\/li>'trainedClassifiers' contains the output classifiers generated by 'classTrain.py'<\/li><\/ul>\n\npost_processing_classifier: contains logs of online classifier outputs and processing script<\/p>\n\n'class' directory contains .csv logs of each RTE and OTA experiment for each jamming and operation scenario<\/li>'classProcess.py' parses the log files and provides classification report and confusion matrix for each multi-class and binary classifiers for each observed scenario - found in 'results->classifier_performance'<\/li><\/ul>\n\npost_processing_mgen: contains MGEN receiver logs and parser<\/p>\n\n'configs' contains JSON files to be used with parser for each experiment<\/li>'mgenLogs' contains MGEN receiver logs for each OTA and RTE experiment described. Within each experiment logs are separated by 'mit' for mitigation used, 'nj' for no jammer, and 'noMit' for no mitigation technique used. File names take the form *_cj_* for constant jammer, *_pj_* for periodic jammer, *_rj_* for reactive jammer, and *_nj_* for no jammer. Performance figures are found in 'results->mitigation_performance'<\/li><\/ul>\n\nray_tracing_emulation: contains files related to Drexel area, Art Museum, and UAV Drexel area validation RTE studies.<\/p>\n\nDirectory contains detailed 'readme.txt' for understanding.<\/li>Please note: the processing files and data logs present in 'validation' folder were developed by Wolfe et al. and should be cited as such, unless explicitly stated differently. \n\tS. Wolfe, S. Begashaw, Y. Liu and K. R. Dandekar, "Adaptive Link Optimization for 802.11 UAV Uplink Using a Reconfigurable Antenna," MILCOM 2018 - 2018 IEEE Military Communications Conference (MILCOM), 2018, pp. 1-6, doi: 10.1109/MILCOM.2018.8599696.<\/li><\/ul>\n\t<\/li><\/ul>\n\nresults: contains results obtained from study<\/p>\n\n'classifier_performance' contains .txt files summarizing binary and multi-class performance of online SDR system. Files obtained using 'post_processing_classifier.'<\/li>'mitigation_performance' contains figures generated by 'post_processing_mgen.'<\/li>'validation' contains RTE and OTA performance comparison obtained by 'ray_tracing_emulation->validation->matlab->outdoor_hover_plots.m'<\/li><\/ul>\n\ntuning_parameter_study: contains the OTA log files for antenna state selection hyperparameter study<\/p>\n\n'dataCollect' contains a folder for each jammer considered in the study, and inside each folder there is a CSV file corresponding to a different configuration of the learning parameters of the reconfigurable antenna. The configuration selected was the one that performed the best across all these experiments and is described in the paper.<\/li>'data_summary.txt'this file contains the summaries from all the CSV files for convenience.<\/li><\/ul>"]} 
    more » « less
  5. Abstract. Multi-temporal measurements quantifying the changes to the Earth's surface are critical for understanding many natural, anthropogenic, and social processes. Researchers typically use remotely sensed Earth observation data to quantify and characterize such changes in land use and land cover (LULC). However, such data sources are limited in their availability prior to the 1980s. While an observational window of 40 to 50 years is sufficient to study most recent LULC changes, processes such as urbanization, land development, and the evolution of urban and coupled nature–human systems often operate over longer time periods covering several decades or even centuries. Thus, to quantify and better understand such processes, alternative historical–geospatial data sources are required that extend farther back in time. However, such data are rare, and processing is labor-intensive, often involving manual work. To overcome the resulting lack in quantitative knowledge of urban systems and the built environment prior to the 1980s, we leverage cadastral data with rich thematic property attribution, such as building usage and construction year. We scraped, harmonized, and processed over 12 000 000 building footprints including construction years to create a multi-faceted series of gridded surfaces, describing the evolution of human settlements in Spain from 1900 to 2020, at 100 m spatial and 5-year temporal resolution. These surfaces include measures of building density, built-up intensity, and built-up land use. We evaluated our data against a variety of data sources including remotely sensed human settlement data and land cover data, model-based historical land use depictions, and historical maps and historical aerial imagery and find high levels of agreement. This new data product, the Historical Settlement Data Compilation for Spain (HISDAC-ES), is publicly available (https://doi.org/10.6084/m9.figshare.22009643, Uhl et al., 2023a) and represents a rich source for quantitative, long-term analyses of the built environment and related processes over large spatial and temporal extents and at fine resolutions. 
    more » « less