skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Focal accuracy estimates and landscape metrics for built-up areas from the GHS-BUILT R2018A in Massachusetts, USA.
These files are supplementary data for this publication: Uhl JH & Leyk S (2022). "Assessing the relationship between morphology and mapping accuracy of built-up areas derived from global human settlement data (https://doi.org/10.1080/15481603.2022.2131192). Each geopackage (GPKG) file contains a set of point locations (in EPSG:3857) attributed with focal accuracy metrics of the GHS-BUILT-R2018A epochs 1975 and 2014, calculated within different levels of spatial support (i.e., focal window size) and for different analytical units (i.e., 30m grid cells, and 3x3 grid cell blocks). Moreover, each location is attributed with focal landscape metrics of built-up areas calculated in the same focal windows using the software Fragstats. These landscape metrics are calculated based on both, GHS built-up areas and reference built-up areas. Reference built-up areas were derived from the Multi-temporal building footprint database for 33 U.S. counties (MTBF-33). These datasets can be used for spatially explicit predictive modeling of the GHS-BUILT R2018A data accuracy using landscape metrics as predictor variables. File nomenclature: lsm_ref_accuracy_sample_2014_1000.gpkg : landscape metrics calculated from the reference built-up areas, for the epoch 2014, using a quadratic focal window of 1,000m x 1,000m. lsm_ghs_accuracy_sample_1975_10000.gpkg : landscape metrics calculated from the ghs built-up areas, for the epoch 1975, using a quadratic focal window of 10,000m x 10,000m. Data processing: Johannes H. Uhl, University of Colorado Boulder (USA), 2020-2022.  more » « less
Award ID(s):
1924670
PAR ID:
10482845
Author(s) / Creator(s):
;
Publisher / Repository:
figshare
Date Published:
Subject(s) / Keyword(s):
Urban analysis and development Photogrammetry and remote sensing Landscape ecology
Format(s):
Medium: X Size: 477894312 Bytes
Size(s):
477894312 Bytes
Sponsoring Org:
National Science Foundation
More Like this
  1. An ESRI Shapfile containing spatially generalized built-up areas for each decade from 1900 to 2010, and for 2015, for each core-based statistical area (CBSA, i.e., metropolitan and micropolitan statistical area) in the conterminous United States. These areas are derived from historical settlement layers from the Historical settlement data compilation for the U.S. (HISDAC-US, Leyk & Uhl 2018). See Burghardt et al. (2022) for details on the data processing. Additionally, there is a CSV file (HISDAC-US_patch_statistics.csv) containing the counts of built-up property records (BUPR), and -locations (BUPL), as well as total building indoor area (BUI) and built-up area (BUA) per CBSA, year, and patch, extraced from the HISDAC-US data (Uhl & Leyk 2018, Uhl et al. 2021). This CSV can be joined to the shapefile (column uid2) by concatenating the columns msaid_year_Id. Spatial coverage: all CBSAs that are covered by the HISDAC-US historical settlement layers. This dataset includes around 2,700 U.S. counties. In the remaining counties, construction year coverage in the underlying ZTRAX data (Zillow Transaction and Assessment Dataset) is low. See Uhl et al. (2021) for details. All data created by Johannes H. Uhl, University of Colorado Boulder, USA. Code available at https://github.com/johannesuhl/USRoadNetworkEvolution. References: Burghardt, K., Uhl, J., Lerman, K.,  & Leyk, S. (2022). Road Network Evolution in the Urban and Rural  United States Since 1900. Computers, Environment and Urban Systems. Leyk, S., & Uhl, J. H. (2018). HISDAC-US, historical settlement data  compilation for the conterminous United States over 200 years. Scientific data, 5(1), 1-14. DOI:  https://doi.org/10.1038/sdata.2018.175  Uhl, J. H., Leyk, S., McShane, C. M., Braswell, A. E., Connor, D. S.,  & Balk, D. (2021). Fine-grained, spatiotemporal datasets measuring  200 years of land development in the United States. Earth system science data, 13(1), 119-153. DOI:  https://doi.org/10.5194/essd-13-119-2021  
    more » « less
  2. These geotiff files represent road network statistics for each core-based statistical area (CBSA) in the conterminous U.S., within grid cells of 1km x 1km. The road network statistics are based on the National transportation dataset (USGS-NTD) v2019. These statistics include: gridcell_stats_azimuthvariety_1km_all_cbsas.tif: The number of unique road angles (azimuth / orientation) in bins of 10 degrees per 1 sqkm grid cell. gridcell_stats_deadendrate_1km_all_cbsas.tif: The proportion of dead ends (nodes of degree 1) of all nodes per 1 sqkm grid cell. gridcell_stats_kmroad_1km_all_cbsas.tif: The approximate total road network length per 1 sqkm grid cell. This is based on the road segment length appended to each road segment centroid and may be biased for very long road segments. gridcell_stats_meandegree_1km_all_cbsas.tif: The average nodal degree of all nodes per 1 sqkm grid cell. gridcell_stats_meangriddedness_1km_all_cbsas.tif: The average griddedness of all nodes per 1 sqkm grid cell. gridcell_stats_nodedensity_1km_all_cbsas.tif: The number of nodes per 1 sqkm grid cell. gridcell_stats_nodesperkmroad_1km_all_cbsas.tif: The number of nodes per km road within each 1 sqkm grid cell. gridcell_stats_firstbuiltup_1km_all_cbsas.tif: The approximate settlement age per 1 sqkm grid cell. This layer is derived from the HISDAC-US First-built-up year (FBUY) layer, which is derived from Zillow's Transaction and Assessment Dataset (ZTRAX). The FBUY data is available here: Leyk, Stefan; Uhl, Johannes H., 2018, "FBUY.tar.gz", Historical settlement composite layer for the U.S. 1810 - 2015, https://doi.org/10.7910/DVN/PKJ90M/BOA5YC, Harvard Dataverse, V2  gridcell_stats_1km_all_cbsas_arcmap10.8.mxd: ESRI ArcMap 10.8 MXD file for quick visualization of the gridded surfaces. Spatial resolution: 1x1km Spatial reference: SR-ORG:7480, USA_Contiguous_Albers_Equal_Area_Conic_USGS_version Source data: USGS-NTD, HISDAC-US. File format: GeoTIFF. Spatial coverage of the road network metrics: All CBSAs in the conterminous U.S. Spatial coverage of the "first built-up year" surface: all U.S. counties that are covered by the HISDAC-US  historical settlement layers. This datasets includes around 2,700 U.S.  counties. In the remaining counties, construction year coverage in the  underlying ZTRAX data (Zillow Transaction and Assessment Dataset) is  low. See Leyk & Uhl (2018) for details. All data created by Johannes H. Uhl, University of Colorado Boulder, USA. Code available at https://github.com/johannesuhl/USRoadNetworkEvolution. References: Burghardt, K., Uhl, J., Lerman, K.,  & Leyk, S. (2022). Road   Network Evolution in the Urban and Rural  United States Since 1900.   Computers, Environment and Urban Systems. Leyk, S., & Uhl, J. H. (2018). HISDAC-US, historical settlement   data  compilation for the conterminous United States over 200 years. Scientific data, 5(1), 1-14. DOI:  https://doi.org/10.1038/sdata.2018.175  
    more » « less
  3. This CSV file contains geometric and topological road network statistics for the majority of counties in the conterminous U.S. The underlying road network data is the USGS-NTD v2019. These road network data from 2019 were clipped to historical settlement extents obtained from the HISDAC-US dataset  Road network statistics are multi-temporal, calculated in time slices for the years: 1810-1900, 1880-1920, 1900-1940, 1920-1960, 1940-1980, 1960-2000, 1980-2015 The historical built-up areas used to model the historical road networks are derived from historical settlement layers from the  Historical settlement data compilation for the U.S. (HISDAC-US, Leyk & Uhl 2018). See Burghardt et al. (2022) for details on the modelling strategy. Spatial coverage: all U.S. counties that are covered by the HISDAC-US historical settlement layers. This datasets includes around 2,700 U.S. counties. In the remaining counties, construction year coverage in the underlying ZTRAX data (Zillow Transaction and Assessment Dataset) is low. See Uhl et al. (2021) for details. All data created by Johannes H. Uhl, University of Colorado Boulder, USA. Code available at https://github.com/johannesuhl/USRoadNetworkEvolution. References: Burghardt, K., Uhl, J., Lerman, K.,  & Leyk, S. (2022). Road  Network Evolution in the Urban and Rural  United States Since 1900.  Computers, Environment and Urban Systems. Leyk, S., & Uhl, J. H. (2018). HISDAC-US, historical settlement  data  compilation for the conterminous United States over 200 years. Scientific data, 5(1), 1-14. DOI:  https://doi.org/10.1038/sdata.2018.175  Uhl, J. H., Leyk, S., McShane, C. M., Braswell, A. E., Connor, D.  S.,  & Balk, D. (2021). Fine-grained, spatiotemporal datasets  measuring  200 years of land development in the United States. Earth system science data, 13(1), 119-153. DOI:  https://doi.org/10.5194/essd-13-119-2021  
    more » « less
  4. Tabulated statistics of road networks at the level of intersections and for built-up areas for each decade from 1900 to 2010, and for 2015, for each core-based statistical area (CBSA, i.e., metropolitan and micropolitan statistical area) in the conterminous United States. These areas are derived from historical road networks developed by Johannes Uhl. See Burghardt et al. (2022) for details on the data processing.  Spatial coverage: all CBSAs that are covered by the HISDAC-US historical settlement layers. This dataset includes around 2,700 U.S. counties. In the remaining counties, construction year coverage in the underlying ZTRAX data (Zillow Transaction and Assessment Dataset) is low. See Uhl et al. (2021) for details. All data created by Keith A. Burghardt, USC Information Sciences Institute, USA Codebook: these CBSA statistics are stratified by degree of aggregation. - CBSA_stats_diffFrom1950: Change in CBSA-aggregated patch statistics between 1950 and 2015 - CBSA_stats_by_decade: CBSA-aggregated patch statistics for each decade from 1900-2010 plus 2015 - CBSA_stats_by_decade: CBSA-aggregated cumulative patch statistics for each decade from 1900-2010 plus 2015. All roads created up to a given decade are used for calculating statistics. - Patch_stats_by_decade: Individual patch statistics for each decade from 1900-2010 plus 2015 - Patch_stats_by_decade: Individual cumulative patch statistics for each decade from 1900-2010 plus 2015. All roads created up to a given decade are used for calculating statistics. The statistics are the following: msaid: CBSA codeid: (if patch statistics) arbitrary int unique to each patch within the CBSA that yearyear: year of statisticspop: population within all CBSA countiespatch_bupr: built up property records (BUPR) within a patch (or sum of patches within CBSA)patch_bupl: built up property l (BUPL) within a patch (or sum of patches within CBSA)patch_bua: built up area (BUA) within a patch (or sum of patches within CBSA)all_bupr: Same as above but for all data in 2015 regardless of whether properties were in patchesall_bupl: Same as above but for all data in 2015 regardless of whether properties were in patchesall_bua: Same as above but for all data in 2015 regardless of whether properties were in patchesnum_nodes: number of nodes (intersections)num_edges: number of edges (roads between intersections)distance: total road length in kmk_mean: mean number of undirected roads per intersectionk1: fraction of nodes with degree 1k4plus: fraction of nodes with degree 4+bearing: histogram of different bearings between intersectionsentropy: entropy of bearing histogrammean_local_gridness: Griddedness used in textmean_local_gridness_max: Same as griddedness used in text but assumes we can have up to 3 quadrilaterals for degree 3 (maximum possible, although intersections will not necessarily create right angles) Code available at https://github.com/johannesuhl/USRoadNetworkEvolution. References: Burghardt, K., Uhl, J., Lerman, K.,  & Leyk, S. (2022). Road Network Evolution in the Urban and Rural  United States Since 1900. Computers, Environment and Urban Systems. 
    more » « less
  5. The data provided here accompany the publication "Drought Characterization with GPS: Insights into Groundwater and Reservoir Storage in California" [Young et al., (2024)] which is currently under review with Water Resources Research. (as of 28 May 2024)Please refer to the manuscript and its supplemental materials for full details. (A link will be appended following publication)File formatting information is listed below, followed by a sub-section of the text describing the Geodetic Drought Index Calculation. The longitude, latitude, and label for grid points are provided in the file "loading_grid_lon_lat".Time series for each Geodetic Drought Index (GDI) time scale are provided within "GDI_time_series.zip".The included time scales are for 00- (daily), 1-, 3-, 6-, 12- 18- 24-, 36-, and 48-month GDI solutions.Files are formatted following...Title: "grid point label L****"_"time scale"_monthFile Format: ["decimal date" "GDI value"]Gridded, epoch-by-epoch, solutions for each time scale are provided within "GDI_grids.zip".Files are formatted following...Title: GDI_"decimal date"_"time scale"_monthFile Format: ["longitude" "latitude" "GDI value" "grid point label L****"]2.2 GEODETIC DROUGHT INDEX CALCULATION We develop the GDI following Vicente-Serrano et al. (2010) and Tang et al. (2023), such that the GDI mimics the derivation of the SPEI, and utilize the log-logistic distribution (further details below). While we apply hydrologic load estimates derived from GPS displacements as the input for this GDI (Figure 1a-d), we note that alternate geodetic drought indices could be derived using other types of geodetic observations, such as InSAR, gravity, strain, or a combination thereof. Therefore, the GDI is a generalizable drought index framework. A key benefit of the SPEI is that it is a multi-scale index, allowing the identification of droughts which occur across different time scales. For example, flash droughts (Otkin et al., 2018), which may develop over the period of a few weeks, and persistent droughts (>18 months), may not be observed or fully quantified in a uni-scale drought index framework. However, by adopting a multi-scale approach these signals can be better identified (Vicente-Serrano et al., 2010). Similarly, in the case of this GPS-based GDI, hydrologic drought signals are expected to develop at time scales that are both characteristic to the drought, as well as the source of the load variation (i.e., groundwater versus surface water and their respective drainage basin/aquifer characteristics). Thus, to test a range of time scales, the TWS time series are summarized with a retrospective rolling average window of D (daily with no averaging), 1, 3, 6, 12, 18, 24, 36, and 48-months width (where one month equals 30.44 days). From these time-scale averaged time series, representative compilation window load distributions are identified for each epoch. The compilation window distributions include all dates that range ±15 days from the epoch in question per year. This allows a characterization of the estimated loads for each day relative to all past/future loads near that day, in order to bolster the sample size and provide more robust parametric estimates [similar to Ford et al., (2016)]; this is a key difference between our GDI derivation and that presented by Tang et al. (2023). Figure 1d illustrates the representative distribution for 01 December of each year at the grid cell co-located with GPS station P349 for the daily TWS solution. Here all epochs between between 16 November and 16 December of each year (red dots), are compiled to form the distribution presented in Figure 1e. This approach allows inter-annual variability in the phase and amplitude of the signal to be retained (which is largely driven by variation in the hydrologic cycle), while removing the primary annual and semi-annual signals. Solutions converge for compilation windows >±5 days, and show a minor increase in scatter of the GDI time series for windows of ±3-4 days (below which instability becomes more prevalent). To ensure robust characterization of drought characteristics, we opt for an extended ±15-day compilation window. While Tang et al. (2023) found the log-logistic distribution to be unstable and opted for a normal distribution, we find that, by using the extended compiled distribution, the solutions are stable with negligible differences compared to the use of a normal distribution. Thus, to remain aligned with the SPEI solution, we retain the three-parameter log-logistic distribution to characterize the anomalies. Probability weighted moments for the log-logistic distribution are calculated following Singh et al., (1993) and Vicente-Serrano et al., (2010). The individual moments are calculated following Equation 3. These are then used to calculate the L-moments for shape (), scale (), and location () of the three-parameter log-logistic distribution (Equations 4 – 6). The probability density function (PDF) and the cumulative distribution function (CDF) are then calculated following Equations 7 and 8, respectively. The inverse Gaussian function is used to transform the CDF from estimates of the parametric sample quantiles to standard normal index values that represent the magnitude of the standardized anomaly. Here, positive/negative values represent greater/lower than normal hydrologic storage. Thus, an index value of -1 indicates that the estimated load is approximately one standard deviation dryer than the expected average load on that epoch. *Equations can be found in the main text. 
    more » « less