skip to main content


Title: The Majority Theorem for the Single ( p  = 1) Median Problem and Local Spatial Autocorrelation

Except for about a half dozen papers, virtually all (co)authored by Griffith, the existing literature lacks much content about the interface between spatial optimization, a popular form of geographic analysis, and spatial autocorrelation, a fundamental property of georeferenced data. The popularp‐median location‐allocation problem highlights this situation: the empirical geographic distribution of demand virtually always exhibits positive spatial autocorrelation. This property of geospatial data offers additional overlooked information for solving such spatial optimization problems when it actually relates to their solutions. With a proof‐of‐concept outlook, this paper articulates connections between the well‐known Majority Theorem of the 1‐median minisum problem and local indices of spatial autocorrelation; the LISA statistics appear to be the more useful of these later statistics because they better embrace negative spatial autocorrelation. The relationship articulation outlined here results in the positing of a new proposition labeled the egalitarian theorem.

 
more » « less
Award ID(s):
1951344
NSF-PAR ID:
10363767
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Geographical Analysis
Volume:
55
Issue:
1
ISSN:
0016-7363
Page Range / eLocation ID:
p. 107-124
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Both historically and in terms of practiced academic organization, the anticipation should be that a flourishing synergistic interface exists between statistics and operations research in general, and between spatial statistics/econometrics and spatial optimization in particular. Unfortunately, for the most part, this expectation is false. The purpose of this paper is to address this existential missing link by focusing on the beneficial contributions of spatial statistics to spatial optimization, via spatial autocorrelation (i.e., dis/similar attribute values tend to cluster together on a map), in order to encourage considerably more future collaboration and interaction between contributors to their two parent bodies of knowledge. The key basic statistical concept in this pursuit is the median in its bivariate form, with special reference to the global and to sets of regional spatial medians. One-dimensional examples illustrate situations that the narrative then extends to two-dimensional illustrations, which, in turn, connects these treatments to the spatial statistics centrography theme. Because of computational time constraints (reported results include some for timing experiments), the summarized analysis restricts attention to problems involving one global and two or three regional spatial medians. The fundamental and foundational spatial, statistical, conceptual tool employed here is spatial autocorrelation: geographically informed sampling designs—which acknowledge a non-random mixture of geographic demand weight values that manifests itself as local, homogeneous, spatial clusters of these values—can help spatial optimization techniques determine the spatial optima, at least for location-allocation problems. A valuable discovery by this study is that existing but ignored spatial autocorrelation latent in georeferenced demand point weights undermines spatial optimization algorithms. All in all, this paper should help initiate a dissipation of the existing isolation between statistics and operations research, hopefully inspiring substantially more collaborative work by their professionals in the future. 
    more » « less
  2. Abstract

    Geospatial data conflation is the process of combining multiple datasets about a geographic phenomenon to produce a single, richer dataset. It has received increased research attention due to its many applications in map making, transportation, planning, and temporal geospatial analyses, among many others. One approach to conflation, attempted from the outset in the literature, is the use of optimization‐based conflation methods. Conflation is treated as a natural optimization problem of minimizing the total number of discrepancies while finding corresponding features from two datasets. Optimization‐based conflation has several advantages over traditional methods including conciseness, being able to find an optimal solution, and ease of implementation. However, current optimization‐based conflation methods are also limited. A main shortcoming with current optimized conflation models (and other traditional methods as well) is that they are often too weak and cannot utilize the spatial context in each dataset while matching corresponding features. In particular, current optimal conflation models match a feature to targets independently from other features and therefore treat each GIS dataset as a collection of unrelated elements, reminiscent of the spaghetti GIS data model. Important contextual information such as the connectivity between adjacent elements (such as roads) is neglected during the matching. Consequently, such models may produce topologically inconsistent results. In this article, we address this issue by introducing new optimization‐based conflation models with structural constraints to preserve the connectivity and contiguity relation among features. The model is implemented using integer linear programming and compared with traditional spaghetti‐style models on multiple test datasets. Experimental results show that the new element connectivity (ec‐bimatching) model reduces false matches and consistently outperforms traditional models.

     
    more » « less
  3. Abstract Aim

    Patterns of genetic diversity within species’ ranges can reveal important insights into effects of past climate on species’ biogeography and current population dynamics. While numerous biogeographic hypotheses have been proposed to explain patterns of genetic diversity within species’ ranges, formal comparisons and rigorous statistical tests of these hypotheses remain rare. Here, we compared seven hypotheses for their abilities to describe the geographic pattern of two metrics of genetic diversity in balsam poplar (Populus balsamifera), a northern North American tree species.

    Location

    North America.

    Taxon

    Balsam poplar (Populus balsamiferaL.).

    Methods

    We compared seven hypotheses, representing effects of past climate and current range position, for their ability to describe the geographic pattern of expected heterozygosity and per cent polymorphic loci across 85 populations of balsam poplar. We tested each hypothesis using spatial and non‐spatial least‐squares regression to assess the importance of spatial autocorrelation on model performance.

    Results

    We found that both expected heterozygosity and per cent polymorphic loci could best be explained by the current range position and genetic structure of populations within the contemporary range. Genetic diversity showed a clear gradient of being highest near the geographic and climatic range centre and lowest near range edges. Hypotheses accounting for the effects of past climate (e.g. past climatic suitability, distance from the southern edge), in contrast, had comparatively little support. Model ranks were similar among spatial and non‐spatial models, but residuals of all non‐spatial models were significantly autocorrelated, violating the assumption of independence in least‐squares regression.

    Main conclusions

    Our work adds strong support for the “Central‐Periphery Hypothesis” as providing a predictive framework for understanding the forces structuring genetic diversity across species’ ranges, and illustrates the value of applying a robust comparative model selection framework and accounting for spatial autocorrelation when comparing biogeographic models of genetic diversity.

     
    more » « less
  4. Abstract

    Human commensal species such as rodent pests are often widely distributed across cities and threaten both infrastructure and public health. Spatially explicit population genomic methods provide insights into movements for cryptic pests that drive evolutionary connectivity across multiple spatial scales. We examined spatial patterns of neutral genomewide variation in brown rats (Rattus norvegicus) across Manhattan, New York City (NYC), using 262 samples and 61,401SNPs to understand (i) relatedness among nearby individuals and the extent of spatial genetic structure in a discrete urban landscape; (ii) the geographic origin ofNYCrats, using a large, previously published data set of global rat genotypes; and (iii) heterogeneity in gene flow across the city, particularly deviations from isolation by distance. We found that rats separated by ≤200 m exhibit strong spatial autocorrelation (r = .3,p = .001) and the effects of localized genetic drift extend to a range of 1,400 m. Across Manhattan, rats exhibited a homogeneous population origin from rats that likely invaded from Great Britain. While traditional approaches identified a single evolutionary cluster with clinal structure across Manhattan, recently developed methods (e.g., fineSTRUCTURE,sPCA,EEMS) provided evidence of reduced dispersal across the island's less residential Midtown region resulting in fine‐scale genetic structuring (FST = 0.01) and two evolutionary clusters (Uptown and Downtown Manhattan). Thus, while some urban populations of human commensals may appear to be continuously distributed, landscape heterogeneity within cities can drive differences in habitat quality and dispersal, with implications for the spatial distribution of genomic variation, population management and the study of widely distributed pests.

     
    more » « less
  5. null (Ed.)
    Tree-form sequential decision making (TFSDM) extends classical one-shot decision making by modeling tree-form interactions between an agent and a potentially adversarial environment. It captures the online decision-making problems that each player faces in an extensive-form game, as well as Markov decision processes and partially-observable Markov decision processes where the agent conditions on observed history. Over the past decade, there has been considerable effort into designing online optimization methods for TFSDM. Virtually all of that work has been in the full-feedback setting, where the agent has access to counterfactuals, that is, information on what would have happened had the agent chosen a different action at any decision node. Little is known about the bandit setting, where that assumption is reversed (no counterfactual information is available), despite this latter setting being well understood for almost 20 years in one-shot decision making. In this paper, we give the first algorithm for the bandit linear optimization problem for TFSDM that offers both (i) linear-time iterations (in the size of the decision tree) and (ii) O(T−−√) cumulative regret in expectation compared to any fixed strategy, at all times T. This is made possible by new results that we derive, which may have independent uses as well: 1) geometry of the dilated entropy regularizer, 2) autocorrelation matrix of the natural sampling scheme for sequence-form strategies, 3) construction of an unbiased estimator for linear losses for sequence-form strategies, and 4) a refined regret analysis for mirror descent when using the dilated entropy regularizer. 
    more » « less