skip to main content


Search for: All records

Creators/Authors contains: "Kim, Joon-Seok"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Benenson, Itzhak (Ed.)
    With the onset of COVID-19 and the resulting shelter in place guidelines combined with remote working practices, human mobility in 2020 has been dramatically impacted. Existing studies typically examine whether mobility in specific localities increases or decreases at specific points in time and relate these changes to certain pandemic and policy events. However, a more comprehensive analysis of mobility change over time is needed. In this paper, we study mobility change in the US through a five-step process using mobility footprint data. (Step 1) Propose the Delta Time Spent in Public Places (ΔTSPP) as a measure to quantify daily changes in mobility for each US county from 2019-2020. (Step 2) Conduct Principal Component Analysis (PCA) to reduce the ΔTSPP time series of each county to lower-dimensional latent components of change in mobility. (Step 3) Conduct clustering analysis to find counties that exhibit similar latent components. (Step 4) Investigate local and global spatial autocorrelation for each component. (Step 5) Conduct correlation analysis to investigate how various population characteristics and behavior correlate with mobility patterns. Results show that by describing each county as a linear combination of the three latent components, we can explain 59% of the variation in mobility trends across all US counties. Specifically, change in mobility in 2020 for US counties can be explained as a combination of three latent components: 1) long-term reduction in mobility, 2) no change in mobility, and 3) short-term reduction in mobility. Furthermore, we find that US counties that are geographically close are more likely to exhibit a similar change in mobility. Finally, we observe significant correlations between the three latent components of mobility change and various population characteristics, including political leaning, population, COVID-19 cases and deaths, and unemployment. We find that our analysis provides a comprehensive understanding of mobility change in response to the COVID-19 pandemic. 
    more » « less
  2. null (Ed.)
  3. null (Ed.)
    In response to the COVID-19 pandemic, there have been various attempts to develop realistic models to both predict the spread of the disease and evaluate policy measures aimed at mitigation. Different models that operate under different parameters and assumptions produce radically different predictions, creating confusion among policy-makers and the general population and limiting the usefulness of the models. This newsletter article proposes a novel ensemble modeling approach that uses representative clustering to identify where existing model predictions of COVID-19 spread agree and unify these predictions into a smaller set of predictions. The proposed ensemble prediction approach is composed of the following stages: (1) the selection of the ensemble components, (2) the imputation of missing predictions for each component, and (3) representative clustering in application to time-series data to determine the degree of agreement between simulation predictions. The results of the proposed approach will produce a set of ensemble model predictions that identify where simulation results converge so that policy-makers and the general public are informed with more comprehensive predictions and the uncertainty among them. 
    more » « less
  4. null (Ed.)
    Agent-based models (ABM) play a prominent role in guiding critical decision-making and supporting the development of effective policies for better urban resilience and response to the COVID-19 pandemic. However, many ABMs lack realistic representations of human mobility, a key process that leads to physical interaction and subsequent spread of disease. Therefore, we propose the application of Latent Dirichlet Allocation (LDA), a topic modeling technique, to foot-traffic data to develop a realistic model of human mobility in an ABM that simulates the spread of COVID-19. In our novel approach, LDA treats POIs as "words" and agent home census block groups (CBGs) as "documents" to extract "topics" of POIs that frequently appear together in CBG visits. These topics allow us to simulate agent mobility based on the LDA topic distribution of their home CBG. We compare the LDA based mobility model with competitor approaches including a naive mobility model that assumes visits to POIs are random. We find that the naive mobility model is unable to facilitate the spread of COVID-19 at all. Using the LDA informed mobility model, we simulate the spread of COVID-19 and test the effect of changes to the number of topics, various parameters, and public health interventions. By examining the simulated number of cases over time, we find that the number of topics does indeed impact disease spread dynamics, but only in terms of the outbreak's timing. Further analysis of simulation results is needed to better understand the impact of topics on simulated COVID-19 spread. This study contributes to strengthening human mobility representations in ABMs of disease spread. 
    more » « less
  5. null (Ed.)
    Location-Based Services are often used to find proximal Points of Interest PoI - e.g., nearby restaurants and museums, police stations, hospitals, etc. - in a plethora of applications. An important recently addressed variant of the problem not only considers the distance/proximity aspect, but also desires semantically diverse locations in the answer-set. For instance, rather than picking several close-by attractions with similar features - e.g., restaurants with similar menus; museums with similar art exhibitions - a tourist may be more interested in a result set that could potentially provide more diverse types of experiences, for as long as they are within an acceptable distance from a given (current) location. Towards that goal, in this work we propose a novel approach to efficiently retrieve a path that will maximize the semantic diversity of the visited PoIs that are within distance limits along a given road network. We introduce a novel indexing structure - the Diversity Aggregated R-tree, based on which we devise efficient algorithms to generate the answer-set - i.e., the recommended locations among a set of given PoIs - relying on a greedy search strategy. Our experimental evaluations conducted on real datasets demonstrate the benefits of proposed methodology over the baseline alternative approaches. 
    more » « less
  6. Our ability to extract knowledge from evolving spatial phenomena and make it actionable is often impaired by unreliable, erroneous, obsolete, imprecise, sparse, and noisy data. Integrating the impact of this uncertainty is a paramount when estimating the reliability/confidence of any time-varying query result from the underlying input data. The goal of this advanced seminar is to survey solutions for managing, querying and mining uncertain spatial and spatio-temporal data. We survey different models and show examples of how to efficiently enrich query results with reliability information. We discuss both analytical solutions as well as approximate solutions based on geosimulation. 
    more » « less
  7. Congested traffic wastes billions of liters of fuel and is a significant contributor to Green House Gas (GHG) emissions. Although convenient, ride sharing services such as Uber and Lyft are becoming a significant contributor to these emissions not only because of added traffic but by spending time on the road while waiting for passengers. To help improve the impact of ride sharing, we propose an algorithm to optimize the efficiency of drivers searching for customers. In our model, the main goal is to direct drivers represented as idle agents, i.e., not currently assigned a customer or resource, to locations where we predict new resources to appear. Our approach uses non-negative matrix factorization (NMF) to model and predict the spatio-temporal distributions of resources. To choose destinations for idle agents, we employ a greedy heuristic that strikes a balance between distance greed, i.e., to avoid long trips without resources and resource greed, i.e., to move to a location where resources are expected to appear following the NMF model. To ensure that agents do not oversupply areas for which resources are predicted and under supply other areas, we randomize the destinations of agents using the predicted resource distribution within the local neighborhood of an agent. Our experimental evaluation shows that our approach reduces the search time of agents and the wait time of resources using real-world data from Manhattan, New York, USA. 
    more » « less
  8. Location-based social networks (LBSNs) have been studied extensively in recent years. However, utilizing real-world LBSN data sets yields several weaknesses: sparse and small data sets, privacy concerns, and a lack of authoritative ground-truth. To overcome these weaknesses, we leverage a large-scale LBSN simulation to create a framework to simulate human behavior and to create synthetic but realistic LBSN data based on human patterns of life. Such data not only captures the location of users over time but also their interactions via social networks. Patterns of life are simulated by giving agents (i.e., people) an array of “needs” that they aim to satisfy, e.g., agents go home when they are tired, to restaurants when they are hungry, to work to cover their financial needs, and to recreational sites to meet friends and satisfy their social needs. While existing real-world LBSN data sets are trivially small, the proposed framework provides a source for massive LBSN benchmark data that closely mimics the real-world. As such, it allows us to capture 100% of the (simulated) population without any data uncertainty, privacy-related concerns, or incompleteness. It allows researchers to see the (simulated) world through the lens of an omniscient entity having perfect data. Our framework is made available to the community. In addition, we provide a series of simulated benchmark LBSN data sets using different synthetic towns and real-world urban environments obtained from OpenStreetMap. The simulation software and data sets, which comprise gigabytes of spatio-temporal and temporal social network data, are made available to the research community. 
    more » « less
  9. Data generators have been heavily used in creating massive trajectory datasets to address common challenges of real-world datasets, including privacy, cost of data collection, and data quality. However, such generators often overlook social and physiological characteristics of individuals and as such their results are often limited to simple movement patterns. To address these shortcomings, we propose an agent-based simulation framework that facilitates the development of behavioral models in which agents correspond to individuals that act based on personal preferences, goals, and needs within a realistic geographical environment. Researchers can use a drag-and-drop interface to design and control their own world including the geospatial and social (i.e. geo-social) properties. The framework is capable of generating and streaming very large data that captures the basic patterns of life in urban areas. Streaming data from the simulation can be accessed in real time through a dedicated API. 
    more » « less
  10. Location-based social networks (LBSNs) have been studied extensively in recent years. However, utilizing real-world LBSN datasets in such studies has severe weaknesses: sparse and small datasets, privacy concerns, and a lack of authoritative ground-truth. Our vision is to create a large scale geo-simulation framework to simulate human behavior and to create synthetic but realistic LBSN data that captures the location of users over time as well as social interactions of users in a social network. While existing LBSN datasets are trivially small, such a framework would provide the first source of massive LBSN benchmark data which would closely mimic the real world, containing high-fidelity information of location, and social connections of millions of simulated agents over several years of simulated time. Therefore, it would serve the research community by revitalizing and reshaping research on LBSNs by allowing researchers to see the (simulated) world through the lens of an omniscient entity having perfect data. These evaluations will guide future research enabling us to develop solutions to improve LBSN applications such as user-location recommendation, friend recommendation, location prediction, and location privacy. 
    more » « less