skip to main content


Title: Location Prediction for Tweets
Geographic information provides an important insight into many data mining and social media systems. However, users are reluctant to provide such information due to various concerns, such as inconvenience, privacy, etc. In this paper, we aim to develop a deep learning based solution to predict geographic information for tweets. The current approaches bear two major limitations, including (a) hard to model the long term information and (b) hard to explain to the end users what the model learns. To address these issues, our proposed model embraces three key ideas. First, we introduce a multi-head self-attention model for text representation. Second, to further improve the result on informal language, we treat subword as a feature in our model. Lastly, the model is trained jointly with the city and country to incorporate the information coming from different labels. The experiment performed on W-NUT 2016 Geo-tagging shared task shows our proposed model is competitive with the state-of-the-art systems when using accuracy measurement, and in the meanwhile, leading to a better distance measure over the existing approaches.  more » « less
Award ID(s):
1947135 1651203 1715385
NSF-PAR ID:
10158483
Author(s) / Creator(s):
Date Published:
Journal Name:
Frontiers Big Data
Volume:
2
Issue:
5
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Unoccupied Aerial Vehicles (UAVs), or drone technologies, with their high spatial resolution, temporal flexibility, and ability to repeat photogrammetry, afford a significant advancement in other remote sensing approaches for coastal mapping, habitat monitoring, and environmental management. However, geographical drone mapping and in situ fieldwork often come with a steep learning curve requiring a background in drone operations, Geographic Information Systems (GIS), remote sensing and related analytical techniques. Such a learning curve can be an obstacle for field implementation for researchers, community organizations and citizen scientists wishing to include introductory drone operations into their work. In this study, we develop a comprehensive drone training program for research partners and community members to use cost-effective, consumer-quality drones to engage in introductory drone mapping of coastal seagrass monitoring sites along the west coast of North America. As a first step toward a longer-term Public Participation GIS process in the study area, the training program includes lessons for beginner drone users related to flying drones, autonomous route planning and mapping, field safety, GIS analysis, image correction and processing, and Federal Aviation Administration (FAA) certification and regulations. Training our research partners and students, who are in most cases novice users, is the first step in a larger process to increase participation in a broader project for seagrass monitoring in our case study. While our training program originated in the United States, we discuss our experiences for research partners and communities around the globe to become more confident in introductory drone operations for basic science. In particular, our work targets novice users without a strong background in geographic research or remote sensing. Such training provides technical guidance on the implementation of a drone mapping program for coastal research, and synthesizes our approaches to provide broad guidance for using drones in support of a developing Public Participation GIS process. 
    more » « less
  2. The geographic settings and interests of diverse groups of rights- and stakeholders figure prominently in the need for internationally coordinated Arctic observing systems. Global and regional observing systems exist to coordinate observations across sectors and national boundaries, leveraging limited resources into widely available observational data and information products. Observing system design and coordination approaches developed for more focused networks at mid- and low latitudes are not necessarily directly applicable in more complex Arctic settings. Requirements for the latter are more demanding because of a greater need for cross-disciplinary and cross-sectoral prioritization and refinement from the local to the pan-Arctic scale, in order to maximize the use of resources in challenging environmental settings. Consideration of Arctic Indigenous Peoples’s observing priorities and needs has emerged as a core tenet of governance and coordination frameworks. We evaluate several different types of observing systems relative to the needs of the Arctic observing community and information users to identify the strengths and weaknesses of each framework. A typology of three approaches emerges from this assessment: “essential variable,” “station model,” and “central question.” We define and assess, against the requirements of Arctic settings, the concept of shared Arctic variables (SAVs) emerging from the Arctic Observing Summit 2020 and prior work by the Sustaining Arctic Observing Networks Road Mapping Task Force. SAVs represent measurable phenomena or processes that are important enough to multiple communities and sectors to make the effort to coordinate observation efforts worthwhile. SAVs align with essential variables as defined, for example, by global observing frameworks, in that they guide coordinated observations across processes that are of interest to multiple sectors. SAVs are responsive to the information needs of Arctic Indigenous Peoples and draw on their capacity to codesign and comanage observing efforts. SAVs are also tailored to accommodate the logistical challenges of Arctic operations and address unique aspects of the Arctic environment, such as the central role of the cryosphere. Specific examples illustrate the flexibility of the SAV framework in reconciling different observational approaches and standards such that the strengths of global and regional observing programs can be adapted to the complex Arctic environment. 
    more » « less
  3. Today’s recommender systems are criticized for recommending items that are too obvious to arouse users’ interests. Therefore the research community has advocated some ”beyond accuracy” evaluation metrics such as novelty, diversity, and serendipity with the hope of promoting information discovery and sustaining users’ interests over a long period of time. While bringing in new perspectives, most of these evaluation metrics have not considered individual users’ differences in their capacity to experience those ”beyond accuracy” items. Open-minded users may embrace a wider range of recommendations than conservative users. In this paper, we proposed to use curiosity traits to capture such individual users’ differences. We developed a model to approximate an individual’s curiosity distribution over different stimulus levels. We used an item’s surprise level to estimate the stimulus level and whether such a level is in the range of the user’s appetite for stimulus, calledComfort Zone. We then proposed a recommender system framework that considers both user preference and theirComfort Zonewhere the curiosity is maximally aroused. Our framework differs from a typical recommender system in that it leverages human’sComfort Zonefor stimuli to promote engagement with the system. A series of evaluation experiments have been conducted to show that our framework is able to rank higher the items with not only high ratings but also high curiosity stimulation. The recommendation list generated by our algorithm has higher potential of inspiring user curiosity compared to the state-of-the-art deep learning approaches. The personalization factor for assessing the surprise stimulus levels further helps the recommender model achieve smaller (better) inter-user similarity.

     
    more » « less
  4. Abstract

    Conservation laws are key theoretical and practical tools for understanding, characterizing, and modeling nonlinear dynamical systems. However, for many complex systems, the corresponding conserved quantities are difficult to identify, making it hard to analyze their dynamics and build stable predictive models. Current approaches for discovering conservation laws often depend on detailed dynamical information or rely on black box parametric deep learning methods. We instead reformulate this task as a manifold learning problem and propose a non-parametric approach for discovering conserved quantities. We test this new approach on a variety of physical systems and demonstrate that our method is able to both identify the number of conserved quantities and extract their values. Using tools from optimal transport theory and manifold learning, our proposed method provides a direct geometric approach to identifying conservation laws that is both robust and interpretable without requiring an explicit model of the system nor accurate time information.

     
    more » « less
  5. Lierler, Yuliya ; Morales, Jose F ; Dodaro, Carmine ; Dahl, Veroniica ; Gebser, Martin ; Tekle, Tuncay (Ed.)
    Knowledge representation and reasoning (KRR) systems represent knowledge as collections of facts and rules. Like databases, KRR systems contain information about domains of human activities like industrial enterprises, science, and business. KRRs can represent complex concepts and relations, and they can query and manipulate information in sophisticated ways. Unfortunately, the KRR technology has been hindered by the fact that specifying the requisite knowledge requires skills that most domain experts do not have, and professional knowledge engineers are hard to find. One solution could be to extract knowledge from English text, and a number of works have attempted to do so (OpenSesame, Google's Sling, etc.). Unfortunately, at present, extraction of logical facts from unrestricted natural language is still too inaccurate to be used for reasoning, while restricting the grammar of the language (so-called controlled natural language, or CNL) is hard for the users to learn and use. Nevertheless, some recent CNL-based approaches, such as the Knowledge Authoring Logic Machine (KALM), have shown to have very high accuracy compared to others, and a natural question is to what extent the CNL restrictions can be lifted. In this paper, we address this issue by transplanting the KALM framework to a neural natural language parser, mStanza. Here we limit our attention to authoring facts and queries and therefore our focus is what we call factual English statements. Authoring other types of knowledge, such as rules, will be considered in our followup work. As it turns out, neural network based parsers have problems of their own and the mistakes they make range from part-of-speech tagging to lemmatization to dependency errors. We present a number of techniques for combating these problems and test the new system, KALMFL (i.e., KALM for factual language), on a number of benchmarks, which show KALMFL achieves correctness in excess of 95%. 
    more » « less