skip to main content

Title: Graph-based Namespaces and Load Sharing for Efficient Information Dissemination in Disasters
Timely, flexible and accurate information dissemination can make a life-and-death difference in managing disasters. Complex command structures and information organization make such dissemination challenging. Thus, it is vital to have an architecture with appropriate naming frameworks, adaptable to the changing roles of participants, focused on content rather than network addresses. To address this, we propose POISE, a name-based and recipient-based publish/subscribe architecture for efficient content dissemination in disaster management. POISE proposes an information layer, improving on state-of-the-art Information-Centric Networking (ICN) solutions such as Named Data Networking (NDN) in two major ways: 1) support for complex graph-based namespaces, and 2) automatic name-based load-splitting. To capture the complexity and dynamicity of disaster response command chains and information flows, POISE proposes a graph-based naming framework, leveraged in a dissemination protocol which exploits information layer rendezvous points (RPs) that perform name expansions. For improved robustness and scalability, POISE allows load-sharing via multiple RPs each managing a subset of the namespace graph. However, excessive workload on one RP may turn it into a “hot spot”, thus impeding performance and reliability. To eliminate such traffic concentration, we propose an automatic load-splitting mechanism, consisting of a namespace graph partitioning complemented by a seamless, loss-less core migration procedure. more » Due to the nature of our graph partitioning and its complex objectives, off-the-shelf graph partitioning, e.g., METIS, is inadequate. We propose a hybrid partitioning solution, consisting of an initial and a refinement phase. Our simulation results show that POISE outperforms state-of-the-art solutions, demonstrating its effectiveness in timely delivery and load-sharing. « less
Authors:
; ;
Award ID(s):
1818971
Publication Date:
NSF-PAR ID:
10190698
Journal Name:
2019 IEEE 27th International Conference on Network Protocols (ICNP)
Page Range or eLocation-ID:
1 to 12
Sponsoring Org:
National Science Foundation
More Like this
  1. Graph-based namespaces are being increasingly used to represent the organization of complex and ever-growing information eco-systems and individual user roles. Timely and accurate information dissemination requires an architecture with appropriate naming frameworks, adaptable to changing roles, focused on content rather than network addresses. Today's complex information organization structures make such dissemination very challenging. To address this, we propose POISE, a name-based publish/subscribe architecture for efficient topic-based and recipient-based content dissemination. POISE proposes an information layer, improving on state-of-the-art Information-Centric Networking solutions in two major ways: 1) support for complex graph-based namespaces, and 2) automatic name-based load-splitting. POISE supports in-network graph-based naming, leveraged in a dissemination protocol which exploits information layer rendezvous points (RPs) that perform name expansions. For improved robustness and scalability, POISE supports adaptive load-sharing via multiple RPs, each managing a dynamically chosen subset of the namespace graph. Excessive workload may cause one RP to turn into a ``hot spot'', impeding performance and reliability. To eliminate such traffic concentration, we propose an automated load-splitting mechanism, consisting of an enhanced, namespace graph partitioning complemented by a seamless, loss-less core migration procedure. Due to the nature of our graph partitioning and its complex objectives, off-the-shelf graph partitioning, e.g., METIS, is inadequate.more »We propose a hybrid, iterative bi-partitioning solution, consisting of an initial and a refinement phase. We also implemented POISE on a DPDK-based platform. Using the important application of emergency response, our experimental results show that POISE outperforms state-of-the-art solutions, demonstrating its effectiveness in timely delivery and load-sharing.« less
  2. Delivering the right information to the right people in a timely manner can greatly improve outcomes and save lives in emergency response. A communication framework that flexibly and efficiently brings victims, volunteers, and first responders together for timely assistance can be very helpful. With the burden of more frequent and intense disaster situations and first responder resources stretched thin, people increasingly depend on social media for communicating vital information. This paper proposes ONSIDE, a framework for coordination of disaster response leveraging social media, integrating it with Information-Centric dissemination for timely and relevant dissemination. We use a graph-based pub/sub namespace that captures the complex hierarchy of the incident management roles. Regular citizens and volunteers using social media may not know of or have access to the full namespace. Thus, we utilize a social media engine (SME) to identify disaster-related social media posts and then automatically map them to the right name(s) in near-real-time. Using NLP and classification techniques, we direct the posts to appropriate first responder(s) that can help with the posted issue. A major challenge for classifying social media in real-time is the labeling effort for model training. Furthermore, as disasters hits, there may be not enough data points availablemore »for labeling, and there may be concept drift in the content of the posts over time. To address these issues, our SME employs stream-based active learning methods, adapting as social media posts come in. Preliminary evaluation results show the proposed solution can be effective.« less
  3. During disasters, it is critical to deliver emergency information to appropriate first responders. Name-based information delivery provides efficient, timely dissemination of relevant content to first responder teams assigned to different incident response roles. People increasingly depend on social media for communicating vital information, using free-form text. Thus, a method that delivers these social media posts to the right first responders can significantly improve outcomes. In this paper, we propose FLARE, a framework using 'Social Media Engines' (SMEs) to map social media posts (SMPs), such as tweets, to the right names. SMEs perform natural language processing-based classification and exploit several machine learning capabilities, in an online real-time manner. To reduce the manual labeling effort required for learning during the disaster, we leverage active learning, complemented by dispatchers with specific domain-knowledge performing limited labeling. We also leverage federated learning across various public-safety departments with specialized knowledge to handle notifications related to their roles in a cooperative manner. We implement three different classifiers: for incident relevance, organization, and fine-grained role prediction. Each class is associated with a specific subset of the namespace graph. The novelty of our system is the integration of the namespace with federated active learning and inference procedures to identifymore »and deliver vital SMPs to the right first responders in a distributed multi-organization environment, in real-time. Our experiments using real-world data, including tweets generated by citizens during the wildfires in California in 2018, show our approach outperforming both a simple keyword-based classification and several existing NLP-based classification techniques.« less
  4. Name-based pub/sub allows for efficient and timely delivery of information to interested subscribers. A challenge is assigning the right name to each piece of content, so that it reaches the most relevant recipients. An example scenario is the dissemination of social media posts to first responders during disasters. We present FLARE, a framework using federated active learning assisted by naming. FLARE integrates machine learning and name-based pub/sub for accurate timely delivery of textual information. In this demo, we show FLARE’s operation.
  5. Advanced imaging and DNA sequencing technologies now enable the diverse biology community to routinely generate and analyze terabytes of high resolution biological data. The community is rapidly heading toward the petascale in single investigator laboratory settings. As evidence, the single NCBI SRA central DNA sequence repository contains over 45 petabytes of biological data. Given the geometric growth of this and other genomics repositories, an exabyte of mineable biological data is imminent. The challenges of effectively utilizing these datasets are enormous as they are not only large in the size but also stored in geographically distributed repositories in various repositories such as National Center for Biotechnology Information (NCBI), DNA Data Bank of Japan (DDBJ), European Bioinformatics Institute (EBI), and NASA’s GeneLab. In this work, we first systematically point out the data-management challenges of the genomics community. We then introduce Named Data Networking (NDN), a novel but well-researched Internet architecture, is capable of solving these challenges at the network layer. NDN performs all operations such as forwarding requests to data sources, content discovery, access, and retrieval using content names (that are similar to traditional filenames or filepaths) and eliminates the need for a location layer (the IP address) for data management. Utilizingmore »NDN for genomics workflows simplifies data discovery, speeds up data retrieval using in-network caching of popular datasets, and allows the community to create infrastructure that supports operations such as creating federation of content repositories, retrieval from multiple sources, remote data subsetting, and others. Named based operations also streamlines deployment and integration of workflows with various cloud platforms. Our contributions in this work are as follows 1) we enumerate the cyberinfrastructure challenges of the genomics community that NDN can alleviate, and 2) we describe our efforts in applying NDN for a contemporary genomics workflow (GEMmaker) and quantify the improvements. The preliminary evaluation shows a sixfold speed up in data insertion into the workflow. 3) As a pilot, we have used an NDN naming scheme (agreed upon by the community and discussed in Section 4 ) to publish data from broadly used data repositories including the NCBI SRA. We have loaded the NDN testbed with these pre-processed genomes that can be accessed over NDN and used by anyone interested in those datasets. Finally, we discuss our continued effort in integrating NDN with cloud computing platforms, such as the Pacific Research Platform (PRP). The reader should note that the goal of this paper is to introduce NDN to the genomics community and discuss NDN’s properties that can benefit the genomics community. We do not present an extensive performance evaluation of NDN—we are working on extending and evaluating our pilot deployment and will present systematic results in a future work.« less