skip to main content


Title: A Hardware-Software Codesign Approach to Identity, Trust, and Resilience for IoT/CPS at Scale
Advancement in communication technologies and the Internet of Things (IoT) is driving adoption in smart cities that aims to increase operational efficiency and improve the quality of services and citizen welfare, among other potential benefits. The privacy, reliability, and integrity of communications must be ensured so that actions can be appropriate, safe, accurate, and implemented promptly after receiving actionable information. In this work, we present a multi-tier methodology consisting of an authentication and trust-building/distribution framework designed to ensure the safety and validity of the information exchanged in the system. Blockchain protocols and Radio Frequency-Distinct Native Attributes (RF-DNA) combine to provide a hardware-software codesigned system for enhanced device identity and overall system trustworthiness. Our threat model accounts for counterfeiting, breakout fraud, and bad mouthing of one entity by others. Entity trust (e.g., IoT devices) depends on quality and level of participation, quality of messages, lifetime of a given entity in the system, and the number of known "bad" (non-consensus) messages sent by that entity. Based on this approach to trust, we are able to adjust trust upward and downward as a function of real-time and past behavior, providing other participants with a trust value upon which to judge information from and interactions with the given entity. This approach thereby reduces the potential for manipulation of an IoT system by a bad or byzantine actor.  more » « less
Award ID(s):
1812404
NSF-PAR ID:
10140221
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
2019 International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData)
Page Range / eLocation ID:
1125 to 1134
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. It takes great effort to manually or semi-automatically convert free-text phenotype narratives (e.g., morphological descriptions in taxonomic works) to a computable format before they can be used in large-scale analyses. We argue that neither a manual curation approach nor an information extraction approach based on machine learning is a sustainable solution to produce computable phenotypic data that are FAIR (Findable, Accessible, Interoperable, Reusable) (Wilkinson et al. 2016). This is because these approaches do not scale to all biodiversity, and they do not stop the publication of free-text phenotypes that would need post-publication curation. In addition, both manual and machine learning approaches face great challenges: the problem of inter-curator variation (curators interpret/convert a phenotype differently from each other) in manual curation, and keywords to ontology concept translation in automated information extraction, make it difficult for either approach to produce data that are truly FAIR. Our empirical studies show that inter-curator variation in translating phenotype characters to Entity-Quality statements (Mabee et al. 2007) is as high as 40% even within a single project. With this level of variation, curated data integrated from multiple curation projects may still not be FAIR. The key causes of this variation have been identified as semantic vagueness in original phenotype descriptions and difficulties in using standardized vocabularies (ontologies). We argue that the authors describing characters are the key to the solution. Given the right tools and appropriate attribution, the authors should be in charge of developing a project's semantics and ontology. This will speed up ontology development and improve the semantic clarity of the descriptions from the moment of publication. In this presentation, we will introduce the Platform for Author-Driven Computable Data and Ontology Production for Taxonomists, which consists of three components: a web-based, ontology-aware software application called 'Character Recorder,' which features a spreadsheet as the data entry platform and provides authors with the flexibility of using their preferred terminology in recording characters for a set of specimens (this application also facilitates semantic clarity and consistency across species descriptions); a set of services that produce RDF graph data, collects terms added by authors, detects potential conflicts between terms, dispatches conflicts to the third component and updates the ontology with resolutions; and an Android mobile application, 'Conflict Resolver,' which displays ontological conflicts and accepts solutions proposed by multiple experts. a web-based, ontology-aware software application called 'Character Recorder,' which features a spreadsheet as the data entry platform and provides authors with the flexibility of using their preferred terminology in recording characters for a set of specimens (this application also facilitates semantic clarity and consistency across species descriptions); a set of services that produce RDF graph data, collects terms added by authors, detects potential conflicts between terms, dispatches conflicts to the third component and updates the ontology with resolutions; and an Android mobile application, 'Conflict Resolver,' which displays ontological conflicts and accepts solutions proposed by multiple experts. Fig. 1 shows the system diagram of the platform. The presentation will consist of: a report on the findings from a recent survey of 90+ participants on the need for a tool like Character Recorder; a methods section that describes how we provide semantics to an existing vocabulary of quantitative characters through a set of properties that explain where and how a measurement (e.g., length of perigynium beak) is taken. We also report on how a custom color palette of RGB values obtained from real specimens or high-quality specimen images, can be used to help authors choose standardized color descriptions for plant specimens; and a software demonstration, where we show how Character Recorder and Conflict Resolver can work together to construct both human-readable descriptions and RDF graphs using morphological data derived from species in the plant genus Carex (sedges). The key difference of this system from other ontology-aware systems is that authors can directly add needed terms to the ontology as they wish and can update their data according to ontology updates. a report on the findings from a recent survey of 90+ participants on the need for a tool like Character Recorder; a methods section that describes how we provide semantics to an existing vocabulary of quantitative characters through a set of properties that explain where and how a measurement (e.g., length of perigynium beak) is taken. We also report on how a custom color palette of RGB values obtained from real specimens or high-quality specimen images, can be used to help authors choose standardized color descriptions for plant specimens; and a software demonstration, where we show how Character Recorder and Conflict Resolver can work together to construct both human-readable descriptions and RDF graphs using morphological data derived from species in the plant genus Carex (sedges). The key difference of this system from other ontology-aware systems is that authors can directly add needed terms to the ontology as they wish and can update their data according to ontology updates. The software modules currently incorporated in Character Recorder and Conflict Resolver have undergone formal usability studies. We are actively recruiting Carex experts to participate in a 3-day usability study of the entire system of the Platform for Author-Driven Computable Data and Ontology Production for Taxonomists. Participants will use the platform to record 100 characters about one Carex species. In addition to usability data, we will collect the terms that participants submit to the underlying ontology and the data related to conflict resolution. Such data allow us to examine the types and the quantities of logical conflicts that may result from the terms added by the users and to use Discrete Event Simulation models to understand if and how term additions and conflict resolutions converge. We look forward to a discussion on how the tools (Character Recorder is online at http://shark.sbs.arizona.edu/chrecorder/public) described in our presentation can contribute to producing and publishing FAIR data in taxonomic studies. 
    more » « less
  2. In this work, we demonstrate the design and implementation of a novel privacy-preserving blockchain for the resource-constrained Internet of Things (IoT). Blockchain, by design, ensures trust, provides built-in integrity of information and security of immutability in an IoT system without the need of a centralized entity. However, its slow transaction rate, lack of transaction privacy, and high resource consumption are three of the major hindrances to the practical realization of blockchain in IoT. While directed acyclic graphs (DAG)-based blockchain variants (e.g., hashgraph) improve the transaction rate, the other two problems remain open. To this end, we designed and constructed the prototype of a blockchain by utilizing the benefits of high transaction rate and miner-free transaction validation process from hashgraph. The proposed blockchain, coined as PrivLiteChain, implements the concept of local differential privacy to provide transaction privacy and temporal constraint to the lifecycle of the blockchain to make it lightweight. 
    more » « less
  3. Abstract: With the proliferation of Dynamic Spectrum Access (DSA), Internet of Things (IoT), and Mobile Edge Computing (MEC) technologies, various methods have been proposed to deduce key network and user information in cellular systems, such as available cell bandwidths, as well as user locations and mobility. Not only is such information dominated by cellular networks of vital significance on other systems co-located spectrum-wise and/or geographically, but applications within cellular systems can also benefit remarkably from inferring such information, as exemplified by the endeavours made by video streaming to predict cell bandwidth. Hence, we are motivated to develop a new tool to uncover as much information used to be closed to outsiders or user devices as possible with off-the-shelf products. Given the wide-spread deployment of LTE and its continuous evolution to 5G, we design and implement U-CIMAN, a client-side system to accurately UnCover as much Information in Mobile Access Networks as allowed by LTE encryption. Among the many potential applications of U-CIMAN, we highlight one use case of accurately measuring the spectrum tenancy of a commercial LTE cell. Besides measuring spectrum tenancy in unit of resource blocks, U-CIMAN discovers user mobility and traffic types associated with spectrum usage through decoded control messages and user data bytes. We conduct 4-month detailed accurate spectrum measurement on a commercial LTE cell, and the observations include the predictive power of Modulation and Coding Scheme on spectrum tenancy, and channel off-time bounded under 10 seconds, to name a few. 
    more » « less
  4. An important task for Information Extraction from Microblogs is Named Entity Recognition (NER) that extracts mentions of real-world entities from microblog messages and meta-information like entity type for better entity characterization. A lot of microblog NER systems have rightly sought to prioritize modeling the non-literary nature of microblog text. These systems are trained on offline static datasets and extract a combination of surface-level features – orthographic, lexical, and semantic – from individual messages for noisy text modeling and entity extraction. But given the constantly evolving nature of microblog streams, detecting all entity mentions from such varying yet limited context in short messages remains a difficult problem to generalize. In this paper, we propose the NER Globalizer pipeline better suited for NER on microblog streams. It characterizes the isolated message processing by existing NER systems as modeling local contextual embeddings, where learned knowledge from the immediate context of a message is used to suggest seed entity candidates. Additionally, it recognizes that messages within a microblog stream are topically related and often repeat mentions of the same entity. This suggests building NER systems that go beyond localized processing. By leveraging occurrence mining, the proposed system therefore follows up traditional NER modeling by extracting additional mentions of seed entity candidates that were previously missed. Candidate mentions are separated into well-defined clusters which are then used to generate a pooled global embedding drawn from the collective context of the candidate within a stream. The global embeddings are utilized to separate false positives from entities whose mentions are produced in the final NER output. Our experiments show that the proposed NER system exhibits superior effectiveness on multiple NER datasets with an average Macro F1 improvement of 47.04% over the best NER baseline while adding only a small computational overhead. 
    more » « less
  5. Trust in data collected by and passing through Internt of Things (IoT) networks is paramount. The quality of decisions made based on this collected data is highly dependent upon the accuracy of the data. Currently, most trust assessment methodologies assume that collected data follows a stationary Gaussian distribution. Often, a trust score is estimated based upon the deviation from this distribution. However, the underlying state of a system monitored by an IoT network can change over time, and the data collected from the network may not consistently follow a Gaussian distribution. Further, faults that occur within the estimated Gaussian distribution may go undetected. In this study, we present a model-based trust estimation system that allows for concept drift or distributions that can change over time. The presented methodology uses data-driven models to estimate the value of the data produced by a sensor using the data produced by the other sensors in the network. We assume that an untrustworthy piece of data falls in the tails of the residual distribution, and we use this concept to assign a trust score. The method is evaluated on a smart home data set consisting of temperature, humidity, and energy sensors. 
    more » « less