While our society accelerates its transition to the Internet of Things, billions of IoT devices are now linked to the network. While these gadgets provide enormous convenience, they generate a large amount of data that has already beyond the network’s capacity. To make matters worse, the data acquired by sensors on such IoT devices also include sensitive user data that must be appropriately treated. At the moment, the answer is to provide hub services for data storage in data centers. However, when data is housed in a centralized data center, data owners lose control of the data, since data centers are centralized solutions that rely on data owners’ faith in the service provider. In addition, edge computing enables edge devices to collect, analyze, and act closer to the data source, the challenge of data privacy near the edge is also a tough nut to crack. A large number of user information leakage both for IoT hub and edge made the system untrusted all along. Accordingly, building a decentralized IoT system near the edge and bringing real trust to the edge is indispensable and significant. To eliminate the need for a centralized data hub, we present a prototype of a unique, secure, and decentralized IoT framework called Reja, which is built on a permissioned Blockchain and an intrusion-tolerant messaging system ChiosEdge, and the critical components of ChiosEdge are reliable broadcast and BFT consensus. We evaluated the latency and throughput of Reja and its sub-module ChiosEdge.
more »
« less
Bringing Decentralized Search to Decentralized Services
This paper addresses a key missing piece in the current ecosystem of decentralized services and blockchain apps: the lack of decentralized, verifiable, and private search. Existing decentralized systems like Steemit, OpenBazaar, and the growing number of blockchain apps provide alternatives to existing services. And yet, they continue to rely on centralized search engines and indexers to help users access the content they seek and navigate the apps. Such centralized engines are in a perfect position to censor content and violate users’ privacy, undermining some of the key tenets behind decentralization. To remedy this, we introduce DeSearch, the first decentralized search engine that guarantees the integrity and privacy of search results for decentralized services and blockchain apps. DeSearch uses trusted hardware to build a network of workers that execute a pipeline of small search engine tasks (crawl, index, aggregate, rank, query). DeSearch then introduces a witness mechanism to make sure the completed tasks can be reused across different pipelines, and to make the final search results verifiable by end users. We implement DeSearch for two existing decentralized services that handle over 80 million records and 240 GBs of data, and show that DeSearch can scale horizontally with the number of workers and can process 128 million search queries per day.
more »
« less
- Award ID(s):
- 2045861
- PAR ID:
- 10312008
- Date Published:
- Journal Name:
- Symposium on Operating Systems Design and Implementation (OSDI)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The continuous rise of the blockchain technology is moving various information systems towards decentralization. Blockchain-based decentralized storage networks (DSNs) offer significantly higher privacy and lower costs to customers compared with centralized cloud storage associated with specific vendors. Coding is required to retrieve data stored on failing components. While coding solutions for centralized storage have been intensely studied, those for DSNs have not yet been discussed. In this paper, we propose a coding scheme where each node receives extra protection through cooperation with nodes in its neighborhood in a heterogeneous DSN with any given topology. Our scheme can achieve faster recovery speed compared with existing network coding methods, and can correct more erasure patterns compared with our previous work.more » « less
-
Agapito, G. (Ed.)The portable document format (PDF) is currently one of the most popular formats for offline sharing biomedical information. Recently, HTML-based formats for web-first biomedical information sharing have gained popularity. However, machine-interpretable information is required by literature search engines, such as Google Scholar, to index articles in a context-aware manner for accurate biomedical literature searches. The lack of technological infrastructure to add machine-interpretable metadata to expanding biomedical information, on the other hand, renders them unreachable to search engines. Therefore, we developed a portable technical infrastructure (goSemantically) and packaged it as a Google Docs add-ons. The “goSemantically” assists authors in adding machine-interpretable metadata at the terminology and document structural levels While authoring biomedical content. The “goSemantically” leverages the NCBO Bioportal resources and introduces a mechanism to annotate biomedical information with relevant machine-interpretable metadata (semantic vocabularies). The “goSemantically” also acquires schema.org meta tags designed for search engine optimization and tailored to accommodate biomedical information. Thus, individual authors can conveniently author and publish biomedical content in a truly decentralized fashion. Users can also export and host content with relevant machine-interpretable metadata (semantic vocabularies) in interoperable formats such as HTML and JSON-LD. To experience the described features, run this code with Google Docmore » « less
-
Diabetes is a global epidemic with severe consequences for individuals and healthcare systems. While early and personalized prediction can significantly improve outcomes, traditional centralized prediction models suffer from privacy risks and limited data diversity. This paper introduces a novel framework that integrates blockchain and federated learning to address these challenges. Blockchain provides a secure, decentralized foundation for data management, access control, and auditability. Federated learning enables model training on distributed datasets without compromising patient privacy. This collaborative approach facilitates the development of more robust and personalized diabetes prediction models, leveraging the combined data resources of multiple healthcare institutions. We have performed extensive evaluation experiments and security analyses. The results demonstrate good performance while significantly enhancing privacy and security compared to centralized approaches. Our framework offers a promising solution for the ethical and effective use of healthcare data in diabetes prediction.more » « less
-
Mobile and web apps are increasingly relying on the data generated or provided by users such as from their uploaded documents and images. Unfortunately, those apps may raise significant user privacy concerns. Specifically, to train or adapt their models for accurately processing huge amounts of data continuously collected from millions of app users, app or service providers have widely adopted the approach of crowdsourcing for recruiting crowd workers to manually annotate or transcribe the sampled ever-changing user data. However, when users' data are uploaded through apps and then become widely accessible to hundreds of thousands of anonymous crowd workers, many human-in-the-loop related privacy questions arise concerning both the app user community and the crowd worker community. In this paper, we propose to investigate the privacy risks brought by this significant trend of large-scale crowd-powered processing of app users' data generated in their daily activities. We consider the representative case of receipt scanning apps that have millions of users, and focus on the corresponding receipt transcription tasks that appear popularly on crowdsourcing platforms. We design and conduct an app user survey study (n=108) to explore how app users perceive privacy in the context of using receipt scanning apps. We also design and conduct a crowd worker survey study (n=102) to explore crowd workers' experiences on receipt and other types of transcription tasks as well as their attitudes towards such tasks. Overall, we found that most app users and crowd workers expressed strong concerns about the potential privacy risks to receipt owners, and they also had a very high level of agreement with the need for protecting receipt owners' privacy. Our work provides insights on app users' potential privacy risks in crowdsourcing, and highlights the need and challenges for protecting third party users' privacy on crowdsourcing platforms. We have responsibly disclosed our findings to the related crowdsourcing platform and app providers.more » « less
An official website of the United States government

