skip to main content


Title: Scalability Analysis of Blockchain on a Serverless Cloud
While adopting Blockchain technologies to automate their enterprise functionality, organizations are recognizing the challenges of scalability and manual configuration that the state of art present. Scalability of Hyperledger Fabric is an open challenge recognized by the research community. We have automated many of the configuration steps of installing Hyperledger Fabric Blockchain on AWS infrastructure and have benchmarked the scalability of that system. We have used the UCR (University of California Riverside) Time Series Archive with 128 timeseries datasets containing over 191,177 rows of data totaling 76,453,742 numbers. Using an automated Serverless approach, we have loaded this dataset, by chunks, into different AWS instances, triggering the load by SQS messaging. In this paper, we present the results of this benchmarking study and describe the approach we took to automate the Hyperledger Fabric processes using serverless Lambda functions and SQS triggering. We will also discuss what is needed to make the Blockchain technology more robust and scalable.  more » « less
Award ID(s):
1919159
NSF-PAR ID:
10272164
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2019 IEEE International Conference on Big Data (Big Data)
Page Range / eLocation ID:
4214 to 4222
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Cloud computing has become a major approach to help reproduce computational experiments. Yet there are still two main difficulties in reproducing batch based big data analytics (including descriptive and predictive analytics) in the cloud. The first is how to automate end-to-end scalable execution of analytics including distributed environment provisioning, analytics pipeline description, parallel execution, and resource termination. The second is that an application developed for one cloud is difficult to be reproduced in another cloud, a.k.a. vendor lock-in problem. To tackle these problems, we leverage serverless computing and containerization techniques for automated scalable execution and reproducibility, and utilize the adapter design pattern to enable application portability and reproducibility across different clouds. We propose and develop an open-source toolkit that supports 1) fully automated end-to-end execution and reproduction via a single command, 2) automated data and configuration storage for each execution, 3) flexible client modes based on user preferences, 4) execution history query, and 5) simple reproduction of existing executions in the same environment or a different environment. We did extensive experiments on both AWS and Azure using four big data analytics applications that run on virtual CPU/GPU clusters. The experiments show our toolkit can achieve good execution performance, scalability, and efficient reproducibility for cloud-based big data analytics. 
    more » « less
  2. An essential requirement of any information management system is to protect data and resources against breach or improper modifications, while at the same time ensuring data access to legitimate users. Systems handling personal data are mandated to track its flow to comply with data protection regulations. We have built a novel framework that integrates semantically rich data privacy knowledge graph with Hyperledger Fabric blockchain technology, to develop an automated access-control and audit mechanism that enforces users' data privacy policies while sharing their data with third parties. Our blockchain based data-sharing solution addresses two of the most critical challenges: transaction verification and permissioned data obfuscation. Our solution ensures accountability for data sharing in the cloud by incorporating a secure and efficient system for End-to-End provenance. In this paper, we describe this framework along with the comprehensive semantically rich knowledge graph that we have developed to capture rules embedded in data privacy policy documents. Our framework can be used by organizations to automate compliance of their Cloud datasets. 
    more » « less
  3. The potential of blockchain technology is immense and is currently regarded as a new technological trend with a rapid growth rate. Blockchain platforms like Bitcoin are public, open, and permission-less. They are also decentralized, immutable, and append-only ledger; those ledgers can store any type of data and are shared among all the participants of the network. These platforms provide a high degree of anonymity for their users' identity and full transparency of the activities recorded on the ledger while simultaneously ensuring data security and tamper-resistance. All nodes on the network collectively work to validate the same set of data and to achieve group consensus. Blockchain platforms like Ethereum have the ability to develop smart contracts and embed business logic. This allows the use of blockchain beyond cryptocurrency as a business management solution. Besides the issues of scalability and the expensive nature of most blockchain systems, many attributes of traditional public blockchain are not desirable in a business or enterprise context such as anonymity, full transparency, and permissionless. Permissioned blockchain platforms like Hyperledger Fabric are designed and built with enterprise and business in mind, retaining the desirable qualities of blockchain for enterprise while replacing the qualities of blockchain that are undesirable for the enterprise. In this paper, we present a comprehensive review on the Hyperledger enterprise blockchain technologies. 
    more » « less
  4. null (Ed.)
    Computerized systems and software, which allow optimizing and planning the processes of production, storage, transportation, sale, and distribution of goods, have emerged in the industry. Scheduling systems, in particular, are designed to control and optimize the manufacturing process. This tool can have a significant effect on the productivity of the industry because it reduces the time and cost through well-defined optimization algorithms. Recently, the applicability of blockchain technology has been demonstrated in scheduling systems to add decentralization, traceability, auditability, and verifiability of the immutable information that this technology provides. This is a novel contribution that provides scheduling systems with an additional layer of security. With the latest version of Hyperledger Fabric, the appropriate levels of permission and policies for access to information can be established with significant levels of privacy and security, which prevent malicious actors from trying to cheat or abuse the system. Different alternatives exist to manage all processes associated with the operation of a blockchain network, and among them, providers of blockchain as a service have emerged. Chainstack stands out for its simplicity and scalability features to deploy and operate a blockchain network. Our goal in this work is to create a solution for secure storage of and access to task-scheduling scheme on the consortium blockchain and inter-planetary file system as a proof of concept to demonstrate its potential and usability. 
    more » « less
  5. With the rapid development of blockchain plat-forms, it is important that different implementations are tested and analyzed for comparative purposes. One such implementation is Hyperledger Sawtooth, a new member of the Hyperledger family. Sawtooth blockchain is a per-missioned implementation developed in part by Intel. While research has been done on Hyperledger Fabric, re-search on Sawtooth is not well documented. Using the Hy-perledger Caliper benchmarking tool, we aim to test the performance of the blockchain and identify potential issues. 
    more » « less