skip to main content


Title: An Attribute-Based Access Control Model for Secure Big Data Processing in Hadoop Ecosystem
Apache Hadoop is a predominant software framework for distributed compute and storage with capability to handle huge amounts of data, usually referred to as Big Data. This data collected from different enterprises and government agencies often includes private and sensitive information, which needs to be secured from unauthorized access. This paper proposes extensions to the current authorization capabilities offered by Hadoop core and other ecosystem projects, specifically Apache Ranger and Apache Sentry. We present a fine-grained attribute-based access control model, referred as HeABAC, catering to the security and privacy needs of multi-tenant Hadoop ecosystem. The paper reviews the current multi-layered access control model used primarily in Hadoop core (2.x), Apache Ranger (version 0.6) and Sentry (version 1.7.0), as well as a previously proposed RBAC extension (OT-RBAC). It then presents a formal attribute-based access control model for Hadoop ecosystem, including the novel concept of cross Hadoop services trust. It further highlights different trust scenarios, presents an implementation approach for HeABAC using Apache Ranger and, discusses the administration requirements of HeABAC operational model. Some comprehensive, real-world use cases are also discussed to reflect the application and enforcement of the proposed HeABAC model in Hadoop ecosystem.  more » « less
Award ID(s):
1736209 1538418 1423481 1111925
PAR ID:
10072092
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
ABAC’18: 3rd ACM Workshop on Attribute-Based Access Control, March 19–21, 2018, Tempe, AZ,
Page Range / eLocation ID:
13 to 24
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. There is an increasing demand for processing large volumes of unstructured data for a wide variety of applications. However, protection measures for these big data sets are still in their infancy, which could lead to significant security and privacy issues. Attribute-based access control (ABAC) provides a dynamic and flexible solution that is effective for mediating access. We analyzed and implemented a prototype application of ABAC to large dataset processing in Amazon Web Services, using open-source versions of Apache Hadoop, Ranger, and Atlas. The Hadoop ecosystem is one of the most popular frameworks for large dataset processing and storage and is adopted by major cloud service providers. We conducted a rigorous analysis of cybersecurity in implementing ABAC policies in Hadoop, including developing a synthetic dataset of information at multiple sensitivity levels that realistically represents healthcare and connected social media data. We then developed Apache Spark programs that extract, connect, and transform data in a manner representative of a realistic use case. Our result is a framework for securing big data. Applying this framework ensures that serious cybersecurity concerns are addressed. We provide details of our analysis and experimentation code in a GitHub repository for further research by the community.

     
    more » « less
  2. Smart homes are interconnected homes in which a wide variety of digital devices with limited resources communicate with multiple users and among themselves using multiple protocols. The deployment of resource-limited devices and the use of a wide range of technologies expand the attack surface and position the smart home as a target for many potential security threats. Access control is among the top security challenges in smart home IoT. Several access control models have been developed or adapted for IoT in general, with a few specifically designed for the smart home IoT domain. Most of these models are built on the role-based access control (RBAC) model or the attribute-based access control (ABAC) model. However, recently some researchers demonstrated that the need arises for a hybrid model combining ABAC and RBAC, thereby incorporating the benefits of both models to better meet IoT access control challenges in general and smart homes requirements in particular. In this paper, we used two approaches to develop two different hybrid models for smart home IoT. We followed a role-centric approach and an attribute-centric approach to develop HyBAC RC and HyBAC AC , respectively. We formally define these models and illustrate their features through a use case scenario demonstration. We further provide a proof-of-concept implementation for each model in Amazon Web Services (AWS) IoT platform. Finally, we conduct a theoretical comparison between the two models proposed in this paper in addition to the EGRBAC model (RBAC model for smart home IoT) and HABAC model (ABAC model for smart home IoT), which were previously developed to meet smart homes’ challenges. 
    more » « less
  3. Smart homes are interconnected homes in which a wide variety of digital devices with limited resources communicate with multiple users and among themselves using multiple protocols. The deployment of resource-limited devices and the use of a wide range of technologies expand the attack surface and position the smart home as a target for many potential security threats. Access control is among the top security challenges in smart home IoT. Several access control models have been developed or adapted for IoT in general, with a few specifically designed for the smart home IoT domain. Most of these models are built on the role-based access control (RBAC) model or the attribute-based access control (ABAC) model. However, recently some researchers demonstrated that the need arises for a hybrid model combining ABAC and RBAC, thereby incorporating the benefits of both models to better meet IoT access control challenges in general and smart homes requirements in particular. In this paper, we used two approaches to develop two different hybrid models for smart home IoT. We followed a role-centric approach and an attribute-centric approach to develop HyBAC RC and HyBAC AC , respectively. We formally define these models and illustrate their features through a use case scenario demonstration. We further provide a proof-of-concept implementation for each model in Amazon Web Services (AWS) IoT platform. Finally, we conduct a theoretical comparison between the two models proposed in this paper in addition to the EGRBAC model (RBAC model for smart home IoT) and HABAC model (ABAC model for smart home IoT), which were previously developed to meet smart homes’ challenges. 
    more » « less
  4. The area of smart homes is one of the most popular for deploying smart connected devices. One of the most vulnerable aspects of smart homes is access control. Recent advances in IoT have led to several access control models being developed or adapted to IoT from other domains, with few specifically designed to meet the challenges of smart homes. Most of these models use role-based access control (RBAC) or attribute-based access control (ABAC) models. As of now, it is not clear what the advantages and disadvantages of ABAC over RBAC are in general, and in the context of smart-home IoT in particular. In this paper, we introduce HABACα, an attribute-based access control model for smart-home IoT. We formally define HABACα and demonstrate its features through two use-case scenarios and a proof-of-concept implementation. Furthermore, we present an analysis of HABACα as compared to the previously published EGRBAC (extended generalized role-based access control) model for smart-home IoT by first describing approaches for constructing HABACα specification from EGRBAC and vice versa in order to compare the theoretical expressiveness power of these models, and second, analyzing HABACα and EGRBAC models against standard criteria for access control models. Our findings suggest that a hybrid model that combines both HABACα and EGRBAC capabilities may be the most suitable for smart-home IoT, and probably more generally. 
    more » « less
  5. It is important in software development to enforce proper restrictions on protected services and resources. Typically software services can be accessed through REST API endpoints where restrictions can be applied using the Role-Based Access Control (RBAC) model. However, RBAC policies can be inconsistent across services, and they require proper assessment. Currently, developers use penetration testing, which is a costly and cumbersome process for a large number of APIs. In addition, modern applications are split into individual microservices and lack a unified view in order to carry out automated RBAC assessment. Often, the process of constructing a centralized perspective of an application is done using Systematic Architecture Reconstruction (SAR). This article presents a novel approach to automated SAR to construct a centralized perspective for a microservice mesh based on their REST communication pattern. We utilize the generated views from SAR to propose an automated way to find RBAC inconsistencies. 
    more » « less