There is an increasing demand for processing large volumes of unstructured data for a wide variety of applications. However, protection measures for these big data sets are still in their infancy, which could lead to significant security and privacy issues. Attribute-based access control (ABAC) provides a dynamic and flexible solution that is effective for mediating access. We analyzed and implemented a prototype application of ABAC to large dataset processing in Amazon Web Services, using open-source versions of Apache Hadoop, Ranger, and Atlas. The Hadoop ecosystem is one of the most popular frameworks for large dataset processing and storage and is adopted by major cloud service providers. We conducted a rigorous analysis of cybersecurity in implementing ABAC policies in Hadoop, including developing a synthetic dataset of information at multiple sensitivity levels that realistically represents healthcare and connected social media data. We then developed Apache Spark programs that extract, connect, and transform data in a manner representative of a realistic use case. Our result is a framework for securing big data. Applying this framework ensures that serious cybersecurity concerns are addressed. We provide details of our analysis and experimentation code in a GitHub repository for further research by the community.
more »
« less
An Attribute-Based Access Control Model for Secure Big Data Processing in Hadoop Ecosystem
Apache Hadoop is a predominant software framework for distributed compute and storage with capability to handle huge amounts of data, usually referred to as Big Data. This data collected from different enterprises and government agencies often includes private and sensitive information, which needs to be secured from unauthorized access. This paper proposes extensions to the current authorization capabilities offered by Hadoop core and other ecosystem projects, specifically Apache Ranger and Apache Sentry. We present a fine-grained attribute-based access control model, referred as HeABAC, catering to the security and privacy needs of multi-tenant Hadoop ecosystem. The paper reviews the current multi-layered access control model used primarily in Hadoop core (2.x), Apache Ranger (version 0.6) and Sentry (version 1.7.0), as well as a previously proposed RBAC extension (OT-RBAC). It then presents a formal attribute-based access control model for Hadoop ecosystem, including the novel concept of cross Hadoop services trust. It further highlights different trust scenarios, presents an implementation approach for HeABAC using Apache Ranger and, discusses the administration requirements of HeABAC operational model. Some comprehensive, real-world use cases are also discussed to reflect the application and enforcement of the proposed HeABAC model in Hadoop ecosystem.
more »
« less
- PAR ID:
- 10072092
- Date Published:
- Journal Name:
- ABAC’18: 3rd ACM Workshop on Attribute-Based Access Control, March 19–21, 2018, Tempe, AZ,
- Page Range / eLocation ID:
- 13 to 24
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Smart homes are interconnected homes in which a wide variety of digital devices with limited resources communicate with multiple users and among themselves using multiple protocols. The deployment of resource-limited devices and the use of a wide range of technologies expand the attack surface and position the smart home as a target for many potential security threats. Access control is among the top security challenges in smart home IoT. Several access control models have been developed or adapted for IoT in general, with a few specifically designed for the smart home IoT domain. Most of these models are built on the role-based access control (RBAC) model or the attribute-based access control (ABAC) model. However, recently some researchers demonstrated that the need arises for a hybrid model combining ABAC and RBAC, thereby incorporating the benefits of both models to better meet IoT access control challenges in general and smart homes requirements in particular. In this paper, we used two approaches to develop two different hybrid models for smart home IoT. We followed a role-centric approach and an attribute-centric approach to develop HyBAC RC and HyBAC AC , respectively. We formally define these models and illustrate their features through a use case scenario demonstration. We further provide a proof-of-concept implementation for each model in Amazon Web Services (AWS) IoT platform. Finally, we conduct a theoretical comparison between the two models proposed in this paper in addition to the EGRBAC model (RBAC model for smart home IoT) and HABAC model (ABAC model for smart home IoT), which were previously developed to meet smart homes’ challenges.more » « less
-
Smart homes are interconnected homes in which a wide variety of digital devices with limited resources communicate with multiple users and among themselves using multiple protocols. The deployment of resource-limited devices and the use of a wide range of technologies expand the attack surface and position the smart home as a target for many potential security threats. Access control is among the top security challenges in smart home IoT. Several access control models have been developed or adapted for IoT in general, with a few specifically designed for the smart home IoT domain. Most of these models are built on the role-based access control (RBAC) model or the attribute-based access control (ABAC) model. However, recently some researchers demonstrated that the need arises for a hybrid model combining ABAC and RBAC, thereby incorporating the benefits of both models to better meet IoT access control challenges in general and smart homes requirements in particular. In this paper, we used two approaches to develop two different hybrid models for smart home IoT. We followed a role-centric approach and an attribute-centric approach to develop HyBAC RC and HyBAC AC , respectively. We formally define these models and illustrate their features through a use case scenario demonstration. We further provide a proof-of-concept implementation for each model in Amazon Web Services (AWS) IoT platform. Finally, we conduct a theoretical comparison between the two models proposed in this paper in addition to the EGRBAC model (RBAC model for smart home IoT) and HABAC model (ABAC model for smart home IoT), which were previously developed to meet smart homes’ challenges.more » « less
-
The area of smart homes is one of the most popular for deploying smart connected devices. One of the most vulnerable aspects of smart homes is access control. Recent advances in IoT have led to several access control models being developed or adapted to IoT from other domains, with few specifically designed to meet the challenges of smart homes. Most of these models use role-based access control (RBAC) or attribute-based access control (ABAC) models. As of now, it is not clear what the advantages and disadvantages of ABAC over RBAC are in general, and in the context of smart-home IoT in particular. In this paper, we introduce HABACα, an attribute-based access control model for smart-home IoT. We formally define HABACα and demonstrate its features through two use-case scenarios and a proof-of-concept implementation. Furthermore, we present an analysis of HABACα as compared to the previously published EGRBAC (extended generalized role-based access control) model for smart-home IoT by first describing approaches for constructing HABACα specification from EGRBAC and vice versa in order to compare the theoretical expressiveness power of these models, and second, analyzing HABACα and EGRBAC models against standard criteria for access control models. Our findings suggest that a hybrid model that combines both HABACα and EGRBAC capabilities may be the most suitable for smart-home IoT, and probably more generally.more » « less
-
In today's mobile-first, cloud-enabled world, where simulation-enabled training is designed for use anywhere and from multiple different types of devices, new paradigms are needed to control access to sensitive data. Large, distributed data sets sourced from a wide-variety of sensors require advanced approaches to authorizations and access control (AC). Motivated by large-scale, publicized data breaches and data privacy laws, data protection policies and fine-grained AC mechanisms are an imperative in data intensive simulation systems. Although the public may suffer security incident fatigue, there are significant impacts to corporations and government organizations in the form of settlement fees and senior executive dismissal. This paper presents an analysis of the challenges to controlling access to big data sets. Implementation guidelines are provided based upon new attribute-based access control (ABAC) standards. Best practices start with AC for the security of large data sets processed by models and simulations (M&S). Currently widely supported eXtensible Access Control Markup Language (XACML) is the predominant framework for big data ABAC. The more recently developed Next Generation Access Control (NGAC) standard addresses additional areas in securing distributed, multi-owner big data sets. We present a comparison and evaluation of standards and technologies for different simulation data protection requirements. A concrete example is included to illustrate the differences. The example scenario is based upon synthetically generated very sensitive health care data combined with less sensitive data. This model data set is accessed by representative groups with a range of trust from highly-trusted roles to general users. The AC security challenges and approaches to mitigate risk are discussed.more » « less
An official website of the United States government

