Security Architecture for Apache Hadoop


Through the years, there has been a clamor and need expressed for robust Apache Hadoop security framework. Considering the massive amount of data that nodes hold, there is an increasing need to focus on security architecture for the cluster. Further, there is a sensitization around the regulatory and legal norms that enterprise firms need to follow.

hadoopsphere.com presents below a security architecture that can be adapted in your Apache Hadoop cluster. Tools may vary based on off-the-shelf utilities or custom in-house monitoring programs. It is essential that each firm depending on its business use case put in essential guards and checks for protecting the Hadoop nodes. The following 10 components should always serve as your discussion guide while implementing security architecture for Apache Hadoop.


Key components required in security architecture for Apache Hadoop:


1. Role based authorization:
- Ensure separation of duties
- Restrict functional access

2. Admin and Configuration:
- Role based administration
- Configurable node and cluster parameters

3. Authentication framework:
- Validate nodes
- Validate client applications
for access to the cluster and  MapReduce jobs

4. Audit Log:
- Log transactions
- Log activities

5. Alerts:
- Real-time alerting
- Constant monitoring

6. File encryption:
- Protect private information (SPI/BPI)
- Comply with regulatory norms

7. Key certificate Server:
- Central key management server to manage different keys for different files.

8. Network security:
- Ensure secure communications between nodes, applications and other interface

9. Resource slim: 
- Minimal consumption of network
- Minimal consumption of resources, threads, process

10. Universal:
- Hadoop agnostic – compatible across distributions
- Heterogeneous support – compatible across ecosystem



© hadoopsphere.com