(SM-2601) Hadoop

(SM-2601) Hadoop

Hadoop is an open-source distributed processing framework that manages data processing and storage for big data applications running in clustered systems. It can handle various forms of structured and unstructured data, giving users more flexibility for collecting, processing, and analyzing data than relational databases and data warehouses provide.

More information can be found in the chapter Apache Hadoop.

Supported versions

Hadoop Components

The Hadoop ecosystem refers to the various components of the Apache Hadoop software library, the accessories and tools provided by the Apache Software Foundation for these types of software projects, and the ways that they work together.

In this section:

In Kyano Storage Management, the following components of the Hadoop ecosystem are used:

Component

Supported versions

Component

Supported versions

Hive

1.1.0 and higher

Impala

2.3.0 and higher

WebHDFS

any

SNP is Cloudera certified.

Setup

For instructions on setting up Hadoop storage, see the chapter Hadoop Storage Setup.

Detailed technical information

For more detailed technical information regarding communication between SAP and Hadoop, see the chapter Communication Between SAP and Hadoop