(Glue-2311) Hadoop Prerequisites

High-level summary of necessary steps (more details can be found in the following documentation):

SAP Basis/Infrastructure tasks:

SAP Basis team:

  1. Create directories on each SAP application server:

    1. /sapmnt/<SID>/global/security/dvd_conn/hive

    2. /sapmnt/<SID>/global/security/dvd_conn/impala

  2. Upload the latest JDBC drivers for Hive and Impala to respective directories on each SAP application server:

    1. for Hive: /sapmnt/<SID>/global/security/dvd_conn/hive

    2. for Impala: /sapmnt/<SID>/global/security/dvd_conn/impala

  3. Upload files provided by the Hadoop team to directory /usr/sap/<SID>/global/security/dvd_conn:

    1. <SID>.keytab

    2. krb5.conf

    3. jssecacerts

  4. Upload .pem certificates for HttpFS/WebHDFS to STRUST.

  5. Enable HTTPS service in ICM.

  6. Create RFC destination for communication with HttpFS/WebHDFS service.

  7. Set parameter ict/disable_cookie_urlencoding (to 1 or 2 dependent on SAP kernel release).

  8. Deploy SNP Java Connector .

    1. Create SNP_JAVA_CONN RFC:

      1. Connection type: TCP/IP Connection

      2. Activation type: Registered Server Program

      3. Fill in the Program ID for example: SNP_JAVA_CONN

    2. Create Technical SAP user for ABAP->JCO RFC Communication:

      1. In transport, you are provided with a template role with name /DVD/JCORFC (if you will use this role, please do not forget to generate the role in transaction PFCG)

      2. User should be created with type System and should be assigned with role /DVD/JCORFC

    3. Grant access for external program SNP_JAVA_CONN to SAP gateway:

      1. Check SMGW > Goto > Expert Functions > External Security > Maintenance of ACL files if external programs are allowed (=* or ProgramID)

Networking team:

  1. Enable correct resolution of hostnames of Hadoop nodes and Kerberos KDC from the SAP system.

  2. Enable outbound communication from the SAP system to Hadoop services:

Port

Service

Port

Service

10000

Hive

14000

HttpFS

21050

Impala

50070

WebHDFS (may be required if HttpFS is not available)

88

Kerberos Key Distribution Center (KDC)

 

Hadoop team:

  1. Create Kerberos principal for each SAP system (<sid>hdp) and provide its keytab file to the SAP Basis team (further referred to as Hadoop technical user).

  2. Provide krb5.conf file to the SAP Basis team.

  3. Provide Java truststore (jssecacerts) from the Hadoop environment to the SAP Basis team.

  4. Provide .pem certificate of HttpFS/WebHDFS service to SAP Basis team.

  5. Create a home directory for each technical <SID>hdp user with appropriate ownership/permissions (e.g. /user/<sid>hdp)”

    1. This can be tested by: curl --negotiate -u : -k "https://<HttpFS_host_FQDN>:14000/webhdfs/v1/?op=GETHOMEDIRECTORY".

  6. Create a Hive database for each SAP system (sap<sid>).

  7. Create Sentry/Ranger policies granting full privileges to <sid>hdp user on sap<sid> database and user's HDFS directory (e.g. /user/<sid>hdp).

  8. Test access of Hadoop technical user.