(SM-1811) Azure Data Lake

This guide is describing a process required to establish connection from Datavard Storage Management to Azure Data Lake Storage. This storage can serve as a target for Glue extraction to CSVs, or as a storage layer for Big Data solution like Azure Databricks or Azure HDInsight.

Prerequisites

  • Azure Data Lake Storage account
  • Reuse Library transports imported
  • SAP NetWeaver 7.10+
  • Network connection between SAP system and Azure environment

Steps on Azure

These steps are usually done by the customer as a preparation for implementation.

Creating Application Registration

Application registration will be used for an authentication to ADLS. To create a new Application Registration, follow these steps:

  1. Go to Azure Active Directory → App registrations → New application registration
  2. Fill required fields and Click Create
  3. Note down Application ID, it will be required later on during Storage management configuration.
  4. Click Settings, Required permissions and add permission for Azure Data Lake.
  5. Click Keys and generate a new key. Note down the key as it will be used later on during configuration.

Creating landing folder 

A directory where all the new files extracted from the SAP system will be located needs to be created upfront with correct permissions.

  1. Go to you ADLS resource Data explorer and click New Folder in desired location
  2. Go inside the new folder, click Access and add access to the App Registration created in previous section

Tenant

Unique identifier of your organization is also required for an authentication. To get this value, follow steps below:

  1. Go to Azure Active Directory → App Registrations → Endpoints
  2. Copy the OAUTH 2.0 AUTHORIZATION ENDPOINT and extract the ID part out of it. (https://login.microsoftonline.com/6fdc3117-ec29-4d73-8b33-028c513372/oauth2/authorize)

Steps on SAP

After preparation is complete on Azure side, we need to fill required information on SAP side to establish connection.

STRUST

Root certificate authority of Microsoft needs to be present in STRUST to establish secure SSL connection. 

  1. With help of your internet browser, Copy CA public certificate into a file.
  2. In STRUST, import this certificate into SSL client (Anonymous) PSE.
  3. Go to transaction ICM and restart ICM service.


RFCs

Two RFCs need to be created in SM59 to successfully establish connection to ADLS.

  1. We start with creation of an RFC of type G for Microsoft  Active directory with target host set to: login.microsoftonline.com. This RFC represents connection to authority server that will grant us an authentication token for ADLS. Set SSL to active and Certificate list to ANONYM.

  2. Create RFC type G for Microsoft Data lake. Set target host to your ADLS address (eg. clazuhdi02.azuredatalakestore.net) and path prefix to /webhdfs/v1/<Path to landing folder>. Set SSL to active and Certificate list to ANONYM.


Authentication profile

Authentication profile containing login information needs to be created in table /DVD/OAUTH_CONF. 

OAUTH_PROFILE can be any value chosen by the customer to identify the profile used for authentication

CLIENT_ID is Application ID created in chapter Creating Application registration

CLIENT_SECRET is a key created in chapter Creating Application registration

GRANT_TYPE is fixed value client_credentials

RESOURCE is fixed value https://datalake.azure.net/

TENANT is identifier found in chapter Tenant

URL is left blank

Linking Authentication profile

Next steps is to link Authenticaton profile with RFCs created in table /DVD/HDP_AUT_OA2. 

Setting Authentication method

Authentication method needs to be set to OAUTH2.0 in table /DVD/HDP_CUS_C

Creating storage in Datavard Storage Management

After configuration is done, we need to define a storage that will serve as a target for extraction.

  1. Go to transaction /DVD/SM_SETUP
  2. Go to Edit Mode and Click New Storage
  3. Create new storage of type HADOOP and fill RFC destination