(SM-2111) Azure Data Lake Gen1

This guide describes a process required to establish a connection from Datavard Storage Management to Azure Data Lake Storage. This storage can serve as a target for Glue extraction to .csv or as a storage layer for Big data solutions like Azure Databricks or Azure HDInsight.


Prerequisites

  • Azure Data Lake Storage account
  • A network connection between an SAP system and Azure environment

Storage setup

The setup procedure requires individual steps on both the SAP and Azure side.

In the following sections, we provide a step-by-step procedure of how to set up the connection.

Azure storage configuration

You should perform these steps before the implementation.

Application registration

Application registration is used for authentication to ADLS. To create a new registration of an application, follow these steps:

  1. Go to Azure Active Directory > App registrations > New application registration

2. Fill the required fields and Click Create.

3. Write down Application ID and Directory ID, as it will be required later on during the Storage management configuration.


4. Click Keys and generate a new key. Write down the key, as it will be used later on during the configuration.

Creating a landing folder 

Create a folder where all the new files extracted from the SAP system will be located and set the correct permissions for this folder.

  1. Go to your ADLS resource Data explorer and click New Folder in the desired location.


2. In the new folder, click Access and add access to the application registered in the previous section.



SAP system configuration

After preparation is complete on the Azure side, fill in the required information on the SAP side to establish a connection.

STRUST

The root certificate authority of Microsoft needs to be loaded via transaction STRUST to enable a secure SSL connection. 

  1. With the help of your internet browser, copy the CA public certificate into a file, as shown in the figure below.

2. In STRUST, import this certificate into SSL Client (Anonymous) PSE.


3. Go to the transaction SMICM and restart the ICM services as shown in the figure below:

RFCs

To successfully establish a connection to ADLS create two RFCs in SM59

  1. Start with the creation of an RFC of the type G for Microsoft Active directory with Target Host set to: login.microsoftonline.com. This RFC represents a connection to the authority server that grants an authentication token for ADLS. 
    Set SSL to "Active" and Certificate list to "ANONYM SSL Client (Anonymous)".


2. Create RFC type G for Microsoft Data lake. Set Target Host to your ADLS address (e.g. clazuhdi02.azuredatalakestore.net) and Path Prefix to /webhdfs/v1/<Path to landing folder>. Set SSL to "Active" and Certificate list to "ANONYM".


Authentication profile

The authentication profile contains login information, which you should create in the table /DVD/OAUTH_CONF. 

OAUTH_PROFILE you may choose any value to identify a profile used for the authentication

CLIENT_ID is an Application ID created in the section Creating Application Registration

CLIENT_SECRET is a key created in the section Creating Application Registration. It can be hashed by the report /DVD/XOR_GEN.

GRANT_TYPE is the fixed value "client_credentials"

RESOURCE is the fixed value "https://datalake.azure.net/"

TENANT is an identifier described in the section Tenant

URL is left blank

Linking authentication profile

The next step is to link the authentication profile with RFCs created in the table /DVD/HDP_AUT_OA2

Setting the authentication method

The authentication method needs to be set to OAUTH2.0 in the table /DVD/HDP_CUS_C

Creating storage in Datavard Storage Management

After the configuration is complete, you need to define storage that serves as a target for the extraction.

  1. Go to the transaction /DVD/SM_SETUP.
  2. Go to Edit Mode and Click New Storage.
  3. Create new storage of the type HADOOP and fill the RFC destination.