(SM-2102) Azure Data Lake Gen1
This guide describes a process required to establish a connection from Datavard Storage Management to Azure Data Lake Storage. This storage can serve as a target for Glue extraction to .csv or as a storage layer for Big data solutions like Azure Databricks or Azure HDInsight.
Prerequisites
- Azure Data Lake Storage account
- A network connection between an SAP system and Azure environment
Storage setup
The setup procedure requires individual steps on both the SAP and Azure side.
In the following sections, we provide a step-by-step procedure of how to set up the connection.
Azure storage configuration
You should perform these steps before the implementation.
Application registration
Application registration is used for authentication to ADLS. To create a new registration of an application, follow these steps:
- Go to Azure Active Directory > App registrations > New application registration
2. Fill the required fields and Click Create.
3. Write down Application ID, as it will be required later on during the Storage management configuration.
4. Click Settings, Required permissions and add permission for Azure Data Lake.
5. Click Keys and generate a new key. Write down the key, as it will be used later on during the configuration.
Creating a landing folder
Create a folder where all the new files extracted from the SAP system will be located and set the correct permissions for this folder.
- Go to your ADLS resource Data explorer and click New Folder in the desired location.
2. In the new folder, click Access and add access to the application registered in the previous section.
Tenant
Tenant is an unique identifier of your organization. For the authentication of your organization, you should obtain the value of the tenant by following these steps:
- Go to Azure Active Directory > App Registrations > Endpoints
2. Copy the OAUTH 2.0 AUTHORIZATION ENDPOINT and extract its ID part. (For example: https://login.microsoftonline.com/6fdc3117-ec29-4d73-8b33-028c513372/oauth2/authorize).
SAP system configuration
After preparation is complete on the Azure side, fill in the required information on the SAP side to establish a connection.
STRUST
The root certificate authority of Microsoft needs to be loaded via transaction STRUST to enable a secure SSL connection.
- With the help of your internet browser, copy the CA public certificate into a file, as shown in the figure below.
2. In STRUST, import this certificate into SSL Client (Anonymous) PSE.
3. Go to the transaction SMICM and restart the ICM services as shown on the figure.
RFCs
To successfully establish a connection to ADLS create two RFCs in SM59.
- Start with the creation of an RFC of the type G for Microsoft Active directory with Target Host set to: login.microsoftonline.com. This RFC represents a connection to the authority server that grants an authentication token for ADLS.
Set SSL to "Active" and Certificate list to "ANONYM SSL Client (Anonymous)".
2. Create RFC type G for Microsoft Data lake. Set Target Host to your ADLS address (e.g. clazuhdi02.azuredatalakestore.net) and Path Prefix to /webhdfs/v1/<Path to landing folder>. Set SSL to "Active" and Certificate list to "ANONYM".
Authentication profile
The authentication profile contains login information, which you should create in the table /DVD/OAUTH_CONF.
OAUTH_PROFILE you may choose any value to identify a profile used for the authentication
CLIENT_ID is an Application ID created in the section Creating Application Registration
CLIENT_SECRET is a key created in the section Creating Application Registration. It can be hashed by the report /DVD/XOR_GEN.
GRANT_TYPE is the fixed value "client_credentials"
RESOURCE is the fixed value "https://datalake.azure.net/"
TENANT is an identifier described in the section Tenant
URL is left blank
Linking authentication profile
The next step is to link the authentication profile with RFCs created in the table /DVD/HDP_AUT_OA2.
Setting the authentication method
The authentication method needs to be set to OAUTH2.0 in the table /DVD/HDP_CUS_C
Creating storage in Datavard Storage Management
After the configuration is complete, you need to define storage that serves as a target for the extraction.
- Go to the transaction /DVD/SM_SETUP.
- Go to Edit Mode and Click New Storage.
- Create new storage of the type HADOOP and fill the RFC destination.