(SM-1902) Azure Data Lake
This guide describes a process required to establish a connection from Datavard Storage Management to Azure Data Lake Storage. This storage can serve as a target for Glue extraction to .csv, or as a storage layer for Big data solutions like Azure Databricks or Azure HDInsight.
Contents
Prerequisites
- Azure Data Lake Storage account
- Reuse Library transports imported
- SAP NetWeaver 7.10+
- Network connection between a SAP system and Azure environment
Steps on Azure
You should perform these steps before the implementation.
Application registration
Application registration is used for an authentication to ADLS. To create a new registration of an application, follow these steps:
- Go to Azure Active Directory > App registrations > New application registration
- Fill required fields and Click Create.
- Note down Application ID, as it will be required later on during the Storage management configuration.
- Click Settings, Required permissions and add permission for Azure Data Lake.
- Click Keys and generate a new key. Note down the key, as it will be used later on during the configuration.
Creating a landing folder
Create a folder where all the new files extracted from the SAP system will be located and set the correct permissions for this folder.
- Go to your ADLS resource Data explorer and click New Folder in the desired location.
- In the new folder, click Access and add access to the application registered in the previous section.
Tenant
Tenant is an unique identifier of your organization. For the authentication of your organization, you should obtain the value of the tenant by following these steps:
- Go to Azure Active Directory > App Registrations > Endpoints
- Copy the OAUTH 2.0 AUTHORIZATION ENDPOINT and extract its ID part. (For example: https://login.microsoftonline.com/6fdc3117-ec29-4d73-8b33-028c513372/oauth2/authorize).
Steps on SAP
After preparation is complete on Azure side, fill the required information on the SAP side to establish a connection.
STRUST
Root certificate authority of Microsoft needs to be present in STRUST to establish a secure SSL connection.
- With the help of your internet browser, copy the CA public certificate into a file, as shown in the figure below.
- In STRUST, import this certificate into SSL Client (Anonymous) PSE.
- Go to the transaction SMICM and restart the ICM services as shown on the figure.
RFCs
To successfully establish a connection to ADLS create two RFCs in SM59.
- Start with the creation of an RFC of the type G for Microsoft Active directory with Target Host set to: login.microsoftonline.com. This RFC represents connection to authority server that grants an authentication token for ADLS.
Set SSL to "Active" and Certificate list to "ANONYM". - Create RFC type G for Microsoft Data lake. Set Target Host to your ADLS address (e.g. clazuhdi02.azuredatalakestore.net) and Path Prefix to /webhdfs/v1/<Path to landing folder>. Set SSL to "Active" and Certificate list to "ANONYM".
Authentication profile
Authentication profile contains login information, which you should create in the table /DVD/OAUTH_CONF.
OAUTH_PROFILE you may choose any value to identify a profile used for the authentication
CLIENT_ID is an Application ID created in the section Creating Application Registration
CLIENT_SECRET is a key created in the section Creating Application Registration
GRANT_TYPE is the fixed value "client_credentials"
RESOURCE is the fixed value "https://datalake.azure.net/"
TENANT is an identifier described in the section Tenant
URL is left blank
Linking authentication profile
Next step is to link the authentication profile with RFCs created in the table /DVD/HDP_AUT_OA2.
Setting authentication method
The authentication method needs to be set to OAUTH2.0 in the table /DVD/HDP_CUS_C
Creating storage in Datavard Storage Management
After configuration is complete, you need to define a storage that serves as a target for the extraction.
- Go to the transaction /DVD/SM_SETUP.
- Go to Edit Mode and Click New Storage.
- Create a new storage of the type HADOOP and fill the RFC destination.