(SM-2211) Azure Data Lake Gen2
This guide describes a process required to establish a connection from Storage Management to Azure Data Lake Storage Gen2.
Prerequisites
SAP requirements
- SAP NetWeaver 7.01 SP-level 015
- HTTPS service enabled
- SSFLIB Version 1.850.40 ; CommonCryptoLib (SAPCRYPTOLIB) Version 8.5.20 (+MT)
Azure storage configuration
This section describes the steps needed for the preparation of the landing area on the Azure side.
Storage Account and ADLS container
To create MS Azure storage account, please follow the official documentation https://learn.microsoft.com/en-us/azure/storage/blobs/create-data-lake-storage-account.
Landing area (container/directory)
Create a container or a directory within the container where all files extracted from the SAP system will be stored.
In Microsoft Azure Storage Explorer, select the container/directory that will be used and set appropriate permissions depending on the authentication method chosen (see below).
Authentication
Authentication to ADLS can be delivered by two methods:
- SAS Token: Recommended for Proof of Concepts, faster setup, but provides fewer security management options.
- OAuth 2.0: Recommended for production deployment, fully manageable, but requires application registration and detailed permission setup.
SAS Token
If you decide to use the SAS token, please ignore OAuth 2.0 Authentication steps. In the same manner, you can skip steps related to the OAuth profile in the SAP configuration section.
To generate the SAS token, go to the Azure portal (see the screenshot below for more information).
The SAS token should grant permissions to Create, Read, Write, Delete and List.
Click Generate SAS token and URL.
Copy Blob SAS token string and store it for configuration on the SAP side.
OAuth 2.0 Authentication
OAuth 2.0 required Application registration. To create a new registration, follow these steps:
- Go to Azure Active Directory > App registrations > New application registration
2. Fill in the required fields and click Create.
3. Write down Application ID and Directory ID, as it will be required later during the Storage management configuration.
4. Click Certificates & secrets and generate a New client secret. Write down the secret, as it will be used later during the configuration.
5. Allow access of registered application (Client ID) to the landing area
6. The registered application also needs to execute permission on ALL parent directories and filesystem (enabling directory structure traversal)
SAP system configuration
After preparation is complete on the Azure side, fill in the required information on the SAP side to establish a connection.
STRUST
To enable verification of server authenticity and encrypted communication (HTTPS), SSL certificates need to be added to the SAP system's STRUST.
To download relevant certificates, open a web browser in private mode and navigate to storage account FQDN in the format: https://<container>.dfs.core.windows.net
The browser will return an InvalidUri error, but will offer an interface to download the certificates:
Please make sure to download all three certificates to have a full certificate chain:
In STRUST, import these certificates into SSL Client (Anonymous) PSE. After the import, click Save.
If OAuth authentication is to be used, repeat the same steps for certificate management with the address https://login.microsoftonline.com.
The address uses the following certificate chain:
When the PSE is updated, run the transaction SMICM and restart the ICM services.
Storage RFC
Create RFC type G for Azure Data Lake Gen2 primary endpoint.
Set Target Host to your ADLS address (e.g. dvdadls2.dfs.core.windows.net) and
Target host must contain <host>.dfs.core.windows.net in FQDN.
That guarantees ADLS Gen2 API will be addressed.Set Path Prefix to ADLS container
- Set SSL to "Active" and Certificate list to "ANONYM SSL Client (Anonymous)".
- Set HTTP Version to 1.1 and Compression to Active
OAuth 2.0 Authentication RFC destination
This step is required only if OAuth authentication was chosen.
Create RFC of the type G for the Microsoft Active directory. This RFC represents a connection to the authority server that grants an authentication token for ADLS.
- Set Target Host to: login.microsoftonline.com.
- Set SSL to "Active" and Certificate list to "ANONYM SSL Client (Anonymous)"
Storage Management
In this step, the actual connection to the landing area is created.
- Execute transaction /DVD/SM_SETUP.
- Switch to Edit Mode and click New Storage.
- Choose storage type ADLS_GEN2 and specify Storage ID
Specify the following parameters:
- HTTP RFC Destination – storage RFC destination created previously
- Path Prefix - /<path to landing area> within ADLS container
- HTTP Call Repeat - specify the number of times the HTTP request will be retried in case of failure (0 = no repetition)
- Repeat Delay (seconds) - seconds between repetitions - if not filled, the default value will be set to 3 seconds (instead of 0)
The bottom part of the storage connection configuration depends on the authentication method chosen.
a) If a SAS token is used, either insert it directly to the respective field, or you can insert it into the field next to Hash button and by clicking Hash, convert it into a hashed string to prevent plain text visibility
SAS token value alerts:
- Remove a question mark '?' character at the beginning of the generated SAS token.
b) If the OAuth authentication method is chosen, fill in the OAuth profile with information recorded during application registration:
OAuth 2.0 profile - ID of the profile
Client ID - Application ID created in the section (SM-2211) Azure Data Lake Gen2.
Client secret - Key created in the section (SM-2211) Azure Data Lake Gen2.
Resource - Static address "https://storage.azure.com/"
Tenant - Part of the URL identifier described in the section (SM-2211) Azure Data Lake Gen1
Confirm storage creation by saving the information (Ctrl+S).
To check the connection click Check button.
Use compute hash to calculate the hash of data.
Advanced
Failover storage
Failover (Read-only) storage enables to utilize Azure Storage Redundancy, e.g. read-access geo-redundant storage (RA-GRS) or read-access geo-zone-redundant storage (RA-GZRS), to improve high availability/disaster recovery. For more information about Azure replication strategies, please refer to official Microsoft documentation: https://docs.microsoft.com/en-us/azure/storage/common/storage-redundancy.
When the primary storage is unavailable, i.e. connection check fails, the application will automatically switch to Failover storage for data read from another data center in a secondary region. Follow the procedure below to enable it on the SAP side.
Enable Failover storage storage
- Run transaction /DVD/RL_SETT_EXPERT
- Specify “SM” in Tool name parameter and execute (F8)
Find the parameter “Failover storage”, click on the "Edit" button, and set the value to "X".
After that, additional parameters appear in Storage Management settings (Tcode /DVD/SM_SETUP), where you can specify HTTP Destination to your Failover storage and SAS token.
Failover storage RFC
Create RFC type G for Azure Data Lake Gen2 secondary endpoint.
Set Target Host to your secondary ADLS address - it is the same as for primary storage, but appends the suffix –secondary (e.g. dvdadls2-secondary.dfs.core.windows.net) and Path Prefix to /<filesystem>/<Path to landing folder>.
- Set SSL to "Active" and Certificate list to "ANONYM".
- Set HTTP Version to 1.1 and Compression to Active