(SM-2311) Azure Data Lake Gen2
This article describes a process required to establish a connection from SAP to Azure Data Lake Storage Gen2.
Prerequisites
SAP NetWeaver 7.01 SP-level 015
HTTPS service enabled
SSFLIB Version 1.850.40 ; CommonCryptoLib (SAPCRYPTOLIB) Version 8.5.20 (+MT)
Azure storage configuration
This section describes the steps needed for the preparation of the landing area on the Azure side.
Storage Account and ADLS container
To create the MS Azure storage account, follow the official documentation Create a storage account to use with Azure Data Lake Storage Gen2.
Landing area (container/directory)
Create a container or a directory within the container where all files extracted from the SAP system will be stored.
In Microsoft Azure Storage Explorer, select the container/directory that will be used and set appropriate permissions depending on the authentication method chosen.
Authentication
Authentication to ADLS can be delivered by two methods:
- SAS Token: Recommended for Proof of Concepts, faster setup, but provides fewer security management options.
- OAuth 2.0: Recommended for production deployment, fully manageable, but requires application registration and detailed permission setup.
SAS Token
If you use the SAS token, please ignore the OAuth 2.0 Authentication steps. Similarly, you can skip steps related to the OAuth profile in the SAP configuration section.
To generate the SAS token, go to the Azure portal.
The SAS token should grant permission to Create, Read, Write, Delete and List.
Click Generate SAS token and URL.
Copy Blob SAS token string and store it for configuration on the SAP side.
OAuth 2.0 Authentication
OAuth 2.0 required application registration. To create a new registration, follow these steps:
- Go to Azure Active Directory > App registrations > New application registration.
2. Fill in the required fields and click Create.
3. Write down Application ID and Directory ID, as it will be required later during the Storage management configuration.
4. Click Certificates & secrets and generate a New client secret. Write down the secret, which will be used later during the configuration.
5. Allow access to the registered application (Client ID) to the landing area.
6. The registered application also needs to execute permission on ALL parent directories and filesystem (enabling directory structure traversal).
SAP system configuration
After preparation is complete on the Azure side, fill in the required information on the SAP side to establish a connection.
STRUST
To enable verification of server authenticity and encrypted communication (HTTPS), SSL certificates need to be added to the SAP system's STRUST.
To download relevant certificates, open a web browser in private mode and navigate to storage account FQDN in the format: https://<container>.dfs.core.windows.net
The browser will return an InvalidUri error, but will offer an interface to download the certificates:
Please make sure to download all three certificates to have a full certificate chain:
In STRUST, import these certificates into SSL Client (Anonymous) PSE. After the import, click Save.
If OAuth authentication is to be used, repeat the same steps for certificate management with the address https://login.microsoftonline.com.
The address uses the following certificate chain:
When the PSE is updated, run the transaction SMICM and restart the ICM services.
Storage RFC
Create RFC type G for Azure Data Lake Gen2 primary endpoint.
Set Target Host to your ADLS address (e.g. dvdadls2.dfs.core.windows.net).
Target host must contain <host>.dfs.core.windows.net in FQDN.
That guarantees ADLS Gen2 API will be addressed.Set Path Prefix to ADLS container.
- Set SSL to Active and Certificate list to ANONYM SSL Client (Anonymous).
Set HTTP Version to 1.1 and Compression to Active.
Active compression can cause communication problems in combination with ICM kernel version 753 patch number 1100 and higher.
In this case we recommend switching the compression off (Inactive).
OAuth 2.0 Authentication RFC destination
This step is required only if OAuth authentication is chosen.
Create RFC of the type G for the Microsoft Active directory. This RFC represents a connection to the authority server that grants an authentication token for ADLS.
- Set Target Host to: login.microsoftonline.com.
- Set SSL to Active and Certificate list to ANONYM SSL Client (Anonymous).
On some new SAP Kernel releases (Kernel 753, Patch 1210 and higher) Compression setting causes problems and should always be set to Inactive.
Storage Management
In this step, the actual connection to the landing area is created.
- Execute transaction /DVD/SM_SETUP.
- Switch to Edit Mode and click New Storage.
- Choose storage type ADLS_GEN2 and specify Storage ID.
Specify the following parameters:
- HTTP RFC Destination: Storage RFC destination created previously.
- Path Prefix: /<path to landing area> within ADLS container.
- HTTP Call Repeat: Specify the number of times the HTTP request will be retried in case of failure (0 = no repetition).
- Repeat Delay (seconds): Seconds between repetitions - if not filled, the default value will be set to 3 seconds (instead of 0).
The bottom part of the storage connection configuration depends on the authentication method chosen.
a) If a SAS token is used, either insert it directly to the respective field, or you can insert it into the field next to Hash button and by clicking Hash, convert it into a hashed string to prevent plain text visibility.
SAS token value alerts:
- Remove a question mark ? character at the beginning of the generated SAS token.
b) If the OAuth authentication method is chosen, fill in the OAuth profile with information recorded during application registration:
OAuth 2.0 profile: The ID of the profile.
Client ID: Application ID created in the section Creating Application Registration.
Client secret: Key created in the section Creating Application Registration.
Resource: Static address https://storage.azure.com/.
Tenant: Part of the URL identifier is described in the section Tenant.
Confirm storage creation by saving the information (Ctrl+S).
To check the connection click Check button.
Use compute hash to calculate the hash of data.
Advanced
Failover storage
Failover (Read-only) storage enables the utilization of Azure Storage Redundancy, e.g. read-access geo-redundant storage (RA-GRS) or read-access geo-zone-redundant storage (RA-GZRS), to improve high availability/disaster recovery. For more information about Azure replication strategies, refer to official Microsoft documentation Azure Storage redundancy.
When the primary storage is unavailable, i.e. connection check fails, the application will automatically switch to Failover storage for data read from another data center in a secondary region. Follow the procedure below to enable it on the SAP side.
Enable Failover storage
- Run transaction /DVD/RL_SETT_EXPERT
- Specify SM in Tool name parameter and execute (F8)
Find the parameter Failover storage, click on the Edit button, and set the value to X.
After that, additional parameters appear in Storage Management settings (transaction /DVD/SM_SETUP), where you can specify HTTP Destination to your Failover storage and SAS token.
Failover storage RFC
Create RFC type G for Azure Data Lake Gen2 secondary endpoint.
Set Target Host to your secondary ADLS address - it is the same as for primary storage, but appends the suffix –secondary (e.g. dvdadls2-secondary.dfs.core.windows.net) and Path Prefix to /<filesystem>/<Path to landing folder>.
- Set SSL to Active and Certificate list to ANONYM.
- Set HTTP Version to 1.1 and Compression to Active
Option to bypass ICM completely
There is an option to bypass ICM and use only JCO to communicate with Azure.
Reason: Unpredictability of ICM and RFCs (different behavior based on SAP release, cipher suite(s), etc.)
There were too many issues already caused by changes to the SAP kernel, parameters, etc. Components that are out of our control and cause instability of SM.
To bypass ICM and use only JCO (java) to communicate with Azure ADLS2 REST API is necessary to set up in transaction /DVD/SM_SETUP the Java connector RFC.
Target System Settings as Target Host and Path Prefix are read from HTTP RFC Destination.
HTTP Proxy options
Proxy Host: Host name of the Proxy server to be used.
Proxy Port: Port number of the proxy server to be used.
Proxy User: User name for logging on to the proxy server.
Proxy Password: Password to use for logging on to the proxy server.