(SM-2305) Azure Data Lake Gen2

This article describes a process required to establish a connection from SAP to Azure Data Lake Storage Gen2.

Prerequisites

  • SAP NetWeaver 7.01 SP-level 015

  • HTTPS service enabled

  • SSFLIB Version 1.850.40 ; CommonCryptoLib (SAPCRYPTOLIB) Version 8.5.20 (+MT)

Azure storage configuration

This section describes the steps needed for the preparation of the landing area on the Azure side.

Storage Account and ADLS container

To create the MS Azure storage account, follow the official documentation Create a storage account to use with Azure Data Lake Storage Gen2.


Landing area (container/directory)

Create a container or a directory within the container where all files extracted from the SAP system will be stored.
In Microsoft Azure Storage Explorer, select the container/directory that will be used and set appropriate permissions depending on the authentication method chosen.

Authentication

Authentication to ADLS can be delivered by two methods:

  • SAS Token: Recommended for Proof of Concepts, faster setup, but provides fewer security management options.
  • OAuth 2.0: Recommended for production deployment, fully manageable, but requires application registration and detailed permission setup.

SAS Token

If you use the SAS token, please ignore OAuth 2.0 Authentication steps. Similarly, you can skip steps related to the OAuth profile in the SAP configuration section.

To generate the SAS token, go to the Azure portal.

The SAS token should grant permission to Create, Read, Write, Delete and List.
Click Generate SAS token and URL.
Copy Blob SAS token string and store it for configuration on the SAP side.

OAuth 2.0 Authentication

OAuth 2.0 required application registration. To create a new registration, follow these steps:

  1. Go to Azure Active Directory > App registrations > New application registration.

2. Fill in the required fields and click Create.

3. Write down Application ID and Directory ID, as it will be required later during the Storage management configuration.

4. Click Certificates & secrets and generate a New client secret. Write down the secret, which will be used later during the configuration.

5. Allow access to the registered application (Client ID) to the landing area.

6. The registered application also needs to execute permission on ALL parent directories and filesystems (enabling directory structure traversal).

SAP system configuration

After preparation is complete on the Azure side, fill in the required information on the SAP side to establish a connection.

STRUST

To enable verification of server authenticity and encrypted communication (HTTPS), SSL certificates need to be added to the SAP system's STRUST.
To download relevant certificates, open a web browser in private mode and navigate to storage account FQDN in the format: https://<container>.dfs.core.windows.net
The browser will return an InvalidUri error, but will offer an interface to download the certificates:

Please make sure to download all three certificates to have a full certificate chain:


In STRUST, import these certificates into SSL Client (Anonymous) PSE. After the import, click Save.

If OAuth authentication is to be used, repeat the same steps for certificate management with the address https://login.microsoftonline.com.
The address uses the following certificate chain:

When the PSE is updated, run the transaction SMICM and restart the ICM services.

Storage RFC

Create RFC type G for Azure Data Lake Gen2 primary endpoint.

  • Set Target Host to your ADLS address (e.g. dvdadls2.dfs.core.windows.net).

    Target host must contain <host>.dfs.core.windows.net in FQDN.
    That guarantees ADLS Gen2 API will be addressed.

  • Set Path Prefix to ADLS container.

  • Set SSL to Active and Certificate list to ANONYM SSL Client (Anonymous).
  • Set HTTP Version to 1.1 and Compression to Active.


OAuth 2.0 Authentication RFC destination

This step is required only if OAuth authentication was chosen.

Create RFC of the type G for the Microsoft Active directory. This RFC represents a connection to the authority server that grants an authentication token for ADLS. 

Storage Management

In this step, the actual connection to the landing area is created.

  1. Execute transaction /DVD/SM_SETUP.
  2. Switch to Edit Mode and click New Storage.
  3. Choose storage type ADLS_GEN2 and specify Storage ID.
  4. Specify the following parameters:

      • HTTP RFC Destination: Storage RFC destination created previously.
      • Path Prefix: /<path to landing area> within ADLS container.
      • HTTP Call Repeat: Specify the number of times the HTTP request will be retried in case of failure (0 = no repetition).
      • Repeat Delay (seconds): Seconds between repetitions - if not filled, the default value will be set to 3 seconds (instead of 0).


The bottom part of the storage connection configuration depends on the authentication method chosen.

a) If a SAS token is used, either insert it directly to the respective field, or you can insert it into the field next to Hash button and by clicking Hash, convert it into a hashed string to prevent plain text visibility.

SAS token value alerts:

  • Remove a question mark ? character at the beginning of the generated SAS token.


b) If the OAuth authentication method is chosen, fill in the OAuth profile with information recorded during application registration:

OAuth 2.0 profile: The ID of the profile.
Client ID: Application ID created in the section (SM-2305) Azure Data Lake Gen2.
Client secret: Key created in the section (SM-2305) Azure Data Lake Gen2.
Resource: Static address https://storage.azure.com/.
Tenant: Part of the URL identifier described in the section (SM-2305) Azure Data Lake Gen1 (obsolete).

Confirm storage creation by saving the information (Ctrl+S).

To check the connection click Check button.


Use compute hash to calculate the hash of data.

Advanced

Failover storage

Failover (Read-only) storage enables to utilize of Azure Storage Redundancy, e.g. rread-access geo-redundant storage (RA-GRS) or read-access geo-zone-redundant storage (RA-GZRS), to improve high availability/disaster recovery. For more information about Azure replication strategies, refer to official Microsoft documentation Azure Storage redundancy.

When the primary storage is unavailable, i.e. connection check fails, the application will automatically switch to Failover storage for data read from another data center in a  secondary region. Follow the procedure below to enable it on the SAP side.

Enable Failover storage storage

  • Run transaction /DVD/RL_SETT_EXPERT
  • Specify SM in Tool name parameter and execute (F8)

Find the parameter Failover storage, click on the Edit button, and set the value to X.




After that, additional parameters appear in Storage Management settings (transaction /DVD/SM_SETUP), where you can specify HTTP Destination to your Failover storage and SAS token.

Failover storage RFC

Create RFC type G for Azure Data Lake Gen2 secondary endpoint.

  • Set Target Host to your secondary ADLS address - it is the same as for primary storage, but appends the suffix –secondary (e.g. dvdadls2-secondary.dfs.core.windows.net) and Path Prefix to /<filesystem>/<Path to landing folder>.

  • Set SSL to Active and Certificate list to ANONYM.
  • Set HTTP Version to 1.1 and Compression to Active


Option to bypass ICM completely

There is an option to bypass ICM and use only JCO to communicate with Azure.

Reason: Unpredictability of ICM and RFCs (different behavior based on SAP release, cipher suite(s), etc.)
There were too many issues already caused by changes to the SAP kernel, parameters, etc. Components that are out of our control and cause instability of SM.

To bypass ICM and use only JCO (java) to communicate with Azure ADLS2 REST API is necessary to set up in transaction /DVD/SM_SETUP the Java connector RFC.