(SM-2402) Snowflake Storage Setup

This page provides general guidelines on how to set up Snowflake Storage to work with Storage Management.

 

With the introduction of Snowpipe Streaming, the connection setup procedure is slightly different compared to the previous Batch Load data replication method.
The crucial differentiator is whether Snowflake Ingest SDK is to be used or not. Because this SDK contains an internal driver manager conflicting with the default one used by SNP Java Connector, the Snowpipe Streaming needs to be fully separated from any other connections.
This is further explained in the instructions below.

1. General prerequisites

1.1 Network Connectivity

To enable communication of SAP systems with the Snowflake host, the following host/ports need to be reachable from the SAP system:

Port

Address

Port

Address

443

<account name>.<region>.<platform>.snowflakecomputing.com

443

ocsp.snowflakecomputing.com

If your Snowflake account is hosted in a particular cloud environment like AWS or MS Azure, make sure their respective domains are reachable from the SAP system (e.g. *.amazonaws.com, *.windows.net); port 443.
This is needed for file upload to the Snowflake internal staging area if the Batch Load data replication method is used.
There is an automatic redirect happening during the PUT operation when transferring data to the Snowflake internal stage, which goes to the respective vendor’s file service (example: customer-stage.s3.us-east-2.amazonaws.com).

In case of connectivity problems, Snowflake offers SnowCD utility to check connectivity towards Snowflake service endpoints possibly involved in the communication: https://docs.snowflake.com/en/user-guide/snowcd.html#

1.2 Warehouse and Database

Snowflake warehouse and database must already exist.

1.3 Snowflake user

Snowflake technical users must be created with full privileges on the chosen Snowflake DB schema.
The recommended privilege for the Snowflake technical user is OWNERSHIP of the Snowflake database.
In case ownership of the database cannot be assigned to the technical user, minimal requirements are the following:

GRANT ALL ON DATABASE <DATABASE> TO ROLE <ROLE> GRANT USAGE ON WAREHOUSE <WAREHOUSE> TO ROLE <ROLE> GRANT ALL ON SCHEMA <SCHEMA> TO ROLE <ROLE>

There should be a one-to-one relation between Snowflake schemas and every SAP system connected, ensuring replicated data isolation.

2. OS prerequisites (On SAP host)

This group of requirements relates to the operating systems underlying the SAP system with all its application servers.
SNP products (e.g. SNP Glue™, SNP OutBoard™ Data Tiering) have been developed and tested in the SUSE Linux environment and Windows Server 2012.
However, by design, they are not limited by the choice of an operating system, if the requirements listed in this guide are met.

2.1 OS directory with JDBC Driver and/or Snowflake Ingest SDK

JDBC protocol is used to connect to Snowflake. Snowflake JDBC driver needs to be manually copied to the operating system and accessible to the Java connector.
Download Snowflake JDBC driver. The latest version of the Snowflake JDBC driver is recommended, except for the Snowpipe Streaming storage connection - the streaming replication method currently requires JDBC driver version 3.13.30 to 3.14.5 in combination with snowflake-ingest-sdk-2.0.x.jar.
It is recommended to store the drivers in a filesystem shared between the application servers, organized in sub-directories to avoid possible conflicts (some customers connect to multiple different platforms/services). Important information is that while the path to the driver directory is defined in the storage connection (SAP GUI), the drivers themselves are loaded by the Java connector component.

Sample directory with the JDBC driver for standard Batch Load method:

$ ls -l /sapmnt/<SID>/drivers/snowflake -rwxr-xr-x 1 <sid>adm root 69376363 Jan 15 09:45 snowflake-jdbc-3.14.5.jar

Sample directory with both JDBC driver and Ingest SDK for Snowpipe Streaming:

$ ls -l /sapmnt/<SID>/drivers/snowflake_streaming -rw-r--r-- 1 <sid>adm sapsys 52768695 Dec 1 16:31 snowflake-ingest-sdk-2.0.4.jar -rw-r--r-- 1 <sid>adm sapsys 33471655 Jan 31 11:53 snowflake-jdbc-3.13.34.jar

Set its ownership and permissions appropriately to <sid>adm[:sapsys].

For compatibility with the SNP Java connector, the Snowflake JDBC driver filename must begin with the prefix snowflake-jdbc. If there are multiple versions of the Snowflake driver present, the latest version will be utilized. It is advisable to keep only one version of the JDBC driver within the Snowflake driver directory.

The Snowflake Ingest SDK JAR filename must begin with the prefix snowflake-ingest. Similarly, if there are multiple versions of the Snowflake Ingest SDK available, the most recently added version will be used. It is recommended to keep only one version of both the JDBC driver and the Snowflake Ingest SDK within the Snowflake streaming driver directory.

3. SAP configuration

3.1 JAVA connector

Java connector is a critical middle-ware component. Follow the steps in Java Connector Setup to set it up/upgrade to a new version before you continue. 

The Java runtime environment is narrowed due to Snowflake JDBC driver requirement - it must be installed in a 64-bit environment and requires Java 1.8 (or higher), link to the official page: https://docs.snowflake.com/en/user-guide/jdbc.html.

3.2 Storage Management Setup

The final step in SAP & Snowflake connectivity is the creation of appropriate storage connections in transaction /DVD/SM_SETUP

For the Batch Load replication method SNOW_STAGE and SNOWFLAKE type connections need to be configured (the first one for data file upload, the second for SQL communication with Snowflake API):

3.2.1 Snowflake Stage (SNOW_STAGE)

Binary storage points to the Internal Stage of the Snowflake user. It is used for data transfer leveraging the extended functionality of the Snowflake JDBC driver.

Storage ID: Logical name of the storage connection.
Storage type: SNOW_STAGE (Snowflake internal stage).
Java connector RFC: TCP/IP RFC destination used for communication with Java connector.
Account name: Name of the Snowflake account without .snowflakecomputing.com domain, in the format: <account name>.<region>.<platform> (more information can be found in Snowflake account name doc).
User role: The user's role used for data insertion.
JAVA Call Repeat: Number of times failed calls should be retried.
Repeat delay (seconds: Delay between retried calls.
Driver path: Absolute path to the directory containing the Snowflake JDBC driver.
NOTE: In previous versions of Storage Management, SAP logical file of type DIR (directory) was used. This is now automatically converted to a physical path.
Connection pool size: Number of connections that can be kept open in the pool, reducing resource-expensive establishment of JDBC connections.
Username: Snowflake user.
Password: User password.

If Key Pair authentication is established according to https://docs.snowflake.com/en/user-guide/key-pair-auth.html#key-pair-authentication-key-pair-rotation, it can be configured with or without a passphrase as follows:

3.2.2 Snowflake Storage (SNOWFLAKE)

Transparent storage pointing to Snowflake Database. For sending SQL command data use the Snowflake JDBC driver.

Storage ID: Logical name of the storage connection.
Storage type: SNOWFLAKE (Snowflake transparent storage).
Referenced storage: Storage ID of Snowflake stage area for temporary file upload/download.
Java connector RFC: RFC Destination used for communication with JCo.
Account name: <account name>.<region>.<platform> (more information can be found in Snowflake account name doc).
Warehouse: Optional parameter, you can select the existing warehouse in the Snowflake account to be used.
Database: Name of an existing database in Snowflake.
Database schema: Name of an existing database schema in Snowflake.
User role: The user's role used for data insertion.
Driver path: Snowflake driver directory.
Hints: Additional string added to connection string when JDBC driver establishes a connection.
Connection pool size: Number of connections that can be kept open in the pool, reducing resource-expensive establishment of JDBC connections.
Table name prefix: Optional text value for naming prefix of all Glue tables created within this storage.
Use Transient tables: When enabled, only transient tables will be created on the storage.
Use Snowflake App for data merge: When enabled, the SNP Native app will be used.
Snowflake App name: Name of the existing Snowflake native App to be used.
Wrap values in staged CSV files: Depending on table content, it may be necessary to encapsulate field values to avoid errors during INSERT into the Snowflake table from staging CSV files.
Data delivery guarantee: Data transfer behavior. More information can be found in the chapter Data Delivery Guarantee.

 

For the Snowpipe Streaming replication method only one SNOWSTREAM type connection is sufficient:

3.2.3 Snowpipe Streaming Storage (SNOWSTREAM)

Transparent storage pointing to Snowflake Database. For sending SQL commands, data uses the Snowflake JDBC driver. To insert data into the Snowflake database Snowflake Ingest SDK is used. For more information about Snowpipe Streaming, requirements, limitations, and configuration visit the official Snowpipe Streaming documentation.

Storage ID: Logical name of the storage connection.
Storage type: SNOWSTREAM (Snowflake transparent storage - Streaming).
Java connector RFC: RFC Destination used for communication with JCo.
Account name: <account name>.<region>.<platform> (more information can be found in Snowflake account name doc).
Warehouse: Optional parameter, you can select the existing warehouse in the Snowflake account to be used.
Database: Name of an existing database in Snowflake.
Database schema: Name of an existing database schema in Snowflake.
User role: The user's role used for data insertion.
Driver path: Snowflake driver directory. It’s required to include the Snowflake Ingest SDK jar file in this directory. The SDK is available for download from the Snowflake Maven Central Repository.
Hints: Additional string added to connection string when JDBC driver establishes a connection.
Connection pool size: Number of connections that can be kept open in the pool, reducing resource-expensive establishment of JDBC connections.
Username: Snowflake user.
Private Key file location: Path to Private Key file.
Private Key Encrypted: When enabled, a Password input field will be displayed.
Password: Password to Private key.
Use Snowflake App for data merge: When enabled, the SNP Native app will be used.
Snowflake App name: Name of the existing Snowflake native App to be used.
Table name prefix: Optional text value for naming prefix of all tables created within this storage.
Data delivery guarantee: Data transfer behavior. More information can be found in the chapter