(SM-2108) GCS & BigQuery storage setup

This page provides general guidelines on how to set up GCP Storage and Big Query to work with Datavard Storage Management.

Contents

1. GCP prerequisites

The entry point to GCP is https://console.cloud.google.com/. Log in with the company account.

1.1 Project

Select a project or create a new one

Overview of the project resources is always accessible by clicking on:

 

1.2 Service account

The key component for GCP access from the SAP system will be the Service account with the generated key in JSON format. Download the key file for authentication to GCP later in the configuration.

1.3 Storage bucket

The next step is the creation of a Storage bucket. At a minimum, the role of Storage Object Admin needs to be assigned to the Service account. Markdown the bucket name for later configuration.

1.4 BigQuery dataset

Similarly, the BigQuery dataset needs to be created with appropriate permissions (read/write) for the Service account. The dataset name will be used later.
Datavard Java connector uses GC API to run any queries on BigQuery which require at least the BigQuery Job User and BigQuery Data Owner roles assigned to the service account.

2. SAP configuration

2.1 JSON key

Step one on the SAP side is to upload the GCP Service account key file (from step 1.2) to the application server. Standard path for sensitive files used by Datavard Java connector is /sapmnt/<SID>/global/security/dvd_conn. All information necessary for authentication is contained within. Access to the file should be limited to authorized personnel (through both operating system and SAP transactions).

2.2 Datavard Java connector

Java connector is a critical middle-ware component. Please follow the steps in this guide Java Connector Setup to set it up before you continue. 

2.2.1 Datavard Java connector GCP libraries

In addition to standard Datavard Java connector configuration, GCP libraries have to be used as a dependency in Java Connector Setup.

If the library is missing, please contact the Datavard representative for further instructions on how to load the library bundle to the system.

2.4 Storage Management setup

The final step in SAP & GCP connectivity is the creation of two storages in transaction /DVD/SM_SETUP. One for the transfer of flat files, the other one for table management, and SQL queries.

Google Cloud Storage (GCS) - as seen in the screenshot, the required fields are:
Storage ID - logical name of the storage connection
Storage type - GCS (stands for Google Cloud Storage)
Bucket name - name of the storage bucket from step 1.3
Path Prefix - path to landing area within GCS bucket
Configuration path - physical path to JSON key file from step 2.1
Java connector RFC - TCP/IP RFC destination used for communication with Datavard Java connector (configured in step 2.2)


2. Google BigQuery storage (BIGQUERY) - required fields
Storage ID - logical name of the storage connection
Storage Type - BIGQUERY
Dataset - name of the BigQuery dataset from step 1.4
Reference Storage - Google Cloud storage used for the flat-file transfer
Java connector RFC - TCP/IP RFC destination used for communication with Datavard Java connector

Use extended escaping - checked if extended escaping should be used (replaces escape characters, such as newline, backspace, tabulator, etc., by the 'space' character)
Force file cursor reader - option to force usage of files for data transfer when reading data stored in BigQuery

If all configuration steps are completed and there is no problem with permissions, an automatic storage check shows a green tick.