(SM-1911) Azure Databricks

1. Prerequisites on Azure portal

The entry point to Azure Databricks is https://portal.azure.com. From the home page navigate to Databricks section.

Click

Your account must have Owner or Contributor privileges on Databricks workspace to be able to access it.
By default, the Azure Databricks cluster is not running. Either create a new cluster or start the existing one (be sure to change cluster filter to show all clusters).

In Advanced Options set up Spark config for connection to ADLS storage. The configuration is a set of key-value pairs separated by single spaces.

Spark metastore version spark.sql.hive.metastore.version 1.2.1
Spark drivers spark.sql.hive.metastore.jars builtin
OAuth access type spark.hadoop.dfs.adls.oauth2.access.token.provider.type ClientCredential
OAuth login endpoint spark.hadoop.dfs.adls.oauth2.refresh.url https://login.microsoftonline.com/6fdc3117-ec29-4d73-8b33-028c8c300872/oauth2/token
OAuth secret spark.hadoop.dfs.adls.oauth2.credential +gv2ThjPc++++++++++++++++++++++++++zV8NrM74=
OAuth client ID spark.hadoop.dfs.adls.oauth2.client.id 74731c8c-7290-4998-8005-1d0670cbe909

 

Note server hostname, port and JDBC URL which will be used in SM storage definition

To create a database, first, create a Notebook for submission of SQL queries

Now you can run create database SQL query

 

Create a TOKEN for remote access in User Settings (Icon top right)

Save token for later use in SAP configuration

2. Configuration on SAP side

Most of the setup is identical to Hive storage describe under Hive Metastore Storage chapter in Hadoop Storage Setup.