(SM-1911) Azure Databricks
1. Prerequisites on Azure portal
The entry point to Azure Databricks is https://portal.azure.com. From the home page navigate to Databricks section.
Click
Your account must have Owner or Contributor privileges on Databricks workspace to be able to access it.
By default, the Azure Databricks cluster is not running. Either create a new cluster or start the existing one (be sure to change cluster filter to show all clusters).
In Advanced Options set up Spark config for connection to ADLS storage. The configuration is a set of key-value pairs separated by single spaces.
Spark metastore version
spark.sql.hive.metastore.version 1.2.1Spark drivers
spark.sql.hive.metastore.jars builtinOAuth access type
spark.hadoop.dfs.adls.oauth2.access.token.provider.type ClientCredentialOAuth login endpoint
spark.hadoop.dfs.adls.oauth2.refresh.url https://login.microsoftonline.com/6fdc3117-ec29-4d73-8b33-028c8c300872/oauth2/tokenOAuth secret
spark.hadoop.dfs.adls.oauth2.credential +gv2ThjPc++++++++++++++++++++++++++zV8NrM74=OAuth client ID
spark.hadoop.dfs.adls.oauth2.client.id 74731c8c-7290-4998-8005-1d0670cbe909
Note server hostname, port and JDBC URL which will be used in SM storage definition
To create a database, first, create a Notebook
for submission of SQL queries
Now you can run create database SQL query
Create a TOKEN for remote access in User Settings (Icon top right)
Save token for later use in SAP configuration
2. Configuration on SAP side
Most of the setup is identical to Hive storage describe under Hive Metastore Storage chapter in Hadoop Storage Setup.