(GLUE-1808) Delta Merge

Introduction

Datavard Glue Delta Merge enables data updates in Hadoop.

Currently, when Delta Merge is set up for Hadoop table (original table) another table is created (merge table). The merge table holds the data with unique primary keys and simulates update functionality.

Note

Delta Merge is limited to HIVE and IMPALA.

Delta Merge setup

You can access Delta Merge setup only from Object Navigator(/DVD/GL80).

Features

After you specify a Glue table (created in Hive or Impala Storage) you can make following actions:

  • Show configuration - show existing delta merge configuration
  • Configure Delta merge - create Delta merge configuration
  • Remove Delta merge configuration - remove existing Delta merge configuration (also removes merge table)

Configure Delta Merge

Use following procedure to set up the Delta Merge:

  1. Press Configure Delta merge
  2. Fill in the required information
  • Source GLUE Table - existing Hadoop table on which Delta Merge will be created
  • Reporting GLUE Table - name of the merged table in which updated data will be stored.

Note

Reporting Glue table is generated automatically. If a table with such a name already exists, it will be overwritten.

  • GLUE Merge Profile - settings for OOZIE and HDFS RFC connections, profiles can be configured in 
  • Pick reporting partitioning fields - select which fields are to be used as partitioning (if no fields or all fields are selected as partitioning, delta merge configuration will not be created)


If all fields are correctly entered, click on the icon  to create Delta Merge configuration for the source table (original table).