(GLUE-1808) Delta Merge
Introduction
Datavard Glue Delta Merge enables data updates in Hadoop.
Currently, when Delta Merge is set up for Hadoop table (original table) another table is created (merge table). The merge table holds the data with unique primary keys and simulates update functionality.
Note
Delta Merge is limited to HIVE and IMPALA.
Delta Merge setup
You can access Delta Merge setup only from Object Navigator(/DVD/GL80).
Features
After you specify a Glue table (created in Hive or Impala Storage) you can make following actions:
- Show configuration - show existing delta merge configuration
- Configure Delta merge - create Delta merge configuration
- Remove Delta merge configuration - remove existing Delta merge configuration (also removes merge table)
Configure Delta Merge
Use following procedure to set up the Delta Merge:
- Press Configure Delta merge
- Fill in the required information
- Source GLUE Table - existing Hadoop table on which Delta Merge will be created
- Reporting GLUE Table - name of the merged table in which updated data will be stored.
Note
Reporting Glue table is generated automatically. If a table with such a name already exists, it will be overwritten.
- GLUE Merge Profile - settings for OOZIE and HDFS RFC connections, profiles can be configured in
- Pick reporting partitioning fields - select which fields are to be used as partitioning (if no fields or all fields are selected as partitioning, delta merge configuration will not be created)
If all fields are correctly entered, click on the icon to create Delta Merge configuration for the source table (original table).