(GLUE-1808) Create SAP Table to Storage Extractor

You can use SAP Table to Storage extractor too load structured data to external storage (like Hadoop).


To create the extractor:

  1. Go to Glue Extractor Workbench
  2. Right-click on SAP Table to Storage extractor type, select Create Extractor
  3. Specify following parameters:
    • Extractor name - technical name of the extractor
    • Description
    • Extractor type - this value is predefined, but you can change it using a simple drop-down menu.
    • Source table – defines source SAP (DDIC) table or view from which the data will be transferred.
    • Target table – target external (Hadoop) table

4. Specify a package



5. Define a mechanism for data synchronization - Delta type

Following synchronization types are supported:

      • FULL – Standard full load without replication. Data is transferred and appended to the target table.
      • VALUE – Value based delta load. Only entries with new distinct values of this column are designated to be transferred during the next execution of the extractor.
      • DATE – Date based delta load. In the first run, all the data with a date older than the current date is transferred. In following extractor runs, only data with a date newer than the latest date and older than current date is transferred.
      • TMSTMP – Timestamp based delta. In the first run, all the data is transferred. In following extractor runs, only data with a timestamp greater than the latest timestamp from the previous run is transferred.
      • TRIGGER - Trigger based delta. 

See this link for more information.

For extractors of delta type DATE, TMSTMP and VALUE you must select a field for delta extractions.



6. The Fieldname Mapping is automatically generated


When target and source tables contain fields with the same names, they are linked automatically, but source table fields can be changed manually when needed. E.g. as displayed on the figure above, there is no source table field assigned to target table field "CLAZZ", simply because field with such a name does not exist in the table. 
The Selection column in the Fieldname Mapping table is used to identify the fields which can be later when defining variant, used for filtering (selection) of the data.
Before using the extractor, it must be saved and activated. When activating an unsaved extractor, it is saved automatically before activation.

Delta details

You can use Delta Tool to see details about extractor delta.

Delta Tools - Properties tab

Properties

Properties tab contains general information about current state of an extractor:

  • Delta type
  • Status - actual status of the extractor (Saved or Active)
  • Source - the name of the source view/table
  • Last extraction - date and time of the last extraction
  • Full load - indicates whether the full load was executed

Properties section shows the delta volume - a number of records which will be extracted in next load run and additional information specific for each delta type.

  • Full mode

Truncate target table - if this checkbox is checked then before new extraction, the data in target table is deleted

  • Trigger

Shadow table - press Display to see a shadow table, a table which is generated if trigger based delta is used.

  • Date

Date field - name of field according which extraction is executed

Highest date - value of the highest date in the last extraction

  • TMSTMP

Timestamp field - name of field according which extraction is executed

Last timestamp - value of last timestamp in last extraction

  • Value

Value field - name of a field according which extraction is executed

Manage

Using Manage tab you can trigger action on the extractor.

First, you must select an existing variant. If only one variant is created then it is automatically selected, when more variants exist then you must choose one.

Hit the RUN button to:

  • Perform a full load 
  • Perform immediate delta run

Delta Tools - Manage tab