(Glue-2308) Mass Execution
The Extraction process of Extractor 2.0 supports mass execution functionality allowing you to execute the process in parallel. Mass execution splits data into multiple portions which are scheduled and executed as separate tasks.
Mass execution is currently not supported for Listcube fetcher and ODP fetcher.
For Mass Execution, the only supported load type is a full load. There is a possibility to initialize the delta via the checkbox in the Additional Fetcher config section.
In this section:
To execute Extractor 2.0 in mass execution mode follow these steps:
Go to the extraction process and press Execute.
As an execution type choose Mass Execution.
After this action, you should see Mass Execution parameters on the screen.
Fill in the Job configuration. This section specifies the properties of background jobs that will be used for executing the tasks extracting each data portion.
Maximum number of jobs: This parameter specifies the number of background jobs used for extraction.
Application server: Optional input, specifying an application server executing the process. If left empty, the current application server will be used.
Application server group: Optional input, specifying an application server group executing the process. If left empty, the current application server will be used.
Keep alive: When checked, it makes sure that another background job will be opened in case some previous task ended up with an unexpected error. This way the parallelism level is kept on the specified level.
We recommend you always keep this flag checked.
After configuring the jobs, you need to select the Data portioning type. This parameter defines the logic used for data split into portions. Based on the selected value you can see various parameters selection relevant for the selected execution type. These will be described further.
You can choose from the following data portioning types:Number of records
Size in MB
Time field
Fiscal period
Fill parameters required for selected data portioning logic.
Description of relevant parameters is described in a separate section, describing particular portioning logic.Press Execute (F8).
In the case of SNP Glue™ Table Fetcher, only HANA native storage is supported. As for data split. Split data based on the number of records and size in MB is supported.
In case your existing selection will conflict with portioning parameters, a popup below will be displayed.
Split data based on the Number of records
This logic splits the source data based on the number of records comparing selected field values while trying to match the required row limit.
In the next picture, you can see the selection screen of the Extractor 2.0 process showing parameters relevant for split data based on the number of records.
You need to fill in these parameters:
Portioning field: Field of the source structure used to split the data into portions.
Limit: Desired number of rows for a single data portion.
Split data based on Size in MB
This logic split the source data based on data size in MB comparing selected field values while trying to match the required size limit.
In the next picture, you can see the selection screen of the Extractor 2.0 process showing parameters relevant for split data based on the size in MB.
You need to fill in these parameters:
Portioning field: Field of the source structure used to split the data into portions.
Limit: Desired size in MB for a single data portion.
Split data based on the Time field
This portioning logic splits the source data based on the defined time value. Using a time characteristic field, range, and unit definition, a separate task is created for each value within the given range.
In the next picture, you can see the selection screen of the Extractor 2.0 process showing parameters relevant for split data based on the time field.
You need to fill in these parameters:
Time field for portioning: Field of the source structure used to split the data into portions.
Timeshift for start: This value represents the number of time units defining the start of the range relative to the current date.
E.g., if you provide 10 as a value and use Year(s) as an unit, you define your range to start 10 years before the current year.Step size: Number of time units covered by a single range.
Assuming the selection from the image above (step defined as 1 and unit as Year(s)) it means that only values matching a single year will be covered in a single extraction task.Timeshift end: This value represents the number of time units defining the end of the range relative to the current date.
E.g., if you provide 5 as a value and use Year(s) as a unit, you define your range to end 5 years before the current year.Unit: Unit for range calculation. Day(s), Month(s), and Year(s) are supported units for the time field-based data split. The time unit is not explicitly labeled on the selection screen but is present as a dropdown list next to previous parameters.
Split data based on Fiscal period
This portioning logic splits the source data based on the defined fiscal year/period value. Using the Fiscal Year or Fiscal Period field of the source structure, the Fiscal Year Variant and the range and unit definition will create a separate task for each fiscal value in a given range.
In the next picture, you can see the selection screen of the Extractor 2.0 process showing parameters relevant for split data based on fiscal period.
You need to fill in these parameters:
Time field for portioning: Field of the source structure used to split the data into portions.
Timeshift for start: This value represents the number of time units defining the start of the range relative to the current date.
E.g., if you provide 10 as a value and use Posting Period(s) as an unit, you define your range to start 10 posting periods before the current year.Fiscal year variant: Fiscal year variant used for fiscal period definition. You can use the provided F4 help to select one of the Fiscal year variants available in the system.
Step size: Number of time units covered by a single range.
Assuming the selection from the image above (step defined as 1 and unit as Posting Period(s)), it means that only values matching a single posting period will be covered in a single extraction task.Timeshift end: This value represents the number of time units defining the end of the range relative to the current date.
E.g., if you provide 5 as a value and use Posting Period(s) as an unit, you define your range to end 5 posting periods before the current date.Unit: Unit for range calculation. Posting Period(s) and Fiscal Year(s) are supported units for the fiscal period-based data split. The unit is not explicitly labeled on the selection screen but is present as a dropdown list next to previous parameters.
Advanced mass execution capabilities
Mass execution functionality allows you to define also composite splitting methods. When an advanced data split type is used, you can see the following button on the selection screen.
For a complete guide on how to set up advanced mass execution and understand this functionality, please see the chapter Advanced Mass Execution Capabilities.
Mass execution with truncate option on file storage
Right after the mass execution is scheduled new JSON is created, which contains a truncate flag set to X (enabled). The JSON files for individual extraction jobs contain an empty truncate flag.