(Glue-2211) Mass Execution

The Extraction process of Extractor 2.0 supports mass execution functionality allowing you to execute the process in parallel. Mass execution splits data into multiple portions which are scheduled and executed as separate tasks.

In this section:

To execute Extractor 2.0 in mass execution mode follow these steps:

  1. Go to the extraction process and press Execute.

     

  2. As an execution type choose Mass Execution.

    After this action, you should see Mass Execution parameters on the screen.

     

  3. Fill in Job configuration. This section specifies the properties of background jobs that will be used for executing the tasks extracting each data portion.

     

    1. Maximum number of jobs - This parameter specifies the number of background jobs used for extraction.

    2. Application server - Optional input, specifying an application server executing the process. If left empty, the current application server will be used.

    3. Application server group - Optional input, specifying an application server group executing the process. If left empty, the current application server will be used.

    4. Keep alive - When checked, it makes sure that another background job will be opened in case some previous task ended up with an unexpected error. This way the parallelism level is kept on the specified level.
      We recommend you always keep this flag checked.

  4. After configuring the jobs, you need to select the Data portioning type. This parameter defines the logic used for data split into portions. Based on the selected value you can see various parameters selection relevant for the selected execution type. These will be described further.
    You can choose from the following data portioning types:

    1. Number of records

    2. Size in MB

    3. Time field

    4. Fiscal period

  5. Fill parameters required for selected data portioning logic.
    Description of relevant parameters is described in a separate section, describing particular portioning logic.

  6. Press Execute (F8).

In the case of SNP Glue™ Table Fetcher, only HANA native storage is supported. As for data split. Split data based on the number of records and size in MB is supported.

Based on the portioning type, the split of the data might be a time-consuming operation. Therefore you have an option to choose Background execution also in mass execution mode. In the case of data portioning types number of records and size in MB, we recommend using Background execution.

Important note to selection handling:

After the data split, the mass execution functionality creates an additional selection that is designed to select only a particular data portion. However, this additional selection is just added to the existing selection (if specified) and does not remove the original selection.

Therefore, you should be careful when defining the mass execution because the particular selection can make the mass execution split ineffective.
E.g., when splitting data based on the calendar year and you will provide the range of years as the selection for the calendar year field, this selection will be present in each mass execution task. As a result, all data that matches your manually specified range or the range generated by mass execution will be selected by each task.

Therefore, we recommend you leave the selection of portioning field empty unless you are executing some specific case e.g., extraction of all data between 2010 and 2020 except the year 2012.

In case your existing selection will conflict with portioning parameters, a popup below will be displayed.

Split data based on the Number of records

This logic splits the source data based on the number of records comparing selected field values while trying to match the required row limit.

In the next picture, you can see the selection screen of the Extractor 2.0 process showing parameters relevant for split data based on the number of records.

You need to fill in these parameters:

  • Portioning field - Field of the source structure used to split the data into portions.

  • Limit - Desired number of rows for a single data portion.

Split data based on Size in MB

This logic split the source data based on data size in MB comparing selected field values while trying to match the required size limit.

In the next picture, you can see the selection screen of the Extractor 2.0 process showing parameters relevant for split data based on the size in MB.

You need to fill these parameters:

  • Portioning field - Field of the source structure used to split the data into portions.

  • Limit - Desired size in MB for a single data portion.

Split data based on the Time field

This portioning logic splits the source data based on the defined time value. Using a time characteristic field, range, and unit definition, a separate task is created for each value within the given range.

In the next picture, you can see the selection screen of the Extractor 2.0 process showing parameters relevant for split data based on the time field.

You need to fill in these parameters:

  • Time field for portioning- Field of the source structure used to split the data into portions.

  • Timeshift for start - This value represents the number of time units defining the start of the range relative to the current date.
    E.g., if you provide 10 as a value and use Year(s) as an unit, you define your range to start 10 years before the current year.

  • Step size - Number of time units covered by a single range.
    Assuming the selection from the image above (step defined as 1 and unit as Year(s)) it means that only values matching a single year will be covered in a single extraction task.

  • Timeshift end - This value represents the number of time units defining the end of the range relative to the current date.
    E.g., if you provide 5 as a value and use Year(s) as a unit, you define your range to end 5 years before the current year.

  • Unit - Unit for range calculation. Day(s), Month(s), and Year(s) are supported units for the time field-based data split. The time unit is not explicitly labeled on the selection screen but is present as a dropdown list next to previous parameters.

Split data based on Fiscal period

This portioning logic splits the source data based on the defined fiscal year/period value. Using the Fiscal Year or Fiscal Period field of the source structure, the Fiscal Year Variant and the range and unit definition will create a separate task for each fiscal value in a given range.

In the next picture, you can see the selection screen of the Extractor 2.0 process showing parameters relevant for split data based on fiscal period.

You need to fill in these parameters:

  • Time field for portioning- Field of the source structure used to split the data into portions.

  • Timeshift for start - This value represents the number of time units defining the start of the range relative to the current date.
    E.g., if you provide 10 as a value and use Posting Period(s) as an unit, you define your range to start 10 posting periods before the current year.

  • Fiscal year variant - Fiscal year variant used for fiscal period definition. You can use the provided F4 help to select one of the Fiscal year variants available in the system.

  • Step size - Number of time units covered by a single range.
    Assuming the selection from the image above (step defined as 1 and unit as Posting Period(s)), it means that only values matching a single posting period will be covered in a single extraction task.

  • Timeshift end - This value represents the number of time units defining the end of the range relative to the current date.
    E.g., if you provide 5 as a value and use Posting Period(s) as an unit, you define your range to end 5 posting periods before the current date.

  • Unit - Unit for range calculation. Posting Period(s) and Fiscal Year(s) are supported units for the fiscal period-based data split. The unit is not explicitly labeled on the selection screen but is present as a dropdown list next to previous parameters.

Advanced mass execution capabilities

Mass execution functionality allows you to define also composite splitting methods. When an advanced data split type is used, you can see the following button on the selection screen.

For a complete guide on how to set up advanced mass execution and understand this functionality, please see the chapter Advanced Mass Execution Capabilities.

Mass execution with truncate option on file storage

Right after the mass execution is scheduled new JSON is created, which contains a truncate flag set to "X" (enabled). The JSON files for individual extraction jobs contain an empty truncate flag.