Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

The Extraction process of Extractor 2.0 supports mass execution functionality allowing you to execute the process in parallel. Mass execution splits data into multiple portions which are scheduled and executed as separate tasks.

Mass execution is currently not supported for Listcube fetcher and ODP fetcher.

For Mass Execution, the only supported load type is a full load. There is a possibility to initialize the delta via the checkbox in the Additional Fetcher config section.

In this section:

To execute Extractor 2.0 in mass execution mode follow these steps:

  1. Go to the extraction process and press Execute.

  2. As an execution type choose Mass Execution.

    After this action, you should see Mass Execution parameters on the screen.

  3. Fill in the Job configuration. This section specifies the properties of background jobs that will be used for executing the tasks extracting each data portion.

    1. Maximum number of jobs: This parameter specifies the number of background jobs used for extraction.

    2. Application server: Optional input, specifying an application server executing the process. If left empty, the current application server will be used.

    3. Application server group: Optional input, specifying an application server group executing the process. If left empty, the current application server will be used.

    4. Keep alive: When checked, it makes sure that another background job will be opened in case some previous task ended up with an unexpected error. This way the parallelism level is kept on the specified level.
      We recommend you always keep this flag checked.

  4. After configuring the jobs, you need to select the Data portioning type. This parameter defines the logic used for data split into portions. Based on the selected value you can see various parameters selection relevant for the selected execution type. These will be described further.
    You can choose from the following data portioning types:

    1. Number of records

    2. Size in MB

    3. Time field

    4. Fiscal period

  5. Fill parameters required for selected data portioning logic.
    Description of relevant parameters is described in a separate section, describing particular portioning logic.

  6. Press Execute (F8).

In the case of SNP Glue™ Table Fetcher, only HANA native storage is supported. As for data split. Split data based on the number of records and size in MB is supported.

Based on the portioning type, the split of the data might be a time-consuming operation. Therefore you have an option to choose Background execution also in mass execution mode. In the case of data portioning types number of records and size in MB, we recommend using Background execution.

Important note to selection handling:

After the data split, the mass execution functionality creates an additional selection that is designed to select only a particular data portion. However, this additional selection is just added to the existing selection (if specified) and does not remove the original selection.

Therefore, you should be careful when defining the mass execution because the particular selection can make the mass execution split ineffective.
E.g., when splitting data based on the calendar year and you will provide the range of years as the selection for the calendar year field, this selection will be present in each mass execution task. As a result, all data that matches your manually specified range or the range generated by mass execution will be selected by each task.

Therefore, we recommend you leave the selection of portioning field empty unless you are executing some specific case e.g., extraction of all data between 2010 and 2020 except the year 2012.

In case your existing selection will conflict with portioning parameters, a popup below will be displayed.

Split data based on the Number of records

This logic splits the source data based on the number of records comparing selected field values while trying to match the required row limit.

In the next picture, you can see the selection screen of the Extractor 2.0 process showing parameters relevant for split data based on the number of records.

You need to fill in these parameters:

  • Portioning field: Field of the source structure used to split the data into portions.

This field also needs to be present in the selection of the Fetcher used by the extraction process.

  • Limit: Desired number of rows for a single data portion.

Usually, there is no possibility to split data on one field while matching the exact number of records. In this case, the logic tries to get as close as possible to the desired number of rows. Mentioned precision strongly depends on the portioning field selection.

E.g., in case the field with few distinct values are used and the limit is set to some low value (e.g., 1000 records), you might see a significant difference between the limit and number of records extracted with each task.

Split data based on Size in MB

This logic split the source data based on data size in MB comparing selected field values while trying to match the required size limit.

In the next picture, you can see the selection screen of the Extractor 2.0 process showing parameters relevant for split data based on the size in MB.

You need to fill in these parameters:

  • Portioning field: Field of the source structure used to split the data into portions.

This field also needs to be present in the selection of the Fetcher used by the extraction process.

  • Limit: Desired size in MB for a single data portion.

Usually, there is no possibility to split data on one field while matching the exact size in MB. In this case, the logic tries to get as close as possible to the desired data size in MB. Mentioned precision strongly depends on portioning field selection.

E.g., in case the field with few distinct values are used and the limit is set to some low value (e.g., 1MB), you might see a significant difference between the limit and size extracted with each task.

Split data based on the Time field

This portioning logic splits the source data based on the defined time value. Using a time characteristic field, range, and unit definition, a separate task is created for each value within the given range.

In the next picture, you can see the selection screen of the Extractor 2.0 process showing parameters relevant for split data based on the time field.

For time-field-based split, the fields that match the calendar year, calendar month, and calendar day data types are supported.

You need to fill in these parameters:

  • Time field for portioning: Field of the source structure used to split the data into portions.

This field needs to match the type of calendar year, calendar month, or calendar day. It also needs to be present in the selection of the Fetcher used by the extraction process.
In order to get these fields, you can use the provided F4 help.

Please be aware that type compatibility does not ensure the selected field really represents some time characteristic. It’s up to you to select a valid field from the source structure.

  • Timeshift for start: This value represents the number of time units defining the start of the range relative to the current date.
    E.g., if you provide 10 as a value and use Year(s) as an unit, you define your range to start 10 years before the current year.

  • Step size: Number of time units covered by a single range.
    Assuming the selection from the image above (step defined as 1 and unit as Year(s)) it means that only values matching a single year will be covered in a single extraction task.

  • Timeshift end: This value represents the number of time units defining the end of the range relative to the current date.
    E.g., if you provide 5 as a value and use Year(s) as a unit, you define your range to end 5 years before the current year.

  • Unit: Unit for range calculation. Day(s), Month(s), and Year(s) are supported units for the time field-based data split. The time unit is not explicitly labeled on the selection screen but is present as a dropdown list next to previous parameters.

Split data based on Fiscal period

This portioning logic splits the source data based on the defined fiscal year/period value. Using the Fiscal Year or Fiscal Period field of the source structure, the Fiscal Year Variant and the range and unit definition will create a separate task for each fiscal value in a given range.

In the next picture, you can see the selection screen of the Extractor 2.0 process showing parameters relevant for split data based on fiscal period.

For the fiscal period-based split, the fields that match the fiscal year and fiscal/posting period data types are supported.

You need to fill in these parameters:

  • Time field for portioning: Field of the source structure used to split the data into portions.

This field needs to match with the type of fiscal year or fiscal/posting period. It also needs to be present in the selection of the Fetcher used by the extraction process.
In order to get these fields, you can use the provided F4 help.

Please be aware the type compatibility does not ensure the selected field really represents some time characteristic. It’s up to you to select a valid field from the source structure.

  • Timeshift for start: This value represents the number of time units defining the start of the range relative to the current date.
    E.g., if you provide 10 as a value and use Posting Period(s) as an unit, you define your range to start 10 posting periods before the current year.

  • Fiscal year variant: Fiscal year variant used for fiscal period definition. You can use the provided F4 help to select one of the Fiscal year variants available in the system.

  • Step size: Number of time units covered by a single range.
    Assuming the selection from the image above (step defined as 1 and unit as Posting Period(s)), it means that only values matching a single posting period will be covered in a single extraction task.

  • Timeshift end: This value represents the number of time units defining the end of the range relative to the current date.
    E.g., if you provide 5 as a value and use Posting Period(s) as an unit, you define your range to end 5 posting periods before the current date.

  • Unit: Unit for range calculation. Posting Period(s) and Fiscal Year(s) are supported units for the fiscal period-based data split. The unit is not explicitly labeled on the selection screen but is present as a dropdown list next to previous parameters.

Advanced mass execution capabilities

Mass execution functionality allows you to define also composite splitting methods. When an advanced data split type is used, you can see the following button on the selection screen.

Following data portioning types are basic split types and can be defined using the standard selection screen:

  • Number of records

  • Size in MB

  • Time field

  • Fiscal period

Other data portioning types represent advanced split types and can be defined using the Configure button. These data portioning types are:

  • Combined data split

  • CSV defined values

  • ABAP code

  • Automated data split

For a complete guide on how to set up advanced mass execution and understand this functionality, please see the chapter Advanced Mass Execution Capabilities.

Mass execution with truncate option on file storage

Right after the mass execution is scheduled new JSON is created, which contains a truncate flag set to X (enabled). The JSON files for individual extraction jobs contain an empty truncate flag.

  • No labels