(Glue-2011) Advanced Mass Execution Capabilities
Mass execution functionality allows you to define also composite splitting methods. When advanced data split type is used, you can see the following button in the selection screen.
In order to configure advanced mass execution, you need to proceed with following steps:
Press Configure button which will display a popup shown in the picture.
Define mass execution types that will be performed during extraction process execution. Use the provided toolbar to specify the configuration. Following operations are available:
Add data split - Adds new data split to advanced mass execution configuration.
Remove data split - Removes selected data split from advanced mass execution configuration together with parameters already defined for the selected data split.
Define parameters - Allows you to define and change parameters for the selected data split.
When data split types are added to mass execution configuration, you need to define parameters for each split you wish to include. You can do so by selecting the data split type and pressing Define parameters button from the toolbar. You can find more information about defining parameters for particular split types in the section Configuring advanced data split types.
After all parameters have been defined, you need to confirm the parameter selection by pressing Confirm button. In case of pressing the button Cancel, all changes you made will be discarded.
We recommend you to save this configuration as a standard SAP variant so you will be able to access the definitions later without the need to configure them again.
During extraction process execution, all defined split types are executed in order defined by the split number. In case some data split is defined incorrectly, the whole execution will end with error status.
Advanced mass execution explained
In this section we are going to explain how the advanced mass execution works. We will be assuming the example that uses following definition:
As you can see, there are two data split types defined within the configuration. First one splits the data into portions based on actual number of records in the source object and second one splits the data based on time characteristic. To learn more about split types, follow the page https://datavard.atlassian.net/wiki/spaces/DATAVARD/pages/2564560628 for basic types and section Configuring advanced data split types for advanced data split types.
Mass execution uses so called Composite data split in order to combine data portions that will be generated for each specified data split type. Assuming our example, when the extraction process is being executed, the Split data based on number of rows is executed first as it’s first in the definition. Let’s say the data will be divided into 10. Then the second split, Split data based on time field, will be executed in a way that it will divide data into let’s say 100 portions. Next the Composite data split logic combines the ranges that defines particular data portion. In result, data will be split into 1000 (10*100). This combination will split each data portion defined by the first split into number of portions defined by the second split by adding additional filter.
Be aware that with each additional split the number of data portion grows like geometric series. Therefore we do not recommend to use more than 3 splits.
This recommendation is however dependent on particular scenario, number of parallel jobs used and actual system resources.
Configuring advanced data split types
As mentioned in previous sections, you can define multiple data split types within advanced configuration. The basic data split types definition is equal to the definition using basic mass execution capabilities with similar user interface. Therefore, we will not mention them here and you can read about them on the page https://datavard.atlassian.net/wiki/spaces/DATAVARD/pages/2564560628.
Composite data split
Composite data split represents general wrapper that binds various splits together and is always configured when advanced data split type is selected. Therefore, it is not present as a split type within the Advanced Mass Execution popup and won’t be discussed further. You can find the details on the page https://datavard.atlassian.net/wiki/spaces/DATAVARD/pages/2564560628.
Split data based on csv defined values
This data split allows you to define data portions in .csv file and store it on application server. The data from application server is read and translated into single value filters that defines data portions.
In the next picture you can find the configuration screen:
Partitioning field - Source field based which the data will be split into portions.
File name - Path to .csv file on application server.
To select the file name, you can simply press F4 and navigate within the tree that will be displayed to your file.
Upload file - Allows you to upload your .csv file to the application server and sets it directly as the value of the File name parameter.
The user needs to have proper authorization (SAP standard) in order to upload a file. Therefore we suggest to setup separate folder on application server that will be used for these files only.
During the execution, the user that executes the process also needs to have the authorizations to read the file from particular path on application server.
File format
The expected file format is .csv file with values separated by character “,“. It also needs to include header at the first line and all values in particular columns needs to be convertible to actual data type represented by the field in the header. Please check following example: