(DI-1902) Define the Data Distribution Analysis

Datavard Insights BW analysis enables you to analyze data distribution and the usage of InfoProviders in a particular system based on a characteristic (usually on time characteristic). The characteristic you choose splits data into groups, which enables us to analyze the distribution of data. Here is an example. You can either create your own Data distribution analysis with a split characteristic or use the predefined Data distribution analysis.

Predefined Data distribution analysis

BW analysis is shipped with one predefined data distribution setting with the technical name "STANDARD". This data distribution setting doesn't use only one specific characteristic to split data in InfoProviders, but a set of standard time characteristics (0CALYEAR, 0CALMONTH, 0CALWEEK, 0FISCYEAR, 0FISCPER, 0CALDAY, 0DATE, 0CALQUARTER). Each InfoProvider is searched for one of these characteristics or referenced time characteristics in a left to right order and then Data distribution analysis is performed based on the first characteristic, which is found. No matter which time characteristic is used as a split characteristic, the value of the this characteristic is transformed to a year value.

If Extended query statistics are enabled, then the analysis defines the time characteristic - which was the most used in query filters by an InfoProvider and its reporting parents - as the split characteristic. This rule is superior to the above mentioned logic.

Another exception is that, if an InfoProvider is a DSO or an InfoCube and a DAP (Data archiving process) is created, then the analysis uses the time characteristic from the DAP.

The current version of the Data distribution analysis is based on only one specified split characteristic. The only exception is the above mentioned data distribution setting with the technical name "STANDARD".

It is possible to enhance a set of used time characteristics by additional ones. See section Additional Time Characteristics for more information.

Define a custom Data distribution analysis

Define Data distribution analysis

You can define a new Data distribution analysis of InfoProviders in your monitored systems following the steps described in this section. This results in the creation of a new tile in Insights Fiori front-end in "Data Management" tab.

Custom Data distribution analysis tile

To define a new Data distribution analysis (technically called as Subgroup)

  1. Run the transaction SE38 and execute the program "/DVD/HS_CENTRAL_SUBGROUP" or the transaction /DVD/HS_SUBG. The following screen appears.
  2. Define the following Data distribution analysis attributes:
    1. Subgroup ID - unique identifier of Data distribution analysis. 
    2. Split Characteristics - defines which InfoObjects are used for the Data distribution analysis. In the current version only 1 characteristic can be defined as a split characteristic and used for data distribution. During the Data distribution analysis each analyzed InfoProvider is searched for the characteristic. If the characteristic is found, all the InfoProvider data is split into groups, which are based on the distinct values of the split characteristic. 

      We implemented a safety check that allows to split object data only up to 100 distinct values to prevent generation of too much data in the system.

    3. Data groups definition area - in this table you can define data groups. Each data group is identified by its ID (DATA_GROUP), it's description and set of conditions. These conditions define how InfoProvider data created by split characteristic is grouped into results.

  3. Click Save to save the definition of the Data distribution analysis. If you refresh Fiori after this step, you should see a new tile without data.

Add Data distribution analysis to a collector

To execute the Data distribution analysis we should add it to a collector

  1. Execute the /DVD/MON transaction on the central system.
  2. Click Monitoring settings under Settings.
  3. In the tree on left side of the screen double click on the KPI collectors folder. 
  4. In the displayed list of collectors search for "/DVD/MON_HM_CL_COL_DU" collector and select its row.
  5. While the collector row is selected double click on Collector's inputs folder in the left tree.



  6. In the displayed list of collector input tables clickfor the input table "/DVD/HS_IT_DU1".


  7. In table maintenance view switch to the edit mode.
  8. Click Create new record.
  9. Fill the following fields:
    1. Subgroup ID - same ID as you defined for Data distribution analysis definition in the first step.
    2. Number - number of days to update size of InfoProvider data.

Data distribution analyses collector is executed every day for all defined Data distribution analyses. Performing size update of all InfoProvider data everyday would not be possible due to long runtime of this action. Therefore it is recommended to update size of data in InfoProviders only once per some time period. Minimal recommended value is two weeks. Update of usage of InfoProvider data is performed everyday as it is much faster.

  1. Click Save button.
    A new Data distribution analysis is now defined. The analysis is automatically executed on all systems where the Data distribution collector is scheduled. See (DI-1902) Set up BW Analysis on how to schedule a collector.
    You can view the results in Fiori front-end after the second execution of the collector.

Please note that Data distribution collector only executes the Data distribution analysis by its first execution and doesn't gather data. During its second and following executions, it also gathers analysis results from previous runs. Therefore, the first results may be visible in Fiori front end, only one day after the first analysis run is completed.

Example

Here is an example how InfoProvider data is processed by Data distribution analysis specified in the previous steps. We use in this example the InfoProvider /DVD/IC_CUBE that contains the characteristic 0CURRENCY.

The Data distribution analysis finds this characteristic in the structure of an InfoProvider and selects all distinct values while it calculates their respective sizes. In our example the InfoProvider has the size of 10 GB in total. Results of this split would be in our example as follows:

  • 0CURRENCY - EUR - 1 GB 
  • 0CURRENCY - USD - 3 GB
  • 0CURRENCY - NZD - 3 GB
  • 0CURRENCY - JPY - 2 GB
  • 0CURRENCY - CHF - 1 GB

As we don't want to view results for all available distinct currencies separately, we have defined 3 data groups. The previous results are assigned to the specified data groups based on their definition.

  • Data group 1 - data with 0CURRENCY EUR belongs here (1 GB).
  • Data group 2 - data with 0CURRENCY USD belongs here (3 GB).
  • Data group 3 - groups all other data (6GB).

After execution of the Data distribution collector, we can view in Fiori front-end three columns for Data Group sizes and Data Group usages and not other columns for all distinct currencies. The grouping makes it possible to focus on the usage analysis of data that are important for us.


Selecting InfoProviders for the Data distribution analysis

It is not always necessary to analyze all InfoProviders based on a characteristic. You can restrict which InfoProviders are analyzed by the Data distribution analysis. To restrict a specific analysis change the input table "/DVD/HS_IT_DU2" of the Data distribution collector in the following way: 

Additional time characteristics

"Standard" Data distribution analysis can be enhanced on a monitored system to take into account not just standard time characteristics (already present) but also additional ones. The only requirement for adding a time characteristic is that its values must contain a year value in the standard format YYYY which is in a fixed value position.

You can specify additional time characteristics for a monitoring system using the BW Wizard or either manually by following these steps on a monitored system:

  1. Execute the report "/DVD/HS_SET_ADD_TIME_CHAR" or the transaction /DVD/HS_ADDTCHA. After execution an ALV table is displayed.
  2. In this table you can add rows. Each row defines one additional time characteristic, which may used in the Data distribution analysis. 
  3. In the first column specify the technical InfoObject name and in the second column the position of the year value in the standard format YYYY. After specification of the additional time characteristic click .


The order in which additional time characteristics are defined in the ALV table is important. During the Data distribution analysis characteristics defined on the top have a higher priority than the ones defined below them.

Examples of settings of additional time characteristics:

InfoObject: 0BIRTHDAY   Year Value Position: 1   - example of characteristic values: 20160101, 20170101, 20171231

InfoObject: ZDVD_MONTH   Year Value Position: 3   - example of characteristic values: 012016, 022016, 032017

InfoObject: ZDVD_FXYEAR   Year Value Position: 4   - example of characteristic values: BCD2016, BCD2017, DGG2017

Additional time characteristics are used during next size update of all InfoProvider data. When additional time characteristics is added or removed this size update is performed in the next execution of the Data distribution collector.

Customization of data groups

"STANDARD" Data distribution analysis can be customized according to your requirement. By default, the analysis consists of the following 6 data groups, in which data sizes and usages are distributed in an ascending order:

<={current year - 5}, {current year - 4}, {current year - 3}, {current year - 2}, {current year - 1}, >={current year}


e.g. <=2013, 2014, 2015, 2016, 2017 ,>=2018

You can change this behavior by using the expert settings (transaction /DVD/RL_SETT_EXPERT) with the Tool name 'HM_BW' :

In the displayed table you find the following parameters for setting the data groups:

Parameter technical nameDefault valueDescription
HM_BW_STD_DATA_GROUPS_ASCXSet whether data groups are displayed in ascending order
HM_BW_STD_DATA_GROUPS_NUMBER6

Set how many data groups are created including the current year. e.g. If you put here 7, the following data groups will be created (considering 2018 as the current year)

<=2012, 2013, 2014, 2015, 2016, 2017 ,>=2018

Data groups are re-created during the next collector execution (e.g. with the HMBW_1DAY monitoring profile).