(DI-1808) How to define data distribution analysis
Datavard Insights BW module enables you to analyze data distribution and usage of InfoProviders in a particular system based on a characteristic. Current version of BW module support analysis based on only one specified characteristic.
There is one exception to this rule. BW module is shipped with one predefined data distribution setting with technical name "STANDARD". This data distribution does not use only one specific characteristic to split data in InfoProviders. It uses set of standard time characteristics (0CALYEAR, 0CALMONTH, 0CALWEEK, 0FISCYEAR, 0FISCPER, 0CALDAY, 0DATE, 0CALQUARTER). Each InfoProvider is searched for one of these (or referenced) characteristics (in a left to right order) and then data distribution is done using the first one found while values are transformed to years level. If Extended query statistics are enabled, relevant time characteristic that was most used in query filters for particular InfoProvider (and its reporting parents) is preferred. However, if an InfoProvider is a DSO or an InfoCube and a DAP (Data archiving process) is created, the time characteristic used in DAP is used. It is possible to enhance set of used time characteristics by additional ones. See section Additional Time Characteristics for more information.
Data distribution definition (new analysis)
You can define new analysis for data distribution of your monitored system InfoProviders following steps described in this section. This will result in creation of a new tile in Insights Fiori frontend in "Data Management" area.
Custom data distribution analysis tile
- To define new data distribution analysis (technically called as Subgroup) run transaction SE38 and execute program "/DVD/HS_CENTRAL_SUBGROUP" or transaction /DVD/HS_SUBG. Following screen will be displayed.
In this screen you need to define following data distribution analysis attributes:- Subgroup ID - unique identifier of data distribution analysis.
Split Characteristics - Here you define what InfoObjects will be used for creation of data distribution analysis. In current version only 1 characteristic can be defined and used in data distribution. What will happen during analysis is that each analyzed InfoProvider will be searched if it does contain that characteristic. If characteristic is found all InfoProvider data are split to the groups created based on distinct values of this characteristic.
There is safety check that allows to split object data only up to 100 distinct values to prevent too much data generation in the system.
Data groups definition area - in this table you can define data groups. Each data group is identified by its id (first column). It has its own description and then set of conditions. These conditions define how InfoProvider data groups created by split using specified characteristic will be grouped in Fiori analysis output.
Click on save button to save data distribution analysis definition.
- If you would refresh Fiori frontend of Insights now you should see new tile there already but without data. To execute analysis we need to add this new data distribution to our collector.
- Execute /DVD/MON transaction on central system.
- Click on "Monitoring settings" in Settings menu of the transaction.
- In the tree on left side of the screen double click on "KPI collectors" folder.
- In displayed list of collectors search for "/DVD/MON_HM_CL_COL_DU" collector and select its row.
- While the collector row is selected double click on "Collector's inputs" folder in the left tree.
- In displayed list of collector input tables click on "Display/Change" button for input table "/DVD/HS_IT_DU1".
- In table maintenence view switch to edit mode.
- Click on create new record button.
- Fill new row while providing:
- Subgroup ID - Same ID as was defined for data distribution analysis definition in step 1.
Number - number of days to update size data of data groups.
Data distribution analyses collector is executed every day for all defined data distribution analyses. Doing size update of all InfoProviders data everyday would not be possible due to long runtime of this action. Therefore it is recommened to update sizes of data in InfoProviders only once per some period. Minimal recommended value is two weeks. Update of usage of InfoProvider data is done everyday as it is much faster.
- Click on Save button.
New data distribution analysis is now defined.
- Analysis will be automaticaly executed on all systems where Data Distribution Collector is schedulled. See BW add-on setup section on how to schedule collector.
Results can be seen in Fiori frontend of Insights application.
Please note that collector always gathers analysis results from monitored system created on its previous run (if already finished). Therefore first results will be visible in Fiori frontend day after first run of analysis is finsihed on monitored system
Example
Here is example how would be one InfoProvider data processed whith data distribution specified as in previous step list. We will use /DVD/IC_CUBE InfoProvider as example that contains characteristic 0CURRENCY.
Analysis will find this characteristic in structure of InfoProvider and will select all distinct values while calculating their respective sizes. Lets pretend that InfoProvider has size of 10 GB in total. Results of such split would be in our example following:
- 0CURRENCY - EUR - 1 GB
- 0CURRENCY - USD - 3 GB
- 0CURRENCY - NZD - 3 GB
- 0CURRENCY - JPY - 2 GB
- 0CURRENCY - CHF - 1 GB
As we dont want to see results for all available distinct currencies sepparatly we have defined 3 data groups (See step 1 above). The previous results will be assigned to appropriate data groups based on the selection defined for them.
- Data group 1 - data with 0CURRENCY EUR will belong here (1 GB).
- Data group 2 - data with 0CURRENCY USD will belong here (3 GB).
- Data group 3 - will group all other data (6GB).
After execution of collector we can see in Fiori frontend three columns for Data Group sizes and three columns for Data Group usages and not columns for all distinct values sepparatelly. This way we can only focus on usage of data that are important for us.
Restricting InfoProviders to be analyzed
As it is not always needed to analyzed all InfoProviders based on some characteristic. You can restrict for which InfoProviders each data distribution analysis should be executed. To restrict specific analysis you need to change input table "/DVD/HS_IT_DU2" of collector in following way:
- Add new row to the table with following content
- Subgroup ID - ID of data distribution analysis in which you want to restrict InfoProviders to be analyzed,
- Number - Unique number for each new line in the table
- Object type - INFOCUBE, DSO or ADSO - for type of InfoProvider which selection should be restricted.
- Select option - Standard select option format to filter on technical name on InfoProvider
- Bellow is example of new rows that we would add to restrict our CURRENCY data distribution analysis to analyze all DSO and ADSO InfoProviders in the system but only one InfoCube with technical name "/DVD/IC_CUBE"
Additional Time Characteristics
"Standard" data distribution analysis can be enhanced on monitored system to take into account not just standard time characteristics but also additional ones. Only requirement for adding additional time characteristic to the analysis is that its values must contain year value in standard format YYYY somewhere. Additional time characteristics are specified on monitored and not central system. For their specification you must execute report "/DVD/HS_SET_ADD_TIME_CHAR" or transaction /DVD/HS_ADDTCHA. After execution ALV table is displayed. In this table you can add rows while each row defines one additional time characteristic to be used in analysis.
In first column you specify technical InfoObject name and second column contains position of year value in standard format YYYY in values of characteristic. Navigational attributes can be also defined as additional time characteristics.
Examples of viable additional time characteristics are:
InfoObject: 0BIRTHDAY Year Value Position: 1 - example of characteristic values: 20160101, 20170101, 20171231
InfoObject: ZDVD_MONTH Year Value Position: 3 - example of characteristic values: 012016, 022016, 032017
InfoObject: ZDVD_FXYEAR Year Value Position: 4 - example of characteristic values: BCD2016, BCD2017, DGG2017
Order in which additional time characteristics are defined in table is important. As during data distribution analysis characteristics defined on top will have higher priority to be used than the ones defined bellow them.
After specification of additional time characteristic you can save their definition by clicking on save button in toolbar.
Additional Time characteristics will be used durring next size recalculation. Size recalculation is not done each collector run but only once per specific time period, default parameter is 14 days. When additional time characteristics is added or removed this recalcuation will be done in next execution of collector run. If the collector was already executed that day please do not execute it manually again.
Data distribution analysis customization
"STANDARD" data distribution analsysis can be customized to fullfil user's specific needs. By default, analysis is providing following 6 data group where data sizes and usages will be distributed in ascending order:
<={current year - 5}, {current year - 4}, {current year - 3}, {current year - 2}, {current year - 1}, >={current year}
e.g. <=2013, 2014, 2015, 2016, 2017 ,>=2018
This behaviour can be changed by using expert settings (transacation /DVD/RL_SETT_EXPERT) with tool name 'HM_BW' by following parameters:
Parameter technical name | Default value | Description |
---|---|---|
HM_BW_STD_DATA_GROUPS_ASC | X | Set whether data groups will be displayed in ascending order |
HM_BW_STD_DATA_GROUPS_NUMBER | 6 | Set how many data groups will be created including current year. e.g. if you put here 7 following data groups will be created (consider 2018 as current year) <=2012, 2013, 2014, 2015, 2016, 2017 ,>=2018 |
Data groups will be re-created during next collector execution (e.g. with HMBW_1DAY monitoring profile).