(DI-2211) Collector for background job monitoring

The collector for background job monitoring is named /DVD/MON_BTC_CL_COL_JOBS. The collector collects KPIs about background jobs.
The user defines the KPIs and sets KPI rules for jobs in the input table /DVD/MON_BTC_CFG.

If the user creates more jobname rules that apply only for one KPI in the input table, then all of these rules are executed. Exceptions are delayed and long-running KPI rules, which apply to job runes that exceed the time limit specified by the user. If in this case, two rules apply to the same job, then the rule with greater the time limit has a higher priority and is applied.

Example:

KPI_1 has the first rule for the job names starting with Z* and a time limit of 60 minutes for a long run (The jobs must exceed the time limit of 60 min.).

Same KPI_1 has also a second rule for job names starting with Z_LONG_PROC* and a time limit of 180 minutes for a long run. (The job name from the first rule is also applied here, but this rule has a greater time limit).

When a collector finds jobs with the following jobnames:

  • Z_JOB1 which was running for 59 minutes - it will not be included (The first rule has a time limit of 60 minutes, so this job is not included)
  • Z_JOB2 which was running for 60 minutes - it will be included
  • Z_JOB3 which was running for 179 minutes - it will be included
  • Z_JOB4 which was running for 180 minutes - it will be included
  • Z_JOB5 which was running for 181 minutes - it will be included
  • Z_LONG_PROC_JOB6 which was running for 59 minutes - it will not be included
  • Z_LONG_PROC_JOB7 which was running for 60 minutes - it will not be included
  • Z_LONG_PROC_JOB8 which was running for 179 minutes - it will not be included (The second rule has a greater time limit of 180 minutes, therefore it is applied instead of the first rule)
  • Z_LONG_PROC_JOB9 which was running for 180 minutes - it will be included
  • Z_LONG_PROC_JOB10 which was running for 181 minutes - it will be included

It collects these data into the KPI.

The user enters an extended KPI definition with the following fields in the Input table /DVD/MON_BTC_CFG:

KPI name

The same technical name as in the KPI definition

System ID

System ID

Background job name

Background job name

Check type

Background job check type

Time

Time limit for delayed and long-running BG jobs

Add to details

Add relevant records to details

Active

Active record

Change by user

Last change by user [automatically filled]

Change date

Last changed date [automatically filled]


The collector provides also a detailed overview of jobs in the detail table for particular KPIs. The detail table is called "BG jobs in error, delay or long-run" (/DVD/MON_BTC_DET). 

and contains following fields:

Timestamp
System ID
Background job name
Job ID
Message text
BG job status
BG Job delay time [s]
BG job long runtime [s]


Three set up KPIs are included in the standard delivery:

KPI name

Description

Unit

BTC_NUM_DELAYED

Number of delayed background jobs ( >= 60 seconds)

Count
BTC_NUM_FAILEDNumber of failed background jobsCount
BTC_NUM_LONGRUNNumber of background jobs running too long ( >= 10 minutes)Count

NOTE: All standard background job KPIs contain the prefix BTC_NUM_*