Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

By default, the Occurrence counter is disabled.
For enabling, you have to activate the switch and add one or more counter definitions to your BVQ alert condition.

Expand
titleFigure 1

Image RemovedImage Added

Figure 1: Use Occurrence counter

...

Expand
titleFigure 2


Consecutive violations


Image RemovedImage Added



TypeType of this
Occurrence counter                                                     
  • Consecutive violations - define how many consecutive violations in a row must occur before the condition is fulfilled
AmountNumber of violationsMaximum number of violation occurrences that are counted until the Alert level is raised.
Alert levelBVQ Alert level

Desired Alert level, one of: OK, INFO, WARN, ERROR, UNKNOWN


Violations per timeframe


Image RemovedImage Added



TypeType of this
Occurrence counter                                                     
  • Violations per timeframe - define how often in a certain timeframe a condition must match until the condition is fulfilled
TimeMinutesSize of the sliding timeframe, in minutes. Only available for type "Violations per time" if SLA mode is turned off.
AmountNumber of violationsMaximum number of violation occurrences that may be counted until the Alert level is raised.
Alert levelBVQ Alert level

Desired Alert level, one of: OK, INFO, WARN, ERROR, UNKNOWN


Violations per SLA interval (fixed window)


Image RemovedImage Added


TypeType of this
Occurrence counter
  • Violations per SLA interval - define how often in a fixed timeframe a condition must match until the condition is fulfilled
Time
Image RemovedImage Added
If the SLA Mode is enabled the time setting is not available. The number of Alert conditions within the SLA interval will be counted.
AmountNumber of violationsMaximum number of violation occurrences that may be counted until the Alert level is raised.
Alert levelBVQ Alert level

Desired Alert level, one of: OK, INFO, WARN, ERROR, UNKNOWN


Figure 2: Differences between the options

...

YELLOW:  Only a certain amount of exceedances within a specified timeframe triggers an alarm. (Occurrence counter ON - Violations per Time) - (delayed trigger mode)

Expand
titleFigure 3

Image RemovedImage Added

Figure 3: Real Life example and illustration of the different methods

...

Expand
titleFigure 4


Methods of choice

Example

How does it work?

What to fill in

Default (direct trigger mode)


Image RemovedImage Added

Each exceeding of a predefined threshold triggers an event within the designated warning level. (Info, Warning, Error)

If a latency exceeds 3ms, the threshold is reached and an error message is triggered. 

The example with the red border shows only one of these events for clarification, but this behavior would apply to all further exceedances.


The default will raise an Error event as soon as the rule is violated the first time.



Image RemovedImage Added

Consecutive Violations


Image RemovedImage Added

Only when a set of 5 consecutive errors occur, an event is triggered within the designated warning level. (Info, Warning, Error)

This condition is shown inside the blue frame and actually occurs only once in this performance chart.


The latency for the MDisk is 5 times above the 3ms threshold and therfore this rule will raise an Error.

Image RemovedImage Added

Violations per timeframe


Image RemovedImage Added


The yellow box shows the possibility of setting "Violations per time". Here, a number of violations per time is specified.

As soon as this number is reached, the designated warning level is displayed.

However, if this value is no longer exceeded, the status of the alert rotates back to the next better value.


In this example we have specified a number of 3 Violations per 60 min. timeframe. With the exceedance of this an alert message will be raised.



Image RemovedImage Added

Figure 4: The different options using a real-life example

...

Expand
titleStep 1: Start of the interval


Image RemovedImage Added

A rule is monitored every 5 min and leads to the output of the error status if it is exceeded once or several times.

In this picture it is to be recognized that with the start the state still runs in the "OK" status and after in each case 5 min another error occurs.

Accordingly the status changes independently of the sliding or fixed window first to info, then to warning and finally to error status.

Thus, both options are in the Error status in interval 25.


...

Expand
titleStep 2: Behaviour of the fixed window mode


Image Removed

Image Added

The SLA interval is set to 30 minutes in this example, so the alert level status will be reset to OK after exactly this period. 

With the default time mode, the status is re-evaluated every time the warning rule is exceeded.
As can be seen in the next step 3, the status changes over a period of time and thus re-evaluates each time how many violations of the rule have occurred within the time window.




Expand
titleStep 3: Behaviour of the sliding window mode


Image RemovedImage Added

This example shows the display of the different alarm levels. In SLA mode, the status is reset exactly after 30 minutes, as shown in step 2. And the counting of rule violations starts again. 

Whereas in sliding window mode, the current state and the number of state changes in the time window are taken into account. After the 4th PI interval, i.e. after minute 20, the rule is no longer violated. The sliding window over a period of 30 minutes now counts the number of rule violations within the 30 minutes and decides how often the rule was broken here. 

Here you can clearly see the "expiration" of the alarm level status. 


...

  • ERROR at 3 ms
  • Violations per time 5, 10, 20 occurrences per time
  • consecutive violation by more than 5 occurrences
Expand
titleFigure 6

Image RemovedImage Added

Figure 6: Real-life example

...

BVQ Alert rule - SLA Mode not enabled (sliding window)Each box represents a 1 hours sliding window and the color shows the current statusExplanation
NameSVC MDisk violated

Image RemovedImage Added


INFO: Each colored box represents a sliding window of 1 hour and indicates the current status by its color.

From 07:15 untill 08:55 the 3ms latency rule was exceeded 5 times. → Status Raised to ERROR

The next upcoming 5 min PI measurements did also contain more than 5 violations of the 3ms latency rule. → Status still at ERROR
From 09:15 the staus lowerded the severity because the 3ms latency rule is no longer exceeded 5 times in that 60 min sliding window. → Status  WARN


Perfomance indicator timing5 minutes
SLA intervalNONE (OFF)

AR ConditionLatency > 3ms  

1. AR Condition > 3 msERROR

Image RemovedImage Added

Image RemovedImage Added

INFO: Each colored box represents a sliding window of 1 hour and indicates the current status by its color.

...

BVQ Alert rule - SLA Mode enabled (fixed window)SLA interval of one dayExplanation
NameDaily SVC MDisk SLA violated

Image RemovedImage Added

INFO: Each box shows by its color the status since the beginning of the day up to that moment. 


SLA mode sets the status to default OK at the start of the day.

At 08:55 5 violations are counted this will raise the status to  INFO

At 09:10 10 violations of 3ms latency are counted so that the status will raise to WARN

At 09:15 5 consecutive violations are happening. This will trigger our 4. alert rule and therefore the status will direclty raise to ERROR

This status will be based on the SLA mode persistant untill the next start of the day.

It will then be reset to 

Status
colourGreen
titleOK




Perfomance indicator timing5 minutes
SLA interval1 day
AR ConditionLatency > 3ms
1. AR Condition Occurrence counter

ERROR

20 times per SLA interval
2. AR Condition Occurrence counter

WARN

10 times per SLA interval

3. AR Condition Occurrence counter

INFO

5 times per SLA interval
4. AR Condition Occurrence counterERROR5 times in a row per SLA interval

...

  • Violations per timeframe (sliding window) → → Time must be greater than  →  Amount of violations  X  PI timing

Image RemovedImage Removed X Image RemovedImage Added > Image Added X Image Added

  • Violations per timeframe (fixed window) → →   SLA interval must be greate than → Amount of violations  X  PI timing

Image Removed Image Added >Image RemovedImage AddedX Image Removed Image Added

Summary

The new BVQ Occurrence counter is an important possibility to specify the amount and the way of monitoring your environment with the Alerting options. With this option it is possible to control very simple up to highly complex scenarios. 

...