Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

By default, the Occurrence counter is disabled.
For enabling, you have to activate the switch and add one or more counter definitions to your BVQ alert condition.

Expand
titleFigure 1

Image Modified

Figure 1: Use Occurrence counter


After enabling the Occurrence counter, you can decide on a method to choose from  → Consecutive violations or Violations per time

...

INFO: Please note that a mixture of SLA and non SLA mode within a single Alert Condition is not possible.

Expand
titleFigure 2
Consecutive violations


Image Modified



TypeType of this
Occurrence counter                                                     
  • Consecutive violations - define how many consecutive violations in a row must occur before the condition is fulfilled
AmountNumber of violationsMaximum number of violation occurrences that are counted until the Alert level is raised.
Alert levelBVQ Alert level

Desired Alert level, one of: OK, INFO, WARN, ERROR, UNKNOWN

Violations per timeframe


Image Modified



TypeType of this
Occurrence counter                                                     
  • Violations per timeframe - define how often in a certain timeframe a condition must match until the condition is fulfilled
TimeMinutesSize of the sliding timeframe, in minutes. Only available for type "Violations per time" if SLA mode is turned off.
AmountNumber of violationsMaximum number of violation occurrences that may be counted until the Alert level is raised.
Alert levelBVQ Alert level

Desired Alert level, one of: OK, INFO, WARN, ERROR, UNKNOWN

Violations per SLA interval (fixed window)


Image Modified

TypeType of this
Occurrence counter
  • Violations per SLA interval - define how often in a fixed timeframe a condition must match until the condition is fulfilled
Time
Image Modified
If the SLA Mode is enabled the time setting is not available. The number of Alert conditions within the SLA interval will be counted.
AmountNumber of violationsMaximum number of violation occurrences that may be counted until the Alert level is raised.
Alert levelBVQ Alert level

Desired Alert level, one of: OK, INFO, WARN, ERROR, UNKNOWN

Figure 2: Differences between the options





Illustration of the different methods:

...

YELLOW:  Only a certain amount of exceedances within a specified timeframe triggers an alarm. (Occurrence counter ON - Violations per Time) - (delayed trigger mode)

Expand
titleFigure 3

Image Modified

Figure 3: Real Life example and illustration of the different methods





This table describes the different options using a real-life example.
More detailed information about each option is given, as well as a description of how to activate one of the options.

Expand
titleFigure 4

Methods of choice

Example

How does it work?

What to fill in

Default (direct trigger mode)


Image Modified

Each exceeding of a predefined threshold triggers an event within the designated warning level. (Info, Warning, Error)

If a latency exceeds 3ms, the threshold is reached and an error message is triggered. 

The example with the red border shows only one of these events for clarification, but this behavior would apply to all further exceedances.


The default will raise an Error event as soon as the rule is violated the first time.



Image Modified

Consecutive Violations


Image Modified

Only when a set of 5 consecutive errors occur, an event is triggered within the designated warning level. (Info, Warning, Error)

This condition is shown inside the blue frame and actually occurs only once in this performance chart.


The latency for the MDisk is 5 times above the 3ms threshold and therfore this rule will raise an Error.

Image Modified

Violations per timeframe


Image Modified


The yellow box shows the possibility of setting "Violations per time". Here, a number of violations per time is specified.

As soon as this number is reached, the designated warning level is displayed.

However, if this value is no longer exceeded, the status of the alert rotates back to the next better value.


In this example we have specified a number of 3 Violations per 60 min. timeframe. With the exceedance of this an alert message will be raised.



Image Modified

Figure 4: The different options using a real-life example



When using the non-SLA - sliding window mode, the status of a rule varies, depending on the current value.
If in the next measurement the value falls below a critical value again, this will be reflected in the status of the alert rule and it will change from e.g. Error Level back to an OK state.
Relevant for this is the PI timing and the time specified in the Sliding window. 
Here, an additional possibility has been created to maintain this state in the long term, over a certain SLA time interval. (fixed window)

...

The main difference from the Sliding Window (used without the SLA option) is the fact that the state of the Alert rule remains.
By default, SLA mode is disabled. To enable it, change the SLA INTERVAL from "SLA mode not enabled" to the desired timeframe.
The following illustration shows the differences by means of a comparative analysis:


Standart timing mode (sliding window)SLA timing mode (fixed window)
Intended to be used for normal monitoring purposesIntended to prove the health of service level objectives.
Allows to define a separate Timing for each Occurrence counter definitionForces all Occurrence counter definitions to use the configured SLA timing
Uses the timing for a sliding window - Time setting within the Occurrence counter definitionsUses the timing for fixed window - Time setting for the SLA interval
Reset to default Alert level (typ. OK) when the measurements in the sliding window are below the counts of all Occurrence counter definitions.Reset to default Alert level (typ. OK) at each start of the SLA interval fixed window.

Figure 5: Standart timing mode versus SLA timing mode

...

  • ERROR at 3 ms
  • Violations per time 5, 10, 20 occurrences per time
  • consecutive violation by more than 5 occurrences
Expand
titleFigure 6

Image Modified

Figure 6: Real-life example



BVQ Alert rule - SLA Mode not enabled (sliding window)Each box represents a 1 hours sliding window and the color shows the current statusExplanation
NameSVC MDisk violated

Image Modified


INFO: Each colored box represents a sliding window of 1 hour and indicates the current status by its color.

From 07:15 untill 08:55 the 3ms latency rule was exceeded 5 times. → Status Raised to ERROR

The next upcoming 5 min PI measurements did also contain more than 5 violations of the 3ms latency rule. → Status still at ERROR












From 09:15 the staus lowerded the severity because the 3ms latency rule is no longer exceeded 5 times in that 60 min sliding window. → Status  WARN


Perfomance indicator timing5 minutes
SLA intervalNONE (OFF)

AR ConditionLatency > 3ms  

1. AR Condition > 3 msERROR

Image Modified

Image Modified

INFO: Each colored box represents a sliding window of 1 hour and indicates the current status by its color.

Figure 7: Disabled SLA mode


BVQ Alert rule - SLA Mode enabled (fixed window)SLA interval of one dayExplanation
NameDaily SVC MDisk SLA violated

Image Modified

INFO: Each box shows by its color the status since the beginning of the day up to that moment. 


SLA mode sets the status to default OK at the start of the day.

At 08:55 5 violations are counted this will raise the status to  INFO

At 09:10 10 violations of 3ms latency are counted so that the status will raise to WARN

At 09:15 5 consecutive violations are happening. This will trigger our 4. alert rule and therefore the status will direclty raise to ERROR

This status will be based on the SLA mode persistant untill the next start of the day.

It will then be reset to 

Status
colourGreen
titleOK




Perfomance indicator timing5 minutes
SLA interval1 day
AR ConditionLatency > 3ms
1. AR Condition Occurrence counter

ERROR

20 times per SLA interval
2. AR Condition Occurrence counter

WARN

10 times per SLA interval

3. AR Condition Occurrence counter

INFO

5 times per SLA interval
4. AR Condition Occurrence counterERROR5 times in a row per SLA interval

Figure 8: Enabled SLA mode

...