Number of Problem Alerts Sent

Sending of alerts is based on the assigned Alert Rule. Whenever the Alert Rule condition is violated, the system will trigger sending a problem alert, and when the problem recovers the system will trigger sending a recovery alert (see About Alert Rules).

If you do not choose to receive continuous alerts, you will receive only one problem alert.

If you choose to receive continuous alerts, the number of problem alerts that you will receive will be determined by the below described mechanisms.

For Uptime Monitors

Monitis external monitoring is based on special Linux agents (“external” agents) running on hosts situated in different geographical locations. The agents conduct external tests (run external monitors).

By current system design the following limitation is imposed on the external agents:

–          If tests from some location fail repeatedly, the agent will send to Monitis notification service NOK for each failed test but in total no more than 5 NOKs. This is done intentionally to prevent end-to-end system overload with too many NOKs.
Note: This limitation is imposed only on the number of NOKs sent by an external monitor to the Monitis notification service in order to limit the number of problem alerts sent to the user (see below). In your external monitors, as well as in the Alerts History, all the NOKs will be displayed without any limitation.

The example below will help in understanding it better.

Example:

The User has an HTTP monitor M running from 3 locations: A, B and C.

He has also assigned an Alert Rule (AR) to the monitor to alert him if M fails simultaneously from 3 locations (alert upon 1 failure from 3 locations).

The User has also enabled receiving continuous alerts.

The flow of events is as follows:

–          M fails from location A. NOK is generated by the agent running from location A, but not sent because AR is not met.

–          Test result from locations B and C is OK.

–          M fails from location A. NOK is generated by the agent running from location A, but not sent because AR is not met.

–          M fails from Location B. NOK is generated by the agent running from location B, but not sent because AR is not met.

–          Test result from location C is OK.

–          M fails from location A. NOK is generated by the agent running from location A, but not sent because AR is not met.

–          M fails from Location B. NOK is generated by the agent running from location B, but not sent because AR is not met.

–          M fails from Location C. AR is triggered as M has now failed from 3 locations simultaneously. The system sends out a problem alert.

–          The next N tests for all 3 locations fail.

–          The number of problem alerts sent to the User is:

o  Location A sends 2 problem alerts with the next 2 failures, and doesn’t send any more alerts with the subsequent failures as the total number of alerts sent by A reaches the limit of 5.

o  Location B sends 3 problem alerts with the next 3 failures, and doesn’t send any more alerts with the subsequent failures as the total number of alerts sent by B reaches the limit of 5.

o  Location C sends 4 more (5 in total including the 1st problem alert sent to the User) problem alerts with the next 4 failures, and doesn’t send any more alerts with the subsequent failures as the total number of alerts sent by C reaches the limit of 5.

The total number of problem alerts sent is therefore: 2+3+5=10.

The table below will help understanding it better:

Number_of_Problem_Alerts1

 

Note:

If you have a maintenance period set for the monitor (see Maintenance Period), you will not be receiving problem alerts during the maintenance. If the problem continues after the maintenance period is over, then if you have continuous alerting enabled in the alert rule (see Alert Rules) the system will generate fake alerts every 9 min within 1 hour from every location that hasn’t recovered yet. This is done to ensure that even if the limit of 5 NOK’s per location is reached, but you haven’t received any alert because the monitor was in maintenance, you will be alerted about the problem if it continues beyond the maintenance.

Similarly, you may have set it in the alert rule that you shall receive alerts only during your scheduled period of time (see Alert Rules). If the problem starts before the scheduled alerting period, then if you have continuous alerting enabled in the alert rule (see Alert Rules) the system will generate fake alerts every 9 min during the scheduled alerting period from every location that hasn’t recovered yet, to make sure you are notified about the problem.

 

For Other Monitors

For Server-Device, Application), Transaction, FPL, custom and cloud monitors, there is no limitation on the number of problem alerts sent. You will be receiving problem alerts for every failed test till the recovery of the problem.