Skip to content

4.4 Alert Configuration and Functionality

Validate alert configuration and functionality

Validating alert rules is a critical step to ensure they are triggering as intended based on the configured settings and conditions. This involves a methodical approach to test individual alert conditions and the overall alert rule behavior.

Generally, focus on these key aspects when validating alerts:

  • Verify the tests, agents, monitors, and devices associated with the alert rule.
  • Confirm that alert conditions trigger as expected based on configured metric thresholds.
  • Ensure alert notifications are sent to the correct recipients or systems with accurate information.
  • Tune thresholds to reduce alert noise while still capturing important events.
  • Test alerts under different scenarios to validate and optimize the configuration.

Two common questions arise during alert validation:

  • Why didn't my alert trigger?
  • Why is my alert triggering too often?

This section addresses these questions and provides guidance on alert sensitivity and tuning.

Key Concepts

Before diving into the details, let's establish a key for the icons used in this document:

Legend
  • Alert conditions are met
  • Alert conditions are not met
  • Alert is triggered or status unchanged
  • Alert is not triggered or status cleared

Sensitivity

The global setting, configured in the Alert Conditions section of an alert rule, determines the overall sensitivity of the alert. It defines how many test rounds and agents must meet the specified conditions for the alert to trigger.

How Many Rounds to Trigger On?

The "Global" section of the alert conditions allows you to specify the number of consecutive test rounds that must meet the criteria for the alert to trigger.

Any (1 of 1 Test Rounds)

This is the most sensitive setting. The alert triggers whenever the conditions are met, even for a single test round.

Round Alert Condition Met? Alert Status
1 Alert is triggered
2 Alert status unchanged (triggered)
3 Alert is cleared
4 Alert is triggered again

This setting can lead to multiple notifications for the same issue, creating unnecessary noise.

Any (1 of 2 Test Rounds)

This setting reduces sensitivity by requiring at least one out of two consecutive test rounds to meet the conditions for the alert to trigger.

Round Alert Condition Met? Alert Status
1 Alert is triggered
2 Alert status unchanged (triggered)
3 Alert status unchanged (triggered)
4 Alert status unchanged (triggered)

With this setting, the alert remains triggered even if one round doesn't meet the conditions. It requires two consecutive rounds without meeting the criteria to clear the alert.

Any or The Same Agents?

You can configure the alert to trigger based on any agent meeting the conditions or require the same agent to meet the conditions for a specified number of rounds.

Any (2 of 2 Test Rounds)

Round New York, NY Los Angeles, CA Alert Status
1 Not triggered
2 Not triggered
3 Triggered
4 Status unchanged
5 Cleared

In this example, the alert triggers when any two agents meet the conditions for two consecutive rounds.

The Same (2 of 2 Test Rounds)

Round New York, NY Los Angeles, CA Alert Status
1 Not triggered
2 Not triggered
3 Not triggered
4 Triggered
5 Cleared

Here, the alert triggers only when the same agent (Los Angeles, CA) meets the conditions for two consecutive rounds.

Percent of Agents or Number of Agents?

  • Specifying the exact number of agents is suitable for dedicated test alerts but less scalable as the number of tests increases.
  • Using a percentage of agents is more scalable but remember that ThousandEyes always uses whole numbers, rounding partial agents down.

Example: With a 10% threshold and 18 agents, the rule triggers only if two agents meet the criteria (not 1.8).

Alert Sensitivity

Start with a more sensitive alert configuration and gradually reduce sensitivity to minimize noise while ensuring you capture all critical events.

Alert Rule Validation

Validate Individual Alert Conditions

Understand each alert metric and its information source to build effective alert rules. Knowing when and if a given condition will trigger for any test is crucial for avoiding frustration when creating complex rules.

Validate the Set of Conditions

The Global section allows you to choose between Any or All of the alert conditions for triggering the alert.

  • Any: The alert triggers if any of the specified local alert conditions are met. This is a logical OR operation. Use this option to create a single alert rule that captures multiple issues. Group similar metrics together (e.g., one rule for network metrics, another for page load and transaction test status).
  • All of: The alert triggers only if all specified local alert conditions are met. This is a logical AND operation. Use this option to reduce alert noise by creating more complex condition sets. The more conditions required, the less likely the alert will trigger.

Testing Alert Rules

To validate individual alert conditions, create a test alert rule with the most sensitive global condition: any one agent, one of one time in a row. This ensures the alert triggers whenever a specific condition is met, allowing you to verify its behavior.

Disable Notifications for Test Alerts

Consider disabling notifications for test alert rules to avoid spamming your team while validating alert configurations.

Common Questions

1. Why did my alert not trigger?
  • Check the alert rule's enabled status.
  • Verify that the alert rule is associated with the correct tests, agents, monitors, or devices.
  • Review the global alert condition settings to ensure they are appropriately configured.
  • Examine the individual alert condition thresholds and confirm they are set to capture the desired events.
2. Why is my alert triggering so often?
  • Consider increasing the number or percentage of agents required to trigger the alert.
  • Adjust the alert rule to require more rounds of data to exceed the threshold.
  • Use the "The Same" agent setting to require the same agent to meet the conditions for multiple consecutive rounds.
  • Review and adjust the individual alert condition thresholds to reduce sensitivity.

Resources

Sample Questions

4.4 Question 1

An alert rule for a Web - HTTP Server test is not triggering when the HTTP response code is 500 Internal Server Error. The alert conditions are configured with "Response Code" set to "any error (>= 400 or no response)". What could be causing the alert to not fire?

  • A) The alert rule is disabled
  • B) The test is not enabled on any Enterprise Agents
  • C) The alert rule's "Settings" section does not have the correct test selected
  • D) The HTTP server is returning a 200 OK response code

4.4 Question 2

A CPU utilization alert for Endpoint Agents is triggering too frequently, creating alert noise. Which of the following steps would help reduce the sensitivity of the alert rule?

  • A) Increase the number of agents that must exceed the CPU threshold to trigger the alert
  • B) Lower the CPU utilization percentage in the alert condition
  • C) Adjust the alert rule to require more rounds of data to exceed the threshold
  • D) Enable the alert rule on more Endpoint Agents