Dynamic Threshold

Introduction

In the domain of system monitoring, the conventional practice of setting fixed thresholds for alerts has been a long-established standard. However, this approach frequently fails to accommodate the dynamic characteristics of contemporary IT environments. The introduction of thresholdless alerting represents a paradigm shift, offering several notable advantages over traditional fixed thresholds.

Dynamic alerting leverages advanced analytics, machine learning, and other sophisticated techniques to detect anomalies and potential issues in real-time without the need for predefined thresholds. This means that the system can adapt to changing conditions, user behaviors, and varying workload during the day, providing a more accurate and responsive monitoring framework.

One of the primary advantages is the reduction in false positives and negatives. Fixed thresholds can often be either too strict, resulting in a flood of alerts, or too permissive, leading to missed critical issues. Dynamic thresholding, by learning the normal behavior of a system, can better distinguish between normal fluctuations and genuine anomalies, improving the signal-to-noise ratio and ensuring that only actionable alerts are generated.

Another key benefit is the ability to anticipate and predict issues before they escalate. By understanding the patterns and trends in system behavior, dynamic alerting can provide proactive insights that allow for early intervention and preventative maintenance. This not only helps in maintaining system uptime but also in optimizing resource allocation and performance.

In addition, dynamic alerting can enhance operational efficiency. It reduces the need for constant manual adjustments to thresholds, freeing up IT personnel to focus on more strategic tasks. It also provides a more comprehensive view of system health, enabling faster and more effective decision-making.

Beginning with SAP Focused Run 5.0 SP00, the dynamic threshold approach has been integrated into the System Monitoring scenario, offering an alternative to the static threshold method. This new approach leverages additive model time series analysis, which is a feature of the SAP HANA Predictive Analysis Library (PAL). The model is commonly known as Prophet forecasting.

Usage

The results can be seen in the System Monitoring application. The Metric Monitor will display per default in addition to the monitored values, the calculated dynamic threshold lines.

In general, any numeric metric, whether it's standard or custom, can be set up for dynamic thresholding if the necessary conditions are fulfilled. However, the usefulness of the threshold values for a specific metric depends on the data being reported. For instance, metrics that mostly report zero as the monitored value, such as the number of errors or long-running processes, will not yield meaningful dynamic thresholds. Therefore, it is important to select the suitable metrics.

Below is an example of calculated dynamic thresholds for the ABAP Dialog Response Time metric, where the load is continuously measured.

Remark: it might take up to one hour, to see the switch from a static to dynamic approach after performed reconfiguration.

Under The "Metric Details" you can find additional information, if dynamic threshold is used and which method is applied:

Threshold Type: Numeric Threshold(Green/Yellow/Red)
Green to Yellow: 2000 ms
Yellow to Red: 3000 ms
Use Best rating of Last N: false
Use Dynamic Threshold: true
Dynamic Threshold Method: Use dynamic threshold only
Direction: Exceeds

Best Practice

It's advised to initially enable dynamic thresholding for just a few metrics and apply them to a limited number of systems. Metrics that represent performance with clear numerical values make excellent starting points, such as response time metrics. If the standard thresholds are currently set as counter types, they should be converted to numeric types before applying dynamic thresholding options.

The prerequisites for determining which metrics are suitable for dynamic thresholding generally include:

Data Variability: The metric should exhibit variability over time, rather than being consistently static or zero.
Sufficient Data: There should be enough historical data to build a meaningful model of the metric's behavior.
Relevant Range: The metric should have values within a range that reflects normal and abnormal operating conditions.
Predictable Patterns: The metric should display identifiable patterns or trends, such as daily, weekly, or seasonal fluctuations.
Impactful Metrics: The metric should be relevant to the performance or health of the system being monitored, so that dynamic thresholds provide meaningful alerts.
Normalization: The metric should be normalized or consistently measured under similar conditions to avoid skewing the thresholding algorithm.
Response to Workload: Metrics that reflect system load or performance under varying conditions often make good candidates for dynamic thresholding.

These prerequisites help ensure that the dynamic thresholds generated are meaningful and useful for monitoring and alerting purposes.

Dynamic Threshold

Introduction

Prerequisites

Configuration

Usage

Best Practice

FAQ

Dynamic Threshold

Introduction

Prerequisites

Configuration

Usage

Best Practice

FAQ

Which data is considered for dynamic threshold calculation?

Is it possible to select the best-case option for the combined approach?

Can I define a threshold sensitivity/offset?

Can I use the dynamic thresholding in combination with best of last N?

Why do I see the static threshold lines on a chart, although the dynamic thresholds are enabled?