Health Monitoring

The Health Monitoring app (formerly Open Component Monitoring)  in SAP Focused Run supports Application & System Monitoring  by providing additional monitoring metrics that go beyond standard system monitoring. Use Health Monitoring when configuration differs vastly between managed objects. In addition, Health Monitoring is used to provide low-barrier monitoring for managed objects that are not included in the landscape management database (LMDB).

Health Monitoring provides the following types of metrics for monitoring managed objects:

For all metric types except Cloud Service metrics, the monitoring data of a metric is collected by a Simple Diagnostics Agent (SDA). Depending on the type of metric, an SDA is used which is either installed on a host of the monitored object or on a host in a collection group in which the monitored object is located.

A collection group corresponds either to a customer network or to a subnetwork of a customer network. Collection groups of the first kind are available in Health Monitoring automatically if the relevant customer networks are defined in SAP Focused Run. You can create, change, and delete collection groups of the second kind in the Health Monitoring app.

In the case of Cloud Service metrics, how monitoring is performed depends on the type of collection involved. With pull data collection, monitoring is by means of integration with the Expert Scheduling Management Cockpit and an ABAP collection job (frequency: five minutes). With push data collection, the managed service sends the metrics directly to the Health Monitoring app.

Configuration: General

Activate Collection Group

Many Health Monitoring metrics are unmodeled, which means they are not assigned to a specific technical system in the landscape. Nevertheless, they need to be executed by a Simple Diagnostics Agent, which must be located inside the collection group. It is therefore necessary, as a first step, to define a central Simple Diagnostics Agent in the collection group which should execute the Health Monitoring metrics.

To activate a collection group, proceed as follows:

  • Choose the Configuration button in the top right corner of the Health Monitoring app.
  • Expand the Collection Group area.
  • You can see all the collection groups in scope and their status. Choose the switch next to the collection group to activate it.
  • Select the Simple Diagnostics Agent that should be used by Health Monitoring and choose Save.
  • The collection group is now active and can be used by Health Monitoring.
  • Note: You can choose the Edit button to assign another Simple Diagnostics Agent, if necessary.


Create New Metrics

Perform the following steps to create a new Health Monitoring metric:

  • Choose the corresponding tab on left of the screen (for example, Availability to create an Availability metric).
  • Choose the + button.
  • Select a metric type (only relevant for Availability and Application Check metrics).
  • Maintain metric and alert attributes (see the table below for descriptions of parameters shared by all metrics).
  • Choose Save.
  • Your metric is now active and is visible in the Health Monitoring application

If you have to create or change multiple Availability metrics simultaneously, you can use the mass maintenance functionality in Health Monitoring (available since SAP Focused Run 3.0 SP00).

For further details, see the relevant guide Mass Maintenance of Availability Metrics in Health Monitoring:

For details of how to create OS Script metrics, see the document Creating OS Script Metrics with Health Monitoring.

Here is a list of the parameters used by all metrics except Cloud Service metrics in the Health Monitoring app. For information about Cloud Service metrics, see here.

FieldDescription
Metric NameA descriptive name for the metric. It is advisable to choose a name that can be easily understood by others.
Collection GroupThe collection group in which the metric is created
Collection Interval/ Collection FrequencyHow often the metric is collected
ThresholdThe metric threshold. For each metric, you specify a threshold and what happens when this threshold is reached – for example, set status of metric to red (error).
Threshold DelayOptional configuration setting for threshold. When a threshold delay is defined, an average calculation is done for the latest collected values that are no longer in the past than the threshold delay. If, for example, the threshold delay is 30 minutes, collected values from the last 30 minutes are taken into account for the average calculation. The metric status is calculated based on the comparison of the average value against the threshold value.
Metric DocumentationAdditional information about the metric
Alert ActiveParameter that allows an alert to be triggered when the metric fails
Alert NameThe name of the alert. It is advisable to choose a descriptive name that can be easily understood by others.
SeveritySeverity of the alert in Alert Management. Maintain a value between 0 (very low) and 9 (Critical).
Notification Variant (optional)Selection of notification for sending an additional e-mail for the alert, if required
Additional AttributesAn attribute and an attribute value (optional) add additional information to a metric. You can also add multiple attribute/value pairs to a metric. For example, you could specify an attribute SID and an attribute value FRN for a metric to specify that this metric refers to an SAP system with the system ID FRN. You can use additional attributes to group the data displayed (for example, in the Attribute Overview tab or in a table).
Outbound Connector Variant (optional)Selection of outbound connector for forwarding the alert via BAdI implementation, if required

For details of individual metrics and their additional parameters, see here

Configuration: Cloud Services

To find out more about configuring cloud services, see Configuration of Cloud Services for Health Monitoring.

Configuration: Examples

Groupware Connector

You can create the following metrics to monitor the availability of a Groupware Connector Server:

Groupware Connector

Create a new Windows Services metric with the following parameters:

  • Metric Name: Groupware Connector
  • Service Name: MsxGwConnector7.0_0
  • Host Name: <host name of groupware server>
  • Additional Attributes: Groupware
  • Collection Interval: 1 Minute
  • Alert Name: Groupware Connector not running
Groupware MS Proxy

Create a new Windows Services metric with the following parameters:

  • Metric Name: Groupware MS Proxy
  • Service Name: MsxGwProxy7.0_0
  • Host Name: <host name of groupware server>
  • Additional Attributes: Groupware
  • Collection Interval: 1 Minute
  • Alert Name: Groupware MS Proxy not running

Other configuration examples are available in our SAP Focused Run - Internet Demo System.

Data Quality Indicators

You can use data quality indicators to call up detailed information about the success of data collection for Simple Diagnostic Agents and Cloud Services. To display this information, choose the red status icon at the top-right of the UI.


In the Agents area, you can see the status of all Simple Diagnostics Agents configured in a collection group. In addition, the Cloud Services area displays the following information:  

  • Timestamp of last successful data collection and (if applicable) technical message and return code: To display this information in a pop-up, choose the rating icon in the Status column.
  • The data collection log in the Expert Scheduling Management Cockpit (services with pull data collection only): To display the cockpit, choose the Data Collection Log link. 

You can also view data quality indicators for cloud services in the following parts of the app:

  • Cloud Services section of the configuration (data collection status only) 
  • Details of configured cloud services 
  • Cloud Services monitoring page 

Background Jobs in Health Monitoring

The background jobs used in the Health Monitoring app are displayed on the Infrastructure panel:

  • Load Balancing
    This job distributes the load involved in collecting monitoring data between the agents assigned to a collection group. If an agent becomes unavailable, the job detects this and moves the metric configurations from the failed agent to the other agents assigned to the same collection group. In addition, it checks whether one agent is collecting considerably more metrics than the others assigned to a collection group. If that's the case, the job distributes the number of metric configurations between these agents to ensure that each of them collects roughly the same number of metrics. 
    The name of the load balancing job is SAP_FRN_OCM_AGENTLOADBALANCE. It runs every 5 minutes.
  • Event Calculation  
    For a detailed explanation see Event Calculation section below. 
    The name of the Event Calculation job is SAP_FRN_OCM_EVENT_CALC. It runs every minute.
  • Housekeeping
    For a detailed explanation, see the Housekeeping section below.
    The name of the housekeeping job is SAP_FRN_OCM_HOUSEKEEPING. It runs once per day. The run writes messages directly into the job log. To view these messages, go to transaction SM37 (Job Overview), enter the job name SAP_FRN_OCM_HOUSEKEEPING, and then choose Execute. Click on the job execution and then choose Display job log.
  • Aggregation 
    For a detailed explanation, see the Housekeeping section below.
    The name of the aggregation job is SAP_FRN_OCM_AGGREGATION. It runs hourly. The run writes messages directly into the job log. To view these messages, go to transaction SM37 (Job Overview), enter the job name SAP_FRN_OCM_AGGREGATION, and then choose Execute. Click on the job execution and then choose Display job log. 

Monitor Execution Status of Jobs and Data Collection Status of Cloud Services and Agents

As of SAP Focused Run 4.0 FP03, the Load Balancing job sends the execution status of the Event Calculation and Housekeeping jobs to the Self-Monitoring Dashboard. 

As of SAP Focused Run 5.0 SP00, the Load Balancing job sends also the execution status of Cloud Services and the status of Agents assigned to Collection Groups to the Self-Monitoring Dashboard.

To display the status, start the app Self-Monitoring Dashboard from the launchpad, select the page Central Components, and expand the section Health Monitoring.


If any Cloud Service status is red, the status of the parent node Cloud Services will also become red, and an alert will be generated. For pull Cloud Services click on ‘Data Collection Log' icon to navigate to Expert Scheduling Management Cockpit and see details of previous data collections:


For the status of Collection Groups it is important to know, that only for active Collection Groups the agent status is collected and displayed in the Self-Monitoring Dashboard. If any agent status of the displayed Collection Groups is red or no agent status was collected at all, the status of the parent node ‘Collection Group: Agent Status Information' will also become red, and an alert will be generated.

Event Calculation

The Event Calculation job calculates the ratings (green, yellow, red) for Availability, Application Check, and Cloud Service metrics. If metrics are rated yellow or red, the job creates alerts for them based on the threshold and alert settings for the metrics.

The Event Calculation job runs every minute. 

To see the status of the last run and the scheduled date/time of the next run, do the following:

In the Health Monitoring app, choose Configuration (gear icon) at the top right of the screen and then open the Infrastructure panel.


Here, you can see the last and next run of the Event Calculation job. 
The line Event Calculation Last Run displays the last execution date and time of the job together with its status (red, yellow, green). 

To avoid alerts being created too late, SAP recommends a maximum job runtime of one minute. By default, the job uses 10 percent of the free work processes of a server (instance) of an SAP Focused Run system. You can reduce the job runtime by increasing the percentage value. To set the percentage value, run the report OCM_EVENT_CALC_SET_PARAMETER.  

Bear in mind, however, that the percentage value can influence the response time of the SAP Focused Run system. This is because the more work processes a job uses, the fewer work processes will be available for other users and other jobs on a server (instance) of the system.

If your SAP Focused Run system consists of multiple servers (instances), you can specify the instance on which the job is executed. The job uses the server group FRN_RFC_HM to determine the server (instance) on which it is executed. To change the server (instance) of server group FRN_RFC_HM, go to transaction RZ12, double-click on server group FRN_RFC_HM, and then enter a different server (instance).

Housekeeping

The SAP_FRN_OCM_HOUSEKEEPING job deletes data that is no longer required from the Health Monitoring database tables (see the Master Guide for SAP Focused Run). The housekeeping job runs once a day.

To view the status of the last run and the scheduled date/time of the next run, proceed as follows:

In the Health Monitoring app, choose the Configuration button (gear icon) at the top right of the screen and then open the Infrastructure panel.



The Configuration area of the UI, containing the Infrastructure panel, now also includes a Housekeeping section. 

 


The configuration tables are divided into two sections: Raw Data and Aggregated Data.

The value in the Raw Data field specifies how long monitoring data is kept in the Health Monitoring database tables – for example, 180 days. In this case, any monitoring data required for detailed or collector charts that is older than today minus 180 days is aggregated and deleted. Other raw data is simply deleted.

The value in the Aggregated Data field specifies how long aggregated monitoring data is kept in the Health Monitoring database tables – for example, 720 days. If the aggregated monitoring data is older than today minus 720 days, it is deleted.

If you've already customized raw data and aggregated values in an earlier feature pack, your legacy values are still available. The UI always displays the lowest configured value. If you make any changes to values on the UI, all values that depend on the relevant type (raw or aggregate) are updated.

To maintain housekeeping, change the values in the Housekeeping section of the Configuration panel and save them.

Note: When you first log on to a newly installed SAP Focused Run system and open the Housekeeping section, the Aggregated Data and Raw Data fields are empty. These fields are automatically populated with the default values when the job SAP_FRN_OCM_HOUSEKEEPING is executed for the first time. This job runs daily (see the Master Guide for SAP Focused Run).

You can maintain the housekeeping settings for each of your SAP Focused Run systems, or you can transport the settings – for example, from a test system to a production system. To transport settings, go to transaction SM30 (Extended Table Maintenance), enter OCM_HKCONFIG in the Table/View field and choose the Maintain button. Next, select the table rows and choose Table View > Transport to store the settings in a Customizing request. You can now transport the Customizing request from your test system to your production system, for example.

Caution

If you adjust the housekeeping configuration, it's advisable not to make large changes (for example, double-digit reductions to the number of days).

If changes of this kind are unavoidable, be aware that housekeeping will be done step by step to avoid performance issues.

This means that aggregation will be done over a period of days until the change has been taken into consideration.

For example, if you change the housekeeping value from 180 to 100 days, housekeeping will run over a number of days until the difference between the oldest database entry and the newly configured value no longer affects performance.

Separate Aggregation job as of SAP Focused Run 5.0 SP00

Aggregation of monitoring data was moved from daily Housekeeping job to a separate job SAP_FRN_OCM_AGGREGATION which is executed hourly (see the Master Guide for SAP Focused Run).
Data aggregation will be performed every hour according to the raw data lifetime setting in the Configuration area of the UI. For performance reasons, the aggregation job processes maximum 12 hours of raw data during a run by default.
This setting can be changed by report OCM_AGGREGATION_SET_PARAMETER. If there is a need to change the maximum number of hours to be aggregated, execute this report via transaction SE38 and enter a new number of hours, e.g. 24. Please note that an increased maximum number of hours to be aggregated might lead to a longer runtime of job SAP_FRN_OCM_AGGREGATION.  


Deprecated Functionalities

While we strongly recommend using the new Configuration panel on the UI of the Health Monitoring app, you can still configure housekeeping settings via the database, as in earlier feature packs.

To maintain housekeeping settings in database table ocm_hkconfig, do the following:

  • Go to transaction SM30 (Extended Table Maintenance) and enter OCM_HKCONFIG in the Table/View field. Then choose the Maintain button.
  • In the Store Table Name field, enter the Health Monitoring database table from which data is to be deleted (for example, ocm_mon_raw). In the field Partition Field Name, specify a date field of the database table. The lifespan is calculated based on the value in this field. In the Lifespan field, enter a lifespan in days. Then choose Save.

Application Logging

Different components of the Health Monitoring application write messages to the application log. To display the log entries, go to transaction slg1 (Application Log).
Select in field ‘Object' the value ‘FRUN_OCM'. Under Time Restriction, define a time frame using the fields 'From (Date/Time)' and ‘To (Date/Time)'. Then go to the ‘Subobject' field and select one of the following values and then choose ‘Execute':

  • CONFIGURATION
    When a metric configuration is deleted messages with this subobject are written to the application log.
  • CONF_CLOUDSRV
    When Cloud Services are changed the application log will contain messages about a Cloud Service being created, deleted, activated and deactivated.
  • CONF_COLLGROUP
    When changes are performed on a Collection group or a Namespace the application log will contain messages about the change like
    • A Namespace or a Collection group is created or deleted
    • A Namespace or a Collection group is activated or deactivated
    • Agents are assigned or unassigned to/from a Namespace or a Collection Group.
  • CONF_CONTENTUPDATE
    When you import SAP content into Health Monitoring messages with this subobject are written to the application log. For example, for RCD content updates “Import SAP content version <A>, FP <B>” or  “Import XALM content file  <A>“ for  XALM content updates.
  • CONF_HOUSEKEEPING
    When Housekeeping settings are changed the application log will contain log messages about the change. For example, whether the lifetime of Aggregated data has been changed or the lifetime of the Raw data has been changed.
  • CONSISTENCY_CHECK
    The consistency check analyzes references of Health Monitoring configurations to objects outside of the Health Monitoring application. For example it checks if a metric configuration contains an additional context which does not exist in Landscape Management Database (LMDB) anymore. The consistency check is executed from the housekeeping job which runs daily. The consistency check writes messages with this subobject to the application log.
  • EVENT_CALC
    The event calculation job writes messages with this subobject to the application log.
  • LOADBALANCE
    The load balancing job writes messages with this subobject to the application log.
  • MON_DATA_INBOUND
    When monitoring data is received in Health Monitoring messages with this subobject are written to the application log. For example, when monitoring data is received from a Simple Diagnostics Agent.