Service Availability Management

Service Availability Management in SAP Solution Manager provides availability reporting for business-critical systems, databases or services. It calculates the availability based on outages detected by system monitoring and compares it to defined availability Service Availability Levels. Unplanned outages are automatically imported from system monitoring . Planned downtimes are automatically imported from work mode management. They must be reviewed and adjusted by system administrators and confirmed by IT service managers or other supervisors before they are taken into account in the availability reporting. The adjusted downtime data is called service outages. Service Availability Management can be the single source of truth for Availability Service Level Agreement reporting.

The high level process is as follows:


Service Availability Management consists of the following pages:

  • The Overview page shows the calculated availability for the selected entities.
  • The Outages page shows the outages for the selected entities and allows to create or maintain outages.
  • The Service Availability Definitions page shows the Service Availability Definitions for the selected entities and allows to maintain additional service definitions. 
  • The Analytics page  provides up time reporting.

Maintain a Service Definition for each system, database or service that has to be managed by Service Availability Management.

How to open Service Availability Management

  1. Open the Launchpad
  2. Select Technical Administration
  3. Select Service Availability 

Overview/Service Reporting

The Service Reporting page shows the calculated availability for the selected systems, databases and services in selected  reporting periods. It gives you a quick overview whether the availability service level agreement was met or breached.


From the Service Reporting you have the following options:

  • Switch between a monthly or yearly display depending on the defined reporting period. 
  • Select different reporting periods 
  • Select whether the displayed availability is calculated based on confirmed outages only or based on all outages. 
  • Select a reporting period to open the Availability Charts for the selected reporting period and compare the availability of multiple systems /services
  • Select an entity to open the Availability Charts for this entity and compare current and previous reporting periods

Availability Charts

From the Availability Charts you have the following options:

  • Compare the availability for multiple systems in one reporting period or the availability for system in multiple reporting periods.
  • Select one system to drill down to the availability for months or days and to identify the months or days where outages occurred and the availability dropped.
    (Whether you can drill down to days or drill down to months depends on the reporting period defined in the Service Availability Definition.)


Outage Summary

The Outage Summary view provides an overview over open and confirmed outages and Service Level Agreement.

The following data is shown:

Entity and Entity type Identifier and type of the system, database or service

Reporting Period

The reporting period for which outages were recorded

SLA Breached

The SLA is breached when calculated availability based on confirmed outages is below the availability SLA for the reporting period

Open outages

Outages in status New, in process, to be reviewed. They need to be processed and confirmed or hidden

Confirmed outages

Outages that have been processed and are confirmed

Remaining Downtime Confirmed

Shows how many minutes are left in the current reporting period until the SLA is breached. If the value becomes negative, SLA is already breached. It considers only the confirmed outages.

Remaining Downtime All 

Shows how many minutes are left in the current reporting period until the SLA is breached. If the value becomes negative, SLA is already breached. It considers both confirmed and unconfirmed outages.

Availability Confirmed (%)

Shows the calculated availability in the current reporting period. It considers only confirmed outages for the calculations.

Availability All (%)

Shows the calculated availability in the current reporting period. It considers both confirmed and unconfirmed outages.

Availability Threshold (%)

Shows the SLA threshold defined in the Service Definition

The Outage Overview  provides the following options: 

  • Use the filter button to filter on systems  with open outages and to filter on entities where SLA is breached
  • Select one or multiple entities and press the mass maintenance button. This opens the Outage Overview view showing all outages for the selected entities in the selected reporting period. From here you can do mass or single maintenance of outages.
  • Click on one system. This opens the Outage Overview view showing all outages for the selected system in the selected reporting period. From here you can do mass or single maintenance of outages.

Outage Overview

The outage overview shows the list of outages for the selected entities in the selected reporting periods. It allows to edit existing outages or create new ones.

The following data is shown:

Data Content
Entity and Entity type Identifier and type of the system, database or service

Type

  • Planned - The entry is a planned downtime and therefore normally not SLA relevant
  • Unplanned - It is an unplanned outage and there normally SLA relevant

Status

  • New - Initial status
  • In process - It is in process by system administrator
  • to Be Reviewed - the outage has been processed by system administrator and waiting to be reviewed and confirmed or rejected by service managers or other supervisors
  • Confirmed - the outage has been confirmed by service managers or other supervisors and is used in SLA calculations and availability reporting

Category

The category of the outage as maintained in the Outage Details

SLA relevant 

Whether the outage is SLA relevant or not. Only the duration of SLA-relevant outages is considered by availability reporting. Unplanned outages are by default SLA relevant while planned downtimes are not. But this can be changed in the outage details

Start and End time

Start and End time of the outage

Source

  • MAI - The outage has been created from on an availability alert for the selected entity after the alert was confirmed
  • Work Mode - The outage has been created from a planned downtime in IT Calendar and Work Mode Management for the selected entity and has been transferred to SAM after the planned downtime was completed
  • Manual - The outage has been created manually in Service Availability Management

Hidden

Whether the downtime has been "hidden" or not. Hidden outages are outages that the system administrator wants to exclude from reporting because for example they are based on false alerts. Hidden outages are only shown if " Show Hidden Outages" is "Yes".

Mass Maintenance of Outages

You can select one or several outages and perform the following mass maintenance actions for them together:

  • Hide Outage: Set the selected outages to hidden. This will hide them from the outage list and exclude them from availability reporting. Hidden outages can be shown if " Show Hidden Outages" is "Yes". Hidden outages are outages that the system administrator wants to exclude from reporting. For example, because they are based on false alerts.
  • Unhide Outage: Remove the hidden flag from the selected hidden outages. Afterwards they are shown again in the outage list and can be processed. Hidden outages can only be selected if " Show Hidden Outages" is set to "Yes".
  • Approve Outage: Set all selected outages to status confirmed. Confirmed outages will be used in SLA calculations and availability reporting
  • Reject Outage: Set all selected outages to status in process so that they must be maintained again by system administrator
  • Review: Set all selected outages to status to be reviewed
  • Modify: Date Time Maintain common start and end time for all selected outages together
  • Set reason: Maintain common reason for all selected outages together

Create Outage

Normally, unplanned outages are detected automatically by system monitoring and transferred to Service Availability Management. Planned downtimes are imported from work mode management or system monitoring. So, the manual creation of outages is only necessary in special cases.

Proceed as follows to create new outage:

  1. Select the Create Outage button
  2. Maintain the following data:

Data Content

Entity

  • The system, database or service for which the outage shall be created

Type

  • Planned -  The entry is a planned downtime and therefore normally not SLA relevant
  • Unplanned - It is an unplanned outage and therefore normally SLA relevant

Category

The category of the outage

SLA relevant 

Whether the outage is SLA relevant or not. Only the duration of SLA-relevant outages is considered by availability reporting. Unplanned outages are by default SLA relevant while planned downtimes are not. But this can be changed in the outage details

Start and End time

Start and End time of the outage

Reason

Textual description of downtime reason

Business Impact

Textual description of business impact

Other Comments

Other Comments

Click on "Save" to create the new outage. Click on Email to send a notification email about the new outage.

Please note: New outages are by default in status "New". They need to be reviewed and set to confirmed before they are taken into account for availability calculations .


Edit Outage

  1. Select one Outage. This opens the Outage Details screen. 
  2. Maintain the following data:

Data Content

Type

  • Planned -  The entry is a planned downtime and therefore normally not SLA relevant
  • Unplanned - It is an unplanned outage and therefore normally SLA relevant

Category

The category of the outage

SLA relevant 

Whether the outage is SLA relevant or not. Only the duration of SLA-relevant outages is considered by availability reporting. Unplanned outages are by default SLA relevant while planned downtimes are not. But this can be changed in the outage details

Start and End time

Start and End time of the outage

Reason

Textual description of downtime reason

Business Impact

Textual description of business impact

Other Comments

Other Comments

Status

  • New - Initial status
  • In process - It is in process by system administrator
  • to Be Reviewed - the outage has been processed by system administrator and waiting to be reviewed and confirmed or rejected by service managers or other supervisors
  • Confirmed - the outage has been confirmed by service managers or other supervisors and is used in SLA calculations and availability reporting 

If the outage is already set to completed you can only edit reason, business impact and comments and revert the status.


Service Availability Definitions

Each system, database or service that has to be managed by Service Availability Management needs to have an active service definition.

Select the Service Availability Definition page to see the service definitions for the selected systems.

The Service Availability Definitions Overview shows the following data for each service definition

Data Content

Status

  • Completed - Service definitions whose end dates has already passed ( today > end date)
  • Active - Service definitions that are active (start date < today < end date). For active service definitions, you can only change the end date
  • Inactive - Service definitions that start in the future (today < start date). Only inactive Service definitions can be deleted.

Title

Title

Entity and type

Identifier and type of the selected system, database or service

Start and end date

First and last validity date of the service definition

Edit Existing Service Availability Definition

Select a service availability definition to see it's details. For existing service availability definitions, it is possible to change the end date and to add new Contractual Maintenance Periods. The other settings cannot be changed. 


Add New Service Availability Definition

Select Button " Add new service availability definition" to create a new service availability definition:

To create a new service availability definition, you need to maintain the following data

General Data:

Data Content

Title

Title

Start and end date

First and last validity date of the service definition

Time Zone

The time zone in which availability patterns and contractual maintenances are defined

In the entities tab: 

Entity /Entity type: Select  system, database or service for which the new service definition is valid.

In the Availability tab:

DataContent

SLA Threshold (%)

The minimum allowed availability in %.

e.g.: 99.5 %, 95%

Reporting Period

The period for which the availability data shall be calculated. Possible values are monthly or yearly

Pattern

Define the daily or weekly pattern for the agreed service time during which the entity must be available per SLA.

Examples:

  • The agreed service time is 7X24 (7 Days 24 hours). Enter a daily pattern with start time 00:00 am and 24 hours 00 minutes duration
  • The agreed service time is 5X8 (from 8 am to 4 pm on work days). Enter a weekly pattern with start time 08:00 am and 08 hours 00 minutes duration

In the Contractual Maintenance tab, you can define reoccurring periods and specific dates, during which maintenances are allowed without affecting the SLA. If Contractual maintenances overlap with agreed service times, the agreed service time is shortened on this day.

Best Practices

  • Define Service Definitions with an end date which is far in the future (e.g.31-Dec 2099) to avoid that service definitions expire unnoticed and availability reporting is no longer provided.
  • It is not allowed to change SLA threshold or agreed service times in active service definitions. Proceed as follows to change the SLA threshold or the agreed service time for a system with active service definition:
    1. Change the end time of the active service definition to the end of the current reporting period
    2. Create a new service definition with a start date in the next reporting period

Required Roles

  • Technical Administration composite roles: SAP_TECHNICAL_ADMIN_COMP, SAP_TECHNICAL_ADMIN_DISP_COMP
  • Service Availability Management roles : 
    • SAP_SM_SAM_ALL SAM (Service Availability Management) - full authorization
    • SAP_SM_SAM_DIS SAM (Service Availability Management) - display authorization
    • SAP_SM_SAM_EDIT SAM (Service Availability Management) - execution authorization
    • SAP_SM_SAM_REVIEW SAM (Service Availability Management) - review authorization

How the Availability is Calculated

The availability is calculated as follows: Availability (%)= (1 - OT  / AST ) * 100 %

  • AST ( Agreed Service Time ) is the duration of the agreed service time per Reporting Period 
  • OT (Outage Time) is the duration of all system outages that occurred during the Agreed Service Time

 

Example

  • The agreed service time is on work days from 9 am to 5 pm.
  • The reporting period is monthly. The current month has 21 work days.
  • A Contractual Maintenance Period is scheduled every 1st Friday of the month from 4 pm to 9 pm.
  • The customer has requested an additional planned downtime on the 2nd Friday from 2 pm to 6 pm for release upgrade. The planned downtime is outside the contractual maintenance period.Despite of this it  is not SLA - relevant because was requested by customer.
  • A system outage occurred on the 2nd Tuesday from 1 pm – 6 pm.

Service Availability Management will calculate the availability as follows:

Data Content
The duration of the agreed service time is AST = 21d*8h*60 min = 10080 min
The system outage lasted for 5 hours (300 minutes). But only 4 hours (240 minutes) are during AST OT = 240 min
The system availability is calculated as follows

Availability (%) = (1 - 240 min / 10080 min) ∗ 100 % = 97.62 %