Extractor Framework Overview

Data Flow & Data Collection

In this section we describe the data flow and the data collection in the extractor framework.

The following terms will be used repeatedly in the following, so we want to explain them here, for better understanding later on.

Work List Item

A work list item defines which metric has to be retrieved by which extractor, including all required configuration parameters (Main Extractor, Extractor, PPMS Modeling, RFC destination to call the extractor, frequency)
All work list items can be found in the E2E_ACTIVE_WLI table, the flag active is set if the extractor is actually running

Extractor

The smallest technical entity collecting the data, the actual data collector
Retrieves data either directly from managed system (e.g. via RFC call) or indirectly from an intermediate layer (e.g. Wily Introscope Enterprise Manager)

Main Extractor

The entity in SAP Solution Manager responsible for execution of the extractor
It calls the extractor locally or in managed system
Communication with EFWK API (e.g. resource release, status update)

Resource Manager

Responsible for managing the execution of all extractors in pull mode
Handling of Resource Management
Starts main extractors asynchronously

The following picture shows the data collection infrastructure:

The picture above shows the data and the request flow. Red dots mark push interfaces, where the data is pushed autonomously from the source to the target of the connection. Blue dots are pull interfaces, where the request for data is triggered from the target of the data flow. Yellow dots mark interfaces, where a request for configuration information can take place. For example, between the Data Provider Connector and the MAI Config. Repository to verify which of the received metrics are valid. Green dots are triggers from the EFWK Resource Manager when starting the Main Extractors or the Data Provider Connector for MAI.

PUSH Extractors

If an extractor is a PUSH extractor, the data provider in which the extractor is running triggers the data extraction based on its configuration autonomous and sends the data to the EFWK via the Data Provide Connector web service API. PUSH extractors are the extractors for MAI located in the diagnostics agents or in the DPC extension for Wily Introscope EM for MAI. These MAI data sources collect and send data based on the monitoring configuration, which contains information on which data to collect and with which frequency. This configuration is stored centrally in the MAI Config Repository, but is replicated to the remote data source/collector. So the diagnostics agent or Wily Introscope doesn't have to request this configuration every time, but only if it deems necessary. For example, in case an agent loses connectivity for some time and reconnects. Usually this configuration is pushed to the diagnostics agent and to Wily EM during the System Monitoring setup. The Data Provider connector uses this configuration information to decide which of the metrics it receives via Agent or Introscope PUSH are valid and processed further.

PULL Extractors

PULL extractors are extractors that have to be triggered to actually deliver data. They are triggered by the EFWK resource manager, based on the configuration stored in the EFWK configuration.

The job EFWK RESOURCE MANAGER calls the Resource Manager every minute. Depending on the work list items, that are due as of the table E2E_ACTIVE_WLI, the main extractors are started asynchronously. As the main extractor is called asynchronously by the Resource Manager, a new Resource Manager run could start before the main extractor called in the previous run is finished.

The main extractor then calls the actual extractor in the managed system or locally in SAP Solution Manager, based on the target RFC destination maintained in table E2E_ACTIVE_WLI. If the extractor runs locally the RFC destination would be NONE. The extractor is usually a function module and runs the extractor class to collect the data. When the main extractor receives the data from the extractor, it calls the automatic data enrichment.

When the data has been enriched, the main extractor calls a data loader, which is a function module in the BW system of SAP Solution Manager which writes the data into data targets such as BW cubes or DB tables.

When the data loader has finished, the main extractor can call the post processor, e.g. to clean up the data source system by deleting temporary extractor data. When the post processing has finished, the main extractor updates the status record for the extractor and ends.

Note: All data transported in the EFWK is in the UTC time zone. The same applies to logging and the required record algorithm. Only when the data is written to the data storage it is converted to the time zone of SAP Solution Manager.

Resource Management

To make sure the EFWK doesn't overload the managed system or SAP Solution Manager, a resource management was put into place. This way the RFC destinations and work processes used by the EFWK are restricted.

The Resource Manager starts the main extractor via a local RFC call (RFC destination NONE). This local RFC call blocks one dialog workprocess. In the main extractor the suitable call to the managed system is made, if the resource cap for this RFC destination is not used yet.

While the extractor in the managed system collects the data, the local Dialog RFC might be rolled out to free the local resource for other extractors. When the extractor delivers the required data the DIA RFC is rolled in again the data is processed by the extractor framework.

We differentiate between local and remote resources.

Local Resource: Number of Dialog Work Processes to be used by EFWK
Remote Resource: RFCs per system or per Introscope to be used

The resource cap defines the maximal number of extractors, which can run in parallel and using the same RFC Destination.

Resource Manager Injection Cycle

All work list items are stored on a global work list in table E2E_ACTIVE_WLI. In each run of the Resource Manager the due work list items are added to the current work list.

From the current work list, the work list items are injected if the resource cap allows it. If a work list item couldn't be injected, it stays on the current work list. The resource pool injection runs through several injection passes. Between each injection pass is a wait state which should allow resources to become available again. The resource pool injection will try to inject a work list item for a defined amount of injection passes. The amount of injection passes is configurable.

If a work list item could not be injected, it is sent back to the global work list and the priority for this work list item is raised by one.

Extractors Running in LUW Mode

Extractors in LUW mode are complex extractors that optimize the data collection and the data distribution in SAP Solution Manager.

The LUW ME Controller is the extractor that is triggered by the EFWK Resource Manager on a regular basis. The ME Controller then starts the Primary extractor, which usually also performs the data collection in the managed system based on defined filter criteria. The Primary extractor stores the data in memory and returns to the ME Controller, which then distributes the collected data sequentially to several Secondary extractors. These secondary extractors then store the data in different info cubes. It can happen that the same data record is needed in different info cubes.

The LUW concept makes sure that data is only collected once and that not several different extractors are needed for the data collection. This improves the infrastructure and also the load put on the managed system.