TrueSight Capacity Optimization for Storage: Assessing the Requirements

This article helps BMC TrueSight CO administrators assess the size of the storage infrastructure to be analyzed in TrueSight Capacity Optimization, the required performance and helps them understand the risks behind each component involved in the solution.

Related Topics

Introduction

Managing the capacity of a storage environment is a technical challenge because of the amount of data to be collected and processed. When considering a typical datacenter, compute, network and storage represent each a third of the total complexity. This means that the amount of resources required to monitor the storage infrastructure is of the same order of magnitude as for compute or network.

This article helps BMC TrueSight CO administrators assess the size of the storage infrastructure to be analyzed in TrueSight CO, the required performance and helps them understand the risks behind each component involved in the solution.

EMC

BMC TrueSight Capacity Optimization for Storage relies on the EMC SMI-S Agent ETL for TrueSight Capacity Optimization to collect the storage metrics, which itself relies on the EMC SMI-S Provider (also known as ECOM).

Architecture - EMC SMI-S Agent ETL for TrueSight Capacity Optimization

When collecting capacity and performance metrics on a large EMC storage environment, the following components come at play:

  • EMC SMI-S Provider (ECOM)
  • Sentry’s EMC SMI-S Agent ETL for TrueSight Capacity Optimization
  • BMC TrueSight Capacity Optimization
  • Database Server (Oracle or PostgreSQL)

EMC SMI-S Provider (ECOM)

Prerequisites

Note: Always install the latest supported version of the EMC SMI-S Provider which can be downloaded from [Dell EMC Support](//support.emc.com).
EMC NAS
EMC Hardware Supported version of the embedded EMC SMI-S Provider
EMC Celerra v8.1.0
EMC VNX v7.1.76-4
EMC Disk Arrays
EMC Hardware Supported versions of the EMC SMI-S Provider
v4.6.2.30 v8.3.0.3
CLARiiON (CX, CX3, AX series)*
CLARiiON (CX4, AX4 series)
VNX series
Symmetrix DMX (DMX-2, DMX-3, DMX-4 series)
VMAX (10K, 20K, 40K series)
VMAX (100K, 200K, 400K series)

* Older versions of EMC CLARiiON have been validated with EMC SMI-S Provider v4.4.

To manage EMC CLARiiON and VNX systems, the EMC SMI-S Provider needs to have network access to the Storage Processors of the CLARiiON and VNX systems. Administrator credentials are required to connect to the arrays.

To manage EMC VMAX and Symmetrix DMX systems, the EMC SMI-S Provider requires at least one LUN to be mounted from each array to manage. These special LUNs are called “gatekeepers” and are used for the communication between the Symmetrix array and the SMI-S Provider (which does not have an IP address). EMC recommends having between 4 and 6 gatekeeper LUNs for each managed array. Increasing the number of gatekeepers dramatically improves the performance of the EMC SMI-S Provider.

The system hosting the EMC SMI-S Provider requires:

  • 2 x 64 CPUs (or 2 virtual CPUs)
  • 8 GB of memory
  • 50 GB of disk, 10K rpm class
  • A 1Gb/s network adapter
  • A 4Gb/s dual port HBA
  • A supported 64-bit version of Windows Server or Linux

The SMI-S Provider needs to be configured with a “Heap Size” of 4 GB.

Scalability

EMC states that the EMC SMI-S Provider can manage up to 5 arrays, with up to 10,000 volumes each.

Real-life experience shows that the scalability of the EMC SMI-S Provider depends on:

  • The number of arrays
  • The number of ports in each array
  • The number of disks in each array
  • The number of volumes
  • The number of hosts these volumes are mapped to

The most important factor is the number of volumes. So it can be said that an EMC SMI-S Provider can handle up to 50,000 volumes.

Important: The performance of the EMC SMI-S Provider is affected by the number of client applications performing concurrent data requests. For example, if both BMC TrueSight Operations Management and BMC TrueSight Capacity Optimization are to extract metrics for EMC, the SMI-S Provider has to be sized accordingly, i.e. to handle twice the workload described in this article.

Sentry’s EMC SMI-S Agent ETL for TrueSight CO

Prerequisites per version

Hardware
Hardware EMC SMI-S ETL Agent for TrueSight Capacity Optimization
CPU 2 x 64 CPUs (or 2 virtual CPUs)
Memory 8 GB
Disk Space 50 GB, 10K rpm class
Network 1 Gb/s network adapter
Software
EMC SMI-S ETL Agent for TrueSight Capacity Optimization Supported versions of BMC TrueSight Capacity Optimization
  v9.5.00 v10.0.00.01 with CHF5 v10.3 v10.5 v10.7 v10.7.01 v11.0.00 v11.3.01
v4.0.00 / v4.0.01 / v4.0.10
v4.1.00
v5.0.00
v6.0.00
v7.0.00 / v7.1.xx
v10.5.00
v10.7.00
v10.7.01
v11.0.00
v11.3.01
Scheduler configuration
EMC SMI-S ETL Agent for TrueSight Capacity Optimization Java Virtual Machine Heap Size
v4.0.00 / v4.0.01 / v4.0.10 / v4.1.00 / v5.0.00 4096m
v6.0.00 / v7.0.00 / v7.1.xx / v10.5.00 / v10.7.00 / v10.7.01 / v11.0.00 / v11.3.01 2048m

The BMC TrueSight CO Scheduler needs to be configured with the above “Heap Size” (change SCHEDULER_HEAP_SIZE environment variable with the according value).

Sentry’s EMC SMI-S Agent ETL for BCO version 4.0.00

Version note

BMC Capacity Optimization 9.5.00 is required. The code will not run with BMC Capacity Optimization 9.0.00.

Scalability

The ETL runs in the BCO scheduler context. It is the performance of this system that needs to be assessed, based on the size of the targeted storage environment.

The EMC ETL collects metrics for various types of objects. The objects listed below are the most important factors:

  • Number of volumes
  • Number of hosts (mapped to the volumes)
  • Number of physical disks

In general, the number of volumes is the highest of the three and is the one driving the complexity of the processing.

The EMC ETL has 2 main processes:

  1. The “service”, in charge of collecting the data which itself has 2 main stages:
    • The “discovery”, which retrieves the characteristics and all of the configuration information about the entire storage infrastructure, and which is run every hour by default
    • The “collect”, which retrieves the performance and capacity metrics for all objects, and which is run every 15 minutes by default
  2. The “saver”, in charge of loading the data collected by the “service” into the data warehouse (BCO database), which runs once an hour by default but can be set to 2 hours through the Saver period property while configuring the ETL task.

The “discovery” can take a very long time to complete and is very resource intensive. The “collect” cannot run while the discovery is running, which can cause collection gaps.

The table below summarizes the maximum number of volumes that can be handled by version 4.0 of the EMC ETL, depending on the discovery and collect polling cycles.

Discovery Cycle Collect Cycle Maximum Number of Volumes
1h (default) 15 min (default) 1,000
24h 15 min 2,000
24h 1h 4,000
Stability

When the discovery or the collect takes too much time to complete, the ETL process is killed by the BCO scheduler. It is therefore important not to exceed the above values to avoid random crashes of the solution.

Sentry’s EMC SMI-S Agent ETL for BCO version 4.0.01

Version Note

The version 4.0.01 has been specially built to optimize the performance of the ETL code.

Scalability

Version 4.0.01 collects the same metrics as version 4.0.00 and is affected by the same factors, in terms of scalability.

However, the scheduling of the collection has been made more flexible, as described below:

As in version 4.0.00, the EMC ETL for 4.0.01 has 2 processes: the “service” and the “saver”. The “service” has been improved and has now 4 main stages:

  • The “discovery”, which retrieves the characteristics and all of the configuration information about the entire storage infrastructure, and which is run every 24 hours by default
  • The “system collect”, which collects the metrics for the storage arrays, the controllers and ports, and which is run every 15 minutes by default
  • The “storage collect”, which collects the metrics for the disks and storage pools, and which is run every hour by default
  • The “volume collect”, which collects the metrics for the volumes and hosts, and which is run once a day by default

As in version 4.0.00, the “discovery” can take a very long time to complete and is very resource intensive. The “collects” cannot run while the discovery is running, which can cause collection gaps.

Real life experience shows that the “saver”, which relies on SQLite, is the performance bottleneck of the solution.

The table below summarizes the maximum number of volumes that can be handled by version 4.0.01 of the EMC ETL, depending on the discovery and collect polling cycles.

Discovery Cycle System Collect Cycle Storage Collect Cycle Volume Collect Cycle Maximum Number of Volumes
24h (default) 15 minutes (default) 1h (default) 24h (default) 10,000
24h 1h 6h 24h 25,000
24h 2h 6h 24h 40,000
Stability

When the ETL takes too much time to complete its operation, it automatically interrupts the processing. This causes collection gaps but it no longer crashes the entire solution like in version 4.0.00.

Sentry’s EMC SMI-S Agent ETL for BCO version 4.0.10

Version Note

The version 4.0.10 improves the performance of the “saver” by removing SQLite as the performance bottleneck of version 4.0.01.

Scalability

Version 4.0.10 collects the same metrics as version 4.0.00 and 4.0.01 and is affected by the same factors, in terms of scalability.

The table below summarizes the maximum number of volumes that can be handled by version 4.0.10 of the EMC ETL, depending on the discovery and collect polling cycles.

Discovery Cycle System Collect Cycle(service.period) Storage Collect Cycle Volume Collect Cycle Maximum Number of Volumes
24h (default) 15 minutes (default) 1h (default) 24h (default) 30,000
24h 1h 6h 24h 80,000
24h 2h 6h 24h 120,000
Note: EMC does not recommend managing more than 50,000 volumes per SMI-S Provider.

Sentry’s EMC SMI-S Agent ETL for TrueSight CO version 5.0.00, 6.0.00, and 7.1.xx

Version Version Note Scalability
v5.0.00 Supports Remote ETL Engine and BMC TrueSight CO 10. Collects the same metrics as version 4.0.00, 4.0.01, and 4.0.10 and is affected by the same factors, in terms of scalability.
v6.0.00 Introduces an improvement in memory management to prevent OutOfMemory exceptions when requesting the SMI-S provider. The memory requirement has been adjusted consequently. Collects the same metrics as version 5.0.00 and is affected by the same factors, in terms of scalability.
v7.1.00 Supports EMC SMI-S provider version 8 (starting at 8.0.3). Stability has been improved when interrupting the collect and the loading has been fixed to prevent failures. Collects the same metrics as version 6.0.00 and is affected by the same factors, in terms of scalability as version 6.0.00.

Sentry’s EMC SMI-S Agent ETL for TrueSight CO version 10.5.00

Version Note

Version 10.5.00 supports EMC SMI-S provider version 8 (starting at 8.0.3) and EMC Celerra 8.1.0.

Scalability

Version 10.5.00 collects the same metrics as version 7.1.xx and is affected by the same factors, in terms of scalability.

The table below summarizes the maximum number of volumes that can be handled by version 10.5.00 of the EMC ETL, depending on the service.period parameter.

Discovery Cycle System Collect Cycle (service.period) Storage Collect Cycle Volume Collect Cycle Maximum Number of Volumes
24h (default) 15 minutes (default) 1h (default) 24h (default) 30,000
24h 1h 6h 24h 80,000
24h 2h 6h 24h 120,000
Note: EMC does not recommend managing more than 50,000 volumes per SMI-S Provider.

Refer to the documentation for the procedure to configure the service.period parameter.

Sentry’s EMC SMI-S Agent ETL for TrueSight CO version 10.7.00 and higher

Version Version Note Scalability
v10.7.00 / 10.7.01 Support EMC SMI-S provider version 8.3.0.1 and EMC Celerra 8.1.0. Collect the same metrics as version 10.5.00 and are affected by the same factors, in terms of scalability.
v11.0.00 / 11.3.01 Support EMC SMI-S provider version 8.3.0.3 and EMC Celerra 8.1.0. EMC SMI-S provider version 8.3.0.3 is a minimum requirement for VMAX storage systems. Collect the same metrics as version 10.5.00 and are affected by the same factors, in terms of scalability.