Capacity Management Process - HS2016

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Capacity Management

Capacity Management

Contents
1. PURPOSE ........................................................................................ 3
2. STRUCTURE OF THE DOCUMENT ........................................................... 4
3. SCOPE............................................................................................ 5
4. GENERAL ASSUMPTIONS..................................................................... 6
5. CAPACITY MANAGEMENT FRAMEWORK .................................................. 7
5.1 Capacity Management Interactions........................................................... 7
5.2 Capacity Management Framework ........................................................... 9
5.3 Capacity Planning.............................................................................. 9
5.4 Resource Planning ............................................................................. 9
5.5 Monitoring and threshold requirements .................................................... 10
5.6 Capacity threshold matrix .................................................................... 10
5.7 Capacity database ............................................................................ 11
5.8 Reporting Performance ...................................................................... 12
5.8.1 Analyze Performance ......................................................................... 12
5.8.2 Tuning and implementation.................................................................. 12
6. CAPACITY MANAGEMENT COMPONENTS ............................................... 14
6.1 Process Model ................................................................................ 14
6.2 Roles and responsibilities .................................................................... 19
7. REFERENCE .................................................................................... 20
7.1 Business Rules ................................................................................ 20
7.2 Risk ............................................................................................ 20
7.3 Quality Attribute ............................................................................. 20
7.4 Data Quality Dimension ...................................................................... 21
7.5 Operation Policy .............................................................................. 21
7.6 KPI ............................................................................................ 21
7.7 Critical To Quality CTQ ....................................................................... 22
7.8 Abstract Time-Scale .......................................................................... 22
7.9 SLA Terms..................................................................................... 22
8. GLOSSARY/ACRONYMS ..................................................................... 23

2
Hartono Subirto - 2010
Capacity Management

1. PURPOSE

The purpose of this document is to establish a Capacity Management process for NOC
Operations to ensure that all the services are supported by adequate and proper resources
and storage capacity. Furthermore, the purpose is to provide a point of focus and
management for all capacity and performance issues related to both services and resources.

The objectives of capacity management include:


• Provide advice and guidance to the VENDOR network Operations on all capacity and
performance related issues
• Ensure that service performance achievements meet or exceed the agreed
performance targets, by managing the performance and capacity of both services
and resources
• Assist with the diagnosis and resolution of performance and capacity related
incidents and problems
• Assess the impact of all changes on performance and capacity of all services and
resources
• Ensure that proactive measures are implemented to improve the performance of
services.

TR

3
Hartono Subirto - 2010
Capacity Management

2. STRUCTURE OF THE DOCUMENT

The document comprises the following chapters:

Chapter–3: Scope: In this chapter we will present the scope of the document and the
process.

Chapter–4: General Assumptions: In this chapter we will present the underlined


Assumptions for both the document and the process.

Chapter–5: Capacity Management Framework: In this chapter we will present the tailored
ITIL framework that will be used in the Engineering of the process.

Chapter–6: Capacity Management component: All the processes will be depicted and
specified using rigorous representation using BPMN and process specification templates.

Chapter–7: References: In this chapter we will present the details supporting the Capacity
Management process in tabular formats. This chapter describes Business rule, Risk, Quality
Attribute, Data Quality dimension, Operation policy, KPI, CTQ, Abstract Time-scale and SLA
terms.

4
Hartono Subirto - 2010
Capacity Management

3. SCOPE

The scope of the Capacity Management process is to monitor pro-actively the services
provided to OPERATOR and the resources used in VENDOR network Operations to deliver
those services. It applied to the NOC Operations phase and includes:
• Monitor the utilization of resources dedicated to OPERATOR
• Monitor the utilization of physical & logical link status between the equipments
• Monitor the utilization of resources at NOC facilities

Activities in the process include:


• Monitoring, analyzing, tuning, and implementing necessary changes in resource
utilization
• Producing a capacity plan that documents current utilization and the requirements,
as well as support costs for new applications or releases
• Application sizing to ensure required service levels are met
• Storage capacity management

5
Hartono Subirto - 2010
Capacity Management

4. GENERAL ASSUMPTIONS

The following are the General Assumptions included in this process:


• This is a level 1 document that represents the Capacity Management at a high level
• Information marked as TBD in the reference section is not mandatory at this stage
and will be completed in the next release of the document
• An event ticket is generated if the defined capacity threshold is violated
• NOC will notify Operations management for any threshold violations if there is a
gradual increase in the capacity utilization
• Operations responds to the threshold notification & will initiate corrective actions
and facilitate NOC for adequate capacity or resources
• An RFC is generated for any capacity or network resources upgrade to NOC

6
Hartono Subirto - 2010
Capacity Management

5. CAPACITY MANAGEMENT FRAMEWORK

5.1 Capacity Management Interactions

Interactions with other


Information
Processes
Availability management determines the causes of
unacceptably slow performance. Unacceptably slow
performance for a business service can have almost the same
Availability Management
impact as if the service was unavailable. As such, availability
management needs to understand and use the data reported
on performance degradation.
Configuration management requires a CMDB. Since the CDB is
a subset of a CMDB, the data in the CDB stores resource and
Configuration capacity usage information about configuration items in the
Management CMDB. This linkage is a key since information provided by
capacity management is shared with other processes via the
CMDB.
Change management needs capacity information to assess the
“before and after” impact of changes on capacity. The CDB
Change Management
provides a "before" picture for any configuration changes or
upgrades. Prior to making actual changes, the CDB is a

7
Hartono Subirto - 2010
Capacity Management

reference source for determining the effects of these


proposed changes. Once changes are made, the CDB also
assists with monitoring results after the configuration change,
to ensure it had the desired effect
Service level management benefits from liable source of
performance data that can be used to set reasonable goals
Service Level
and to measure the achievement of those goals in production.
Management
For all performance related SLAs, the CDB becomes the
database of record, tracking compliance with the agreements
Incident management needs to monitor data stored in the
CDB to generate incidents when exceptions are noted.
Incident management and capacity management are
Incident Management reciprocally related in that incident management tells capacity
management about incidents related to capacity and
performance, while capacity management resolves and
documents capacity related incidents
Problem management needs the ability to review utilization
and service data to quickly identify the root cause of
performance problems. This data is needed both on a real-
Problem Management
time and historical basis to see the current situation and the
history building up to the event. The CDB is a key source of
this data.
service continuity management utilizes information in CDB to
Service Continuity model the impact of various
Management failures, disaster recovery, or business demand change
scenarios.

8
Hartono Subirto - 2010
Capacity Management

5.2 Capacity Management Framework


The below diagram depicts the framework of Capacity Management in VENDOR network
Operations.

5.3 Capacity Planning

The main task of Capacity Management is to draw up the Capacity Plan. The Capacity Plan
includes:

• All the information about the capacity of the VENDOR network Operations
• Forecasts of future needs based on trends, business forecasts and existing SLAs
• The changes needed to adapt the VENDOR network capacity to technological
changes and the emerging needs of users and customers

The information for producing the Capacity Plan comes from various sources. Inputs
required for this are:

• Service Level Agreements


• Request from Problem Management to look into capacity issues for the analysis of
problems
• Impact assessment of changes on the capacity of infrastructure
• Business strategy and plans that provide information on future business growth
• External suppliers of new technology
• Deployment/development plans and programs
• Change Management process

5.4 Resource Planning

An essential element in Capacity Management is to assign appropriate personnel and


hardware and software resources to each service and application.
9
Hartono Subirto - 2010
Capacity Management

Capacity Management has reliable information about:

• Agreed and/or envisaged levels of service


• Expected levels of performance
• Impact of the application or service on the customer's business processes
• Margins of security and availability
• Associated costs of hardware and other IT resources needed

5.5 Monitoring and threshold requirements


Defining monitoring and threshold is to ensure that the performance of the IT infrastructure
matches the requirements of the OPERATOR requirements as well as the operational needs.

The activities involved include:


• Identify critical IT Network resources related to each of the business processes,
which need adequate capacity. It includes:
a. Databases
b. Hardware
c. Network components
d. Data files
e. Network Connectivity
• Define detailed monitors and thresholds - after the requirements are identified,
actual thresholds per service and component are defined
• Adjust or implement monitor and threshold - when the monitors and thresholds are
defined they can be implemented. If the monitors already exist, a modification is
made to the existing monitor. Implementing monitors and thresholds is part of
adjusting systems and should therefore always follow change management
procedures
• Watch monitor and threshold - after monitors and thresholds are defined they are
monitored against future requirements. The requirements can change based on
business requirements. Therefore, the requirements for monitoring are matched
against the actual monitors and thresholds on a regular basis.
• Management should ensure that the performance of information technology
resources is continuously monitored and exceptions are reported in a timely and
comprehensive manner.
5.6 Capacity threshold matrix
The below table represents the various components under monitoring and their threshold
levels:

Object Category Counter Instance Threshold

10
Hartono Subirto - 2010
Capacity Management

Total processes Real Time


BSS Component
running per device

RF Component utilized

Not
TX Component
applicable

Diesel Generator Infrastructure Diesel level Real Time


Nr of ports/slots
available
Physical Real Time
Component
Interface per device
Total traffic
utilization

Logical interface Total traffic Real Time


Component
utilization per device

5.7 Capacity database


The CDB covers all the business, technical and service information received and generated
by Capacity Management in relation to the capacity of the infrastructure and its elements.
Ideally, the CDB is interrelated with the CMDB so that the CMDB is able to give a complete
image of the systems and applications and includes all the information about their capacity.
However, the two databases are "physically independent".

11
Hartono Subirto - 2010
Capacity Management

5.8 Reporting Performance


It is essential to prepare reports allowing the performance of Capacity Management to be
evaluated. The documentation drawn up includes information about:
• The utilization of resources
• Deviations of the real capacity utilization from the planned thresholds
• Analysis of trends in the use of capacity
• Metrics established for capacity analysis and performance monitoring
• Impact on quality of service, availability and other IT processes

5.8.1 Analyze Performance


The information that is gathered is analyzed. Analysis of data results in identification of
issues like:
• Inappropriate distribution of workload across available resource (e.g., CPU load
is not optimized across several CPUs)
• Unexpected increase in the use of a service or resource

5.8.2 Tuning and implementation


Tuning activities are small ‘configuration’ changes that are pre-approved by the Change
Management process. When an action is needed that is not on the pre-approved list a
Change request is initiated to make sure future capacity requirements are met. These
actions are performed by the operational staff on a day-to-day basis to ensure agreed
service levels are met or that resources are more effectively used.

• Tuning activities are activities that result in zero downtime for the users during
the agreed service hours. Some techniques that are applied are balancing disk
load or processor usage across platforms.
• Two types of Change request are initiated:
o A Change request to make an adjustment to the pre-approved list of
tuning activities
o For major platform changes like deploying new systems or network
components, the situations is reported to operations for necessary
corrective actions. After business approval is received, the change is
implemented as a Non-PAC order from the project management.
• Define action - determine the action that is required to meet service level
agreements or predefined thresholds. The information for this activity comes
from matching actual use against thresholds set.

12
Hartono Subirto - 2010
Capacity Management

• Implement Change - when the implementation is done, tests are conducted to


check if all operational services and systems are working as they did before the
modification
• Log modification - the modification is logged in the service management tool to
ensure that processes, like incident and problem management, know which
modifications are executed. These processes need this information in case
incidents arise from the implemented modification

13
Hartono Subirto - 2010
Capacity Management

6. CAPACITY MANAGEMENT COMPONENTS

6.1 Process Model

14
Hartono Subirto - 2010
Capacity Management

15
Hartono Subirto - 2010
Capacity Management

Process Specification
Specification Description
To establish a Capacity Management process for NOC
operations to ensure that all the services are backed by
adequate and proper resources and storage capacity.
Summary/Purpose
Furthermore, the purpose is to provide a point of focus and
management for all capacity- and performance-related issues,
relating to both services and resources.
Scope This is a level 1 Process Specification
Primary ITIL Reference Capacity Management
Related ITIL Practices Incident Management, Change Management, Configuration
Management, Problem Management, IT Service Continuity
Management, Availability Management, Service Level
Management.
Related Business • Ensure adequate resources capacity at all levels
Driver • Utilization of resources is at optimum within threshold
Related Operational OP-001, OP-002 (Ref:7.5)
Policies
Assumptions • Event ticket is generated if the defined capacity threshold is
violated
• NOC will notify operations management for threshold
violations with gradual increase in the capacity utilization
• operations will respond to the threshold notification and will
initiate corrective actions
• An RFC is generated for any capacity or network resources
upgrade
Trigger • Business or capacity plan
• RFC for building capacity or providing new services
Basic Course of Event 1. Obtain necessary Information
2. Define capacity threshold limits
3. Monitor the performance
4. Generate the reports
5. Review the performance
6. Continue Monitoring and close
Alternative Path Threshold Violation found
1. Generate Event ticket
2. Perform Diagnosis
3. Gradual increasing trend found

16
Hartono Subirto - 2010
Capacity Management

4. Notify operations with necessary information


5. Initiate corrective action
6. End
Exception Path Abrupt Event observed
1. Diagnose the event
2. Update the ticket
3. Assign to Problem Management
4. Close the ticket
Extension points Configuration Management, Service Level Management,
Incident Management
Preconditions 1. CDB (Capacity Management Database)
2. Identity and Access Management systems
Post -conditions 1. Utilization within threshold
2. Updated capacity
3. Utilization reports
Related Business BR-001, BR-002, BR-003 (Ref: 7.1)
Rules*
Related Risks * RSK-001 (Ref:7.2)
Related Quality QA-001, QA-002 (Ref:7.3)
Attributes
Related Data Quality DQ-001, DQ-002 (Ref:7.4)
Dimensions
Related Primary SLA SLA-001 (Ref:7.5)
Terms
Related KPIs KPI - 001 (Ref:7.6)
Related CTQs * CTQ-001 (Ref:7.7)
Actors/Agents OSS(NMS), Operations, NOC (Ref: Table 6.3)
Delegation Delegation Rule -1: Agent Not Available
1. Delegate the Issue to additional Agent with same Role
2. Update the Issue
3. Log the Delegation
Delegation Rule -2: Agent Overloaded
1. Delegate the Issue to additional Agent with same Role
2. Update the Issue
3. Log the Delegation

17
Hartono Subirto - 2010
Capacity Management

Escalation Escalation Rule: Capacity utilization found beyond threshold,


Notify to
1. Operations
Escalation Rule: Corrective action is delayed, escalate to
2. Management
Process Map Section 5.1, Section 5.2
Process Model Section 6.1
Other References • Timescale
• Escalation

18
Hartono Subirto - 2010
Capacity Management

6.2 Roles and responsibilities


Roles Responsibilities
• Review the reports submitted on utilization
• Initiate corrective actions when threshold violation is notified
Operations • Guide NOC for maintaining adequate capacity
• Provide adequate resources to NOC for operations
• Review and process the recommendation provided by NOC
• Defining utilization levels or thresholds
• Monitor the utilization
• Handle the events related to threshold violation
• Notify operations for capacity issues
NOC
• Generate reports and review with the operations
• Analyse the performance and recommend actions
• Perform trend analysis on utilization
• Facilitate operations in forecasting the capacity needs
• Monitor the resources utilization
OSS (NMS)
• Generate event tickets if threshold violation observed

19
Hartono Subirto - 2010
Capacity Management

7. REFERENCE

7.1 Business Rules


BR ID Description Context Rule Source
Event ticket is generated if the
defined capacity threshold is
BR-001 Operations NOC OSS
violated

An RFC is generated for any


BR-002 capacity or network resources Operations NOC
upgrade
operations will respond to the
threshold notification and will
BR-003 Operations Operations
initiate corrective actions

7.2 Risk
Severity
Risk ID Description Source Status Resolution
Level
Threshold levels not
RSK -
defined and monitored for Operations 4
001
a resource

7.3 Quality Attribute


QA ID Description Threshold
QA- 001 Authenticity
QA - 002 Non Repudiation

20
Hartono Subirto - 2010
Capacity Management

7.4 Data Quality Dimension


DQ ID Description Threshold
DQ - 001 Accuracy
DQ - 002 Timeliness

7.5 Operation Policy


Policy ID Description Context Importance (1-5)
OP-001 One service desk agent per shift Shift 5
OP-002 Service Desk and Incident Management is Operations 5
available 24x7

7.6 KPI
Import Soft Hard
Name Acronym Description Context
ance Threshold Threshold
Number of cases
wherein appropriate
action is not defined
KPI – 001 ANDPR Operations 5 TBD TBD
to achieve SLA or
proposed
recommendations

21
Hartono Subirto - 2010
Capacity Management

7.7 Critical To Quality CTQ


Soft Hard
Name Acronym Description Context Importance
Threshold threshold
No of cases
wherein
CTQ-001 TLV threshold Operations 5
level is
violated

7.8 Abstract Time-Scale


Name Acronym Description Quantification
Delay in corrective actions after
ATS-001 DCAATN
threshold notifications

7.9 SLA Terms


SLA ID Description Context KPI OPI CTQ
Capacity
utilization
SLA-001 Operations
reporting to
the customer

22
Hartono Subirto - 2010
Capacity Management

8. GLOSSARY/ACRONYMS

Terminology Description
Abrupt Event A sudden change in the capacity level due to an abnormal action
leading to an event or incident
BPMN Business Process Modelling Notation
CDB Capacity DataBase
CMD Capacity Management Database
CMDB Configuration Management Database
A database used to store configuration records throughout their
Lifecycle.
CPU Central Processing Unit
CTQ Critical to Quality
IPS Intrusion Preventive System
IT Information Technology
ITIL Information technology Infrastructure Library
KPI Key Performance Indicator
OPERATOR
VENDOR
NMS Network Management System
NOC Network Operations Centre
OSS Operations Support Systems
PAC Pre Approved Changes
PC Personal Computers
RFC Request for Change
SLA Service Level Agreement
VLAN Virtual Local Area Network
VRF Virtual Routing & Forwarding
ACR Added Capacity Request
GATS GSM Automatics Tracking System
AMR Antenna Modification Request
PCR Parameter Change Request
SCO Site Change Order
SSO Site Search Order

23
Hartono Subirto - 2010

You might also like