Oil Refinery 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Journal of Loss Prevention in the Process Industries 22 (2009) 244–253

Contents lists available at ScienceDirect

Journal of Loss Prevention in the Process Industries


journal homepage: www.elsevier.com/locate/jlp

Development of Risk-Based Inspection and Maintenance procedures


for an oil refinery
M. Bertolini a,1, M. Bevilacqua b, 2, F.E. Ciarapica b, *, G. Giacchetta b, 3
a
Department of Industrial Engineering, University of Parma, Viale delle Scienze, 181/A 43100 Parma, Italy
b
Energy Department, University Politecnica delle Marche, via Brecce Bianche, 60100 Ancona, Italy

a r t i c l e i n f o a b s t r a c t

Article history: The management of failure analysis has a strategic importance within a refinery from the organizational,
Received 16 January 2007 engineering and economic point of view. The determination of an algorithm, that allows a methodical
Received in revised form and as far as possible automatic approach to management of failure data, can make substantial
10 January 2009
improvements in the organization of work and in the decision-making processes.
Accepted 19 January 2009
A panel of expert, made up of academicians and refinery operators, was formed in order to develop
a Risk-Based Inspection and Maintenance (RBI&M) procedure. RBI&M procedure developed comprises six
Keywords:
modules: identification of the scope, functional analysis, risk assessment, risk evaluation, operation
Risk management
Safety management selection and planning, J-factor computation and operation realization. Taking into consideration
Maintenance historical data regarding Near Accidents, Operating Drawbacks, Occupational and Environmental Acci-
Work order management dents occurred in refinery over the last years the panel of expert defined a risk matrix in order to
evaluate the risk associated to critical events and maintenance activities. Five probability classes and five
severity categories, that take into account four impact categories (Health and Safety, Environmental,
Economic and Reputation), have been defined.
This paper reports the application of the RBI&M method to two specific stages in the maintenance
activities of the refinery, i.e. the oil refinery turnaround and work orders management.
The panel of expert developed heuristic methods in order to apply RBI&M procedure to the two cases
allowing the refinery to minimize the overall risk taking into consideration the limits in term of time and
budget (in turnaround case) and of human resources (in the management of work orders). The results
have highlighted a clear improvement in the indices which measure the quality of maintenance.
Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction system design, frequent preventive maintenance, and early


response to warnings. At the other end, aggressive strategies are
Safety in a refinery relies, among other things, on the adopted driven by demanding production schedules, single-string system
management criteria. They affect all the plant life-cycle: from plant designs, and minimal inspection and maintenance to obtain
design and construction, throughout the production activity, until maximum production with minimum interruptions. The difference,
possible dismantling. Safety management strategies for critical of course, lies in the immediate costs and in the resulting level of
systems involve multiple dimensions including design philosophy, system failure risk (Baron & Cornell, 1999). According to Krishnas-
maintenance policies, and procedures of personnel hiring, training, amy, Khan, and Haddara (2005), a Risk-based maintenance (RBM)
and evaluation (Cowing, Cornell, & Glynn, 2004). At one end of the approach helps in designing an alternative strategy to minimize the
spectrum, the most conservative approaches rely on a robust risk resulting from breakdowns or failures. Adapting a risk-based
maintenance strategy is essential in developing cost-effective
maintenance policies. Critical equipment can be identified based on
* Corresponding author. Tel.: þ39 071 2204435; fax: þ39 071 2204770. the level of risk and a pre-selected acceptable level of risk. Main-
E-mail addresses: [email protected] (M. Bertolini), m.bevilacqua@ tenance of equipment is prioritised based on the risk, which helps
univpm.it (M. Bevilacqua), [email protected] (F.E. Ciarapica), g.giacchetta@ in reducing the overall risk of the plant.
univpm.it (G. Giacchetta).
1
Tel.: þ39 0521 905861; fax: þ39 0521 905705.
In this paper we propose a Risk-Based Inspection and Mainte-
2
Tel.: þ39 071 2204874; fax þ39 071 2204770. nance (RBI&M) methodology that, using the synergies provided
3
Tel.: þ39 071 2204763; fax: þ39 071 2204770. by the simultaneous adoption of risk analysis and reliability

0950-4230/$ – see front matter Ó 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jlp.2009.01.003
M. Bertolini et al. / Journal of Loss Prevention in the Process Industries 22 (2009) 244–253 245

management methods, enables considerable changes to be made and visbreaking), to improve the oil conversion increasing the
with a view to the production of servicing plans that ensure greater production of light products.
reliability at the lowest possible cost. Particularly two applications The rest of the paper is organized as follows. Section 2 is divided
of the RBI&M method to two specific stages in the maintenance into three parts, in the first the organization of the team that has
activities of a refinery, i.e. the oil refinery turnaround and the work developed the methodology proposed in this paper is described; in
orders management, are reported in this work. The refinery ana- Section 2.1 a bibliographical analysis of existing work about risk
lysed in this work, is sited in Ancona (Italy), and can currently count analysis and maintenance prioritization is presented; in Section 2.2
on a processing capacity amounting to 3,900,000 ton/year of crude the RBI&M procedures are discussed. In Section 3, two case studies
oil, a storage capacity of more than 1,500,000 m3 and the ability to on the API refinery are used to illustrate the application of the
receive tankers and super-tankers up to a tonnage of 400,000 tons. proposed method. Finally, the conclusions are presented in Section 4.
Processing at the refinery is based essentially on a topping, catalytic
reforming, isomerisation, vacuum, visbreaking, and thermal 2. Materials and methods
cracking cycle, which is organized in a series of operational sections
that form interconnected functional units (Fig. 1). The plant has The development of RBI&M procedures and the application to
a closed loop water system which is able to deliver up to 7000 m3/h, two specific stages in the maintenance activities of a medium-sized
and a fire fighting system which is able to supply up to 3000 m3/h refinery was carried out by a panel of experts.
of sea water. An integrated gasification combined cycle plant of A panel of experts was formed in order to encourage commu-
287 MW power is actually starting up its operations. This plant nication and meetings where the operators could contribute their
burns a synthesis gas obtained from a heavy oil refining products knowledge and information about the processes. The panel was
gasification plant, whose production capacity is equal to 1250 ton/ made up of 15 participants, and included 3 academics, whose
day. This plant has auxiliary oxygen production, gas washing, research studies are mainly focused on risk analysis and mainte-
sulphur recovery, effluent treatment and heavy metals recovery nance management, 4 technical operators and 5 managerial oper-
utilities. The monthly feed of 340,000 ton to the primary distillation ators involved in the maintenance processes, 3 ApiSoi operators.
process (Topping) comprises 300,000 ton of oil and 40,000 primary The re-engineering of inspection maintenance and processes was
residuum. The main oil refinery processing plants are formed by also prompted by the arrival of a third party at the refinery (ApiSoi),
two recently introduced distillation units: an atmospheric one, called in to manage the maintenance activities on the basis of
with a production capacity of 10,500 ton per day and a vacuum one, a global service contract. This number of participants, which at first
whose distillation capacity is equal to 2500 ton/day. sight may seem rather large, derives from the Delphi technique
The light distillation fractions, mainly formed by liquid petro- (Linstone & Turoff, 1975) adopted for working with panels. The
leum gas (LPG) and petrol, feed a hydrogenation process (Unifin- Delphi technique is a structured process which investigates
ing), which is used to stabilise some components and to remove a complex or ill-defined issue by means of a panel of experts. This
undesired elements such as sulphur. After the hydrogenation methodology proves to be an appropriate research design for this
process, LPG is ready for use while petrol undergoes further pro- type of research and permits individual opinions to be obtained
cessing to enhance its octane number (isomerisation, to produce within a structured group and using a communicative process. The
light petrol without aromatics, and platforming, to obtain a very panel worked for a period of about two weeks, and the sessions
high octane number). The middle distillation fraction (kerosene, were planned on a three-round Delphi process. At first a series of
light and heavy gas oil) is subjected to a desulphurisation process statements concerning the requirements of the turnaround process
(HDS1 and HDS2 plants), while heavy fraction and the distillation were generated individually and anonymously by the experts. All
residuum are processed through cracking plants (thermal cracking the statements were then collected and delivered to the members

Fig. 1. Oil refinery simplified cycle.


246 M. Bertolini et al. / Journal of Loss Prevention in the Process Industries 22 (2009) 244–253

of the panel, who were required to indicate their level of agree- functional regulations on risk. Bevilacqua, Braglia, and Gabbrielli
ment; answers were finally fedback to the panel. (2000) presented a new tool for failure mode and effect analysis
developed for a new Integrated Gasification and Combined Cycle
2.1. General characteristics of Risk-Based Inspection and plant in an important Italian oil refinery. The methodology is based
Maintenance systems on the integration between a modified Failure Mode Effect and
Criticality Analysis (FMECA) and a Monte Carlo simulation as
Over the past few decades, maintenance strategies have pro- a method for testing the weights assigned for the measurement of
gressed from primitive breakdown maintenance to more sophisti- risk priority numbers (RPNs). Harnly (1998) developed a risk
cated strategies like condition monitoring and reliability-centred ranked inspection recommendation procedure that is used in one
maintenance (Khan & Haddara, 2004). Another link in this chain of of Exxon’s chemical plants to prioritise repairs that are identified
progress has recently been added by the introduction of a risk- during equipment inspection. The equipment is prioritised based
based approach to maintenance. This approach has been suggested on the severity index, which is failure potential combined with
as a new vision for asset integrity management (ASME, 2000). consequences of failure. Dey (2001) presented a risk-based model
Some authors (Krishnasamy et al., 2005; Kumar, 1998; Van Heel, for inspection and maintenance of a cross-country petroleum
Knegtering, & Brombacher, 1999) developed Risk-based mainte- pipeline that reduces the amount of time spent on inspection.
nance strategies to provide a basis not only for taking the reliability Risk-based maintenance strategies can also be used to improve
of a system into consideration when making decisions regarding the existing maintenance policies through optimal decision
the type and the time for maintenance actions, but also to be able to procedures in different phases of the risk cycle of a system. Studies
take into consideration the risk that would result as a consequence by Khan and Abbasi (1998), Kumar (1998) and Todinov (2003) show
of an unexpected failure. Most of the previous studies focused on a strong relationship between maintenance practices and the
a particular system and were either quantitative or semi-quanti- occurrence of serious accidents. An assessment of the impact of
tative. In this work a Risk-Based Inspection and Maintenance preventive maintenance on the failure characteristics of a piece of
methodology is proposed which integrates classic Reliability-Cen- equipment in the field of competing risks was discussed by Bedford
tred Maintenance (RCM) analysis with models of risk analysis and (2004). Profitability is closely related to the availability and reli-
develops a system which takes into consideration not only reli- ability of the equipment. Khan and Haddara (2004) introduced
ability and economic aspects, but also the company’s reputation a risk-based maintenance methodology. This methodology aims at
and environmental impact. reducing the overall risk that may result as a consequence of
There are different risk-based approaches reported in literature unexpected failures of operating facilities. By assessing the level of
ranging from the purely qualitative to the highly quantitative. risk caused by the failure of each component, one can prioritise the
A brief review of some of the most frequently used approaches is maintenance tasks for the components of the system. The same
presented here. Many authors used probabilistic risk assessment authors (Khan, Sadiq, & Haddara, 2004) proposed a risk-based
(PRA) as a tool for maintenance prioritization (Vesely, Belhadj, & maintenance and inspection approach to improve maintenance
Rezos, 1993). Balkey and Art (1998) developed a methodology, operations schedule and minimize failure risk by developing an
which includes risk-based ranking methods, beginning with the optimal inspection strategy. They present a structured approach for
use of plant PRA, for the determination of risk-significant and less equipment risk analysis using fuzzy method. Four case studies are
risk-significant components for inspection and the determination developed and they show that the robustness of the results
of similar populations for pump and valve in-service testing. This depends firstly on human expertise. They also point out the impact
methodology integrates non-destructive examination data, struc- of the risk formalism on the assessment of occurrence, conse-
tural reliability/risk assessment results, PRA results, failure data quences and then the judgement of risk level.
and expert opinions. Cowing et al. (2004) presented and illustrated Arunraj and Maiti (2007) made a synthesis of 25 RBM methods,
a dynamic probabilistic model designed to describe the long-term presenting the different steps and describing their main drawbacks.
evolution of such a system through the different phases of opera- The review of these methods showed that there is no unique way to
tion, shutdown, and possible accident. In addition to PRA, they used perform risk analysis and risk-based maintenance. The application
a Markov model to track the evolution of the system and its of these methodologies highly depends on the depth of the anal-
components through different performance phases. ysis, area of application and quality of results. According to these
Apeland and Scarf (2003) presented alternative probabilistic authors, other than this, the experience of the analysts to use these
frameworks for optimisation of risk-based inspection using methodologies is an important factor to consider.
a Bayesian approach. The fully Bayesian approach discussed in this More recently Meel et al. (2007) presented dynamic analyses of
paper provides straightforward means of presenting to decision incidents that have occurred in the USA chemical plants. Probability
makers the uncertainty related to future events. This approach is density distributions were formulated for their causes (e.g., equip-
described in the context of an inspection maintenance decision ment failures, operator errors, etc.), and associated equipment items
problem, and is contrasted with the classic probabilistic approach utilized within a particular industry. Bayesian techniques provided
(Baker & Wang, 1992; Christer, Wang, Baker, & Sharp, 1995) that posterior estimates of the cause and equipment-failure probabili-
assumes the existence of true probabilities and probability distri- ties. Other authors (Mili, Bassetto, Siadat, & Tollenaere, 2009) have
butions which have to be estimated. Wang and Christer (1998) proposed a new approach to the use of FMECA as an operational tool
proposed a model of safety inspection process, for the expected which unveils productivity improvement areas. They demonstrated
consequence of inspections over a finite time horizon. A single that it is possible to use FMECA method in a more dynamic envi-
dominant failure mode is modelled, which has considerable safety ronment, continuously updated by operational events. The article
or risk consequences assumed to be measurable either in cost terms proposes a risk-based maintenance method, which relies on the
or in terms of the probability of failure over the time horizon. The regular and automatic update of equipments risk analyses including
paper establishes a pragmatic procedure for formulating objective equipment-failure history.
functions which may be optimised to determine the optimal The more recent papers point out that efforts need to be focused
inspection intervals. on data retrieval and update – as automatic as possible – in order to
There has been an increased focus on risk-based maintenance prevent risks analyses obsolescence. In particular, source of infor-
optimisation in the offshore industry prompted by the recent mation, update frequency and risk estimation method must be
M. Bertolini et al. / Journal of Loss Prevention in the Process Industries 22 (2009) 244–253 247

redefined continuously. This literature reveals also that risk anal- 2.2.1. Identification of the scope
yses are considered through the perspective of its sustaining tools The aims and scope of the analysis must be defined and clas-
and methods. Particular attention has also been paid to connect as sified: environmental aspects connected with the operational
smooth as possible, events and preliminary risks analyses. activities carried out, safety risk, investment, work orders
In this work taking into consideration historical data regarding management, etc. For this reason the requirements, regulations
critical events occurred in refinery over the last years the panel of and adequacy criteria of the plant concerning safety and environ-
expert defined a procedure in order to evaluate the risk associated mental protection must be made clear, since they are considered
to critical items and maintenance activities. The most interesting ‘‘boundary conditions’’ in RBI&M analysis. It is at this stage that the
part is the application of the RBI&M procedure to two specific cases systems to which the RBI&M plan is to be applied are described:
in the maintenance activities of the refinery, i.e. the oil refinery the respective ‘‘hierarchical models’’ of these systems are devel-
turnaround and work orders management. The panel of expert oped, identifying the plant as a whole and the role of each system
developed heuristic methods in order to apply RBI&M procedure to within it, together with the sub-systems, single items and their
the two cases allowing the refinery to minimize the overall risk components.
taking into consideration the limits in term of time and budget (in
turnaround case) and of human resources (in the management of 2.2.2. Functional analysis
work orders). Standard functions and performance: the functions of each piece
of equipment are defined and their standard performance is iden-
2.2. Risk-Based Inspection and Maintenance procedure tified so as to be able to recognise immediately if there is deviation
from this performance. Prior to the real analysis of functional
RBI&M analysis is carried out by means of a sequence of steps, as failure of the identified items it is necessary to determine their
shown in Fig. 2. respective functions, distinguishing between:

Standard functions and performance Identification


of the scope
Functional failures

Failure mode
Functional
Effects of failure analysis

p c
r e o Health & Safety
o x n Risk
b p s
e Environmental assessment
a o impact
b s q
i u u
e Reputation
l r n
i e c Economic loss
t e
y
Risk
s
evaluation

Acceptable Yes
risk ?

No
Set Task Feedback
Repair Task Choice of
action
Substitution Task

On condition Task

Default action
J-factor
Failure tracing definition
Redesign

Action
Planning

Fig. 2. Steps in RBI&M methodology.


248 M. Bertolini et al. / Journal of Loss Prevention in the Process Industries 22 (2009) 244–253

Table 1 Table 2
Assigning probability. Indices of severity for Health and Safety, Environmental and Reputation categories.

Class Key word Description of 1 time Absolute Examples Severity Impact Description Examples
probability every x. value category
years 1 Health and Medication or injury with Grazes, first degree burns of
A Very rare Virtually >20 0.001 Fatal accident on the Safety possible absence from work limited extent
impossible workplace; double of between 1 and 3 days/
emergency; Aircraft accident; discomfort in carrying out
Accident caused by lightening. work
B Rare Unlikely 3–20 0.05 Malfunctioning of a control Environmental No perceptible Slight discharge of vapour
valve; Electric motor out of environmental impact from a coupling – smelly
order; Accident on a plant outside the workplace emissions within the
with serious economic loss; refinery area
Permanent injury; Reputation Minimal impact – no Slight discharge of vapour
Environmental damage. external impact on from a coupling visible
C Occasional May happen 1–3 0.3 Unexpected breakdown of reputation outside the workplace
a few times, at a pump; Black out; Leakage
2 Health and Injury or poor health with Sprains, dislocation, skin
least once from a heat exchanger;
Safety loss of ability to work and irritation
Instrument out of order;
possible absence from work
Spills; Injury to a person on
for up to 10 days
the workplace.
Environmental Brief duration/intensity Slight but totally recoverable
D Probable May happen Once 0.5 Unexpected shutdown (see
above the self-imposed spill of product – slight
several times every six Appendix 1); Limited losses;
limits, but reversible release of smelly but non-
months Opening of safety valve;
toxic and harmless product
Unpleasant smells.
noticed outside the
E Frequent May happen >Once 1 Slow down; Violation of traffic
workplace
often every six regulations; Samples in
Reputation Impact of brief duration and/ Limited flaring caused by
months unsuitable or polluted
or programmable entity a planned plant shutdown
containers.
3 Health and Injury or occupational Torn muscle, second degree
Safety disease with loss of ability to burns, fracture
work for an extended period
of up to 30 days/
- essential function: a function required of a given element and occupational disease with
for which it has been specifically installed; reversible consequences
Environmental Brief duration/intensity Partially recoverable spill of
- auxiliary function: a support function for the essential function; above the self-imposed product, value of daily SOx/
- protective function: a function aimed at protecting people, the limits, but reversible within NOx emission note above the
plant and the environment when the item is working; the legal limits limits, but recoverable in the
- information function: a function of monitoring and reporting monthly emission note
Reputation Local impact – impact of Limited flaring caused by an
operational parameters;
brief duration and/or unexpected plant shutdown
- interface function: an interface function between different unexpected entity
items;
4 Health and Injury to more than one Multiple fracture, third
- superfluous function: a function which the item is able to Safety person/Injury with loss of degree burns
perform but which is not relevant in the specific context. ability to work for a period of
more than 30 days and with
The functions can also be classified as: possible multiple injuries
caused by the same initial
event; it includes any
- on-line function: a function operating continuously or so permanent physical or
frequently that the user is constantly aware of its state (this health impediments
leads to evident failure); Environmental Exceeding self-imposed Prolonged emission of bad
- off-line function: a function which is used intermittently or not environmental limits smells, smoke, vapour, flame
or loud noises noticeable
frequently and which is not possible to be aware of except by outside the workplace
means of specific assessments and tests (this leads to hidden Reputation Regional impact – impact of Cleaning of a heater causing
failure). considerable duration and/ noise which can be heard
or programmable entity outside the workplace
Functional failures: in general for each function it is possible to 5 Health and It includes the extreme Deaths
determine a certain number of functional failures which means Safety possibility of death to
that the equipment is unable to perform its standard operations. individuals or groups due to
the same initial event/Lethal
Failure Mode: a functional failure may occur for various reasons.
exposure
Knowing the cause means that the way to avoid the failure can be Environmental Exceeding legal Release of toxic-harmful
identified. In doing this, however, it is important not to enter into environmental limits products
too much unnecessary detail so as to avoid time-wasting. Reputation National impact – impact of Fire in a heater
considerable duration and/
Effects of failure: Several effects can be associated with failure
or unexpected entity
mode, that is to say, what could happen if the failure mode being
examined should occur.

2.2.3. Risk analysis  Initial evaluation of probability: In the case of the refinery
In order to analyse how critical a given functional failure is, it is a team of experts defined some criteria, which are illustrated
necessary to carry out an initial evaluation of probability and an in Table 1, in order to create five probability classes, quanti-
initial evaluation of consequences: fying the probability that an event will occur, assigning a level,
M. Bertolini et al. / Journal of Loss Prevention in the Process Industries 22 (2009) 244–253 249

Table 3
Decision matrix.

Severity Key word Consequences Probabilities

Health and Safety Economic Environmental Reputation VR R O P F


impact impact
Very rare Rare Occasional Probable Frequent

Virtually Unlikely May happen May happen May


impossible a few times (1) several happen
times often
1 Minor Discomfort: Medication/ <10,000 V Minimal Within 1 2 3 4 10 (2)
Accidents 1–3 days Area/Dept impact company
confines
2 Moderate Poor health 3–10 days 10,000– Short-term Surrounding 2 4 6 16 (2) 30 (3)
100,000 impact areas
V Dept
3 Severe Occupational disease: 100,000 to Short-term Local 3 6 18 (2) 36 (3) 60 (4)
Reversible in 10–30 days 1 million V impact noted territory
Refinery outside
4 Very severe Permanent damage to health: 1–10 million Transient Regional 4 16 (2) 36 (3) 64 (4) 80 (4)
>30 days/Accident involving V Branches reversible
several people damage
5 Catastrophic Lethal exposures: >10 million Permanent National 10 (2) 30 (3) 60 (4) 80 (4) 100 (4)
fatal accident V Group environmental
damage

from A to E, to the possible scenario. Several examples were allows us to bear in mind how long people are subjected to
defined for every probability class analysing refinery historical a possible event.
data. The same work was carried out in order to create and to
explain the categories of consequences.
2.2.5. Choice of action and calculation of the J-factor
 Initial evaluation of consequences: 4 categories of possible
If the team of experts considers that the level of risk evaluated
consequences have been considered by the expert: effects in
is too high, either preventive action (action which reduces the
terms of potential injuries (Health and Safety), environmental
probability of an event) or mitigating action (action which reduces
impact, loss of reputation and economic loss. A level of
the consequences of an event) is necessary. When possible, using
severity, from 1 to 5, was then assigned to each impact cate-
a logical decision-making process based on the findings and on
gory. The severity of the event can be decided with reference
the limits of efficiency and functionality, the action to be taken
to Table 2.
should be identified (repair task, replacement task and on
condition task).
The economic impact of failure is considered in the annual
If this is not possible ‘‘default’’ action should be decided, which
balance and in relation to production:
may involve the re-designing of components, unscheduled main-
tenance or modifications to maintenance procedures.
- damage to property results in costs for the repair and substi-
For each improvement choice identified it is necessary to
tution of the equipment;
reconfirm the conditions of probability, consequence and exposure
- the costs associated with the value of the product are loss of
in order to determine the efficacy of preventive and mitigating
production, start-up costs, etc.
action. In addition to the previously defined conditions, the team
must also assess the most profitable alternative (that is to say, the
The decision-making criteria are illustrated in the ‘‘Economic
one which reduces the maximum risk per invested euro) as
impact’’ column of Table 3.
opposed to taking no action at all. An index called Justificator
Factor (or J-factor) has been used in order to help this type of
2.2.4. Risk evaluation
analysis4:
Risk evaluation is based on the decision-making matrix illus-
trated in Table 3 which summarises all the previously defined items J  factor ¼ ðoriginalrisk  newriskÞ=cost
of risk analysis.
The risk score which should be considered in the case of several in which original risk is the initial risk calculated for the threat
competing items is the highest score found among the items (Alternative 0), new risk is the risk of the alternative when
considered. There is also a multiplication factor f in the matrix completely implemented, and cost is the cost of implementing the
which allows the risk to be indicated numerically on a scale of from alternative. The J-factor indicates the amount of risk reduction by
1 to 100; factor  2 main diagonal (Medium to Low Risk); factor  3 euro invested for each alternative. It can therefore be used as
(Medium to High Risk); factor  4 (High Risk). a criterion for choosing between the various alternatives. The team
In this context a risk is the product of 3 factors, i.e.: must look for the alternative that produces the maximum reduc-
Probability  Exposure  Consequence: tion of risk per euro, that is to say, a high J-factor value. Finally on
the basis of the indications provided by J-factor calculation the
Probability: the likelihood of the considered event occurring; chosen action can be scheduled.
Exposure: the possible duration of the event;
Consequence: the outcome of the event.
4
This type of J-factor was defined in accordance with an internal procedure
Some guidelines, shown in Table 4, have been defined in order to applied by the Shell Company (S-RCM, 2000) which was specifically created for the
evaluate exposure. This is a factor which may lessen the risk and it refinery sector.
250 M. Bertolini et al. / Journal of Loss Prevention in the Process Industries 22 (2009) 244–253

Table 4 (generally 3–4 weeks), during which time all the resources –
Assessment of Exposure.
human, material, technical and economic – must be efficiently
Key word Exposure Examples amalgamated and adequately supported.
Constant 1 Emissions from a chimney The RBI&M methodology has been used to overcome the habit of
Frequent – Daily 0.6 Loading operations considering all the items that cannot be isolated while the plant is
Occasional – Weekly 0.3 Machine alternation
running as T/A items and all the items ‘‘conventionally’’ included in
Unusual – Monthly 0.2 Transfer of fluid
Rare – Sometimes 0.1 A non-routine operation
the work schedule as non-T/A items.
Very Rare – Annual or less 0.05 Replacement of a catalyst Using methods to facilitate the cost-risk–benefit assessment
enables us to eliminate several items from the list of T/A equipment.
The RBI&M method used for the Risk-Based Investigation helps us
3. Two case studies to evaluate the failure risk and the benefits deriving from any
preventive measures (the product of the cost of the measures
This works reports on the application of the RBI&M method to multiplied by the new probability of the failure’s occurrence) and
two specific stages in the maintenance activities of the API refinery, thus compare the failure risks. Fig. 3 shows a summary of the
the annual turnaround and the work orders management. decision-making method applied for an optimal definition of the
list of T/A equipment.
3.1. Oil refinery turnaround (T/A)
3.1.1. Discussion
Routine maintenance on the plant at the refinery, called a turn- To give an idea of the significance of operations it is important to
around, consists in changing and/or restoring the plant’s working remember that in 2003 the API refinery turnaround lasted 18 days
conditions by taking action on its component parts in order to in all, maintenance activities involved a daily average of 500
improve their efficiency. This servicing process affects both the workers, with peaks of up to 700, for a total of 100,000 h worked
equipment that cannot be isolated while the plant is in normal and about ten million euro of investment. That is why we imple-
operation, Items in T/A, and any equipment that does not need to be mented an RBI&M method that, starting from the way the turn-
emptied and cleaned, but that has posed problems in operation or around had been done until 2003, led to the identification of the
needs to be inspected regularly, Items not in T/A. The aim of the action needed to reorganize the turnaround and highlighted
process is to restore or improve energy efficiency, guarantee measures that could be taken to achieve better and better results in
smooth operation, ensure the integrity of the safety systems, and terms of performance and flexibility.
contain wear and tear in order to prolong the equipment’s working The performance control of the new proposed turnaround
life as far as possible and ensure a clean working environment. management process is essential for its effective implementation.
In practical terms, depending on the objectives, these measures The validation of the new model focused on two main aspects: the
become necessary to: increase production by means of overhauls attainment of the set targets and the absence of negative influence
and improvements, modernize the plant with the aid of more of the new procedure on the previous turnaround management,
advanced technologies, remove load limitations, change the cata- from an economic point of view.
lysts, enable essential inspections, improve plant performance by Such controls were facilitated by the use of the Activity-Based
means of large-scale changes, overhaul some parts (e.g. pumps and/ Costing tool, which allowed us to estimate the turnaround process
or compressors), inspect critical containers, evaluate the residual indicators. Table 5 reports the outcome of analysis, relative to the
life of components involved in the turnaround, clean the equip- 2003 / 2004 and 2004 / 2005 oil refinery turnaround.
ment, and deal with any leaks that have developed during opera- It is also important to point out that the re-definition of T/A
tion of the machinery. In many ways, managing a turnaround is like items allowed the company to decrease the amount of items
managing a design project, but it is even more demanding because included in the turnaround by 23% in 2004 and 9.5% in 2005, with
all the measures have to be planned to fit into a very tight schedule respect to the 2003 situation. These results were accomplished by

Risk reduction decision


List of Premises matrix
works
Alternatives A B C D
Risk Scores . .. … … …
High Alternatives to reduce
Risk of NOT doing the risk and residual Musts
the work risk scores … … ... … …
(J factor calculation) … … … … …
Low Wants
… … … … …
… … … … …
Exclude from T/A

Step 1: Risk definition Best alternative


Step 2: List of alternatives included in the T/A
Step 3: Calculation of residual risk (J factor)
Step 4: Definition of Musts (conditions that are essential/consistent with premises)
Definition of Wants (desirable conditions associated with desirability indexes [Low, Medium , High])
Step 5: Evaluation and choice of the best alternative using weight factors

Fig. 3. Decision-making method


M. Bertolini et al. / Journal of Loss Prevention in the Process Industries 22 (2009) 244–253 251

Table 5 that, by identifying the causes and determining corrective action,


Percentage variation in working hours and costs through RBI&M. the analysis of undesired events leads to the introduction of
Description 2003 / 2004 2004 / 2005 a continuous improvement process, which is typical of manage-
Variation in working hours 20% 14% ment systems. On the basis of these analyses the ME Department
Variation in costs 18% 9% will be able to develop maintenance plans (predictive and
preventive maintenance) and optimise the procedures and main-
tenance techniques to be adopted.
shifting to current maintenance the ‘‘non-core activities’’ i.e., the The procedure for work order management starts with
ones that from the point of view of technical difficulty and asso- a comparison of all work orders inserted in the Computerised
ciated risk do not require a plant shutdown. Maintenance Management System (CMMS) of the refinery. CMMS
collects also the design and feature of all items (equipments, plants,
etc.) to manage (see in the top left-hand corner of Fig. 4).
3.2. Work orders management In order to decide the priorities for thorough investigation the
work orders must go through a comparison process which is
This work proposes a use of RBI&M methodology which aims at articulated in several steps:
defining a strategy for the management of ‘‘work orders’’ which are
received by the Maintenance Engineering Department. The term - the work orders provided by the CMMS must first be filtered by
‘‘work orders’’ refers to a whole set of activities such as replacement comparing them with the list of all the items present in the
of components, plant servicing, orders to purchase new compo- refinery, so as to concentrate only on those types of mainte-
nents, failure analysis, etc. The responsibilities identified for the nance which refer to a specific and unambiguous item. Some
correct functioning of the procedure involve both the workers and examples of work orders which do not concern the ME
the Maintenance Engineering (ME) Department of the refinery. As Department are those which refer to work which must be
far as the workers are concerned, each head of a shift, when issuing carried out by administration, the warehouse, road mainte-
a work order (WO) in the light of a Near Accident, Operational nance, insulation, repair of motor vehicles, etc.
Accident, Injury or Environmental Accident (see Appendix 1), must - Further to this preliminary step of selecting only those work
fill in properly all the items required by the Computerised Main- orders which concern a well-identified item there is a second
tenance Management System (CMMS) present in the refinery. The skimming to select only those tasks which satisfy at least one of
ME Department must compare all the internal reports of non- the following conditions (see Appendix 1):
conformity drawn up for each function of the refinery with the a) Associated with a Critical Item
work orders on the CMMS. b) Associated with a Bad Actor
This procedure, which must be repeated weekly, allows the c) Maintenance costs of over V10,000
identification of those critical items which RBI&M methodology d) Cause of plant shutdown or slowdown
must be applied to (Fig. 4). The procedure is based on the concept e) Issued in an emergency or urgent situation

CMMS

Work orders List of items


Critical item in processes

Unplanned shutdown and

Emergency and urgency


Cost > 10000 euro

Comparison
Bad Actors

slowdown

W.O. associated with equipment yes

no
Satisfies at least
W.O. follows its path no one condition

yes

Low risk Prioritising of events with the


decision matrix

Maintenance Critical failure analysis to Failure


plan be investigated analysis

ME

Fig. 4. Algorithm for work orders management.


252 M. Bertolini et al. / Journal of Loss Prevention in the Process Industries 22 (2009) 244–253

Table 6 213 to Critical Items in processes, 199 unplanned shutdown and


Time-scale for carrying out work order analysis. slowdown, 1166 items have a non-significant impact on the envi-
Risk values Priority ronment (env), safety (sfty), process shutdown (sd), reduction in
Risk between 81 and 100 Analysis must be carried out immediately quality (qlty) or loss of quantity of the product (qnty), 600 items
Risk between 36 and 80 Analysis must be started within 24 h have a cost of greater than 10,000 Euro, and 25 near accidents are
Risk between 17 and 36 Analysis must be started within 48 h reported. The same simulation was repeated using the data for
Risk between 1 and 16 Analysis must be started within one week
2004 and 2005.
Experience has shown that the application of the decision-
making matrix for drawing up CFA is a very selective method, and
The task requests identified in this way converge with the
on average only 50% of possible candidates pass through this filter.
indications provided by the non-conformity reports to form a list of
Consequently the weekly number of failure analyses to be per-
cases to analyse using RBI&M methodology.
formed by the Reliability Department was 33.2 in 2003, 28 in 2004
As well as identifying the phenomena of critical failure to be
and 27.4 in 2005, with an average value of 6.64, 5.6 and 5.48 failure
failure analysed, the investigation carried out using the decision-
analyses respectively for the years 2003, 2004 and 2005. On dis-
making matrix determines the priorities, i.e. the deadline for
starting analysis of the event and maintenance plan according to cussing the results obtained with the head of the Reliability
Department, we found that this average number corresponds with
the criteria shown in Table 6. This table defines the time-scale for
carrying out work order and failure analysis. expectations and with the true capacity expected of the staff
employed (5 people), and is also in line with the current work load.
The decreasing values of the work orders and of the all indexes
3.2.1. Discussion
analysed demonstrate that the proposed procedure has not
Although apparently extremely simple the algorithm illustrated
involved contraindications from reliability point of view while it
in the previous chapter is the result of a whole series of attempts and
improvements. In fact it was also necessary to bear in mind the allowed to reliability department important advantages from
economical and management point of view.
organizational aspects of the work including the number of tech-
nicians employed in the Reliability Department, the number of daily
working hours, the average time taken to perform a single failure
4. Conclusions
analysis, etc. The work organization in the reality of the refinery
studied can be described as follows: in the Reliability Department 5
The RBI&M method proposed in this work was developed in
people generally work on drawing up failure analysis, each working
order to solve two important problems of reliability department of
day is of 8 h, the weeks worked during the year are 52 minus 5
the refinery analysed:
working weeks (26 days of holiday), minus another two weeks,
considering the average value of national holidays and days of sick
leave. Therefore the number of work orders (WO) computed with 1) The personnel available for the analysis of critical items and
events is limited and it is not able to assess in detail all the
the algorithm must equate with forty-five working weeks. Since
events occurred;
resources are limited, it is necessary to determine how they should
2) once defined the critical level of an item or event, reliability
be distributed, so that no important works remain neglected while
more resources are concentrated on the most critical work orders. department has to decide the best maintenance actions and
work orders to carry out.
By applying the algorithm to data regarding WO issued in 2003,
2004 and 2005, we obtained the results shown in Table 7.
The target reached was that of identifying the really critical
To explain to procedure it is possible to take into consideration
the data of 2003. Simulation using data from 2003 results in 2988 events, items and work orders in terms of safety, environment,
plant availability, quality of the product and maintenance costs, so
candidates for Critical Failure Analysis (CFA) to be controlled with
the decision-making matrix, out of a total 7128 examined, which is as to be able to proceed, in a systematic way, with failure analysis
and thereby with the performance of subsequent corrective action.
equal to about 42%. Of these, 767 must be attributed to Bad Actors,
The application of the procedure to two specific stages in the
maintenance activities of the refinery has shown how the use of
Table 7 risk-based inspection techniques leads to an improvement in the
Results of the simulations performed.
indices which measure maintenance quality (Tables 5 and 7). The
Year 2003 Year 2004 Year 2005 scheduling of necessary operations is made on the basis of risk
Total work orders 7128 4849 4807 estimation, cost of the alternative but first of all on the basis of
examined (TE) resources available. Tests carried out by panel of expert have
Total candidates for Critical 2988 2525 2469
allowed to improve and to standardize the method due to the
Failure Analysis (CFA)
Percentage of total (%tot) 42% 52% 51% specific application field of petrochemical plants.
Work orders involving env, 1166 1165 1104 The decision-making matrix has proved to be a tool which
sd, qnty, qlty, sfty quantifies the risk and can lead to the fixing of priorities and can
(1 failure per year) justify choices in terms of ‘‘action/event risk – associated risks –
n. Bad Actors (BA) 767 568 573
n. Action on Critical 231 104 99
possible improvements – resources used – result’’.
Items (CI) Decision-making assessment using the risk matrix must be
Unplanned shut/ 199 101 91 carried out by a special ‘‘team’’ whose experience must guarantee
slowdown (SD) thorough competency concerning all aspects of the specific oper-
n. WO with estimated 600 552 568
ation being examined. The work procedure which was developed
cost > 10 thousand V
(>10,000) has also led to a common and easily comprehensible technical
n. Near Accidents (NA) 25 35 34 language which is used by all the people who interact on the plants.
CFA candidates/week 2988/45 z 66.4 2525/45 z 56.1 2469/45 z 54.8 It has improved the understanding of the way in which the plant
CFA/week 66.4  0.5 z 33.2 56.1  0.5 z 28 54.8  0.5 z 27.4 works and has made improvements possible both in the mainte-
CFA/(week  employee) 33.2/5 ¼ 6.64 28/5 ¼ 5.6 27.4/5 ¼ 5.48
nance and the running processes of the plant.
M. Bertolini et al. / Journal of Loss Prevention in the Process Industries 22 (2009) 244–253 253

The failure of a system is rarely the result of a single cause, but References
rather the result of a combination or a series of interacting events.
As a result, Risk-based Inspection & Maintenance must not be American Society of Mechanical Engineers Code Committee SC6000. (2000).
Hazardous release protection. New York, NY: ASME.
perceived as a static exercise to be performed only once. It is Apeland, S., & Scarf, P. A. (2003). A fully subjective approach to modelling inspection
a dynamic process, which must be continuously updated as addi- maintenance. European Journal of Operational Research, 148, 410–425.
tional information becomes available. Arunraj, N. S., & Maiti, J. (2007). Risk-based maintenance – techniques and appli-
cations. Journal of Hazardous Materials, 142(3), 653–661.
RBI&M methodology proposed in this paper can be adapted Baker, R. D., & Wang, W. (1992). Estimating the delay-time distribution of faults in
and used in many situations which may arise in a refinery. Using repairable machinery from failure data. IMA Journal of Mathematics Applied in
this methodology it is possible to make not only a failure analysis Business and Industry, 3, 259–281.
Balkey, R. K., & Art, J. R. (1998). ASME risk-based in service inspection and testing:
but also to estimate the environmental aspects connected with an outlook to the future. Society for Risk Analysis, 18(4).
operational activities, the safety risk, and to make an investment Baron, M. M., & Cornell, M. E. P. (1999). Designing risk-management strategies for
analysis, etc. critical engineering systems. IEEE Transactions on Engineering Management, 46,
87–100.
Bedford, T. (2004). Assessing the impact of preventive maintenance based on
censored data. Quality and Reliability Engineering International, 20, 247–254.
Bevilacqua, M., Braglia, M., & Gabbrielli, R. (2000). Monte Carlo simulation approach
Appendix 1 for a modified FMECA in a power plant. Quality and Reliability Engineering
International, 16, 313–324.
Definitions Christer, A. H., Wang, W., Baker, R. D., & Sharp, J. (1995). Modelling maintenance
practice of production plant using the delay-time concept. IMA Journal of
Mathematics Applied in Business and Industry, 6, 67–83.
Item: equipment or device characterized by a conventional Cowing, M. M., Cornell, M. E. P., & Glynn, P. W. (2004). Dynamic modeling of the
progressive number, subject to maintenance. tradeoff between productivity and safety in critical engineering systems. Reli-
Critical Item: item identified with specific operational func- ability Engineering and System Safety, 86, 269–284.
Dey, P. M. (2001). A risk-based model for inspection and maintenance of cross-
tions. Its malfunctioning compromises the operations of the country petroleum pipeline. Journal of Quality in Maintenance Engineering,
plant in which it is used; in other words the unavailability of 40(4), 24–31.
this equipment or device has implications for safety or may Harnly, A. J. (1998). Risk based prioritization of maintenance repair work. Process
Safety Progress, 17(1), 32–38.
cause plant shutdown or may prevent the loading of products Khan, F. I., & Abbasi, S. A. (1998). Risk assessment in chemical process industries:
via land or sea. Advance techniques. New Delhi: Discovery Publishing House. XC376.
Bad actor: an item which has had two or more failures in Khan, F. I., & Haddara, M. R. (2004). Risk-based maintenance of ethylene oxide
production facilities. Journal of Hazardous Materials, A108, 147–159.
a year or which is defined as such because of its technical Khan, K., Sadiq, R., & Haddara, M. (2004). Risk-based inspection and maintenance
functions. (RBIM): multi-attribute decision making with aggregative risk analysis. Process
Critical Failure Analysis: a real or potential event, subject to Safety and Environmental Protection, 82, 398–411.
Krishnasamy, L., Khan, F., & Haddara, M. (2005). Development of a risk-based
failure analysis. maintenance (RBM) strategy for a power-generating plant. Journal of Loss
Near Accidents: Near accidents are those events which have Prevention in the Process Industries, 18, 69–81.
been a source of risk or danger, potentially provoking injuries Kumar, U. (1998). Maintenance strategies for mechanized and automated mining
systems: a reliability and risk analysis based approach. Journal of Mines, Metals
and accidents; damage to the health of the workers or the
and Fuels, 46(11–12), 343–347.
population; damage to the environment; damage to company Linstone, H. A., & Turoff, M. (1975). The Delphi method techniques and application.
or third party property. London: Addison-Wesley.
Operational Accidents: Operational accidents are those events Meel, A., O’Neill, L. M., Levin, J. H., Seider, W. D., Oktem, U., & Keren, N. (2007).
Operational risk assessment of chemical industries by exploiting accident
which have resulted in operational targets not being reached databases. Journal of Loss Prevention in the Process Industries, 20(2007),
following ‘‘upsets’’ or ‘‘mal operation’’. 113–127.
Environmental Accidents: Environmental accidents are those Mili, A., Bassetto, S., Siadat, A., & Tollenaere, M. (2009). Dynamic risk management
unveil productivity improvements. Journal of Loss Prevention in the Process
events which have led to a lack of conditions which respect Industries, 22(2009), 25–34.
the environment. S-RCM Training Guide. (2000). Shell-reliability centered maintenance. Shell Global
Injuries: Injuries according to Italian Law (Art.2 DPR 1124 of Solution International.
Todinov, M. T. (2003). Setting reliability requirements based on minimum failure-
20/6/65) ‘‘occur because of violent events in the workplace free operating periods. Quality and Reliability Engineering International, 20,
and cause death, complete or partial permanent work 273–287.
disablement leading to absence from work or reduced Van Heel, K. A. L., Knegtering, B., & Brombacher, A. C. (1999). Safety lifecycle
management. A flowchart presentation of the IEC 61508 overall safety lifecycle
performance’’. Accidents are defined as those events which
model. Quality and Reliability Engineering International, 15, 493–500.
cause damage to people, the environment or property. Vesely, W. E., Belhadj, M., & Rezos, J. T. (1993). PRA importance measures for
Shutdown: total shutdown of operations in a plant due to any maintenance prioritization applications. Reliability Engineering and System
Safety, 43, 307–318.
type of anomaly.
Wang, W., & Christer, A. H. (1998). A modelling procedure to optimize component
Slowdown: reduced working capacity of the plant, less than safety inspection over a finite time horizon. Quality and Reliability Engineering
75% of the expected value. International, 13, 217–224.

You might also like