Clearing The Stop Fault Flag in Sun Cluster 3.x

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 4

KEDB: ITIL Compliance

Wipro Infotech
Enterprise Services Page 1 of 4
KEDB: ITIL Compliance

Description
Clearing the stop-fault flag status in Sun Cluster 3.x resources and resource group

Platform

SUN

Category

Trouble shooting Sun Cluster 3.x

Problem statement

When the Failover_mode resource property is NONE or SOFT and the STOP of a resource fails, the individual resource
goes into the STOP_FAILED state and the resource group goes into the ERROR_STOP_FAILED state. You cannot bring
a resource group in this state on any node online, nor can you edit it (create or delete resources, or change resource-
group or resource properties).

RCA Summary

You cannot bring a resource group in this state on any node online, nor can you edit it (create or delete resources, or
change resource-group or resource properties).

Solution

This procedure tells you how to clear the STOP_FAILED error flag on resources.

During this procedure, you must supply the following information: the name of the node where the resource is
STOP_FAILED the name of the resource and resource group in STOP_FAILED state

Note: Perform this procedure from any cluster node.

1. Become superuser on a cluster member.


2. Identify the resources that have gone into the STOP_FAILED state and their nodes.

# /usr/cluster/bin/scstat –g

Example:

...

Wipro Infotech
Enterprise Services Page 2 of 4
KEDB: ITIL Compliance

-- Resource Groups and Resources --


Group Name Resources
---------- ---------
Resources: dvl1rg dvl1-lh hasp-dvl1-res dvl1-ora-res dvl1-lsnr-res
...
-- Resource Groups --
Group Name Node Name State
---------- --------- -----
Group: dvl1rg quid Error--stop failed
Group: dvl1rg rofe Offline
...
-- Resources --
Resource Name Node Name State Status Message
------------- --------- ----- --------------
Resource: dvl1-lh quid Online but not monitored Online -
LogicalHostname online.
Resource: dvl1-lh rofe Offline Offline
Resource: hasp-dvl1-res quid Stopping Unknown -Stopping
Resource: hasp-dvl1-res rofe Offline Offline
Resource: dvl1-ora-res quid Offline Offline
Resource: dvl1-ora-res rofe Offline Offline
Resource: dvl1-lsnr-res quid Stop failed Offline
Resource: dvl1-lsnr-res rofe Offline Offline
...
3. On nodes that are in a STOP_FAILED state, manually stop the resources and their monitors

# /usr/cluster/bin/scswitch -n -j resource
Confirm that the application encapsulated with the resource has stopped. If not, shut it down.

Note: This step might require killing processes or running resource


type-specific commands or other commands.

4. On all the nodes that were manually stopped, set the state of the resources to OFFLINE and clean the
STOP_FAILED flag.

# /usr/cluster/bin/scswitch -c -h nodelist -j resource -f STOP_FAILED


5. Check the resource-group state on the nodes where the STOP_FAILED flag was cleared in Step 4.

The resource-group state should now be OFFLINE or ONLINE.

# /usr/cluster/bin/scstat -g
6. If the resource group remains in the ERROR_STOP_FAILED state, which the command scstat -g indicates, run the
following scswitch command to take the resource group offline on the nodes where the resource group is still in the
ERROR_STOP_FAILED state.

# /usr/cluster/bin/scswitch -F -g resource-group
This situation can occur if the resource group was being switched offline when the STOP method failure occurred and
the resource that failed to stop had a dependency on other resources in the resource group. Otherwise, the resource
group reverts to the ONLINE or OFFLINE state automatically after you have run the command in Step 4 on all
STOP_FAILED resources.

Wipro Infotech
Enterprise Services Page 3 of 4
KEDB: ITIL Compliance

Now you can switch the resource group to the ONLINE state.

Use the following command to bring the resource group and all resources online, in the preferred host:

# /usr/cluster/bin/scswitch -Z -g resource-group
or

Use the following commands to bring the resource group online, choosing the host and to bring the resources online,
one by one:

# /usr/cluster/bin/scswitch -z -g resource-group -h <node>


# /usr/cluster/bin/scswitch -e -j resource

References

NA

Wipro Infotech
Enterprise Services Page 4 of 4

You might also like