HBA Troubleshooting Guide
HBA Troubleshooting Guide
HBA Troubleshooting Guide
Copyright 2006-2009 Brocade Communications Systems, Inc. All Rights Reserved. Brocade, Fabric OS, File Lifecycle Manager, MyView, and StorageX are registered trademarks and the Brocade B-wing symbol, DCX, and SAN Health are trademarks of Brocade Communications Systems, Inc., in the United States and/or in other countries. All other brands, products, or service names are or may be trademarks or service marks of, and are used to identify, products or services of their respective owners. Notice: This document is for informational purposes only and does not set forth any warranty, expressed or implied, concerning any equipment, equipment feature, or service offered or to be offered by Brocade. Brocade reserves the right to make changes to this document at any time, without notice, and assumes no responsibility for its use. This informational document describes features that may not be currently available. Contact a Brocade sales office for information on feature and product availability. Export of technical data contained in this document may require an export license from the United States government. The authors and Brocade Communications Systems, Inc. shall have no liability or responsibility to any person or entity with respect to any loss, cost, liability, or damages arising from the information contained in this book or the computer programs that accompany it. The product described by this document may contain open source software covered by the GNU General Public License or other open source license agreements. To find-out which open source software is included in Brocade products, view the licensing terms applicable to the open source software, and obtain a copy of the programming source code, please visit http://www.brocade.com/support/oscd.
European Headquarters Brocade Communications Switzerland Srl Centre Swissair Tour B - 4me tage 29, Route de l'Aroport Case Postale 105 CH-1215 Genve 15 Switzerland Tel: +41 22 799 5640 Fax: +41 22 799 5641 E-mail: [email protected]
Document History
Title
Fibre Channel HBA Troubleshooting Guide Fibre Channel HBA Troubleshooting Guide
Publication number
53-1000885-01 53-1000885-02
Summary of changes
New document Revised with corrections.
Date
December 2008 January 2009
Contents
Chapter 1
Introduction to troubleshooting
In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 How to use this manual for troubleshooting . . . . . . . . . . . . . . . . . . . . 1 Gathering problem information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Chapter 2
Isolating Problems
In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 General problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Resolving installation problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Verifying installation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Errors when installing driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Installer program does not autorun from CD (Windows only) . . 13 Files needed for bfad.sys message appears when removing driver 13 Cannot roll back driver on all HBA instances using Device Manager 14 Host not booting from remote LUN . . . . . . . . . . . . . . . . . . . . . . . 14 Confirming driver package installation. . . . . . . . . . . . . . . . . . . . 14 Host system freezes or crashes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 HCM GUI fails to connect with HCM agent . . . . . . . . . . . . . . . . . . . . 17 Verifying Fibre Channel links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Chapter 3
iii
For detailed information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Data to provide support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Collecting data using host system commands . . . . . . . . . . . . . . . . . 24 Collecting data using BCU commands and HCM . . . . . . . . . . . . . . . 25 Using Support Save . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 HBA data collection using HCM . . . . . . . . . . . . . . . . . . . . . . . . . 28 Collecting data using BCU commands . . . . . . . . . . . . . . . . . . . . 28 Collecting data using Fabric OS commands . . . . . . . . . . . . . . . . . . . 28 Event logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Host system logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 HCM logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Syslog support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Windows Event Log support . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Port statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 IOC statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Fabric statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Remote port statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 FCIP initiator mode statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Logical port statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Virtual port statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Quality of service (QoS) statistics . . . . . . . . . . . . . . . . . . . . . . . . 39 Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Beaconing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Loopback tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 PCI loopback test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Memory test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 HBA temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Ping end points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Trace route . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Echo test. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 SCSI test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Collecting SFP data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 SFP diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Port power on management (POM) . . . . . . . . . . . . . . . . . . . . . . 46 Collecting port data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Base port properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Remote port properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Logical port properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Virtual port properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Port log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Port list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Port query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Port speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Authentication settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Displaying settings through HCM . . . . . . . . . . . . . . . . . . . . . . . . 48 Displaying settings through BCU. . . . . . . . . . . . . . . . . . . . . . . . . 48
iv
QoS and target rate limiting settings . . . . . . . . . . . . . . . . . . . . . . . . . 48 BCU commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 HCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Persistent binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Chapter 4
Performance optimization
In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Linux tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Solaris tuning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Windows tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Driver tunable parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 OS tunable parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 VMware tuning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Index
vi
In this chapter
How this document is organized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Supported hardware and software. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Whats new in this document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Document conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Notice to the reader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Additional information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Getting technical help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Document feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Chapter 3, Tools for Collecting Data provides a summary of diagnostic and monitoring tools
available through the HCM, Brocade Command Line Utility (BCU), fabric switch operating system and host system to help you isolate and resolve HBA-related problems.
NOTE
vii
HBA support
The following Fibre Channel host bus adapters (HBAs) are supported in this release.
Brocade 815. Single-port HBA with a per-port maximum of 8 Gbps using an 8 Gbps SFP+. Brocade 825. Dual-port HBA with a per-port maximum of 8 Gbps using an 8 Gbps SFP+. Brocade 415. Single-port HBA with a per-port maximum of 4 Gbps using a 4 Gbps SFP. Brocade 425 Dual-port HBA with a per-port maximum of 4 Gbps using a 4 Gbps SFP.
Notes:
This publication only supports the HBA models listed above and does not provide information
about the Brocade 410 and 420 Fibre Channel HBAs, also known as the Brocade 400 Fibre Channel HBAs.
Although you can install an 8 Gbps SFP+ into a Brocade 415 or 425 HBA, only 4 Gbps
maximum port speed is possible.
Windows Server 2003, version R2 with SP2 Windows Server 2008 Windows NT (HCM support only) Windows 2000 (HCM support only) Linux RHEL4, RHEL5, SLES9, and SLES10 Solaris 10 (x86 and SPARC) VMware ESX Server 3.5
NOTE
Drivers, BCU, and HCM Agent are supported only on the VMware console Operating System. HCM is supported only on the guest operating system on VMware.
viii
Specific operating system service pack levels and other patch requirements are detailed in the current HBA release notes.
NOTE
Document conventions
This section describes text formatting conventions and important notice formats used in this document.
Text formatting
The narrative-text formatting conventions that are used are as follows: bold text Identifies command names Identifies the names of user-manipulated GUI elements Identifies keywords and operands Identifies text to enter at the GUI or CLI Provides emphasis Identifies variables Identifies paths and Internet addresses Identifies document titles Identifies CLI output Identifies command syntax examples
italic text
code text
For readability, command names in the narrative portions of this guide are presented in mixed lettercase: for example, switchShow. In actual examples, command lettercase is often all lowercase. Otherwise, this manual specifically notes those cases in which a command is case sensitive.
ix
Commands are printed in bold. Command options are printed in bold. Arguments. Optional element. Variables are printed in italics. In the help pages, values are underlined or enclosed in angled brackets < >. Repeat the previous element, for example member[;member...] Fixed values following arguments are printed in plain font. For example, --show WWN Boolean. Elements are exclusive. Example: --show -mode egress | ingress
NOTE
ATTENTION
An Attention statement indicates potential damage to hardware or data.
CAUTION A Caution statement alerts you to situations that can be potentially hazardous to you or cause damage to hardware, firmware, software, or data.
DANGER A Danger statement indicates conditions or situations that can be potentially lethal or extremely hazardous to you. Safety labels are also attached directly to products to warn of these conditions or situations.
Key terms
For definitions specific to Brocade and Fibre Channel, see the technical glossaries on Brocade Connect. See Brocade resources on page xi for instructions on accessing Brocade Connect. For definitions specific to this document, see the Brocade Fibre Channel HBA Installation and Reference Manual.
For definitions of SAN-specific terms, visit the Storage Networking Industry Association online dictionary at: http://www.snia.org/education/dictionary
Sun Microsystems, Inc. Red Hat Inc. Novell, Inc VMware Inc. SPARC International, Inc
Additional information
This section lists additional Brocade and industry-specific documentation that you might find helpful.
Brocade resources
For HBA resources, such as product information, software, firmware, and documentation, visit the Brocade HBA web site at www.brocade.com/hba. To get up-to-the-minute product information, join Brocade Connect. Go to http://www.brocadeconnect.com to register at no cost for a user ID and password. For practical discussions about SAN design, implementation, and maintenance, you can obtain Building SANs with Brocade Fabric Switches through: http://www.amazon.com For additional Brocade documentation, visit the Brocade Web site: http://www.brocade.com
xi
Brocade HBA model number Host operating system version Software name and software version, if applicable syslog message logs bfa_supportsave output. To expedite your support call, use the bfa_supportsave feature to collect debug information from the driver, internal libraries, and firmware. You can save valuable information to your local file system and send it to support personnel for further investigation. For details on using this feature, refer to Using Support Save on page 25.
Detailed description of the problem, including the switch or fabric behavior immediately
following the problem, and specific questions.
xii
NOTE
For details on using HCM and BCU commands, refer to the Brocade Fibre Channel HBA Administrators Guide. 3. Port World-Wide Port Name (PWWN). Determine this through the following resources:
Label located on the end of the HBA opposite the SFP receiver slots. This label provides
the WWPN for each port.
This command displays HBA information. The <ad_id> parameter is the HBAs serial number.
port --list <ad_id>
This command lists all the physical ports on the HBA along with their basic attributes. The <ad_id> parameter is the HBAs serial number.
Document feedback
Quality is our first concern at Brocade and we have made every effort to ensure the accuracy and completeness of this document. However, if you find an error or an omission, or you think that a topic needs further development, we want to hear from you. Forward your feedback to: [email protected] Provide the title and version number of the document and as much detail as possible about your comment, including the topic heading and page number and your suggestions for improvement.
xiii
xiv
Chapter
Introduction to troubleshooting
In this chapter
How to use this manual for troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Gathering problem information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
First study information in the general HBA problem symptoms, causes, and fixes or actions
in Table 2 on page 6. This table should provide help on many general problems that you may encounter with HBA operation. Fixes and actions often reference the BCU commands, HCM features, and host operating system commands described in Chapter 3, Tools for Collecting Data to gather data for problem isolation or resolution.
If you still require more information to isolate problems, use the following sections in
Chapter 3. Note that these sections are referenced from Table 2 (Troubleshooting General Problems) when appropriate to further isolate problems. Resolving installation problems on page 12. Host system freezes or crashes on page 16. Verifying Fibre Channel links on page 20. 3. Use the BCU commands, HCM features, and host operating system commands described in Chapter 2, Isolating Problems to gather data to help you isolate problems. Although many of these tools are specifically referenced as actions for problems described in Table 2 (Troubleshooting General Problems) in Chapter 2, many more are included that can provide helpful data, such as event logs, operating statistics, and diagnostics. Note that Table 5 on page 24 in Chapter 3 provides a list of useful host system commands for each supported operating system that you can use to gather data. 4. Consider these factors when isolating and resolving the problem:
Can the issue be resolved by using the latest supported combination of host system BIOS,
operating system, operating system updates, or HBA drivers?
Does the issue persist when the HBA is installed in a different platform or is connected
using a different switch port, SFP, and cable?
Can this problem be reproduced on one or more HBAs, port, or host system? Can you
identify specific steps that consistently reproduce this problem on one or more hosts?
Is the problem documented in release notes for the HBA, operating system, or host system
BIOS?
Is the problem documented in release notes for the switch and target storage system? Is unexpected behavior intermittent or always present?
If the problem is in a Fibre Channel switch, cabling, storage device, or in connectivity between these components, refer to documentation, help systems, or service providers of that equipment. 5. If you cannot resolve the problem, gather and provide problem information to your HBA support provider for resolution.
Describe the symptoms that you are observing. Be specific. Here are some examples: - User experiences, such as slow performance or file access. - LEDs not functioning on an HBA port that is connected to the fabric. - All LEDs on HBA port flashing amber. - Expected storage devices not visible from the HCM or host systems storage management
application.
HBA not recognized by host system BIOS. HBA not recognized as PCI device by host system operating system.
What happened prior to the observed symptoms? Describe all observed behavior that is unexpected and compare against expected behavior. Gather information for support. - Use appropriate tools on storage targets to gather information such as disk, tape, and
controller model and firmware levels.
Run the bfa_supportsave BCU command on the host system and save output to a file on your system. This command captures all driver, internal libraries, firmware, and other information needed to diagnose suspected system issues.You can save captured information to the local file system and send it to support personnel for further investigation.
Run the Fabric OS supportSave command on any Brocade switch and save output. This command collects RASLOG, TRACE, supportShow, core file, FFDC data and other support information.
For details on using the Support Save feature, refer to Using Support Save on page 25.
Draw a topology map of the SAN from the HBAs to the storage targets. Include the following:
TABLE 1
Component
HBA
Fibre Channel switches. Fiber optic links between HBA, switches, and storage ports. Host hardware
The bfa_supportSave and FOS supportsave commands can provide current information for the topology map. Also, consider using the Brocade SAN Health products to provide information on your SAN environment, including an inventory of devices, switches, firmware versions, and SAN fabrics, historical performance data, zoning and switch configurations, and other data. Click the Support tab on www.brocade.com for more information on these products.
Run appropriate diagnostic tools for storage targets. Use additional HCM, BCU, host system, and Fabric OS commands summarized in Chapter 3,
Tools for Collecting Data to gather statistics and problem data on the HBA, host, Fibre Channel links, and connected devices.
Determine what has changed in the SAN. For example, if the SAN functioned without problems
before installing the HBA, then the problem is most likely in the HBA installation or configuration, HBA hardware, or HBA driver package. Other examples to investigate could be changes in the switch or storage system firmware, an offline switch, or a disconnected or faulty cable between the HBA, switch, or storage controller fiber optic ports.
Record the time and frequency of symptoms and the period of time symptoms have been
observed.
Determine if unexpected behavior is intermittent or always present. List steps that have been taken to troubleshoot the problem, including changes attempted to
isolate the problem.
Chapter
Isolating Problems
In this chapter
General problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Resolving installation problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Host system freezes or crashes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 HCM GUI fails to connect with HCM agent. . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Verifying Fibre Channel links. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
General problems
Table 2 on page 6 describes general problems related to HBA operation, possible causes, and recommended actions that may fix the problem. Recommended actions may refer you to information in the following locations as appropriate to gather information to further isolate and resolve the problem.
2
TABLE 2
Symptom
General problems
Fix or Action
1 Execute host operating system command to list PCI devices. Refer to the List PCI Devices row in Table 5 on page 24. If the HBA is not listed, perform the following steps. Reseat the HBA. Replace the HBA with an HBA in known working condition to determine whether there is a slot malfunction. Verify compatibility by reviewing the Brocade Server Connectivity Compatibility Matrix. To find this document, log into Brocade Connect on www.brocade.com, then select the Compatibility Information quick link under Documentation Library. Execute your hosts operating system command to list PCI devices. Refer to the List PCI Devices row in Table 5 on page 24. If the HBA is not listed in the output from this command, go on to the next step. Refer to HBA not reported under servers PCI sub-system under the Symptom column in this table. The HBA driver may not be loaded. Refer to Confirming driver package installation on page 14 for methods to verify driver installation.
2 3
1 2
HBA not reported under servers PCI sub-system. HBA driver not loaded.
General problems
TABLE 2
Symptom
Fix or Action
1 Ensure that the SFPs and cables are connected properly on both HBA and switch side. Check for any cable damage. Verify HBA side link status by executing the BCU port --list command. Check the FC Addr field for an address and the State field for Linkup. For details on using this command, refer to Port list on page 47. Execute either the Fabric OS switchShow or portShow commands on the attached switch to ensure that the switch or individual port is not disabled or offline. Check the port topology setting on the switch using the Fabric OS portCfgShow command to ensure that Locked L_Port is OFF. Use the portCfgLport command to change the setting to OFF if required. Check the switch port speed using the Fabric OS portCfgShow command to verify that Speed is either AUTO or matches the speed of the attached HBA port (for example, the speed setting for both ports is 4 Gbps). Check port speed on the HBA with the BCU port --list or port --query commands to display the current and configured speed. Refer to Port speed on page 48 and Port query on page 47 for details on using these commands. If non-brocade branded SFPs are inserted on the HBA side or 8 Gbps switch or director, the port link will not come up. On the switch, execute the Fabric OS switchShow command to verify that Mod_Inv (invalid module) does not display for the port state.
2
TABLE 2
Symptom
General problems
Loss of sync and loss of signal errors in port statistics (refer to Port statistics on page 34).
Ensure that the SFPs and cables are connected properly on both HBA and switch side. Check for any cable damage. Refer to Verifying Fibre Channel links on page 20
Check authentication settings on the switch and HBA. For the switch, execute the authutil --show Fabric OS command. For the HBA, execute the BCU auth --show command (refer to Authentication settings on page 48). Use the BCU auth --show <port> command on the HBA and Fabric OS authutil --show command on switch. Check the shared secret configuration on the attached switch and HBA. For the switch, execute the secAuthSecret Fabric OS command. For the HBA, execute the auth -secret BCU command (refer to Authentication settings on page 48) for details on using the auth-secret command.
General problems
TABLE 2
Symptom
Fix or Action
1 Execute the Fabric OS nsAllShow command on the attached switch to verify that the target and the host are online in the fabric and registered in the name server. Execute the Fabric OS cfgActvShow command on the attached switch and verify that the host and target are in the same zone (either using domain area members, port area members, or port or node WWNs). The HBA driver may not be loaded. Refer to Confirming driver package installation on page 14 for methods to verify driver installation. Verify that the remote target port (rport) is reporting itself online by comparing rport online and rport offline statistics (refer to Remote port statistics on page 36). The rport online counter should be one greater than the rport offline counter. If not, clear the counters and try connecting to the remote port again. Verify the rport online and rport offline statistics again. Check LUN mapping and masking using storage array configuration tools. The HBA driver may not be loaded. Refer to Confirming driver package installation on page 14 for methods to verify driver installation.
1 2
Missing or improper storage array LUN masking setting. HBA driver not loaded.
I/Os are not failing over immediately on a path failure in MPIO setup.
Execute the port --query <port_id> BCU command and ensure fcpim MPIO mode is enabled (which implies zero Path TOV values) or that fcpim MPIO mode is disabled with the expected Path TOV settings (default is 10 seconds). Execute the Fabric OS configure command on the attached switch and change the Maximum logins per port parameter under the F_Port login parameters menu to increase the maximum NPIV I/Ds allowed per port.
On Linux, the maximum IOPS numbers are very low. On VMware, the maximum IOPS numbers are very low.
The amount of disk I/O requests are Refer to Linux tuning on page 51 for causing low throughput and high latency. suggestions to optimize HBA performance in Linux systems. The amount of disk I/O requests are Refer to VMware tuning on page 53 for causing low throughput and high latency. suggestions to optimize HBA performance in Vmware systems.
2
TABLE 2
Symptom
General problems
Fix or Action
1 Verify if QoS is enabled for an HBA port using the qos -query <port_id> BCU command. Verify if it is enabled on the switch using the islShow command. Verify zones on the switch using cfgActvShow command. Verify that QoS is configured on switch using instructions in the Brocade Fabric OS Administrators Guide.
2 3
There is a problem in the fabric or a protocol issue between the HBA and fabric.
Check fabric statistics. Refer to Fabric statistics on page 36 for methods to display fabric statistics for the HBA. If counts for FLOGI sent and FLOLGI accept fabric statistics do not match, suspect fabric problem or protocol issue between HBA and fabric. If fabric offline counts increase and fabric maintenance is not occurring, this may indicate a serious fabric problem. Refer to your switch troubleshooting guide. Refer to Errors when installing driver on page 13 for more information to isolate this problem. Refer to Installer program does not autorun from CD (Windows only) on page 13 for more information to isolate this problem. Refer to Host not booting from remote LUN on page 14 for more information to isolate this problem. Refer to Verifying Fibre Channel links on page 20 for more information to isolate this problem.
Errors when installing bfa_driver_linux--<version>.noarch.rpm driver package. Installer program does not autorun (Windows only).
Appropriate distribution kernel development packages are not installed on your host system for the currently running kernel. Autorun is not enabled on your system.
1 2 3 4
Fault fiber optic cabling and connections. Faulty or unseated SFPs or unsupported SFPs. Conflicts with port operating speed or topology of attached devices. HBA not compatible with host system. Problem in the fabric or a protocol issue between the HBA and fabric. NPIV is not supported or is disabled on the switch
Check virtual port statistics, such as FDISK sent, FDISK accept, and No NPIV support statistics. Refer to Virtual port statistics on page 38 for methods to display virtual port statistics. Refer to Confirming driver package installation on page 14 for methods to verify driver installation.
10
General problems
TABLE 2
Symptom
The HBA not registering with the name server or cannot access storage.
Follow recommended action in message. Resolve critical-level messages and multiple major or minor-level messages relating to the same issue as soon as possible. For details on event messages, refer to Event logs on page 30.
11
2
TABLE 2
Symptom
Fix or Action
Do not uninstall the driver using the Device Manager if you have used the Brocade installer programs to install driver instances. Always use the Brocade installer programs to remove the driver. Refer to Files needed for bfad.sys message appears when removing driver on page 13 for more information.
Cannot roll back driver on all HBA instances using Device Manager
Installing the driver using the Brocade driver installer program (bfa_installer.exe) or Software Installer (GUI or command-based application), then rolling back driver HBA instances using the Device Manager.
Install the driver for each HBA instances using the Device Manager, then roll back the driver using Device Manager. Use the driver installer program (bfa_installer.exe) or Brocade Software Installer (GUI or command-based application) to install or upgrade the driver, then use the Brocade Software Uninstaller to roll back drivers on all HBA instances in one-step. Refer to Cannot roll back driver on all HBA instances using Device Manager on page 14 for more information.
If troubleshooting actions in Table 2 do not resolve problems, check the installed version of the HBA (chip revision) and driver (fw version) using the adapter --query BCU command. To use this command, refer to Collecting data using BCU commands on page 28. Refer to release notes posted on the Brocade HBA web site (www.brocade.com/hba) for known problems relating to the HBA and driver versions.
NOTE
Verifying installation
Problems with HBA operation may be due to improper hardware or software installation, incompatibility between the HBA and your host system, unsupported SFPs installed on the HBA, improper fiber optic cable connected to the fabric, or the HBA not operating within specifications. Determine if problems may exist because of these factors by reviewing your installation with information in the Brocade Fibre Channel HBA Installation and Reference Manual listed in Table 3.
12
TABLE 3
Information
Hardware and software compatibility information. Software installation packages supported by host operating system and platforms. Hardware and software installation instructions. Product specifications.
13
Cannot roll back driver on all HBA instances using Device Manager
When using the Windows Device Manager, you can only roll back the driver for the first HBA instance. This occurs if you perform the following sequence of steps: 1. Install the driver using the Brocade driver installer program (bfa_installer.exe) or Software Installer (GUI or command-based application). 2. Roll back driver HBA instances using Device Manager. To avoid this problem, use one of the following methods:
Install the driver for each HBA instances using the Device Manager, then roll back the driver
using Device Manager.
Use the driver installer program (bfa_installer.exe) or Brocade Software Installer (GUI or
command-based application) to install or upgrade the driver, then use the Brocade Software Uninstaller to roll back drivers on all HBA instances in one-step.
A zone is created on the attached switch that contains only the PWWN of the storage system
port where the boot LUN is located and the PWWN of the HBA port.
BIOS or EFI is enabled to support boot over SAN from a specific HBA port. BIOS or EFI is configured to boot from a specific LUN. The hosts operating system, HBA driver, and other necessary files are installed on the boot
LUN.
Storage devices and targets not being discovered by the device manager or appearing
incorrectly in the hosts device manager.
Improper or erratic behavior of HCM (installed driver package may not support HCM version). Host operating system not recognizing HBA installation. Operating system errors (blue screen).
If driver is not installed, try re-installing the driver or re-installing the HBA hardware and then the driver. You can use HCM and tools available through your hosts operating system to obtain information such as driver name, driver version, and HBA port WWNs.
NOTE
14
Windows
Use the Device Manager to determine driver installation. Verify if the driver is installed and Windows is recognizing the HBA using the following steps. 1. Open the Device Manager. 2. Expand the list of SCSI and RAID controllers. 3. Right-click the Brocade FC HBA model where you are installing the driver. If you do not see this entry or Fibre Channel Controller displays with a yellow question mark under Other Devices, the driver is not installed. 4. Select Properties to display the Properties dialog box. 5. Click the Driver tab to display the driver date and version. Click Driver Details for more information.
NOTE
If driver is not installed, try re-installing the driver or re-installing the HBA hardware and then the driver.
Linux
Verify if the HBA driver installed successfully using the following commands:
# lspci
This is a utility that displays information about all PCI buses in the system and all devices connected to them.
# lsmod
This command displays information about all loaded modules. If bfa appears in the list, the HBA driver is loaded to the system.
# dmesg
This command prints kernal boot messages. For the bfa entry, HBA model and driver version should display if the hardware and driver are installed successfully.
15
# modprobe -l bfa
This verifies that the module has loaded. If bfa displays, the module has been loaded to the system.
Solaris
Verify if the HBA driver installed successfully using the following commands.
pkginfo -l bfa
This displays details about installed Brocade HBA (bfa) drivers. Look for information as in the following example. Note that the VERSION may be different, depending on the driver version you installed. The ARCH and DESC information may also be different, depending on your host system platform. If the HBA driver package is installed, bfa_pkg should display with a completely installed. status.
PKGINST: NAME: CATEGORY: ARCH: VERSION: BASEDIR: VENDOR: DESC: PSTAMP: INSTDATE: HOTLINE: STATUS: bfa Brocade Fibre Channel Adapter Driver system sparc&i386 alpha_bld31_20080502_1205 / Brocade 32 bit & 64 bit Device driver for Brocade Fibre Channel adapters 20080115150824 May 02 2008 18:22 Please contact your local service provider completely installed
VMware
Verify if the HBA driver installed successfully using the following commands:
vmkload_mod -l
This lists installed driver names, R/O and R/W addresses, and whether the ID is loaded. Verify that an entry for bfa exists and that the ID loaded.
cat /proc/vmware/version
This displays the latest versions of installed drivers. Look for a bfa entry and related build number.
If the system does not freeze when rebooted and operates correctly, use the following
information to resolve the problem:
16
Try rebooting the system without any connectivity to the switch. This will help isolate any hang caused by switch and device interactions. Reseat SFPs in the HBA. Determine whether the installed SFPs are faulty by observing LED operation by HBA ports. If all LEDs are flashing amber, the SFP is invalid and may not be a required Brocade model. You can also verify SFP operation by replacing them with SFPs in known operating condition. If the problem is resolved after replacement, original SFP is faulty. Check for conflicts with attached devices. Verify that data speed (1-8 Gbps) and connection topology (for example, point-to-point) for devices attached to the HBA are compatible with settings on the HBA port. Although auto may be set, configuring settings manually on the HBA port and devices may allow connection. Also, note that the HBA only supports point-to-point connection topology. Refer to the Brocade Fibre Channel HBA Administrators Guide for procedures to configure HBA ports.
NOTE
Observe the LEDs by HBA ports. Illuminated LEDs indicate connection, link activity, and connection speed negotiated with the attached device. Refer to LED Operation in the Specifications chapter of the Brocade Fibre Channel HBA Installation and Reference Manual.
If the system freezes perform the following tasks: - Verify if the host system firmware supports PCIe specifications listed in the Hardware and
Software Compatibility section, Introduction chapter, of the Brocade Fibre Channel HBA Installation and Reference Manual. If not, download a firmware update to support the HBA.
On Windows systems, determine when the system freezes during the boot process. If it freezes as the driver loads, uninstall and reinstall the driver. If it freezes during hardware recognition, uninstall both the driver and HBA, then reinstall both. Remove the HBA and reboot the system. If the system boots, reinstall the HBA. Reseat the HBA. Uninstall and reinstall the driver. Try installing the HBA into another host system. If the problem does not occur, the HBA may not be compatible with the original host system. If the problem occurs in the new system, replace the HBA.
The agent is not running. The agent not accepting connections on the expected port. The agent is not listening on the expected port.
17
Communications between the client and agent is blocked by a firewall preventing access to the
port (usually only a consideration for remote HCM management).
NOTE
This command is a single line. The localhost can be replaced with a different IP address.
wget --no-check-certificate https://admin:password@localhost:34568/JSONRPCServiceApp/ SupportSaveController.do
If successful, the file SupportSaveController.do (actually a zip format file) will contain the data from the HCM agent. 4. If you are managing a VMware host system through HCM from a remote system, the hosts firewall may be blocking TCP/IP port 34568, which allows agent communication with HCM. Use the following command to open port 34568:
/usr/sbin/esxcfg-firewall-o 34568,tcp,out,https
Use Windows Firewall and Advanced Service (WFAS) to open port 34568.
NOTE
You can change the default communication port (34568) for the agent using procedures in the Installation chapter of the Brocade Fibre Channel HBA Installation and Reference Manual. Refer to the section on modifying HCM agent operation. 5. If HCM is still unable to connect to the HCM agent after using the preceding steps, collect the following data and send to your Support representative for analysis:
Data collected from the previous step in SupportSaveController.do. Data from the HCM application SupportSave feature. Select Tools > SupportSave to
generate a supportsave file. The data file name and location displays when the SupportSave feature runs.
18
HBA agent files on the HBA host (where the HCM agent is installed). Collect these files
using the following command:
tar cvfz hbaagentfiles.tgz /opt/hbaagent
Data collected on the HBA host from the bfa_supportsave feature using the following
command:
bfa_supportsave
Output collects to a file and location specified when the SupportSave feature runs.
Windows systems
Perform the following tasks to isolate and resolve the problem. 1. Verify that the agent is running by executing the appropriate status command for your operating system described in the Installation chapter of the Brocade Fibre Channel HBA Installation and Reference Manual. Refer to the section on modifying HCM agent operation. 2. If you receive a message that the hcmagent is stopped, restarting the agent should resolve the problem. To restart, use the appropriate start command for your operating system which is also described in the Brocade Fibre Channel HBA Installation and Reference Manual. Note that one command described in the manual restarts the agent, but the agent will not restart if the system reboots or the agent stops unexpectedly. Another command restarts the agent, but the agent will restart if the system reboots. 3. If the HCM agent starts, verify which TCP port the agent is listening on by executing the following command at the Windows command prompt:
netstat -nao | findstr 34568
1960 in the last column is the process identifier for the Windows process listening on the TCP port. Note that this identifier may be different on your system. 4. Enter the following command to confirm that the process identifier bound to TCP port 34568 is for the hcmagent.exe process:
tasklist /svc | findstr 1960
The following should display if the identifier from step 3 is bound to TCP port 34568:
hcmagent.exe 1960 hcmagent
5. If you are managing a Windows 2008 host system through HCM from a remote system, the hosts firewall may be blocking TCP/IP port 34568. Use Windows Firewall and Advanced Service (WFAS) to open port 34568.
NOTE
You can change the default communication port (34568) for the agent using procedures in the Installation chapter of the Brocade Fibre Channel HBA Installation and Reference Manual. Refer to the section on modifying HCM agent operation.
19
6. If the hcmagent is running and listening on port 34568 and there are no firewall issues (as explained in step 5), but you get the same Failed to connect to agent on host..." error when using HCM, collect the following data. Send this data to your Support representative for analysis:
Copies of output from the commands in step 3 and step 4. Files from the output directory created after you execute the bfa_supportsave feature.
To collect these files, execute the following command:
bfa_supportsave
Data collected by this command saves to subdirectory named bfa_ss_out. In Windows explorer, right-click this directory and select Send To > Compressed (zipped). This creates a zip file that you can send to your Support representative.
Build information for the HCM application. Select Help > About in HCM to display the
version, build identification, and build date.
NOTE
Damaged fiber optic cables. (Note that damaged cables can also cause errors and invalid data
on links.)
20
Fiber optic cables may not be rated or compatible with HBA port speeds. Refer to Fibre Optic
Cable specifications in the Brocade Fibre Channel HBA Installation and Reference Manual.
Faulty switch or HBA SFPs. Verify if an SFP is the problem by connecting a different link to the
HBA port or, if convenient, replace the cable with a cable of known quality. If the errors or invalid data on the link still indicate a cable problem, the SFP may be faulty. Try replacing the SFP.
SCSI retries and timeouts determine communication between HBA and storage. Dropped
packets cause timeouts. Packets can drop because of SFP issues on HBA or switch - possibly the SFP is not compatible with HBA, but is compatible with switch or vice versa. You can run the BCU port --stats command to display port statistics, such as error and dropped frames. Table 4 lists HCM options and BCU commands, as well as Fabric OS commands that you can use to determine link status.
TABLE 4
Application
HCM
Chapter 3, Tools for Collecting Data Fabric OS Administrators Guide Fabric OS Troubleshooting and Diagnostics Guide
21
22
Chapter
In this chapter
For detailed information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data to provide support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Collecting data using host system commands . . . . . . . . . . . . . . . . . . . . . . . Collecting data using BCU commands and HCM . . . . . . . . . . . . . . . . . . . . . Collecting data using Fabric OS commands . . . . . . . . . . . . . . . . . . . . . . . . . Event logs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Collecting SFP data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Collecting port data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Authentication settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . QoS and target rate limiting settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Persistent binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23 24 24 25 28 30 34 40 45 46 48 48 50
23
Support Save Diagnostics Port logs Port statistics and properties HBA properties Host operating system commands
NOTE
Output from all of these commands is captured using the Support Save feature.
TABLE 5
Task
Windows
In Windows registry location HKEY_LOCAL_MACHINE \SYSTEM\CurrentContro lSet\Enum\PCI devcon find pci\* msinfo32.exe Click the plus sign(+) next to Components to view hardware details. Windows Task Manager, tasklist.exe Windows Task Manager, tasklist.exe Windows Task Manager, perfmon.exe
VMware
lspci -vv, esxcfg-info -w
Solaris
prtdiag -v, prtconf -pv
lsdev
esxcfg-info -a
ps -efl, top top, vmstat -m vmstat, VM Performance: esxtop [first type 'v', 'e' then enter vm# in the list down], Disk Performance: esxtop [type 'v' then 'd']. vmkload_mod -l vmkload_mod -l
List for driver modules To check for Brocade Fibre Channel adapter (BFA) driver module
24
TABLE 5
Task
Windows
System Category in Windows Event Viewer (eventvwr.exe) systeminfo.exe
VMware
/var/log/message* , /var/log/vmkernel*, /var/log/vmkwarning*,/p roc/vmware/log cat /etc/vmware-release
Solaris
dmesg, /var/adm/message*
System log message location NOTE: For more information, refer to Host system logs on page 30. Show OS distribution info
/etc/bfa.conf Windows Registry (HKEY_LOCAL_MACHINE \SYSTEM\CurrentContro lSet\Services\bfad\Para meters\Device), HBA Flash Windows Registry (HKEY_LOCAL_MACHINE \HARDWARE\DEVICEMA P\Scsi\Scsi Port x) /dev/bfa*
/kernel/drv/bfa.conf
/dev/bfa*
(Release 1.0) /devices/pci*/pci*/ fibre-channel@0:dev ctl, (Release 1.1 and later) /devices/pci*/pci*/ bfa@0:devctl
In HCM, launch Support Save through the Tools menu. Through BCU, enter the bfa_supportsave command. Through your internet browser (Internet Explorer 6 or later or Firefox 2.0 or later), you can
collect bfa_supportsave output if you do not have root access, do not have access to file transfer methods such as FTP and SCP, or do not have access to the Host Configuration Manager (HCM). A bfa_supportsave collection can also occur automatically for a port crash event. The Support Save feature saves the following information:
NOTE
25
HBA model and serial number. HBA firmware version. Host model and hardware revision. All support information. HBA configuration data. All operating system and HBA information needed to diagnose field issues Information about all HBAs in the system. Firmware and driver traces. Syslog message logs. Windows System Event log .evt file. HCM GUI-related engineering logs Events HBA configuration data Environment information Data.xml file Vital CPU, memory, network resources HCM Agent (logs, configuration) Driver logs (bfa_supportsave output) Core files
Master and Application logs are saved when Support Save is initiated through HCM, but not through BCU.
NOTE
where: [output_dir] An optional parameter that specifies the directory where you want output saved. If not specified, output is saved as a directory in the current working directory as bfa_ss_out.
26
Messages display as the system gathers information. When complete, an output file and directory display. The directory name specifies the date when the file was saved. For more information on the bfa_supportsave feature, refer to the Host Connectivity Manager (HCM) Administrators Guide.
where localhost is the IP address of the server from which you want to collect the bfa_supportsave information. 2. Log in using the factory default user name (admin) and password (password). Use the current user name and password if they have changed from the default, The File Download dialog box displays, prompting you to save the supportSaveController.do file. 3. Click Save and navigate to the location where you want to save the bfa_supportsave file.
Port crash events have a CRITICAL severity and you can view the details in the Master Log and Application Log tables in HCM. For more information on these logs, refer to HCM logs on page 32.
BCU
Collects only driver-related logs and configuration files.
Browser
Collects driver-related and HCM Agent logs and configuration files.
HCM
Collects, HCM, driver-related, and HCM Agent logs and configuration files.
27
Port crash events have a CRITICAL severity and you can view the details in the Master Log and Application Log tables in HCM.
where: list Lists all adapters in the system. For each adapter in the system, a brief summary line is displays.
The adapter --query command displays adapter information, such as the current version of the HBA (chip revision) and driver (fw version), maximum port speed, model information, serial number, number of ports, PCI information, pwwn, nwwn, hardware path, and flash information (such as firmware version).
adapter -query <ad_id>
where: ad_id ID of the adapter (HBA) for which you want to query.
authUtil
Use this command to display and set local switch authentication parameters.
cfgShow
Displays zone configuration information for the switch. You can use command output to verify target ports (by port WWN) and LUNs that are intended to be accessible from the HBA.
28
fcpProbeShow
Use this command to display the Fibre Channel Protocol daemon (FCPd) device probing information for the devices attached to a specified F_Port or FL_Port. This information includes the number of successful logins and SCSI INQUIRY commands sent over this port and a list of the attached devices.
nsShow
Use this command to display local NS information about all devices connected to a specific switch. This includes information such as the device PID, device type, and the port and node WWN.
zoneshow
Use this command without parameters to display all zone configuration information (both defined and enabled).
portErrShow
Use this command to display an error summary for all switch ports.
portLogShow
Use this command to display the port log for ports on a switch.
portLogShowPort
Use this command to display the port log for a specified switch port.
portPerfShow
Use this command to display throughput information for all ports on the switch.
portStatsShow
Use this command to display hardware statistics counters for a specific switch port.
portShow
Use this command to display information and status of a specified switch port, including the speed, ID, operating state, type, and WWN.
SecAuthSecret
Use this command to manage the DH-CHAP shared secret key database used for authentication.This command displays, sets, and removes shared secret key information from the databases
sfpShow
Use this command to display detailed information about specific SFPs installed in a switch.
switchShow
Use this command to display switch and port information. Output may vary depending on the switch model. Use this information to determine the fabric port WWN and PID connected to an HBA port. Also display topology, speed, and state of each port on the switch.
29
Event logs
Event logs
Event messages that occur during HBA and driver operation are important tools for isolating and resolving problems. Messages provide descriptions of the event, severity, time and date of the event, and in some cases, cause and recommended actions. These messages are captured in logs. Monitoring events in these logs allows early fault detection and isolation on a specific HBA. The following types of logs are available:
Message ID Message text Message arguments Severity level Cause Recommended action
Brocade HBA event message files are installed in the HBA driver installation directory for each supported operating system. Table 8 provides the location of the message files for each system.
TABLE 6
Linux VMware Solaris Windows
Operating System
Table 7 describes the logs for each supported operating system, where the logs are stored, and how to view them.
TABLE 7
Solaris Windows
Operating System
Location
/var/adm/messages Not applicable
30
Event logs
TABLE 7
Linux
1VmWare
Operating System
Location
/var/log/message /var/log/message* , /var/log/vmkernel*, /var/log/vmkwarning*,/proc/v mware/log
1.
For ESX Server Console operating system. For Guest system, refer to information in Windows or Linux.
You can view all event messages that can display for a Brocade HBA by viewing HTML files that are loaded to your system as the driver package installs. These files contain all message information that can display on system logs for the Brocade HBA. View these files through your internet browser. Table 8 provides the location of the Brocade HBA message files for each supported system.
TABLE 8
Linux VMware Solaris Windows
Operating System
TABLE 9
bfal_aen_itnim.html
bfal_aen_lport.html
bfal_aen_port.html
31
Event logs
TABLE 9
hba_error_codes.doc
Adjust the logging level, or the types of messages logged to your system log that relate to HBA driver operation, using the following HCM options and BCU commands.
NOTE
HCM logs
You can view data about HBA operation through the following HCM logs. Both of these logs display on the bottom of the HCM main window. Click the Master Log or Application Log to toggle between logs.
The Master Log displays informational and error messages during HBA operation. This log
contains the severity level, event description, date and time of event, the function that reported the event (such as a specific HBA port or remote target port), WWN of device where event occurs, and other information.
The Application Log displays informational and error messages related to HBA discovery or
HCM application issues.
Master Log
The Master Log displays event information in seven fields:
Sr No.
Sequence number that event occurred in ascending order.
32
Event logs
Severity
Event severity level (informational, minor, major, or critical).
Critical-level messages indicate that the software has detected serious problems that will eventually cause a partial or complete failure of a subsystem if not corrected immediately. Examples of these could be a power supply failure or rise in temperature Major messages represent conditions that do not impact overall system functionality significantly. Examples of these could be timeouts on certain operations, failures of certain operations after retries, invalid parameters, or failure to perform a requested operation. Minor messages highlight a current operating condition that should be checked or it might lead to a failure. Information-level messages report the current non-error status of the system components; for example, the online and offline status of a fabric port.
WWN
World Wide Name of HBA where event occurred.
Category
The category or type of event. Categories define the component where events occur:
Adapter - Events relating to the HBA (Adapter). Port - Events relating to a specific port on the HBA. LPORT - Events relating to a specific logical port. RPORT - Events relating to a specific remote initiator or target port. ItNIM - Events relating to an initiator-target nexus. Examples of these include end to end target discovery, initiator target connectivity, and loss of connectivity. Audit - Audit events. IOC - Driver and firmware events involving the I/O controller on the HBA.
Subcategory
Subcategory of main category.
Application Log
The Application Log displays all application-related informational and error messages, as well as the following attributes.
Date and time the message occurred. Severity of the message. Description of the message. The agent IP address.
33
Statistics
Syslog support
You can configure the HCM agent to forward events to a maximum of three syslog destinations using the Syslog option on the HCM Configure menu. These events will display in the operating system logs for systems such as Solaris and Linux. For procedures to configure syslog destinations, refer to the Brocade Fibre Channel HBA Administrators Guide.
Statistics
You can access a variety of statistics using BCU commands and HCM. Use these statistics to monitor HBA performance and traffic between the HBA and LUNs and isolate areas that impact performance and device login. You can display statistics for the following:
HBA ports IO controller Virtual ports (vport) Logical ports (lport) Remote ports (rport) FCP initiator mode Fabric (BCU only) Targets Security authentication
This section provides an overview of these statistics and how to access them. For more detail, refer to the Brocade Fibre Channel HBA Administrators Guide.
Port statistics
Use BCU and HCM to display a variety of port statistics, such as transmitted and received frames and words, received loop initialization primitive (LIP) event counts, error frames received, loss of synchronization, link failure and invalid CRS counts, and end of frame (EOF) errors. Use these statistics to isolate link and frame errors. For example, loss of synch and loss of signal errors indicate a physical link problem. To resolve these problems, check cables, SFPs on the HBA or switch, and patch panel connections.
34
Statistics
where: port_id ID of the port for which you want to display statistics.
IOC statistics
Use BCU and HCM to display port-level statistics for the I/O controller through the BCU and HCM. The I/O controller refers to the firmware entity controlling the port. The following types of IOC statistics are displayed:
IOC driver IOC firmware Firmware IO Firmware port FPG Firmware port PHYSM Firmware port LKSM Firmware port SNSM
where: ioc_id ID of the IOC controller for which you want to display statistics.
35
Statistics
Fabric statistics
Use BCU and HCM to display statistics for fabric login (FLOGI) activity and fabric offlines and onlines detected by the port. Use these statistics to help isolate fabric login problems. Following are two examples of how to use these statistics for troubleshooting:
If the HBA is not showing in the fabric, check the FLOGI sent and FLOLGI accept statistics. If the
counts do not match, the switch or fabric may not be ready to respond. This is normal as long as it does not persist. If the problem persists, this could indicate a problem in the fabric or a protocol issue between the HBA and fabric.
If fabric offline counts increase and fabric maintenance is not being done, this may indicate a
serious fabric problem. Slow fabric performance or hosts unable to address storage could also be seen.
where: port_id ID of the HBA port for which you want to display statistics.
Port login (PLOGI) activity Authentication and discovery (ADISC) activity Logout (LOGO) activity RCSNs received Process logins (PRLI) received Hardware abstraction layer (HAL) activity
As an example of using these statistics for troubleshooting, if the host cannot see the target, you can verify that the remote port (rport) is reporting itself online by comparing the rport offline and rport offline statistics. The rport online counter should be one greater than the rport offline counter. If not, clear the counters and retry connecting to the remote port. Verify the rport online and rport offline statistics again.
36
Statistics
where: port_id lpwwn rpwwn ID of the port for which you want to display rport statistics. Displays the logical port world wide name. This is an optional argument. If the -l lpwwn argument is not specified, the base port is used. Displays the remote ports port world wide name.
where: port_id lpwwn rpwwn ID of the port for which you want to display statistics. Logical port world wide name. This is an optional argument. If the -l lpwwn argument is not specified, the base port is used. Remote port world wide name.
Name server (NS) port logins (plogin) activity Register symbolic port name (RSPN_ID) identifier activity Register FC4 type identifier (RFT_ID) activity
37
Statistics
Register FC4 type identifier (RFT_ID) activity Get all port ID requests for a given FC4 type (NS_GID_FT) activity Retries Timeouts
Use these statistics to help determine if the HBA is not registering with the name server or cannot access storage. Following are examples of how these statistics indicate these problems:
If name server port login (NS PLOGI) error rejects and unknown name server port login
response (NS login unknown rsp) errors increase, then the HBA most likely cannot log in to the name server.
If name server register symbolic port name identifier (NS RSPN_ID) or name server register
symbolic port name identifier response (NS RFT_ID rsp) errors or rejects (NS RFT_ID rejects) are increasing, the HBA has a problem registering with the name server.
If name server get all port ID response NS GID_FT rsp), rejects (NS_GID FT rejects), or unknown
responses (NS_GID FT unknown rsp) are increasing, the HBA has a problem querying the name server for available storage.
where: port_id lpwwn ID of the port for which you want to display statistics. Logical port world wide name for which you want to display statistics. This is an optional argument. If the -l lpwwn argument is not specified, the base port is used. Remote port world wide name for which you want to display statistics.
rpwwn
If FDISK sent and FDISK accept statistics do not match, the fabric or switch may not be ready.
This is normal as long as it does not persist. If it does persist, there may be a problem in the fabric or a protocol issue between the HBA and fabric. Note that in this case FDISK retries also increase.
38
Statistics
Check the No NPIV support statistics to verify that NPIV is supported and enabled on the
switch.
where: port_id vpwwn ID of the port for which you want to display rport statistics. Displays the statistics for the virtual port by its WWN. If no part WWN is specified, the information provided is for the base vport.
where: port_id ID of the port for which you want to display rport statistics.
39
Diagnostics
Diagnostics
Diagnostics, available through BCU commands and HCM, evaluate the integrity of HBA hardware and end-to-end connectivity in the fabric. All of these diagnostics can be used while the system is running.
Beaconing
Initiate beaconing on a specific HBA port to flash the port LEDs and make it easier to locate the HBA in an equipment room. Initiate link beaconing to flash the LEDs on a specific HBA port and the LEDs on a connected switch port to verify the connection between HBA and switch. When you initiate link beaconing, commands are sent to the other side of the link. When the remote port receives these commands, that ports LEDs flash. The remote port sends a command back to the originating port. When that port receives this command, the ports LEDs flash.
NOTE
To initiate link beaconing, this feature must be available on the connected switch. Toggle beaconing on and off and set beaconing duration using the BCU or HCM.
where: port_id duration ID of the port for which you want to enable beaconing. Length of time between blinks.
where: port_id on | off duration ID of the port for which you want to run a link beacon test. Toggle on or off. If turned on, you can specify duration. Length of time between blinks.
40
Diagnostics
Loopback tests
Use the BCU or the HCM to perform a loopback test for a specific port. Loopback tests require that you disable the port. The following loopback tests are available:
Internal
Random data is sent to the HBA port, then returned without transmitting through the port. The returned data is validated to determine port operation. Errors may indicate an failed port.
External
For this test, a loopback connector is required for the port. Random data is sent to the HBA port. The data transmits from the port then returns. The returned data is validated to determine port operation. Errors may indicate a failed port.
where: port_id loopback type speed duration frame count -p pattern ID of the port that you want to run the test. internal, external, serdes For 8 Gbps HBA, this is 2, 4, or 8. For 4 Gbps HBA, this is 1, 2 or 4. Length of time between blinks. Integer from 0- 4,294,967,295. Default is 8192. Hex number. Default value is A5A5A5A5.
Subtest: three options: Internal, Serdes, and External. Link Speed: For 8G HBA: 2G, 4G and 8G; For 4G HBA: 1G, 2G and 4G Frame Count: Integer from 0- 4,294,967,295. Default value is 8192 Data Pattern: 8 Hex number. Default value is A5A5A5A5
5. Click Start.
41
Diagnostics
where: port_id pattern frame count ID of the port from which you want to run the test. Hex number. Integer.
Frame count: Integer Data pattern: Hex number. Test cycle: The number should be positive and the default is 1.
5. Click Start.
Memory test
Use the BCU or the HCM to perform a memory test for the HBA.
NOTE
Performing the Memory test disables the HBA.
42
Diagnostics
HBA temperature
Use the BCU diag --tempshow command to read the adapters temperature sensor registers.
diag --tempshow <ad_id>
where: port_id rpwwn lpwwn ID of the HBA port from which you want to ping the remote port. Remote port WWN that you want to ping. Logical port WWN. 0 indicates the base port.
43
Diagnostics
7.
Click Start.
Trace route
Use the BCU and HCM to report the SAN path between the HBA and remote end point.
where: port_id rpwwn lpwwn ID of the port from which you want to trace the route. Remote port WWN that you want to ping. Logical port WWN. 0 indicates the base port
Echo test
Use the BCU and HCM to initiate an echo test between the HBA port and a Fibre Channel end point. This sends an ECHO command and response sequence between the HBA port and target port to verify connection with the target.
where: port_id rpwwn lpwwn ID of the port for which you want to perform the test. Remote port WWN that you want to ping. Logical port WWN. 0 indicates the base port; otherwise.
44
SCSI test
Use the fcdiag --scsitest BCU command to test the SCSI link between the HBA and remote port.
fcdiag -scsitest <port_id> <rpwwn> [-l lpwwn]
where: port_id rpwwn lpwwn ID of the port for which you want to test the SCSI link. Remote port WWN that you want to ping. Logical port WWN. 0 indicates the base port; otherwise.
SFP diagnostics
SFP diagnostics provide detailed information on the SFP transceiver for a selected port, such as its health status, port speed, connector type, minimum and maximum distance, as well as details on the extended link.
where: port_id ID of the port for which you want to display SFP attributes.
45
46
Port log
Use the debug --portlog BCU command to display a log of Fibre Channel frames and other main control messages that were sent out and received on a specific port. You can use this information to isolate HBA and Fibre Channel protocol problems.
debug --portlog <port_id>
where: port_id The ID of the port for which you want to display the port log.
If the port log is disabled, a warning message displays. Use the debug -portlogctl command to enable and disable the port log.
NOTE
Port list
Use the port --list BCU command to list all physical ports on the HBA along with their physical attributes, such as PWWN, Fibre Channel address, port type, speed, and state.
port --list <port_id>
where: port_id ID of the port for which you want to display information.
Port query
Use port --query BCU command to display port information, such as WWN, NWWN, state, current and configured speed, topology, received and transmitted BB_Credits, and beacon status.
port --query <port_id>
47
Authentication settings
Port speed
Use port --speed BCU command to display the current port speed setting, such as 1, 4, or 8 Gbps.
port --speed <port_id <1|2|4|8|auto>
where: port_id <1|2|4|8|auto> ID of the port for which you want to display port speed. The speed settings, with auto being autosensing mode.
Authentication settings
Use the Brocade CLI utility (BCU) or the HCM GUI to display the HBA authentication settings and status.
where: port_id ID of the port for which you want to display authentication settings.
48
BCU commands
Use the following BCU command to determine Target Rate Limiting speed and enabled status.
ratelim --query <port-id>
where: port_id ID of the port for which you want to display target rate limiting settings.
Use the following BCU command to display QoS and target rate limiting enabled status and
target rate limiting default speed.
port --query <port-id>
where: port_id ID of the port for which you want to display port information.
Use the following command to display QoS status and information for a port.
bcu qos --query <port_id>
where: port_id ID of the port for which you want to display target rate limiting settings.
Use the following command to determine operating speed of the remote port, QoS priority, and
target rate limiting enforcement:
bcu rport --query
where: port_id Specifies the ID of the port for which you want to query attributes of a remote port.
HCM
Use the Port Properties panel in HCM to display configured QoS parameters.
To open the Port Properties panel: 1. Select a port in the device tree. 2. Click the Properties tab in the right pane.
Use the Remote Port Properties panel in HCM to display information on target rate limiting and
QoS for the remote port. To open the Remote Port Properties panel: 1. From the device tree, select a remote port (target or initiator). 2. Click the Remote Port Properties tab in the right pane.
49
Persistent binding
Persistent binding
Persistent binding is a feature of Fibre Channel (FC) host bus adapters that enables you to permanently assign a system SCSI target ID to a specific FC device, even though the devices ID on the FC loop may be different each time the FC loop initializes. Persistent binding is available in the Windows and VmWare environments only. Use the HCM or BCU to display target ID mapping for an HBA port. BCU Use the pbind --list BCU command to query the list of mappings for persistent binding on a specific port.
pbind --list <port_id> <pwwn>
where: port_id pwwn ID of the port for which you want to query mappings. Port World Wide Name
HCM Use the Persistent Binding dialog box to determine SCSI target ID mappings, perform the following steps: 1. Launch the HCM. 2. Select an HBA or port from the device tree. 3. Select Configure > Persistent Binding. You can also select and right-click on an HBA or port in the device tree and select Persistent Binding from the list.
50
Chapter
Performance optimization
This chapter provides information and tools for optimizing your HBA performance.
In this chapter
Linux tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solaris tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Windows tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VMware tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51 52 52 53
Linux tuning
Linux disk I/O scheduling reorders, delays, and merges requests for disk I/O to achieve better throughput and lower latency than would happen if all the requests were sent straight to the disk. Linux 2.6 has four different disk I/O schedulers: noop, deadline, anticipatory and completely fair queuing. Enabling the noop scheduler avoids any delays in queuing of I/O commands. This helps in achieving higher I/O rates by queuing multiple outstanding I/O requests to each disk. To enable the noop scheduler, run the following commands on your system.
for i in /sys/block/sd[b-z]/queue/scheduler do echo noop > $i done
You must disable the default scheduler because it is not tuned for achieving the maximum I/O performance. For performance tuning on Linux, refer to the following publications
NOTE
51
Solaris tuning
Solaris tuning
To increase I/O transfer performance, set the following parameters on your system:
Set the maximum device read/write directive (maxphy). Set the Fibre disk maximum transfer parameter (ssd_max_xfer_size).
Please refer to Sun StorageTek SAM File System Configuration and Administration Guide document for details of the two parameters.
Windows tuning
Windows tuning involves configuring the driver and operating system tunable parameters.
Interrupt coalescing
Default: ON
Interrupt delay
Default: 1 micro second Valid Range: 0 1125 micro seconds (Note that the value of 0 disables the delay timeout interrupt.)
Interrupt latency
Default: 1 micro second
52
VMware tuning
Valid Range: 0 225 micro seconds (Note that the value of 0 disables the latency monitor timeout interrupt.)
Interrupt Coalescing
When this feature is turned off, I/O completion requests are not coalesced by the firmware. While this helps reduce I/O latency, the host CPU will frequently be interrupted, leading to a slower system response under heavy I/O load (more than 7000 I/Os per second). When this feature is turned on, the HBA will not interrupt the host until Interrupt delay duration. Interrupt delay, together with Interrupt latency, helps to reduce the number of interrupts that the host CPU processes per second, leading to improved overall CPU utilization. However, if the number of interrupts handled between the Interrupt delay period is relatively smaller, then this will result in performance degradation as the I/O completion process has slowed down. The BCU ioc -intr command can be used to configure these interrupt attributes for the desired port.
bcu ioc --intr <ioc_id> <--coalesce|-c> {on | off} [<Latency> <Delay>]
OS tunable parameters
Please see the section Storage Stack Drivers in Disk Subsystem Performance Analysis for Windows Server 2003 optimizations located on the following website. http://download.microsoft.com Please see the sections Performance Tuning for Storage Subsystem and I/O Priorities in Performance Tuning Guidelines for Windows Server 2008 located on the following website. http://www.microsoft.com
VMware tuning
For performance tuning on VMware, refer to the following publications on the VMware website at www.vmware.com:
Performance Tuning Best Practices for ESX Server 3. Refer to the following sections: - Storage Performance Best Practices - Related Publications Fibre Channel SAN Configuration Guide. Refer to Using ESX Server with SAN: Concepts.
53
VMware tuning
54
Index
A
adapter list command, 28 adapter query command, 28 application log, 32, 33 authentication settings, 48
B
BCU commands adapter list, 28 adapter query, 28 port list, 47 port query, 47 port speed, 48 to collect data, 25 beaconing, 40 enabling through BCU, 40 enabling through HCM, 40
C
collecting data using BCU, 28 collecting data using event logs, 30 collecting data using Fabric OS commands, 28
D
data collecting data with BCU and HCM, 25 collecting using host commands, 24 data to provide support, 24 device manager, 15
diagnostics, 40 beaconing, 40 enabling through BCU, 40 enabling through HCM, 40 echo test enabling through BCU, 44 enabling through HCM, 45 HBA, 40 HBA temperature, 43 loopback tests, 41 enabling through BCU, 41 enabling through HCM, 41 memory test, 42 enabling through BCU, 42 enabling through HCM, 43 PCI loopback tests, 42 enabling through BCU, 42 enabling through HCM, 42 ping end points, 43 enable through HCM, 43 enabling through BCU, 43 SCSI test, 45 SFP, 45 enable through BCU, 45, 46 enable through HCM, 46 trace route, 44 enable through HCM, 44 enabling through BCU, 44 driver install errors, 13 driver installation verify using Linux commands, 15 verify using Solaris commands, 16 verify using VMware command, 16 driver tunable parameters for Windows, 52, 53
E
echo test, 44 enable through HCM, 45 enabling through BCU, 44 error message when removing Windows driver, 13 error when rolling back driver, 14
55
event logs, 30 HCM, 32 host system, 30 host system logs adjust logging level, 32 syslog support, 34 Windows event log support, 34 event message files, 31
I
information gathering, 2 installation confirming driver installation, 14 driver errors, 13 problems, 12 verifying, 12 IOC statistics, 35 displaying through BCU, 35 displaying through HCM, 35 isolating problems, 5
F
Fabric OS commands, 28 fabric statistics, 36 displaying through BCU, 36 displaying through HCM, 36 FCIP initiator mode statistics, 37 displaying through BCU, 37 displaying through HCM, 37 Fibre Channel links verifying, 20 files needed for bfad.sys message, 13
L
Linux tuning, 51 log application, 32 master, 32 logical port properties, 47 logical port statistics, 37 displaying through BCU, 38 displaying through HCM, 38 logs, 32 application, 33 event, 30 HCM, 32 host system, 30 master log severity levels, 32 port, 47 syslog support, 34 Windows event log support, 34 loopback tests, 41 enabling through BCU, 41 enabling through HCM, 41
H
HBA diagnostics, 40 fabric OS support, viii operating system support, viii PWWN, xiii serial number, xii storage support, viii supported models, viii switch support, viii HBA event message files, 31 HBA memory test, 42 HBA properties, 15 HBA properties panel, 28 HBA statistics, 34 HCM cannot connect with agent, 17 HCM logs, 32 HCM options to collect data, 25 host commands for collecting data, 24 host freezes or crashes, 16 host system logs, 30
M
master log, 32 master log severity levels, 32 memory test, 42 enabling through BCU, 42 enabling through HCM, 43
O
operating system support, viii
56
P
PCI loopback tests, 42 enabling through BCU, 42 enabling through HCM, 42 performance optimization Linux tuning, 51 Solaris tuning, 52 VMware tuning, 53 Windows tuning, 52 persistent binding settings, 50 ping end points diagnostics, 43 enable through HCM, 43 enabling through BCU, 43 port data, 46 port list command, 47 port log, 47 port properties base, 46 logical, 47 remote, 46 virtual, 47 port properties panel, 46 port query command, 47 port speed command, 48 port statistics, 34 enable through BCU, 35 enable through HCM, 35
problem driver event messages, 11 errors when installing driver, 10 fabric authentication failures, 8 failed to connect to agent error, 11 files needed for bfad.sys message, 13 general, 5 HBA not in fabric, 10 HBA not registering with name server, 11 HBA not reported under PCI subsystem, 6 HCM fails connection with agent, 17 host not booting from remote LUN, 10, 14 host system freezes, 10, 16 I/Os not failing over on path failure, 9 installation, 12 installer program does not autorun, 10, 13 isolating, 5 loss of sync and signal, 8 LUN not visible, 9 maximum IOPS numbers low, 9 no HBAs reported, 6 operating system errors, 10 port link not up, 7 QoS performance issues, 10 target not visible, 9 unable to create NPIV ports, 9 virtual devices not in name server, 10 problem information, 2 properties panel for HBA, 28
Q
QoS settings, 48 QoS statistics, 39 displaying through BCU, 39 displaying through HCM, 39
R
remote port properties, 46 remote port statistics, 36 displaying through BCU, 37 displaying through HCM, 37 resolving installation problems, 12
S
SCSI target ID mappings, 50
57
SCSI test, 45 serial number location, xii SFP diagnostics, 45 enable through BCU, 45 enable through HCM, 46 Solaris tuning, 52 statistics fabric, 36 displaying through BCU, 36 displaying through HCM, 36 FCIP initiator mode, 37 displaying through BCU, 37 displaying through HCM, 37 IOC, 35 displaying through BCU, 35 displaying through HCM, 35 logical port, 37 displaying through BCU, 38 displaying through HCM, 38 port, 34 display through BCU, 35 display through HCM, 35 QoS displaying through BCU, 39 displaying through HCM, 39 remote port, 36 displaying through BCU, 37 displaying through HCM, 37 virtual port, 38 displaying through BCU, 39 displaying through HCM, 39 statistics for HBAs, 34 support data to provide, 24 support save differences between HCM, BCU, and browser, 27 on port crash event, 27 using, 25 using through BCU, 26 using through browser, 27 using through HCM, 26 using through port crash event, 27 syslog support, 34
trace route, 44 enable through HCM, 44 enabling through BCU, 44 troubleshooting gathering information, 2 general problems, 5 introduction, 1 using this manual, 1
V
virtual port properties, 47 virtual port statistics, 38 virtual port statsitcs displaying through BCU, 39 displaying through HCM, 39 VMware tuning, 53
W
Windows driver tunable parameters, 52 Windows event log support, 34 Windows tuning, 52 WWPN of HBA, xiii
T
target rate limiting settings, 48 target statistics, 37 technical help for product, xii temperature diagnostics, 43
58