10 - Cisco LAN Switching (Clark, Hamilton, IsBN# 1578700949)
10 - Cisco LAN Switching (Clark, Hamilton, IsBN# 1578700949)
10 - Cisco LAN Switching (Clark, Hamilton, IsBN# 1578700949)
Cisco Press
Cisco Press
800 East 96th Street
Indianapolis, IN 46240 USA
ii
Trademark Acknowledgments
All terms mentioned in this book that are known to be trademarks or service marks have been appropriately capital-
ized. Cisco Press or Cisco Systems, Inc. cannot attest to the accuracy of this information. Use of a term in this book
should not be regarded as affecting the validity of any trademark or service mark.
Feedback Information
At Cisco Press, our goal is to create in-depth technical books of the highest quality and value. Each book is crafted
with care and precision, undergoing rigorous development that involves the unique expertise of members from the
professional technical community.
Readers feedback is a natural continuation of this process. If you have any comments regarding how we could
improve the quality of this book, or otherwise alter it to better suit your needs, you can contact us through e-mail at
[email protected]. Please make sure to include the book title and ISBN in your message.
We greatly appreciate your assistance.
Dedications
To my wife, Debbie, for being the most supportive, understanding, patient, and loving partner a person could ever
ask for. And, to God, for giving me the ability, gifts, and privilege to work in such an exciting and fullling career.
-Kennedy
To my wife, Emily, the true author in our family, who taught me the joy of communication through the printed page
and who now has many romantic evenings due in appreciation for the ones neglected. And to my four boys, Jay,
Scott, Alex and Caleb, who endured with exceeding patience the hours dad locked himself in a quiet room instead
of playing ball and camping.
-Kevin
vi
Acknowledgments
Kennedy Clark: An avid reader of all things nerdy, I have always taken acknowledgements and dedications fairly
lightly. Having now been through the book-writing process myself, I can assure you that this will never be the case
again. Writing a book (especially one on technology that is as fast-moving as switching) is an incredibly demanding
process that warrants a huge number of thank yous. In the brief space I have here, I would like to express appreci-
ation to a small number of the people involved in this project. First, I would like to thank Kevin Hamilton, my co-
author. Kevin was willing to jump into a project that had almost been left for dead because I was feeling completely
overwhelmed by the staggering amount of work it involved. I would like to thank Radia Perlman for reading the e-
mails and Spanning Tree chapters of an unknown author. Also, the people at Cisco Press have been wonderful to
work with (I would encourage other authors to check them out). Chris Cleveland and Brett Bartow deserve special
mention. There are many people at Cisco to thank... Jon Crawfurd for giving a young NetWare guy a chance with
router technology. Stuart Hamilton for taking this project under his wing. Merwyn Andrade for being the switching
genius I someday hope to be. Tom Nosella for sticking with the project through its entirety. I owe many thanks to
the people at Chesapeake Computer Consultants. I would especially like to thank Tim Brown for teaching me one
of my rst network courses and remaining a faithful friend and mentor. Also, Tom Van Meter for showing me the
ropes with ATM. Finally, a very special thanks to my wife for her never-ending love and encouragement.
And, to God, for giving me the ability, gifts, and privilege to work in such an exciting and fullling career.
Kevin Hamilton: A project of this magnitude reects the hard work of many individuals beyond myself. Most
notably, Kennedy. He repeatedly amazes me with his ability to not only understand minute details for a vast array of
subjects (many of which are Catalyst related), but to reiterate them without reference to written materials months
and even years past the time when he is exposed to the point. His keen insights to networking and unique methods
of communicating them consistently challenge me to greater professional depths. I, therefore, thank Kennedy for
the opportunity to join him in this endeavor, and for the knowledge I gained as a result of sharing ink with him. I
also must thank the staff and instructors at Chesapeake Computer Consultants for their continuous inspiration and
support as we at times felt discouraged thinking we would never write the last page. And Tim Brown, who taught
me that technology can be funny. And lastly, the staff at Cisco Press. Brett Bartow and Chris Cleveland must espe-
cially be commended for their direction and vision in this project. They worked hard at keeping us focused and
motivated. I truly believe that without their guidance, we could never have produced this book on our own.
vii
Contents at a Glance
Part I Foundational Issues 3
Chapter 1 Desktop Technologies 5
Chapter 2 Segmenting LANs 35
Chapter 3 Bridging Technologies 55
Chapter 4 Configuring the Catalyst 83
Chapter 5 VLANs 113
Contents
Part I Foundational Issues 3
Chapter 1 Desktop Technologies 5
Legacy Ethernet 5
Carrier Sense with Multiple Access with Collision Detection (CSMA/CD) 6
Addressing in Ethernet 7
LAN Frames 10
Ethernet SlotTimes 11
Ethernet Frame Rates/Performance 13
Fast Ethernet 14
Full-Duplex and Half-Duplex Support 15
Autonegotiation 16
100BaseTX 17
100BaseT4 17
100BaseT2 17
100BaseFX 18
Media-Independent Interface (MII) 19
Network Diameter (Designing with Repeaters in a 100BaseX Network) 19
Practical Considerations 23
Gigabit Ethernet 24
Gigabit Architecture 25
Full-Duplex and Half-Duplex Support 27
Gigabit Media Options 27
Token Ring 29
Token Ring Operations 29
Token Ring Components 31
Summary 32
Review Questions 32
Chapter 2 Segmenting LANs 35
Why Segment LANs? 35
Segmenting LANs with Repeaters 37
Shared Bandwidth 39
Number of Stations per Segment 39
End-to-End Distance 40
Segmenting LANs with Bridges 42
ix
LANE V2 426
NHRP 427
MPOA Configuration 429
Configuring the LECS Database with elan-id 431
Configuring the MPS 432
Configuring the MPC 435
Sample MPOA Configuration 438
Troubleshooting an MPOA Network 441
Ensuring the ELANs Are Functional between the Source and the Destination 442
Determining That All MPCs and MPSs Are Functional 442
Determining That MPCs and MPSs Discovered Each Other 442
Determining If the Threshold Is Crossed at the MPC to Initiate an MPOA
Resolution Request 442
Ensuring That the MPS and NHS Components ARE Functional 445
Other Causes of MPOA Failures 445
Review Questions 446
In addition, you will see the usual battery of network device, peripheral, topology, and connection icons associated
with Cisco Systems documentation. These icons are as follows:
Token
Ring
Foreword
With the advent of switching technology and specically the enormously successful Catalyst Switching products
from Cisco Systems, corporations all over the world are upgrading their infrastructures to enable their networks for
high bandwidth applications. Although the original goal of most switched network design was primarily increased
bandwidth, the networks of today require much more with the advent of mission critical applications and IP Voice
emerging as mainstream networking requirements. It is therefore important not only to reap the bandwidth benets
of Catalyst switching but also learn sound network design principles leveraging all of the features in the Catalyst
software suite.
One thing network designers have learned over the years is that things never get any easier when it comes to under-
standing and evaluating all of the available technologies that appear in standards bodies and are written about in
trade magazines. We read about MPOA, LANE, Gigabit Ethernet, 802.1Q, 802.1p, Layer 3 switching, OSPF, BGP,
VPN, MPLS, and many others. The key, however, to building and operating a successful network is understanding
the basic fundamentals of the relevant technologies, knowing where and how to apply them most effectively in a
network, and most importantly leveraging the successes of others to streamline the deployment of the network.
Internetworking design is part art and part science mostly due to the fact that the applications that ride on top of the
network have widely varying trafc characteristics. This represents another challenge when designing a network
because you might well optimize it to perform for a certain application only to nd that a few months later a brand
new application places entirely differing demands on the network.
The science part of campus network design relies on a few basic principles. First, every user connects to a port on a
switch and so wiring closets are provisioned with Catalyst switches such as the Catalyst 5000 family to connect end
users either at 10 megabit Ethernet or increasingly 100 megabit Ethernet. The base level of switching capability
here is called Layer 2 switching.
There are typically tens to hundreds of wiring closets that need to be connected somehow. Although there are many
ways to do this, experience has taught us that a structured approach with some hierarchy is the best technique for a
stable and easily expandable network. Wiring closets then are typically consolidated into a network layer called the
distribution layer that is characterized by a combination of Layer 2 and Layer 3 switching.
If the network is large in size, there can still be a large number of distribution layer switches, and so in keeping with
the structured methodology, another layer is used to network the distribution layer together. Often called the core of
the network, a number of technologies can be used, typied by ATM, Gigabit Ethernet, and Layer 3 switching.
This probably sounds rather simple at this point, however as you can see from the thickness of this book, there is
plenty of art (and a lot more science) toward making your design into a highly available, easy to manage, expand-
able, easy to troubleshoot network and preparing you with a solid foundation for new emerging applications.
This book not only covers the science part of networking in great detail in the early chapters, but more importantly
deals with real-world experience in the implementation of networks using Catalyst products. The books authors
not only teach this material in training classes but also have to prove that they can make the network work at cus-
tomer sites. This invaluable experience is captured throughout the book. Reading these tips carefully can save you
countless hours of time experimenting on nding the best way to ne tune your particular network. In addition, as
part of the CCIE Professional Development series of Cisco Press, you can use the experience gained from reading
and understanding this book to prepare for one of the most sought after professional certications in the industry.
Stuart Hamilton, CCIE #1282
Senior Manager, Enterprise Network Design
Cisco Systems Inc.
xxv
Introduction
Driven by a myriad of factors, LAN switching technology has literally taken the world by storm. The Internet, Web
technology, new applications, and the convergence of voice, video, and data have all placed unprecedented levels of
trafc on campus networks. In response, network engineers have had to look past traditional network solutions and
rapidly embrace switching. Cisco, the router company, has jumped heavily into the LAN switching arena and
quickly established a leadership position. The Catalyst series of switches has set a new standard for performance
and features, not to mention sales.
Despite the popularity of campus switching equipment, it has been very difcult to obtain detailed and clear infor-
mation on how it should be designed, utilized, and deployed. Although many books have been published in the last
several years on routing technology, virtually no books have been published on LAN switching. The few that have
been published are vague, out-of-date, and absent of real-world advice. Important topics such as the Spanning-Tree
Protocol and Layer 3 switching have either been ignored or received inadequate coverage. Furthermore, most have
contained virtually no useful information on the subject of campus design.
This book was written to change that. It has the most in-depth coverage of LAN switching technology in print to
date. Not only does it have expansive coverage of foundational issues, but it is also full of practical suggestions.
Proven design models, technologies, and strategies are thoroughly discussed and analyzed.
Both authors have drawn on their extensive experience with campus switching technology. As two of the rst certi-
ed Catalyst instructors, they have rst-hand knowledge of how to effectively communicate switching concepts.
Through design and implementation experience, they have a detailed understanding of what works, as well as what
doesnt work.
Objectives
Cisco LAN Switching is designed to help people move forward with their knowledge of the exciting eld of campus
switching. CCIE candidates will receive broad and comprehensive instruction on a wide variety of switching-
related technologies. Other network professionals will also benet from hard-to-nd information on subjects such
Layer 3 switching and campus design best practices.
Audience
Cisco LAN Switching should appeal to a wide variety of people working in the network eld. It is designed for any
network administrator, engineer, designer, or manager who requires a detailed knowledge of LAN switching
technology.
Obviously, the book is designed to be an authoritative source for network engineers preparing for the switching por-
tion of the CCIE exams and Cisco Career Certications. Cisco LAN Switching is not a quick x guide that helps
you cram (such books are virtually worthless when it comes to taking the CCIE practical exams). Instead, it focuses
extensively on theory and building practical knowledge. When allied with hands-on experience, this can be a potent
combination.
However, this book is designed to go far beyond test preparation. It is designed to be both a tutorial and a reference
tool for a wide range of network professionals, including the following:
People with less switching experience will benet extensively from the foundational material
discussed in Part I. This material then transitions smoothly into the more advanced subject
matter discussed in later chapters.
Network professionals with a detailed understanding of routing but new to campus switching
will nd that Cisco LAN Switching can open up a whole new world of technology.
xxvi
Network engineers with extensive switching experience will nd Cisco LAN Switching taking
them farther into the eld. For example, much of the Spanning-Tree Protocol information in
Part II and the real-world design information in Part V has never been published before. The
Catalyst 6000 material discussed in Part VI is also completely new.
Network designers will benet from the state-of-the-art coverage of campus design models and
the detailed discussions of opposing design strategies.
Engineers who have already obtained their CCIE will value Cisco LAN Switching as a refer-
ence tool and for design information.
Organization
The eighteen chapters and one appendix of this book fall into seven parts:
Part I: Foundational IssuesThis section takes you through technologies that underlie the
material covered in the rest of the book. Important issues such as Fast Ethernet, Gigabit Ether-
net, routing versus switching, the types of Layer 2 switching, the Catalyst command-line envi-
ronment, and VLANs are discussed. Although advanced readers might want to skip some of
this material, they are encouraged to at least skim the sections on Gigabit Ethernet and VLANs.
Part II: Spanning TreeThe Spanning-Tree Protocol can make or break a campus network.
Despite the ubiquitous deployment of this protocol, very little detailed information about its
internals has been published. This section is designed to be the most comprehensive source
available on this important protocol. It presents a detailed analysis of common problems and
Spanning Tree troubleshooting. This chapter also discusses important enhancements such Port-
Fast, UplinkFast, BackboneFast, and PVST+.
Part III: TrunkingPart III examines the critical issue of trunk connections, the links used to
carry multiple VLANs throughout a campus network. Chapter 8 begins with a detailed discus-
sion of trunking concepts and covers Ethernet-based forms of trunking, ISL, and 802.1Q.
Chapters 9 and 10 look at LAN Emulation (LANE) and MPOA (Multiprotocol over ATM), two
forms of trunking that utilize Asynchronous Transfer Mode (ATM).
Part IV: Advanced FeaturesThis section begins with an in-depth discussion of the impor-
tant topic of Layer 3 switching, a technology that has created a whole switching paradigm.
Both MLS (routing switch) and hardware-based (switching router) routing are examined. The
next two chapters examine the VLAN Trunking Protocol (VTP) and multicast-related topics
such as Cisco Group Management Protocol (CGMP) and Internet Group Membership Protocol
(IGMP) Snooping.
Part V: Real-World Campus Design and ImplementationPart V focuses on real-world
issues such as design, implementation, and troubleshooting. These chapters are oriented
toward helping you benet from the collective advice of many LAN switching experts.
Part VI: Catalyst 6000 TechnologyThis section includes a chapter that analyzes the Cata-
lyst 6000 and 6500 models. Focusing primarily on Layer 3 switching, it discusses the impor-
tant Native IOS Mode of operation.
Part VII: AppendixThe single appendix in this section provides answers and solutions to
the Review Questions and Hands-On Labs from the book.
xxvii
NOTE Notes are used for sidebar information related to the main text.
Various elements of Catalyst and Cisco router command syntax are presented in the course of each chapter. This
book uses the same conventions as the Cisco documentation:
Vertical bars (|) separate alternative, mutually exclusive, elements.
Square brackets [] indicate optional elements.
Braces {} indicate a required choice.
Braces within square brackets [{}] indicate a required choice within an optional element.
Boldface indicates commands and keywords that are entered literally as shown.
Italics indicate arguments for which you supply values.
Feedback
If you have questions, comments, or feedback, please contact the authors at the following e-mail addresses. By let-
ting us know of any errors, we can x them for the benet of future generations. Moreover, being technical geeks in
the true sense of the word, we are always up for a challenging technical discussion.
Kennedy Clark [email protected]
Kevin Hamilton [email protected]
PART
I
Foundational Issues
Chapter 1 Desktop Technologies
Chapter 5 VLANs
This chapter covers the following key topics:
Legacy EthernetThis section explains the operations and implementation rules of
legacy 10 Mbps CSMA/CD systems.
LAN FramesThis section presents various common formats for transporting
packets over Ethernet.
Fast EthernetA now popular desktop Ethernet migration, this uses 100 Mbps
technology. This section describes its characteristics and some of the common media
options.
Gigabit EthernetAs the highest speed Ethernet available today, this technology
nds immediate utility for trunking Catalysts and connecting high performance
servers. This section describes media options and characteristics.
Token RingToken Ring, the other popular LAN alternative, operates very
differently from Ethernet. This section provides a brief overview of Token Ring.
CHAPTER
1
Desktop Technologies
Since the inception of local-area networks (LANs) in the 1970s, numerous LAN
technologies graced the planet at one point or another. Some technologies became legends:
ArcNet and StarLAN, for example. Others became legacies: Ethernet, Token Ring, and
FDDI. ArcNet was the basis for some of the earliest ofce networks in the 1980s, because
Radio Shack sold it for its personal computer line, Model II. A simple coaxial-based
network, it was easy to deploy by ofce administrators for a few workstations. StarLAN,
one of the earliest twisted-pair network technologies, became the basis for the Institute of
Electrical and Electronic Engineers (IEEE) 10BaseT network. Running at 1 Mbps,
StarLAN demonstrated that networking over twisted pair was feasible. Both ArcNet and
StarLAN enjoyed limited success in the market because higher speed technologies such as
10 Mbps Ethernet and 4 Mbps Token Ring were introduced soon afterwards. With the
higher bandwidth capacity of newer network technologies and the rapid development of
higher speed workstations demanding more network bandwidth, ArcNet (now fondly
referred to as ArchaicNet) and StarLAN were doomed to limited market presence.
The legacy networks continue to nd utility as distribution and backbone technologies for
both manufacturing and ofce environments. But like ArcNet and StarLAN, even these
technologies see higher speed networks such as Fast Ethernet, High Speed Token Ring, and
ATM crowding into the network arena. However, the legacy systems will remain for many
more years due to the existence of such a large installed base. Users will replace Ethernet
and Token Ring in phases as applications demand more bandwidth.
This chapter discusses the legacy network technologies, Ethernet and Token Ring, as well
as Fast Ethernet and Gigabit Ethernet. Although Gigabit Ethernet is not yet a popular
desktop technology, it is discussed here because of its relationship to Ethernet and its use
in Catalyst networks for trunking Catalysts together. This chapter also describes how the
access methods operate, some of the physical characteristics of each, and various frame
formats and address types.
Legacy Ethernet
When mainframe computers dominated the industry, user terminals attached either directly to
ports on the computer or to a controller that gave the appearance of a direct connection. Each
wire connection was dedicated to an individual terminal. Users entered data, and the terminal
6 Chapter 1: Desktop Technologies
immediately transmitted signals to the host. Performance was driven by the horsepower in the
hosts. If the host became overworked, users experienced delays in responses. Note, though,
that the connection between the host and terminal was not the cause in the delay. The users
had full media bandwidth on the link regardless of the workload of the host device.
Facility managers installing the connections between the terminal and the host experienced
distance constraints imposed by the hosts terminal line technology. The technology limited
users to locations that were a relatively short radius from the host. Further, labor to install the
cables created inated installation and maintenance expenses. Local-area networks (LANs)
mitigated these issues to a large degree. One of the immediate benets of a LAN was to reduce
the installation and maintenance costs by eliminating the need to install dedicated wires to
each user. Instead, a single cable pulled from user to user allowed users to share a common
infrastructure instead of having dedicated infrastructures for each station.
A technology problem arises when users share a cable, though. Specically, how does the
network control who uses the cable and when? Broadband technologies like cable
television (CATV) support multiple users by multiplexing data on different channels
(frequencies). For example, think of each video signal on a CATV system as a data stream.
Each data stream is transported over its own channel. A CATV system carries multiple
channels on a single cable and can, therefore, carry multiple data streams concurrently. This
is an example of frequency-division multiplexing (FDM). The initial LANs were conceived
as baseband technologies, however, which do not have multiple channels. Baseband
technologies do not transmit using FDM. Rather, they use bandwidth-sharing, which
simply means that users take turns transmitting.
Ethernet and Token Ring dene sets of rules known as access methods for sharing the cable.
The access methods approach media sharing differently, but have essentially the same end
goal in mind.
Carrier sense refers to the process of listening before speaking. The Ethernet device wishing
to communicate looks for energy on the media (an electrical carrier). If a carrier exists, the
cable is in use and the device must wait to transmit. Many Ethernet devices maintain a counter
of how often they need to wait before they can transmit. Some devices call the counter a
deferral or back-off counter. If the deferral counter exceeds a threshold value of 15 retries, the
device attempting to transmit assumes that it will never get access to the cable to transmit the
packet. In this situation, the source device discards the frame. This might happen if there are
too many devices on the network, implying that there is not enough bandwidth available.
When this situation becomes chronic, you should segment the network into smaller segments.
Chapter 2, Segmenting LANs, discusses various approaches to segmentation. If the power
level exceeds a certain threshold, that implies to the system that a collision occurred. When
stations detect that a collision occurs, the participants generate a collision enforcement signal.
The enforcement signal lasts as long as the smallest frame size. In the case of Ethernet, that
equates to 64 bytes. This ensures that all stations know about the collision and that no other
station attempts to transmit during the collision event. If a station experiences too many
consecutive collisions, the station stops transmitting the frame. Some workstations display an
error message stating Media not available. The exact message differs from implementation
to implementation, but every workstation attempts to convey to the user that it was unable to
send data for one reason or another.
Addressing in Ethernet
How do stations identify each other? In a meeting, you identify the intended recipient by
name. You can choose to address the entire group, a set of individuals, or a specic person.
Speaking to the group equates to a broadcast; a set of individuals is a multicast; and
addressing one person by name is a unicast. Most trafc in a network is unicast in nature,
characterized as trafc from a specic station to another specic device. Some applications
generate multicast trafc. Examples include multimedia services over LANs. These
applications intend for more than one station to receive the trafc, but not necessarily all
for all stations. Video conferencing applications frequently implement multicast addressing
to specify a group of recipients. Networking protocols create broadcast trafc, whereas IP
creates broadcast packets for ARP and other processes. Routers often transmit routing
updates as broadcast frames, and AppleTalk, DecNet, Novell IPX, and many other
protocols create broadcasts for various reasons.
Figure 1-1 shows a simple legacy Ethernet system with several devices attached. Each
devices Ethernet adapter card has a 48-bit (6 octet) address built in to the module that
uniquely identies the station. This is called the Media Access Control (MAC) address, or
the hardware address. All of the devices in a LAN must have a unique MAC address.
Devices express MAC addresses as hexadecimal values. Sometimes MAC address octets
are separated by hyphens (-) sometimes by colons (:) and sometimes periods (.). The three
formats of 00-60-97-8F-4F-86, 00:60:97:8F:4F:86, and 0060.978F.4F86 all specify the
same host. This book usually uses the rst format because most of the Catalyst displays use
this convention; however, there are a couple of exceptions where you might see the second
or third format. Do not let this confuse you. They all represent MAC addresses.
8 Chapter 1: Desktop Technologies
1 2 3
00-60-08- 00-60-08-
93-DB-C1 93-AB-12
Ethernet Frame Header
cl250101.eps
Destination Address Source Address
00-60-08-93-AB-12 00-60-08-93-DB-C1
To help ensure uniqueness, the rst three octets indicate the vendor who manufactured the
interface card. This is known as the Organizational Unique Identier (OUI). Each
manufacturer has a unique OUI value that it acquired from IEEE, the global administrator
for OUI values. Cisco has several OUI values: 00000C, 00067C, 0006C1, 001007,
00100B, 00100D, 001011, 001014, 00101F, 001029, 00102F, 001054, 001079, 00107B,
0010A6, 0010F6, 0010FF, 00400B (formerly Crescendo), 00500F, 005014, 00502A,
00503E, 005050, 005053, 005054, 005073, 005080, 0050A2, 0050A7, 0050BD, 0050E2,
006009, 00602F, 00603E, 006047, 00605C, 006070, 006083, 00900C, 009021, 00902B,
00905F, 00906D, 00906F, 009086, 009092, 0090A6, 0090AB, 0090B1, 0090BF, 0090D9,
0090F2, 00D006, 00D058, 00D0BB, 00D0C0, 00E014, 00E01E, 00E034, 00E04F,
00E08F, 00E0A3, 00E0B0, 00E0F7, 00E0F9, and 00E0FE.
The last three octets of the MAC address equate to a host identier for the device. They are
locally assigned by the vendor. The combination of OUI and host number creates a unique
address for that device. Each vendor is responsible to ensure that the devices it
manufactures have a unique combination of 6 octets.
Unicast Frames
In a LAN, stations must use the MAC address for the Layer 2 address in a frame to identify
the source and destination. When Station 1 transmits to Station 2 in Figure 1-1, Station 1
generates a frame that includes Station 2s MAC address (00-60-08-93-AB-12) for the
destination and Station 1s address (00-60-08-93-DB-C1) for the source. This is a unicast
frame. Because the LAN is a shared media, all stations on the network receive a copy of the
frame. Only Station 2 performs any processing on the frame, though. All stations compare
the destination MAC address with their own MAC address. If they do not match, the
stations interface module discards (ignores) the frame. This prevents the packet from
consuming CPU cycles in the device. Station 2, however, sees a match and sends the packet
to the CPU for further analysis. The CPU examines the network protocol and the intended
application and decides whether to drop or use the packet.
Legacy Ethernet 9
Broadcast Frames
Not all frames contain unicast destination addresses. Some have broadcast or multicast
destination addresses. Stations treat broadcast and multicast frames differently than they do
unicast frames. Stations view broadcast frames as public service announcements. When a
station receives a broadcast, it means, Pay attention! I might have an important message for
you! A broadcast frame has a destination MAC address of FF-FF-FF-FF-FF-FF (all binary
1s). Like unicast frames, all stations receive a frame with a broadcast destination address. When
the interface compares its own MAC address against the destination address, they dont match.
Normally, a station discards the frame because the destination address does not match its own
hardware address. But broadcast frames are treated differently. Even though the destination and
built-in address dont match, the interface module is designed so that it still passes the broadcast
frame to the processor. This is intentional because designers and users want to receive the
broadcast frame as it might have an important request or information. Unfortunately, probably
only one or at most a few stations really need to receive the broadcast message. For example,
an IP ARP request creates a broadcast frame even though it intends for only one station to
respond. The source sends the request as a broadcast because it does not know the destination
MAC address and is attempting to acquire it. The only thing the source knows for sure when it
creates the ARP request is the destinations IP address. That is not enough, however, to address
the station on a LAN. The frame must also contain the MAC address.
Routing protocols sometimes use broadcast MAC addresses when they announce their
routing tables. For example, by default, routers send IP RIP updates every 30 seconds. The
router transmits the update in a broadcast frame. The router does not necessarily know all
of the routers on the network. By sending a broadcast message, the router is sure that all
routers attached to the network will receive the message. There is a downside to this,
however. All devices on the LAN receive and process the broadcast frame, even though only
a few devices really needed the updates. This consumes CPU cycles in every device. If the
number of broadcasts in the network becomes excessive, workstations cannot do the things
they need to do, such as run word processors or ight simulators. The station is too busy
processing useless (for them) broadcast frames.
Multicast Frames
Multicast frames differ from broadcast frames in a subtle way. Multicast frames address a
group of devices with a common interest and allow the source to send only one copy of the
frame on the network, even though it intends for several stations to receive it. When a station
receives a multicast frame, it compares the multicast address with its own address. Unless the
card is previously congured to accept multicast frames, the multicast is discarded on the
interface and does not consume CPU cycles. (This behaves just like a unicast frame.)
For example, Cisco devices running the Cisco Discovery Protocol (CDP) make periodic
announcements to other locally attached Cisco devices. The information contained in the
announcement is only interesting to other Cisco devices (and the network administrator).
To transfer the announcement, the Cisco source could send a unicast to each and every
10 Chapter 1: Desktop Technologies
Cisco device. That, however, means multiple transmissions on the segment and consumes
network bandwidth with redundant information. Further, the source might not know about
all of the local Cisco devices and could, therefore, choose to send one broadcast frame. All
Cisco devices would receive the frame. Unfortunately, so would all non-Cisco devices. The
last alternative is a multicast address. Cisco has a special multicast address reserved, 01-00-
0C-CC-CC-CC, which enables Cisco devices to transmit to all other Cisco devices on the
segment. All non-Cisco devices ignore this multicast message.
Open Shortest Path First (OSPF), an IP routing protocol, makes routing update announcements
on a specially reserved multicast address. The reserved multicast OSPF IP addresses 224.0.0.5
and 224.0.0.6 translate to MAC multicast addresses of 01-00-5E-00-00-05 and 01-00-5E-00-
00-06. Chapter 13, Multicast and Broadcast Services, discusses how these MAC addresses
are derived. Only routers interested in receiving the OSPF announcement congure their
interface to receive the message. All other devices lter the frame.
LAN Frames
When stations transmit to each other on a LAN, they format the data in a structured manner
so that devices know what octets signify what information. Various frame formats are
available. When you congure a device, you must dene what format your station will use,
realizing that more than one format might be congured, as is true for a router.
Figure 1-2 illustrates four common frame formats for Ethernet. Some users interchange the
terms packets and frames rather loosely. According to RFC 1122, a subtle difference exists.
Frames refer to the entire message, from the data link layer (Layer 2) header information
through and including the user data. Packets exclude Layer 2 headers and only include the
IP header (Layer 3 protocol header) through and including user data.
Figure 1-2 Four Ethernet Frame Formats
Frame
Packet
Frame Layer 2 Header Field Data Field 1500-octets
Format 14-octets
Ethernet v2 MAC DA MAC SA Type Data F
(ARPA) 6-octets 6-octets 2-octets C
802.3 MAC DA MAC SA Length Data S
6-octets 6-octets 2-octets
cl250102.eps
The frame formats developed as the LAN industry evolved and differing requirements arose
for protocols. When XEROX developed the original Ethernet (which was later adopted by the
industry), a frame format like the Ethernet frame in Figure 1-2 was dened. The rst 6 octets
contain the destinations MAC address, whereas the next eld of 6 octets contain the sources
MAC address. Two bytes follow that indicate to the receiver the correct Layer 3 protocol to
which the packet belongs. For example, if the packet belongs to IP, then the type eld value
is 0x0800. Table 1-1 lists several common protocols and their associated type values.
Table 1-1 Common Routed Protocols and Their Hex Type Values
Protocol Hex Type Value
IP 0800
ARP 0806
Novell IPX 8137
AppleTalk 809B
Banyan Vines 0BAD
802.3 0000-05DC
Following the type value, the receiver expects to see additional protocol headers. For
example, if the type value indicates that the packet is IP, the receiver expects to decode IP
headers next. If the value is 8137, the receiver tries to decode the packet as a Novell packet.
IEEE dened an alternative frame format. In the IEEE 802.3 formats, the source and
destination MAC addresses remain, but instead of a type eld value, the packet length is
indicated. Three derivatives to this format are used in the industry: raw 802.3, 802.3 with
802.2 LLC, and 802.3 with 802.2 and SNAP. A receiver recognizes that a packet follows
802.3 formats rather than Ethernet formats by the value of the two-byte eld following the
source MAC address. If the value falls within the range of 0x0000 and 0x05DC (1500
decimal), the value indicates length; protocol type values begin after 0x05DC.
Ethernet SlotTimes
Ethernets rules govern how stations operate in a CSMA/CD environment. The rules
constantly keep in mind the need to detect collisions and to report them to the participants.
Ethernet denes a slotTime wherein a frame travels from one network extreme to the other.
In Figure 1-3, assume that Station 1, located at one extreme of the network, transmits a
frame. Just before the frame reaches Station 2, located at the other extreme of the network,
Station 2 transmits. Station 2 transmits because it has something to send, and because
Station 1s frame hasnt arrived yet, Station 2 detects silence on the line. This demonstrates
a prime example of a collision event between devices at opposite extremes of the network.
Because they are at opposite ends of the network, the timing involves worst case values for
detecting and reporting collisions.
12 Chapter 1: Desktop Technologies
cl250103.eps
25.6 s
Ethernet rules state that a station must detect and report collisions between the furthest
points in the network before the source completes its frame transmission. Specically, for
a legacy 10 Mbps Ethernet, this must all occur within 51.2 microseconds. Why 51.2
microseconds? The time is based on the smallest frame size for Ethernet, which
corresponds to the smallest time window to detect and report collisions. The minimum
frame size for Ethernet is 64 bytes, which has 512 bits. Each bit time is 0.1 microseconds
in length, which is calculated from one over Ethernets data rate (1/106). Therefore, the slot
time for Ethernet is 0.1 microseconds/bit 512 bits or 51.2 microseconds.
Next, the Ethernet specication translates the slotTime into distance. As the Ethernet signal
propagates through the various components of the collision domain, time delays are
introduced. Time delay values are calculated for copper cables, optical bers, and repeaters.
The amount of delay contributed by each component varies based upon the media
characteristics. A correctly designed network topology totals the delay contribution for
each component between the network extremes and ensures that the total is less than one
half of 51.2 microseconds. This guarantees that Station 2 can detect the collision and report
it to Station 1 before Station 1 completes the transmission of the smallest frame.
A network that violates the slotTime rules by extending the network to distances that
require more than 51.2 microseconds experience late collisions, which can cause the
network to malfunction. When a station transmits, it retains the frame in a local buffer until
it either transmits the frame successfully (that is, without a collision) or the deferral counter
threshold is exceeded. We previously discussed the deferral counter situation. Assume that
a network administrator overextends the network in Figure 1-3 by inserting too many
repeaters or by deploying segments that are too long. When Station 1 transmits, it assumes
that the frame successfully transmitted if it experiences no collision by the time that it
transmits 64 octets. Once the frame believes that it was successfully transmitted, the frame
is eliminated from buffers leaving no opportunity to retry. When the network overextends
the slotTime, the source might learn of a collision after it transmits the rst 64 octets. But
no frame is in the buffer at this point to resend, because the source thought that the
transmission was successful!
LAN Frames 13
Fast Ethernet
When Ethernet technology availed itself to users, the 10 Mbps bandwidth seemed like an
unlimited resource. (Almost like when we had 640k of PC RAMit seemed we would
never need more!) Yet workstations have developed rapidly since then, and applications
demand more data in shorter amounts of time. When the data comes from remote sources
rather than from a local storage device, this amounts to the application needing more
network bandwidth. New applications nd 10 Mbps to be too slow. Consider a surgeon
downloading an image from a server over a 10 Mbps shared media network. He needs to
wait for the image to download so that he can begin/continue the surgery. If the image is a
high resolution image, not unusually on the order of 100 MB, it could take a while to
receive the image. What if the shared network makes the available user bandwidth about
500 kbps (a generous number for most networks) on the average? It could take the
physician 26 minutes to download the image:
100 MB 8/500 kbps = 26 minutes
If that were you on the operating table waiting for the image to download, you would not
be very happy! If you are the hospital administration, you are exposing yourself to surgical
complications at worst and idle physician time at best. Obviously, this is not a good
situation. Sadly, many hospital networks function like this and consider it normal. Clearly,
more bandwidth is needed to support this application.
Recognizing the growing demand for higher speed networks, the IEEE formed the 802.3u
committee to begin work on a 100 Mbps technology that works over twisted-pair cables. In
June of 1995, IEEE approved the 802.3u specication dening a system that offered vendor
interoperability at 100 Mbps.
Like 10 Mbps systems such as 10BaseT, the 100 Mbps systems use CSMA/CD, but provide
a tenfold improvement over legacy 10 Mbps networks. Because they operate at 10 times the
speed of 10 Mbps Ethernet, all timing factors reduce by a factor of 10. For example, the
slotTime for 100 Mbps Ethernet is 5.12 microseconds rather than 51.2 microseconds. The
IFG is .96 microseconds. And because timing is one tenth that of 10 Mbps Ethernet, the
network diameter must also shrink to avoid late collisions.
An objective of the 100BaseX standard was to maintain a common frame format with
legacy Ethernet. Therefore, 100BaseX uses the same frame sizes and formats as 10BaseX.
Everything else scales by one tenth due to the higher data rate. When passing frames from
a 10BaseX to a 100BaseX system, the interconnecting device does not need to re-create the
frames Layer 2 header because they are identical on the two systems.
10BaseT, the original Ethernet over twisted-pair cable standard, supports Category 3, 4, and
5 cables up to 100 meters in length. 10BaseT uses a single encoding technique, Manchester,
and signals at 20 MHz well within the bandwidth capability of all three cable types.
Because of the higher signaling rate of 100BaseT, creating a single method to work over all
cable types was not likely. The encoding technologies that were available at the time forced
IEEE to create variants of the standard to support Category 3 and 5 cables. A ber optic
version was created as well.
Fast Ethernet 15
NOTE Be aware, however: Just because an interface card runs 100BaseX full duplex, you cannot
assume that the device where you install it supports full-duplex mode. In fact, some devices
might actually experience worse throughput when in full-duplex mode than when in half-duplex
mode. For example, Windows NT 4.0 does not support full-duplex operations because of driver
limitations. Some SUN workstations can also experience this, especially with Gigabit Ethernet.
The IEEE 802.3x committee designed a standard for full-duplex operations that covers
10BaseT, 100BaseX, and 1000BaseX. (1000BaseX is Gigabit Ethernet discussed in a later
section, Gigabit Ethernet.) 802.3x also dened a ow control mechanism. This allows a
receiver to send a special frame back to the source whenever the receivers buffers overow.
The receiver sends a special packet called a pause frame. In the frame, the receiver can request
the source to stop sending for a specied period of time. If the receiver can handle incoming
trafc again before the timer value in the pause frame expires, the receiver can send another
pause frame with the timer set to zero. This tells the receiver that it can start sending again.
16 Chapter 1: Desktop Technologies
Although 100BaseX supports both full- and half-duplex modes, you can deploy 100 Mbps
hubs that operate in half-duplex mode. That means the devices attached to the hub share the
bandwidth just like the legacy Ethernet systems. In this case, the station must run in half-
duplex mode. To run in full-duplex mode, the device and the hub (switch) must both support
and be congured for full duplex. Note that you cannot have a full duplex for a shared hub.
If the hub is shared, it must operate in half-duplex mode.
Autonegotiation
With the multiple combinations of network modes available, conguring devices gets
confusing. You need to determine if the device needs to operate at 10 or 100 Mbps, whether
it needs to run in half- or full-duplex mode, and what media type to use. The device
conguration must match the hub conguration to which it attaches.
Autonegotiation attempts to simplify manual conguration requirements by enabling the
device and hub to automatically agree upon the highest common operational level. The
802.3u committee dened Fast Link Pulse (FLP) to support the autonegotiation process.
FLP, an enhanced version of 10BaseTs Link Integrity Test, sends a series of pulses on the
link announcing its capabilities. The other end also transmits FLP announcements, and the
two ends settle on whatever method has highest priority in common between them. Table
1-2 illustrates the priority scheme.
According to Table 1-2, 100BaseT2 full-duplex mode has highest priority, whereas the
slowest method, 10BaseT half-duplex, has lowest priority. Priority is determined by speed,
cable types supported, and duplex mode. A system always prefers 100 Mbps over 10 Mbps,
and always prefers full duplex over half duplex. Note that 100BaseT2 has higher priority
than 100BaseTX. This is not a direct result of 100BaseT2 being a more recent medium.
Rather, 100BaseT2 has higher priority because it supports more cable types than does
100BaseTX. 100BaseTX only supports Category 5 type cable, whereas 100BaseT2
supports Category 3, 4, and 5 cables.
Fast Ethernet 17
TIP Not all devices perform autonegotiation. We have observed at several customer locations
failure of the autonegotiation processeither because of equipment not supporting the
feature or poor implementations. We recommend that critical devices such as routers,
switches, bridges, and servers be manually congured at both ends of the link to ensure that,
upon reboot, the equipment operates in a common mode with its hub/switch port.
100BaseTX
Many existing 10 Mbps twisted-pair systems use a cabling infrastructure based upon
Category 5 (unshielded twisted-pair) UTP and (shielded twisted-pair) STP. The devices use
two pairs of the cable: one pair on pins 1 and 2 for transmit and one pair on pins 3 and 6 for
receive and collision detection. 100BaseTX also uses this infrastructure. Your existing
Category 5 cabling for 10BaseT should support 100BaseTX, which also implies that
100BaseTX works up to 100 meters, the same as 10BaseT.
100BaseTX uses an encoding scheme like Fiber Distributed Data Interface (FDDI) of 4B/
5B. This encoding scheme adds a fth bit for every four bits of user data. That means there
is a 25 percent overhead in the transmission to support the encoding. Although 100BaseTX
carries 100 Mbps of user data, it actually operates at 125 Megabaud. (We try not to tell this
to marketing folks so that they do not put on their data sheets 125 Mbps throughput!)
100BaseT4
Not all building infrastructures use Category 5 cable. Some use Category 3. Category 3
cable was installed in many locations to support voice transmission and is frequently
referred to as voice grade cable. It is tested for voice and low speed data applications up to
16 MHz. Category 5 cable, on the other hand, is intended for data applications and is tested
at 100 MHz. Because Category 3 cable exists in so many installations, and because many
10BaseT installations are on Category 3 cable, the IEEE 802.3u committee included this as
an option. As with 10BaseT, 100BaseT4 links work up to 100 meters. To support the higher
data rates, though, 100BaseT4 uses more cable pairs. Three pairs support transmission and
one pair supports collision detection. Another technology aspect to support the high data
rates over a lower bandwidth cable comes from the encoding technique used for
100BaseT4. 100BaseT4 uses an encoding method of 8B/6T (8 bits/6 ternary signals) which
signicantly lowers the signaling frequency, making it suitable for voice-grade wire.
100BaseT2
Although 100BaseT4 provides a solution for Category 3 cable, it needs four pairs to support
operations. Most Category 3 cable installations intend for the cable to support voice
communications. By consuming all the pairs in the cable for data transmissions, no pairs
18 Chapter 1: Desktop Technologies
100BaseFX
802.3u species a variant for single-mode and multimode ber optic cables. 100BaseFX
uses two strands (one pair) of ber optic cablesone for transmitting and one for receiving.
Like 100BaseTx, 100BaseFX uses a 4B/5B encoding signaling at 125 MHz on the optical
ber. When should you use the ber optic version? One clear situation arises when you
need to support distances greater than 100 meters. Multimode supports up to 2,000 meters
in full-duplex mode, 412 meters in half-duplex mode. Single-mode works up to 10 kms
a signicant distance advantage. Other advantages of ber include its electrical isolation
properties. For example, if you need to install the cable in areas where there are high levels
of radiated electrical noise (near high voltage power lines or transformers), ber optic cable
is best. The cables immunity to electrical noise makes it ideal for this environment. If you
are installing the system in an environment where lightning frequently damages equipment,
or where you suffer from ground loops between buildings on a campus, use ber. Fiber
optic cable carries no electrical signals to damage your equipment.
Note that the multimode ber form of 100BaseFX species two distances. If you run the
equipment in half-duplex mode, you can only transmit 412 meters. Full-duplex mode
reaches up to 2 kms.
Fast Ethernet 19
Class II repeater must be 100BaseT2. The only exception to mixing is for 100BaseTX and
100BaseFX, because these both use 4B/5B and no encoding translation is necessary.
The lower latency value for a Class II repeater enables it to support a slightly larger network
diameter than a Class I based network. Converting the signal from analog to digital and
performing line encoding translation consumes bit times. A Class I repeater therefore
introduces more latency than a Class II repeater reducing the network diameter.
Figure 1-4 illustrates interconnecting stations directly together without the use of a repeater.
Each station is referred to as a DTE (data terminal equipment) device. Transceivers and
hubs are DCE (data communication equipment) devices. Use a straight through cable when
connecting a DTE to a DCE device. Use a cross-over when connecting a DTE to DTE or a
DCE to DCE. Either copper or ber can be used. Be sure, however, that you use a cross-
over cable in this conguration. A cross-over cable attaches the transmitter pins at one end
to the receiver pins at the other end. If you use a straight through cable, you connect
transmit at one end to transmit at the other end and fail to communicate. (The Link
Status light does not illuminate!)
Figure 1-4 Interconnecting DTE to DTE
UTP
100 Meters
cl250104.eps
Fiber
412 Meters
NOTE There is an exception to this where you can, in fact, connect two DTE or two DCE devices
directly together with a straight through cable. Some devices have MDI (medial interface)
and MDIX ports. The MDIX is a media interface cross-over port. Most ports on devices are
MDI. You can use a straight through cable when connecting from an MDI to an MDIX port.
Using a Class I repeater as in Figure 1-5 enables you to extend the distance between
workstations. Note that with a Class I repeater you can mix the types of media attaching to
the repeater. Any mix of 100BaseTX, 100BaseT4, 100BaseT2, or 100BaseFX works. Only
one Class I repeater is allowed in the network. To connect Class I repeaters together, a
bridge, switch, or router must connect between them.
Fast Ethernet 21
Fiber Fiber
100BaseF to 100BaseF
272 Meters
100BaseFX
UTP Fiber
100BaseT to 100BaseF
100 Meters
cl250105.eps
261 Meters (TX)/231Meters(T2/T4)
320 Meters
100 Meters
308 Meters
22 Chapter 1: Desktop Technologies
Unlike Class I repeaters, two Class II repeaters are permitted as in Figure 1-7. The
connection between the repeaters must be less than or equal to ve meters. Why daisy chain
the repeaters if it only gains ve meters of distance? Simply because it increases the number
of ports available in the system.
Figure 1-7 Networking with Two Class II Repeaters
Homogeneous 100BaseF
228 Meters
cl250107.eps
100 Meters 5 Meters
216 Meters
The networks in Figure 1-5 through Figure 1-7 illustrate networks with repeaters operating
in half-duplex mode. The network diameter constraints arise from a need to honor the
slotTime window for 100BaseX half-duplex networks. Extending the network beyond this
diameter without using bridges, switches, or routers violates the maximum extent of the
network and makes the network susceptible to late collisions. This is a bad situation. The
network in Figure 1-8 demonstrates a proper use of Catalyst switches to extend a network.
Fast Ethernet 23
2 Kms 2 Kms
Fiber
100 Meters
Fiber
cl250108.eps
Practical Considerations
100BaseX networks offer at least a tenfold increase in network bandwidth over shared
legacy Ethernet systems. In a full-duplex network, the bandwidth increases by twentyfold.
Is all this bandwidth really needed? After all, many desktop systems cannot generate
anywhere near 100 Mbps of trafc. Most network systems are best served by a hybrid of
network technologies. Some users are content on a shared 10 Mbps system. These users
normally do little more than e-mail, Telnet, and simple Web browsing. The interactive
applications they use demand little network bandwidth and so the user rarely notices delays
in usage. Of the applications mentioned for this user, Web browsing is most susceptible
because many pages incorporate graphic images that can take some time to download if the
available network bandwidth is low.
24 Chapter 1: Desktop Technologies
If the user does experience delays that affect work performance (as opposed to non-
work-related activities), you can increase the users bandwidth by doing the following:
Upgrading the user to 10BaseT full duplex and immediately double the bandwidth.
Upgrading the user to 100BaseX half duplex.
Upgrading the user to 100BaseX full duplex.
Which of these is most reasonable? It depends upon the users application needs and the
workstation capability. If the users applications are mostly interactive in nature, either of
the rst two options can sufce to create bandwidth.
However, if the user transfers large les, as in the case of a physician retrieving medical
images, or if the user frequently needs to access a le server, 100BaseX full duplex might
be most appropriate. Option 3 should normally be reserved for specic user needs, le
servers, and routers.
Another appropriate use of Fast Ethernet is for backbone segments. A corporate network
often has an invisible hierarchy where distribution networks to the users are lower speed
systems, whereas the networks interconnecting the distribution systems operate at higher
rates. This is where Fast Ethernet might t in well as part of the infrastructure. The decision
to deploy Fast Ethernet as part of the infrastructure is driven by corporate network needs as
opposed to individual user needs, as previously considered. Chapter 8, Trunking
Technologies and Applications, considers the use of Fast Ethernet to interconnect Catalyst
switches together as a backbone.
Gigabit Ethernet
As if 100 Mbps is not enough, yet another higher bandwidth technology was unleashed on
the industry in June of 1998. Gigabit Ethernet (IEEE 802.3z) species operations at 1000
Mbps, another tenfold bandwidth improvement. We discussed earlier how stations are hard-
pressed to fully utilize 100 Mbps Ethernet. Why then do we need a Gigabit bandwidth
technology? Gigabit Ethernet proponents expect to nd it as either a backbone technology
or as a pipe into very high speed le servers. This contrasts with Fast Ethernet in that Fast
Ethernet network administrators can deploy Fast Ethernet to clients, servers, or use it as a
backbone technology. Gigabit Ethernet will not be used to connect directly to clients any
time soon. Some initial studies of Gigabit Ethernet indicate that installing 1000 Mbps
interfaces in a Pentium class workstation will actually slow down its performance due to
software interrupts. On the other hand, high performance UNIX stations functioning as le
servers can indeed benet from a larger pipe to the network.
In a Catalyst network, Gigabit Ethernet interconnects Catalysts to form a high-speed
backbone. The Catalysts in Figure 1-9 have low speed stations connecting to them (10 and
100 Mbps), but have 1000 Mbps to pass trafc between workstations. A le server in the
network also benets from a 1000 Mbps connection supporting more concurrent client
accesses.
Gigabit Ethernet 25
1000 Mbps
cl250109.eps
100 10 100 10 1000 Mbps
Gigabit Architecture
Gigabit Ethernet merges aspects of 802.3 Ethernet and Fiber Channel, a Gigabit technology
intended for high-speed interconnections between le servers as a LAN replacement. The
Fiber Channel standard details a layered network model capable of scaling to bandwidths
of 4 Gbps and to extend to distances of 10 kms. Gigabit Ethernet borrows the bottom two
layers of the standard: FC-1 for encoding/decoding and FC-0, the interface and media layer.
FC-0 and FC-1 replace the physical layer of the legacy 802.3 model. The 802.3 MAC and
LLC layers contribute to the higher levels of Gigabit Ethernet. Figure 1-10 illustrates the
merger of the standards to form Gigabit Ethernet.
26 Chapter 1: Desktop Technologies
802.2LLC
CSMA/CD
Physical Layer
802.2 Ethernet
MAC
CSMA/CD(Half-) or Full -Duplex
8B/10B Encoder/Decoder
Serializer/Deserializer FC-2 through FC-4
Connector
802.3z
FC-1
Gigabit Ethernet
Encoder/Decoder
cl250110.eps
FC-0
Interface and Media
Fiber Channel
The Fiber Channel standard incorporated by Gigabit Ethernet transmits at 1.062 MHz over
ber optics and supports 800 Mbps data throughput. Gigabit Ethernet increases the
signaling rate to 1.25 GHz. Further, Gigabit Ethernet uses 8B/10B encoding which means
that 1 Gbps is available for data. 8B/10B is similar to 4B/5B discussed for 100BaseTX,
except that for every 8 bits of data, 2 bits are added creating a 10-bit symbol. This encoding
technique simplies ber optic designs at this high data rate. The optical connector used by
Fiber Channel, and therefore by Gigabit Ethernet, is the SC style connector. This is the
push-in/pull-out, or snap and click, connector used by manufacturers to overcome
deciencies with the ST style connector. The ST, or snap and twist, style connectors
previously preferred were a bayonet type connector and required nger space on the front
panel to twist the connector into place. The nger space requirement reduced the number
of ports that could be built in to a module.
NOTE A new connector type, the MT-RJ, is now nding popularity in the ber industry. The MT-
RJ uses a form factor and latch like the RJ-45 connectors, supports full duplex, has lower
cost than ST or SC connectors, and is easier to terminate and install than ST or SC. Further,
its smaller size allows twice the port density on a face plate than ST or SC connectors.
Gigabit Ethernet 27
cl2501r1.eps
Length Carrier
Preamble DA SA Data FCS
Type Extension
The addition of the carrier extension bits does not change the actual Gigabit Ethernet frame
size. The receiving station still expects to see no fewer than 64 octets and no more than 1518
octets.
1000BaseSX
1000BaseSX uses the short wavelength of 850 nms. Although this is a LASER-based
system, the distances supported are generally shorter than for 1000BaseLX. This results
from the interaction of the light with the ber cable at this wavelength. Why use
1000BaseSX then? Because the components are less expensive than for 1000BaseLX. Use
this less expensive method for short link distances (for example, within an equipment rack).
1000BaseLX
In ber optic systems, light sources differ in the type of device (LED or LASER) generating
the optical signal and in the wavelength they generate. Wavelength correlates to the frequency
of RF systems. In the case of optics, we specify the wavelength rather than frequency. In
practical terms, this corresponds to the color of the light. Typical wavelengths are 850
nanometers (nms) and 1300 nms. 850 nm light is visible to the human eye as red, whereas 1300
is invisible. 1000BaseLX uses 1300 nm optical sources. In fact, the L of LX stands for long
wavelength. 1000BaseLX uses LASER sources. Be careful when using ber optic systems. Do
not look into the port or the end of a ber! It can be hazardous to the health of your eye.
Use the LX option for longer distance requirements. If you need to use single mode, you
must use the LX.
Token Ring 29
1000BaseCX
Not included in Table 1-4 is a copper media option. 1000BaseCX uses a 150-Ohm balanced
shielded copper cable. This new cable type is not well-known in the industry, but is
necessary to support the high-bandwidth data over copper. 1000BaseCX supports
transmissions up to 25 meters. It is intended to be used to interconnect devices collocated
within an equipment rack very short distances apart. This is appropriate when Catalysts are
stacked in a rack and you want a high speed link between them, but you do not want to
spend the money for ber optic interfaces.
1000BaseT
One nal copper version is the 1000BaseT standard which uses Category 5 twisted-pair cable.
It supports up to 100 meters, but uses all four pairs in the cable. This offers another low cost
alternative to 1000BaseSX and 1000BaseLX and does not depend upon the special cable used
by 1000BaseCX. This standard is under the purview of the IEEE 802.3ab committee.
Token Ring
This chapter began with an overview of LAN access methods. To this point, you should be
familiar with the various options using the CSMA/CD method. This section briey
examines Token Ring, the other popular form of LAN access.
Token Ring systems, like Ethernet, use a shared media technology. Multiple stations attach
to a network and share the bandwidth. Token Ring supports two bandwidth options: 4 Mbps
and 16 Mbps. The 4 Mbps version represents the original technology released by IBM. 16
Mbps, a version released after 4 Mbps, essentially works the same as 4 Mbps Token Ring
and introduces a couple of optional new features to further improve the system.
Token
Ring
cl250111.eps
Each station in the network creates a break in the ring. A token passes around the ring from
station to station. If a station desires to send information, it holds onto the token and starts
to transmit onto the cable. Assume Station 1 wants to transmit to Station 3. Station 1, when
it receives a token, possesses the token and transmits the frame with Station 3s MAC
address as the destination and Station 1s MAC address as the source. The frame circulates
around the ring from station to station. Each station locally copies the frame and passes it
to the next station. Each station compares the destination MAC address against its own
hardware address and either discards the frame if they dont match, or sends the frame to
the processor. When Station 2 receives the frame, it too copies the frame and sends it on to
the next station. All stations receive a copy of the frame because, just like Ethernet, Token
Ring is a broadcast network. The frame eventually returns to the source. The source is
responsible for removing the frame and introducing a new token onto the network.
In this model, only one station at a time transmits because only one station can possess the
token at a time. Some network inefciencies result, however, when a station retains the
token until it removes the frame it transmitted from the ring. Depending upon the length of
the ring, a station can complete transmission of a frame before the frame returns back to the
source. During the time between the completion of transmission and the removal of the
frame, the network remains idleno other station can transmit. This amounts to wasted
bandwidth on the network. Early token release, an optional feature introduced with 16
Mbps Token Ring, permits the source to create a new token after it completes transmission,
and before it removes its frame from the network. This increases the Token Ring utilization
to a much higher degree than for systems without early token release.
Occasionally, a source might not be online whenever the frame it transmitted returns to it.
This prevents the source from removing the frame and causes it to circulate around the
networkpossibly indenitely. This consumes bandwidth on the network and prevents
other stations from generating trafc. To prevent this, one of the stations on the ring is
Token Ring 31
elected to be the ring monitor. Whenever a packet circulates around the ring, the ring
monitor marks a particular bit in the frame indicating, I already saw this frame once. If
the ring monitor sees any frame with this bit set, the ring monitor assumes that the source
cannot remove the frame and removes it.
MAU
Detached
Station
cl250112.eps
Internal to the MAU, the transmit from one station connects to the receive of another
station. This continues between all attached stations until the ring is completed. What
happens if a user detaches a station? When this occurs, the MAU bypasses the unused port
to maintain ring integrity.
A network administrator can daisy-chain MAUs together to extend the distance and to
introduce more ports in the network. Figure 1-14 illustrates how MAUs usually have ring-
in (RI) and ring-out (RO) ports to attach to other MAUs.
32 Chapter 1: Desktop Technologies
RI RO
Token
Ring
cl250113.eps
RO RI
Summary
Although many of you use a number of different LAN technologies, the market still has a
preponderance of legacy Ethernet deployed. A lot of 10 Mbps systems still exist with varied
media options such as copper and ber. You should expect to encounter this type of
connection method for at least another few years. This chapter covered the basics of how
legacy Ethernet functions.
Because of the limitations that legacy Ethernet can cause some applications, higher speed
network technologies had to be developed. The IEEE created Fast Ethernet to meet this
need. With the capability to run in full-duplex modes, Fast Ethernet offers signicant
bandwidth leaps to meet the needs of many users. This chapter discussed the media options
available for Fast Ethernet and some of the operational characteristics of it.
And for real bandwidth consumers, Gigabit Ethernet offers even more capacity to meet the
needs of trunking switches together and to feed high performance le servers. This chapter
covered some of the attributes of Gigabit Ethernet and choices available to you for media.
Review Questions
1 What is the pps rate for a 100BaseX network? Calculate it for the minimum and maximum
frame sizes.
2 What are the implications of mixing half-duplex and full-duplex devices? How do you do it?
Review Questions 33
3 In the opening section on Fast Ethernet, we discussed the download time for a typical medical
image over a shared legacy Ethernet system. What is an approximate download time for the
image over a half-duplex 100BaseX system? Over a full-duplex 100BaseX system?
4 What disadvantages are there in having an entire network running in 100BaseX
full-duplex mode?
5 Can a Class II repeater ever attach to a Class I repeater? Why or why not?
6 What is the smallest Gigabit Ethernet frame size that does not need carrier extension?
This chapter covers the following key topics:
Why Segment LANs?Discusses motivations for segmenting LANs and the
disadvantages of not segmenting.
Segmenting LANS with RepeatersDiscusses the purpose, benets, and
limitations of repeaters in LANs.
Segmenting LANS with BridgesDiscusses how bridges create collision domains
and extend networks. As the foundational technology for LAN switches, this section
describes the benets and limitations of bridges.
Segmenting LANS with RoutersDiscusses how routers create broadcast domains
by limiting the distribution of broadcast frames.
Segmenting LANS with SwitchesDiscusses the differences between bridges and
switches, and how switches create broadcast domains differently from routers.
CHAPTER
2
Segmenting LANs
As corporations grow, network administrators nd themselves deep in frustration.
Management wants more users on the network, whereas users want more bandwidth. To
further confuse the issue, nances often conict with the two objectives, effectively
limiting options. Although this book cannot help with the last issue, it can help clarify what
technology options exist to increase the number of users served while enhancing the
available bandwidth in the system. Network engineers building LAN infrastructures can
choose from many internetworking devices to extend networks: repeaters, bridges, routers,
and switches. Each component serves specic roles and has utility when properly deployed.
Engineers often exhibit some confusion about which component to use for various network
congurations. A good understanding of how these devices manipulate collision and
broadcast domains helps the network engineer to make intelligent choices. Further, by
understanding these elements, discussions in later chapters about collision and broadcast
domains have a clearer context.
This chapter, therefore, denes broadcast and collision domains and discusses the role of
repeaters, bridges, routers, and switches in manipulating the domains. It also describes why
network administrators segment LANs, and how these devices facilitate segmentation.
Before After
cl250201.eps
100 Users per Segment 100 Users per Segment
10 Mbps System Bandwidth 10 Mbps System Bandwidth
20 Kbps per User 50 Mbps Total System Bandwidth
100 Kbps per User
Before segmentation, all 500 users share the networks 10 Mbps bandwidth because the
segments interconnect with repeaters. (The next section in this chapter describes how
repeaters work and why this is true.) The after network replaces the repeaters with bridges
and routers isolating segments and providing more bandwidth for users. Bridges and
routers generate bandwidth by creating new collision and broadcast domains as
summarized in Table 2-1. (The sections on LAN segmentation with bridges and routers
later in this chapter dene collision and broadcast domains and describe why this is so.)
Each segment can further divide with additional bridges, routers, and switches providing
even more user bandwidth. By reducing the number of users on each segment, more
bandwidth avails itself to users. The extreme case dedicates one user to each segment
providing full media bandwidth to each user. This is exactly what switches allow the
administrator to build.
Segmenting LANs with Repeaters 37
The question remains, though, What should you use to segment the network? Should you
use a repeater, bridge, router, or LAN switch? Repeaters do not really segment a network
and do not create more bandwidth. They simply allow you to extend the network distance
to some degree. Bridges, routers, and switches are more suitable for LAN segmentation.
The sections that follow describe the various options. The repeater is included in the
discussion because you might attach a repeater-based network to your segmented network.
Therefore, you need to know how repeaters interact with segmentation devices.
Wire A Wire B
cl250202.eps
1 2 3 4
Repeaters regenerate the signal from one wire on to the other. When Station 1 transmits to
Station 2, the frame also appears on Wire B, even though the source and destination device
coexist on Wire A. Repeaters are unintelligent devices and have no insight to the data
content. They blindly perform their responsibility of forwarding signals from one wire to
all other wires. If the frame contains errors, the repeater forwards it. If the frame violates
the minimum or maximum frame sizes specied by Ethernet, the repeater forwards it. If a
collision occurs on Wire A, Wire B also sees it. Repeaters truly act like an extension of the
cable.
Although Figure 2-2 shows the interconnection of two segments, repeaters can have many
ports to attach multiple segments as shown in Figure 2-3.
38 Chapter 2: Segmenting LANs
cl250203.eps
A 10BaseT network is comprised of hubs and twisted-pair cables to interconnect
workstations. Hubs are multiport repeaters and forward signals from one interface to all
other interfaces. As in Figure 2-2, all stations attached to the hub in Figure 2-3 see all trafc,
both the good and the bad.
Repeaters perform several duties associated with signal propagation. For example,
repeaters regenerate and retime the signal and create a new preamble. Preamble bits
precede the frame destination MAC address and help receivers to synchronize. The 8-byte
preamble has an alternating binary 1010 pattern except for the last byte. The last byte of the
preamble, which ends in a binary pattern of 10101011, is called the start of frame delimiter
(SFD). The last two bits indicate to the receiver that data follows. Repeaters strip all eight
preamble bytes from the incoming frame, then generate and prepend a new preamble on the
frame before transmission through the outbound interface.
Repeaters also ensure that collisions are signaled on all ports. If Stations 1 and 2 in Figure
2-2 participate in a collision, the collision is enforced through the repeater so that the
stations on Wire B also know of the collision. Stations on Wire B must wait for the collision
to clear before transmitting. If Stations 3 and 4 do not know of the collision, they might
attempt a transmission during Station 1 and 2s collision event. They become additional
participants in the collision.
Limitations exist in a repeater-based network. They arise from different causes and must be
considered when extending a network with repeaters. The limitations include the following:
Shared bandwidth between devices
Specication constraints on the number of stations per segment
End-to-end distance capability
Segmenting LANs with Repeaters 39
Shared Bandwidth
A repeater extends not just the distance of the cable, but it also extends the collision domain.
Collisions on one segment affect stations on another repeater-connected segment.
Collisions extend through a repeater and consume bandwidth on all interconnected
segments. Another side effect of a collision domain is the propagation of frames through
the network. If the network uses shared network technology, all stations in the repeater-
based network share the bandwidth. This is true whether the source frame is unicast,
multicast, or broadcast. All stations see all frames. Adding more stations to the repeater
network potentially divides the bandwidth even further. Legacy Ethernet systems have a
shared 10 Mbps bandwidth. The stations take turns using the bandwidth. As the number of
transmitting workstations increases, the amount of available bandwidth decreases.
NOTE Bandwidth is actually divided by the number of transmitting stations. Simply attaching a
station does not consume bandwidth until the device transmits. As a theoretical extreme, a
network can be constructed of 1,000 devices with only one device transmitting and the
other 999 only listening. In this case, the bandwidth is dedicated to the single transmitting
station by virtue of the fact that no other device is transmitting. Therefore, the transmitter
never experiences collisions and can transmit whenever it desires at full media rates.
End-to-End Distance
Another limitation on extending networks with repeaters focuses on distance. An Ethernet
link can extend only so far before the media slotTime specied by Ethernet standards is
violated. As described in Chapter 1, the slotTime is a function of the network data rate. A
10 Mbps network such as 10BaseT has a slotTime of 51.2 microseconds. A 100 Mbps
network slotTime is one tenth that of 10BaseT. The calculated network extent takes into
account the slotTime size, latency through various media such as copper and ber, and the
number of repeaters in a network. In a 10 Mbps Ethernet, the number of repeaters in a
network must follow the 5/3/1 rule illustrated in Figure 2-4. This rule states that up to ve
segments can be interconnected with repeaters. But only three of the segments can have
devices attached. The other two segments interconnect segments and only allow repeaters
to attach at the ends. When following the 5/3/1 rule, an administrator creates one collision
domain. A collision in the network propagates through all repeaters to all other segments.
3 Workstation 1 2 3
Segments
4 Repeaters 1 2 3 4
1 2 3 4 5
5 Segments
cl250204.eps
Repeaters, when correctly used, extend the collision domain by interconnecting segments
at OSI Layer 1. Any transmission in the collision domain propagates to all other stations in
the network. A network administrator must, however, take into account the 5/3/1 rule. If the
network needs to extend beyond these limits, other internetworking device types must be
used. For example, the administrator could use a bridge or a router.
Repeaters extend the bounds of broadcast and collision domains, but only to the extent
allowed by media repeater rules. The maximum geographical extent, constrained by the
media slotTime value, denes the collision domain extent. If you extend the collision
domain beyond the bounds dened by the media, the network cannot function correctly. In
the case of Ethernet, it experiences late collisions if the network extends too far. Late
collision events occur whenever a station experiences a collision outside of the 51.2 s
slotTime.
Figure 2-5 illustrates the boundaries of a collision domain dened by the media slotTime.
All segments connected together by repeaters belong to the same collision domain. Figure
2-5 also illustrates the boundaries of a broadcast domain in a repeater-based network.
Broadcast domains dene the extent that a broadcast propagates throughout a network.
Segmenting LANs with Repeaters 41
cl250205.eps
= Collision Domain
= Broadcast Domain
1 2 3 4
cl250206.eps
DA MAC SA MAC Source IP Dest IP
00-60-97-8F-4F-86 00-60-97-8F-5B-12 172.16.1.2 172.16.1.1
Wire A Wire B
cl250207.eps
The lter process differs for access methods such as Ethernet and Token Ring. For example,
Ethernet employs a process called transparent bridging that examines the destination MAC
address and determines if a frame should be forwarded, ltered, or ooded. Bridges operate at
Layer 2 of the OSI model, the data link layer. By functioning at this layer, bridges have the
capability to examine the MAC headers of frames. They can, therefore, make forwarding
decisions based on information in the header such as the MAC address. Token Ring can also
use source-route bridging which determines frame ow differently from transparent bridges.
These methods, and others, are discussed in more detail in Chapter 3, Bridging Technologies.
More importantly, though, bridges interconnect collision domains allowing independent
collision domains to appear as if they were connected, without propagating collisions between
them. Figure 2-8 shows the same network as in Figure 2-5, but with bridges interconnecting the
segments. In the repeater-based network, all the segments belong to the same collision domain.
The network bandwidth was divided between the four segments. In Figure 2-8, however, each
segment belongs to a different collision domain. If this were a 10 Mbps legacy network, each
segment would have its own 10 Mbps of bandwidth for a collective bandwidth of 40 Mbps.
Figure 2-8 Bridges Create Multiple Collision Domains and One Broadcast Domain
cl250208.eps
= Collision Domain
= Broadcast Domain
in the chapter, it is valid to comment now that the ultimate bandwidth distribution occurs
when you dedicate one user for each bridge interface. Each user then has all of the local
bandwidth to himself; only one station and the bridge port belong to the collision domain.
This is, in effect, what switching technology does.
Another advantage of bridges stems from their Layer 2 operation. In the repeater-based
network, an end-to-end distance limitation prevents the network from extending
indenitely. Bridges allow each segment to extend a full distance. Each segment has its own
slotTime value. Bridges do not forward collisions between segments. Rather, bridges
isolate collision domains and reestablish slotTimes. Bridges can, in theory, extend networks
indenitely. Practical considerations prevent this, however.
Bridges lter trafc when the source and destination reside on the same interface.
Broadcast and multicast frames are the exception to this. Whenever a bridge receives a
broadcast or multicast, it oods the broadcast message out all interfaces. Again, consider
ARP as in the repeater-based network. When a station in a bridged network wants to
communicate with another IP station in the same bridged network, the source sends a
broadcast ARP request. The request, a broadcast frame, passes through all bridges and out
all bridge interfaces. All segments attached to a bridge belong to the same broadcast
domain. Because they belong to the same broadcast domain, all stations should also belong
to the same IP subnetwork.
A bridged network can easily become overwhelmed with broadcast and multicast trafc if
applications generate this kind of trafc. For example, multimedia applications such as
video conferencing over IP networks create multicast trafc. Frames from all participants
propagate to every segment. In effect, this reduces the network to appear as one giant shared
network. The bandwidth becomes shared bandwidth.
In most networks, the majority of frames are not broadcast frames. Some protocols generate
more than others, but the bandwidth consumed by these protocol broadcast frames is a
relatively small percentage of the LAN media bandwidth.
When should you use bridges? Are there any advantages of bridges over repeaters? What
about stations communicating with unicast frames? How do bridges treat this trafc?
When a source and destination device are on the same interface, the bridge lters the frames
and does not forward the trafc to any other interface. (Unless the frame is a broadcast or
multicast.) If the source and destination reside on different ports relative to the bridge, the
bridge forwards the frame to the appropriate interface to reach the destination. The
processes of ltering and selective forwarding preserve bandwidth on other segments. This
is a signicant advantage of bridges over repeaters that offers no frame discrimination
capabilities.
When a bridge forwards trafc, it does not change the frame. Like a repeater, a bridge does
nothing more to the frame than to clean up the signal before it sends it to another port. Layer
2 and Layer 3 addresses remain unchanged as frames transit a bridge. In contrast, routers
change the Layer 2 address. (This is shown in the following section on routers.)
Segmenting LANs with Bridges 45
A rule of thumb when designing networks with bridges is the 80/20 rule. This rule states
that bridges are most efcient when 80 percent of the segment trafc is local and only 20
percent needs to cross a bridge to another segment. This rule originated from traditional
network design where server resources resided on the same segments with the client
devices they served, as in Figure 2-9.
20%
20%
80% 80%
l250209.eps
The clients only infrequently needed to access devices on the other side of a bridge. Bridged
networks are considered to be well designed when the 80/20 rule is observed. As long as
this trafc balance is maintained, each segment in the network appears to have full media
bandwidth. If however, the ow balance shifts such that more trafc gets forwarded through
the bridge rather than ltered, the network behaves as if all segments operate on the same
shared network. The bridge in this case provides nothing more than the capability to daisy-
chain collision domains to extend distance, but without any bandwidth improvements.
Consider the worst case for trafc ow in a bridged network: 0/100 where none of the trafc
remains local and all sources transmit to destinations on other segments. In the case of a
two-port bridge, the entire system has shared bandwidth rather than isolated bandwidth.
The bridge only extends the geographical extent of the network and offers no bandwidth
gains. Unfortunately, many intranets see similar trafc patterns, with typical ratios of 20/80
rather than 80/20. This results from many users attempting to communicate with and
through the Internet. Much of the trafc ows from a local segment to the WAN connection
and crosses broadcast domain boundaries. Chapter 14, Campus Design Models, discusses
the current trafc trends and the demise of the 80/20 rule of thumb in modern networks.
One other advantage of bridges is that they prevent errored frames from transiting to another
segment. If the bridge sees that a frame has errors or that it violates the media access method
size rules, the bridge drops the frame. This protects the destination network from bad frames
that do nothing more than consume bandwidth for the destination device discards the frame
anyway. Collisions on a shared legacy network often create frame fragments that are
sometimes called runt frames. These frames violate the Ethernet minimum frame size rule of
64 bytes. Chapter 3, Bridging Technologies, shows the frame size rules in Table 3-5.
Whereas a repeater forwards runts to the other segments, a bridge blocks them.
46 Chapter 2: Segmenting LANs
cl250210.eps
= Collision Domain
= Broadcast Domain
A side effect of separate broadcast domains demonstrates itself in the behavior of routers. In a
repeater- or bridge-based network, all stations belong to the same subnetwork because they all
belong to the same broadcast domain. In a router-based network, however, which creates
multiple broadcast domains, each segment belongs to a different subnetwork. This forces
workstations to behave differently than they did in the bridged network. Refer to Figure 2-11
and Table 2-2 for a description of the ARP process in a routed network. Although the world
does not need another description of ARP, it does in this case serve to illustrate how frames ow
through a router in contrast to bridges and repeaters. Further, it serves as an example of how
workstations must behave differently with the presence of a router. In a bridge- or repeater-
Segmenting LANs with Routers 47
based network, the workstations transmit as if the source and destination are in the collision
domain, even though it is possible in a bridged network for them to be in different domains.
The aspect that allows them to behave this way in the bridged network is that they are in the
same broadcast domain. However, when they are in different broadcast domains, as with the
introduction of a router, the source and destination must be aware of the router and must
address their trafc to the router.
Wire A Wire B
1 4
3 6
2 5
cl250211.eps
MAC: 00-60-97-8F-4F-86 MAC: 00-60-97-8F-5B-12
IP: 172.16.1.1 IP: 10.0.0.1
*ARP Request
**ARP Reply
***User Data Frame
When Station 1 wants to talk to Station 2, Station 1 realizes that the destination is on a
different network by comparing the destinations logical address to its own. Knowing that
they are on different networks forces the source to communicate through a router. The
router is identied through the default router or default gateway setting on the workstation.
48 Chapter 2: Segmenting LANs
To communicate with the router, the source must address the router at Layer 2 using the
routers MAC address. To obtain the routers MAC address, the source rst ARPs the router
(see frames 1 and 2 in Figure 2-11). The source then creates a frame with the routers MAC
address as the destination MAC address and with Station 2s logical address for the
destination Layer 3 address (see frame 3 in Figure 2-11). When the frame enters the router,
the router determines how to get to the destination network. In this example, the destination
directly attaches to the router. The router ARPs for Station 2 (frames 4 and 5 in Figure 2-
11) and creates a frame with station 2s MAC address for the L2 destination and routers
MAC for the L2 source (see frame 6 in Figure 2-11). The router uses L3 addresses for
Stations 1 and 2. The data link layer header changes as the frame moves through a router,
while the L3 header remains the same.
In contrast, remember that as the frame transits a repeater or bridge, the frame remains the
same. Neither repeaters nor bridges modify the frame. Like a bridge, routers prevent
errored frames from entering the destination network.
Session 1
cl250212.eps
Session 2
Because a switch is nothing more than a complex bridge with multiple interfaces, all of the
ports on a switch belong to one broadcast domain. If Station 1 sends a broadcast frame, all
devices attached to the switch receive it. The switch oods broadcast transmissions to all
other ports. Unfortunately, this makes the switch no more efcient than a shared media
interconnected with repeaters or bridges when dealing with broadcast or multicast frames.
It is possible to design the switch so that ports can belong to different broadcast domains as
assigned by a network administrator, thus providing broadcast isolation. In Figure 2-13,
some ports belong to Broadcast Domain 1 (BD1), some ports to Broadcast Domain 2
(BD2), and still others to Broadcast Domain 3 (BD3). If a station attached to an interface
in BD1 transmits a broadcast frame, the switch forwards the broadcast only to the interfaces
belonging to the same domain. The other broadcast domains do not experience any
bandwidth consumption resulting from BD1s broadcast. In fact, it is impossible for any
frame to cross from one broadcast domain to another without the introduction of another
external device, such as a router, to interconnect the domains.
50 Chapter 2: Segmenting LANs
2
BD
BD1
BD2
cl250213.eps
BD1
BD3
Switches capable of dening multiple broadcast domains actually dene virtual LANs
(VLANs). Each broadcast domain equates to a VLAN. Chapter 5, VLANs, discusses
VLANs in more detail. For now, think of a VLAN capable switch as a device that creates
multiple isolated bridges as shown in Figure 2-14.
BDI BD2
cl250214.eps
BD3
If you create ve VLANs, you create ve virtual bridge functions within the switch. Each
bridge function is logically isolated from the others.
Summary
What is the difference between a bridge and a switch? Marketing. A switch uses bridge
technology but positions itself as a device to interconnect individual devices rather than
networks. Both devices create collision domains on each port. Both have the potential to
Review Questions 51
create multiple broadcast domains depending upon the vendor implementation and the user
conguration.
Review Questions
Refer to the network setup in Figure 2-15 to answer Questions 1 and 2.
cl250215.eps
1 Examine Figure 2-15. How many broadcast and collision domains are there?
2 In Figure 2-15, how many Layer 2 and Layer 3 address pairs are used to transmit
between Stations 1 and 2?
52 Chapter 2: Segmenting LANs
cl250216.eps
Ports 5,6,7 in VLAN2
3 6
4 5
Bridging Technologies
Although various internetworking devices exist for segmenting networks, Layer 2 LAN
switches use bridge internetworking technology to create smaller collision domains.
Chapter 2, Segmenting LANs, discussed how bridges segment collision domains. But
bridges do far more than segment collision domains: they protect networks from unwanted
unicast trafc and eliminate active loops which otherwise inhibit network operations. How
they do this differs for Ethernet and Token Ring networks. Ethernet employs transparent
bridging to forward trafc and Spanning Tree to control loops. Token Ring typically uses a
process called source-route bridging. This chapter describes transparent bridging, source-
route bridging (along with some variations), and Layer 2 LAN switching. Chapter 6 covers
Spanning Tree for Ethernet.
Transparent Bridging
As discussed in Chapter 2, networks are segmented to provide more bandwidth per user.
Bridges provide more user bandwidth by reducing the number of devices contending for
the segment bandwidth. But bridges also provide additional bandwidth by controlling data
ow in a network. Bridges forward trafc only to the interface(s) that need to receive the
trafc. In the case of known unicast trafc, bridges forward the trafc to a single port rather
than to all ports. Why consume bandwidth on a segment where the intended destination
does not exist?
Transparent bridging, dened in IEEE 802.1d documents, describes ve bridging processes
for determining what to do with a frame. The processes are as follows:
1 Learning
2 Flooding
3 Filtering
4 Forwarding
5 Aging
Receive packet
Learn source
address or refresh
aging timer
Is the destination a
broadcast, Flood packet
Yes
multicast, or
unknown unicast?
No
same interface?
No
Forward unicast to
correct port
When a frame enters the transparent bridge, the bridge adds the source Ethernet MAC address
(SA) and source port to its bridging table. If the source address already exists in the table, the
bridge updates the aging timer. The bridge examines the destination MAC address (DA). If the
DA is a broadcast, multicast, or unknown unicast, the bridge oods the frame out all bridge ports
in the Spanning Tree forwarding state, except for the source port. If the destination address and
Transparent Bridging 57
source address are on the same interface, the bridge discards (lters) the frame. Otherwise, the
bridge forwards the frame out the interface where the destination is known in its bridging table.
The sections that follow address in greater detail each of the ve transparent bridging processes.
Learning
Each bridge has a table that records all of the workstations that the bridge knows about on
every interface. Specically, the bridge records the source MAC address and the source port
in the table whenever the bridge sees a frame from a device. This is the bridge learning
process. Bridges learn only unicast source addresses. A station never generates a frame with
a broadcast or multicast source address. Bridges learn source MAC addresses in order to
intelligently send data to appropriate destination segments. When the bridge receives a frame,
it references the table to determine on what port the destination MAC address exists. The
bridge uses the information in the table to either lter the trafc (if the source and destination
are on the same interface) or to send the frame out of the appropriate interface(s).
But when a bridge is rst turned on, the table contains no entries. Assume that the bridges
in Figure 3-2 were all recently powered ON, and no station had yet transmitted.
Therefore, the tables in all four bridges are empty. Now assume that Station 1 transmits a
unicast frame to Station 2. All the stations on that segment, including the bridge, receive the
frame because of the shared media nature of the segment. Bridge A learns that Station 1
exists off of port A.1 by looking at the source address in the data link frame header. Bridge
A enters the source MAC address and bridge port in the table.
A B C
B.3
1 2 3 4 5
D.1
D
D.2
cl250302
6
58 Chapter 3: Bridging Technologies
Flooding
Continuing with Figure 3-2, when Station 1 transmits, Bridge A also looks at the destination
address in the data link header to see if it has an entry in the table. At this point, Bridge A only
knows about Station 1. When a bridge receives a unicast frame (a frame targeting a single
destination), no table entry exists for the DA, the bridge receives an unknown unicast frame. The
bridging rules state that a bridge must send an unknown unicast frame out all forwarding
interfaces except for the source interface. This is known as ooding. Therefore, Bridge A oods
the frame out all interfaces, even though Stations 1 and 2 are on the same interface. Bridge B
receives the frame and goes through the same process as Bridge A of learning and ooding.
Bridge B oods the frame to Bridges C and D, and they learn and ood. Now the bridging tables
look like Table 3-1. The bridges do not know about Station 2 because it did not yet transmit.
Still considering Figure 3-2, all the bridges in the network have an entry for Station 1
associated with an interface, pointing toward Station 1. The bridge tables indicate the
relative location of a station to the port. Examining Bridge Cs table, an entry for Station 1
is associated with port C.1. This does not mean Station 1 directly attaches to C.1. It merely
reects that Bridge C heard from Station 1 on this port.
In addition to ooding unknown unicast frames, legacy bridges ood two other frame types:
broadcast and multicast. Many multimedia network applications generate broadcast or
multicast frames that propagate throughout a bridged network (broadcast domain). As the
number of participants in multimedia services increases, more broadcast/multicast frames
consume network bandwidth. Chapter 13, Multicast and Broadcast Services, discusses
ways of controlling multicast and broadcast trafc ows in a Catalyst-based network.
Filtering
What happens when Station 2 in Figure 3-2 responds to Station 1? All stations on the segment
off port A.1, including Bridge A, receive the frame. Bridge A learns about the presence of
Station 2 and adds its MAC address to the bridge table along with the port identier (A.1).
Bridge A also looks at the destination MAC address to determine where to send the frame.
Bridge A knows Station 1 and Station 2 exist on the same port. It concludes that it does not
need to send the frame anywhere. Therefore, Bridge A lters the frame. Filtering occurs when
the source and destination reside on the same interface. Bridge A could send the frame out
other interfaces, but because this wastes bandwidth on the other segments, the bridging
algorithm species to discard the frame. Note that only Bridge A knows about the existence
of Station 2 because no frame from this station ever crossed the bridge.
Transparent Bridging 59
Forwarding
If in Figure 3-2, Station 2 sends a frame to Station 6, the bridges ood the frame because no
entry exists for Station 6. All the bridges learn Station 2s MAC address and relative location.
When Station 6 responds to Station 2, Bridge D examines its bridging table and sees that to
reach Station 2, it must forward the frame out interface D.1. A bridge forwards a frame when
the destination address is a known unicast address (it has an entry in the bridging table) and
the source and destination are on different interfaces. The frame reaches Bridge B, which
forwards it out interface B.1. Bridge A receives the frame and forwards it out A.1. Only
Bridges A, B, and D learn about Station 6. Table 3-2 shows the current bridge tables.
Aging
When a bridge learns a source address, it time stamps the entry. Every time the bridge sees
a frame from that source, the bridge updates the timestamp. If the bridge does not hear from
that source before an aging timer expires, the bridge removes the entry from the table. The
network administrator can modify the aging timer from the default of ve minutes.
Why remove an entry? Bridges have a nite amount of memory, limiting the number of
addresses it can remember in its bridging tables. For example, higher end bridges can
remember upwards of 16,000 addresses, while some of the lower-end units may remember as
few as 4,096. But what happens if all 16,000 spaces are full in a bridge, but there are 16,001
devices? The bridge oods all frames from station 16,001 until an opening in the bridge table
allows the bridge to learn about the station. Entries become available whenever the aging timer
expires for an address. The aging timer helps to limit ooding by remembering the most active
stations in the network. If you have fewer devices than the bridge table size, you could increase
the aging timer. This causes the bridge to remember the station longer and reduces ooding.
Bridges also use aging to accommodate station moves. In Table 3-2, the bridges know the
location of Stations 1, 2, and 6. If you move Station 6 to another location, devices may not
be able to reach Station 6. For example, if Station 6 relocates to C.2 and Station 1 transmits
to Station 6, the frame never reaches Station 6. Bridge A forwards the frame to Bridge B,
but Bridge B still thinks Station 6 is located on port B.3. Aging allows the bridges to
forget Station 6s entry. After Bridge B ages the Station 6 entry, Bridge B oods the
frames destined to Station 6 until Bridge B learns the new location. On the other hand, if
60 Chapter 3: Bridging Technologies
Station 6 initiates the transmission to Station 1, then the bridges immediately learn the new
location of Station 6. If you set the aging timer to a high value, this may cause reachability
issues in stations within the network before the timer expires.
The Catalyst screen capture in Example 3-1 shows a bridge table example. This Catalyst
knows about nine devices (see bolded line) on nine interfaces. Each Catalyst learns about
each device on one and only one interface.
The bridge tables discussed so far contain two columns: the MAC address and the relative
port location. These are seen in columns two and three in Example 3-1, respectively. But
this table has an additional column. The rst column indicates the VLAN to which the
MAC address belongs. A MAC address belongs to only one VLAN at a time. Chapter 5,
VLANs, describes VLANs and why this is so.
Switching Modes
Chapter 2 discusses the differences between a bridge and a switch. Cisco identies the
Catalyst as a LAN switch; a switch is a more complex bridge. The switch can be congured
to behave as multiple bridges by dening internal virtual bridges (i.e., VLANs). Each
virtual bridge denes a new broadcast domain because no internal connection exists
between them. Broadcasts for one virtual bridge are not seen by any other. Only routers
(either external or internal) should connect broadcast domains together. Using a bridge to
interconnect broadcast domains merges the domains and creates one giant domain. This
defeats the reason for having individual broadcast domains in the rst place.
Switching Modes 61
Switches make forwarding decisions the same as a transparent bridge. But vendors have
different switching modes available to determine when to switch a frame. Three modes in
particular dominate the industry: store-and-forward, cut-through, and fragment-free.
Figure 3-3 illustrates the trigger point for the three methods.
DA SA Remainder of packet
cl250304
Store and Forward
All bytes
Each has advantages and trade offs, discussed in the sections that follow. As a result of the
different trigger points, the effective differences between the modes are in error handling
and latency. Table 3-3 compares the approaches and shows which members of the Catalyst
family use the available modes. The table summarizes how each mode handles frames
containing errors, and the associated latency characteristics.
*. Note that when a model supports more than one switching mode, adaptive cut-through may be available.
Check model specifics to confirm.
62 Chapter 3: Bridging Technologies
One of the objectives of switching is to provide more bandwidth to the user. Each port on
a switch denes a new collision domain that offers full media bandwidth. If only one station
attaches to an interface, that station has full dedicated bandwidth and does not need to share
it with any other device. All the switching modes dened in the sections that follow support
the dedicated bandwidth aspect of switching.
TIP To determine the best mode for your network, consider the latency requirements for your
applications and your network reliability. Do your network components or cabling
infrastructure generate errors? If so, x your network problems and use store-and-forward.
Can your applications tolerate the additional latency of store-and-forward switching? If not,
use cut-through switching. Note that you must use store-and-forward with the Cat 5000 and
6000 family of switches. This is acceptable because latency is rarely an issue, especially with
high-speed links and processors and modern windowing protocols. Finally, if the source and
destination segments are different media types, you must use store-and-forward mode.
Store-and-Forward Switching
The store-and-forward switching mode receives the entire frame before beginning the
switching process. When it receives the complete frame, the switch examines it for the
source and destination addresses and any errors it may contain, and then it possibly applies
any special lters created by the network administrator to modify the default forwarding
behavior. If the switch observes any errors in the frame, it is discarded, preventing errored
frames from consuming bandwidth on the destination segment. If your network experiences
a high rate of frame alignment or FCS errors, the store-and-forward switching mode may
be best. The absolute best solution is to x the cause of the errors. Using store-and-forward
in this case is simply a bandage. It should not be the x.
If your source and destination segments use different media, then you must use this mode.
Different media often have issues when transferring data. The section Source-Route
Translation Bridging discusses some of these issues. Store-and-forward mode is necessary
to resolve this problem in a bridged environment.
Because the switch must receive the entire frame before it can start to forward, transfer
latency varies based on frame size. In a 10BaseT network, for example, the minimum
frame, 64 octets, takes 51.2 microseconds to receive. At the other extreme, a 1518 octet
frame requires at least 1.2 milliseconds. Latency for 100BaseX (Fast Ethernet) networks is
one-tenth the 10BaseT numbers.
Switching Modes 63
Cut-Through Switching
Cut-through mode enables a switch to start the forwarding process as soon as it receives the
destination address. This reduces latency to the time necessary to receive the six octet
destination address: 4.8 microseconds. But cut-through cannot check for errored frames
before it forwards the frame. Errored frames pass through the switch, consequently wasting
bandwidth; the receiving device discards errored frames.
As network and internal processor speeds increase, the latency issues become less relevant.
In high speed environments, the time to receive and process a frame reduces signicantly,
minimizing advantages of cut-through mode. Store-and-forward, therefore, is an attractive
choice for most networks.
Some switches support both cut-through and store-and-forward mode. Such switches
usually contain a third mode called adaptive cut-through. These multimodal switches use
cut-through as the default switching mode and selectively activate store-and-forward. The
switches monitor the frame as it passes through looking for errors. Although the switch
cannot stop an errored frame, it counts how many it sees. If the switch observes that too
many frames contain errors, the switch automatically activates the store-and-forward mode.
This is often known as adaptive cut-through. It has the advantage of providing low latency
while the network operates well, while providing automatic protection for the outbound
segment if the inbound segment experiences problems.
Fragment-Free Switching
Another alternative offers some of the advantages of cut-through and store-and-forward
switching. Fragment-free switching behaves like cut-through in that it does not wait for an
entire frame before forwarding. Rather, fragment-free forwards a frame after it receives the
rst 64 octets of the frame (this is longer than the six bytes for cut-through and therefore
has higher latency). Fragment-free switching protects the destination segment from
fragments, an artifact of half-duplex Ethernet collisions. In a correctly designed Ethernet
system, devices detect a collision before the source nishes its transmission of the 64-octet
frame (this is driven by the slotTime described in Chapter 1). When a collision occurs, a
fragment (a frame less than 64 octets long) is created. This is a useless Ethernet frame, and
in the store-and-forward mode, it is discarded by the switch. In contrast, a cut-through
switch forwards the fragment if at least a destination address exists. Because collisions
must occur during the rst 64 octets, and because most frame errors will show up in these
octets, the fragment-free mode can detect most bad frames and discard them rather than
forward them. Fragment-free has a higher latency than cut-through, however, because it
must wait for an additional 58 octets before forwarding the frame. As described in the
section on cut-through switching, the advantages of fragment-free switching are minimal
given the higher network speeds and faster switch processors.
64 Chapter 3: Bridging Technologies
Source-Route Bridging
In a Token Ring environment, rings interconnect with bridges. Each ring and bridge has a
numeric identier. The network administrator assigns the values and must follow several rules.
Typically, each ring is uniquely identied within the bridged network with a value between 1 and
4095. (It is possible to have duplicate ring numbers, as long as the rings do not attach to the same
bridge.) Valid bridge identiers include 1 through 15 and must be unique to the local and target
rings. A ring cannot have two bridges with the same bridge number. Source devices use ring and
bridge numbers to specify the path that the frame will travel through the bridged network. Figure
3-4 illustrates a source-route bridging (SRB) network with several attached workstations.
Figure 3-4 A Source-Route Bridged Network
A B C
1 2
Token Token
Ring Ring
100 200
D
Token
Ring
300
cl250305
400 3
Token
Ring
E
Token Ring Bridging 65
When Station A wants to communicate with Station B, Station A rst sends a test frame to
determine whether the destination is on the same ring as the source. If Station B responds
to the test frame, the source knows that they are both on the same ring. The two stations
communicate without involving any Token Ring bridges.
If, however, the source receives no response to the test frame, the source attempts to reach
the destination on other rings. But the frame must now traverse a bridge. In order to pass
through a bridge, the frame includes a routing information eld (RIF). One bit in the frame
header signals bridges that a RIF is present and needs to be examined by the bridge. This
bit, called the routing information indicator (RII), is set to zero when the source and
destination are on the same ring; otherwise, it is set to one.
Most importantly, the RIF tells the bridge how to send the frame toward the destination.
When the source rst attempts to contact the destination, the RIF is empty because the
source does not know any path to the destination. To complete the RIF, the source sends an
all routes explorer (ARE) frame (it is also possible to use something called a Spanning Tree
Explorer [STE]). The ARE passes through all bridges and all rings. As it passes through a
bridge, the bridge inserts the local ring and bridge number into the RIF. If in Figure 3-5,
Station A sends an ARE to nd the best path to reach Station D, Station D will receive two
AREs. The RIFs look like the following:
Ring100 - Bridge1 - Ring200 - Bridge2 - Ring300
Ring100 - Bridge1 - Ring400 - Bridge3 - Ring300
Each ring in the network, except for ring 100, see two AREs. For example, the stations on
ring 200 receive two AREs that look like the following:
Ring100-Bridge1-Ring200
Ring100-Bridge1-Ring400-Bridge3-Ring300-Bridge2-Ring200
The AREs on ring 200 are useless for this session and unnecessarily consume bandwidth.
As the Token Ring network gets more complex with many rings interconnected in a mesh
design, the quantity of AREs in the network increases dramatically.
NOTE A Catalyst feature, all routes explorer reduction, ensures that AREs don't overwhelm the
network. It conserves bandwidth by reducing the number of explorer frames in the network.
Station D returns every ARE it receives to the source. The source uses the responses to
determine the best path to the destination. What is the best path? The SRB standard does
not specify which response to use, but it does provide some recommendations. The source
could do any of the following:
Use the rst response it receives
Use the path with the fewest hops
Use the path with the largest MTU
Use a combination of criteria
Most Token Ring implementations use the rst option.
66 Chapter 3: Bridging Technologies
Now that Station A knows how to reach Station D, Station A transmits each frame as a
specifically routed frame where the RIF species the ring/bridge hops to the destination.
When a bridge receives the frame, the bridge examines the RIF to determine if it has any
responsibility to forward the frame. If more than one bridge attaches to ring 100, only one
of them forwards the specically routed frame. The other bridge(s) discard it. Station D
uses the information in the RIF when it transmits back to Station A. Station D creates a
frame with the RIF completed in reverse. The source and destination use the same path in
both directions.
Note that transparent bridging differs from SRB in signicant ways. First, in SRB, the
source device determines what path the frame must follow to reach the destination. In
transparent bridging, the bridge determines the path. Secondly, the information used to
determine the path differs. SRB uses bridge/ring identiers, and transparent bridging uses
destination MAC addresses.
NOTE Source-route translational bridging (SR/TLB), described in the next section, can overcome
some of the limitations of source-route transparent bridging (SRT). The best solution,
though, is to use a router to interconnect routed protocols residing on mixed media.
This behavior causes problems in some IBM environments. Whenever an IBM Token Ring
attached device wants to connect to another, it rst issues a test frame to see whether the
destination resides on the same ring as the source. If the source receives no response, it
sends an SRB explorer frame.
The SRT deciency occurs with the test frame. The source intends for the test frame to
remain local to its ring and sets the RII to zero. An RII set to zero, however, signals the
SRT bridge to transparently bridge the frame. The bridge oods the frame to all rings. After
the test frame reaches the destination, the source and destination workstations
communicate using transparent bridging methods as if they both reside on the same ring.
Although this is functional, transparent bridging does not take advantage of parallel paths
like source-route bridging can. Administrators often create parallel Token Ring backbones
to distribute trafc and not overburden any single link. But transparent bridging selects a
single path and does not use another link unless the primary link fails. (This is an aspect of
the Spanning-Tree Protocol described in Chapter 6, Understanding Spanning Tree, and
Chapter 7, Advanced Spanning Tree.) Therefore, all the trafc passes through the same
links, increasing the load on one while another remains unused. This defeats the intent of
the parallel Token Rings.
Another IBM operational aspect makes SRT unsuitable. To achieve high levels of service
availability, some administrators install redundant devices, such as a 3745 controller, as
illustrated in Figure 3-5.
68 Chapter 3: Bridging Technologies
Mainframe
3745 3745
A B
MAC=00-00-00-01-02-03
Token Token
100 Ring Ring 200
cl250306
300
Token
Ring
The redundant controllers use the same MAC address (00-00-00-01-02-03) to simplify
workstation conguration; otherwise, multiple entries need to be entered. If the primary
unit fails, the workstation needs to resolve the logical address to the new MAC address of
the backup unit. By having duplicate MAC addresses, fully automatic recovery is available
without needing to resolve a new address.
Duplicate MAC addresses within a transparently bridged network confuses bridging tables,
however. A bridge table can have only one entry for a MAC address; a station cannot appear
on two interfaces. Otherwise, if a device sends a frame to the MAC address for Controller A
in Figure 3-5, how can the transparent bridge know whether to send the frame to Controller
A or Controller B? Both have the same MAC address, but only one can exist in the bridging
table. Therefore, the intended resiliency feature will not work in the source-route transparent
bridge mode. When using the resiliency feature, congure the Token Ring Concentrator Relay
Function (TrCRF) for source-route bridging. The TrCRF denes ports to a common ring. The
TrCRF is discussed later in the chapter in the Token Ring Switching section.
Token Ring Bridging 69
Token
Ring
???
TB
SRB
Token
Ring
cl250307
Several obstacles prevent devices on the two networks from communicating with each
other. Some of the obstacles include the following:
MAC address format
MAC address representation in the protocol eld
LLC and Ethernet framing
Routing information eld translation
MTU size mismatch
Translational bridging helps to resolve these issues, allowing the devices to communicate.
But no standard exists for translational bridging, leaving a number of implementation
details to the vendor. The following sections discuss in more detail each listed limitation.
The sections provide a technical and a practical explanation of why transparent bridging
might not be the best option.
70 Chapter 3: Bridging Technologies
Ethernet frame
DA SA Type Data
format
cl250308
The frame elds illustrated in Figure 3-7 are as follows:
AC (access control)
FC (frame control)
DA (destination MAC address)
SA (source MAC address)
RIF (routing information eld)
DSAP (destination service access point)
SSAP (source service access point)
Control (LLC control)
OUI (Organizational Unique Identiervendor ID)
Type (protocol type value)
RIF Interpretation
When connecting source-route devices to transparent devices, another issue involves the
routing information eld. The RIF is absent from transparent devices but is vital to the
Token Ring bridging process. How then does a source-route bridged device specify a path
to a transparently bridged device?
A translational bridge assigns a ring number to the transparent segment. To an SRB device, it
appears that the destination device resides on a source-routed segment. To the transparent
device, the SRB device appears to attach to a transparent segment. The translational bridge
keeps a source-routing table to reach the Token Ring MAC address. When a transparent
device transmits a frame to the Token Ring device, the bridge looks at the destination MAC
address, nds a source-route entry for that address, and creates a frame with a completed RIF.
72 Chapter 3: Bridging Technologies
MTU Size
Ethernet, Token Ring, and FDDI support different frame sizes. Table 3-5 lists the minimum
and maximum frame sizes for these media access methods.
A frame from one network type cannot violate the frame size constraints of the destination
network. If an FDDI device transmits to an Ethernet device, it must not create a frame over
1518 octets or under 64 octets. Otherwise, the bridge must drop the frame. Translational
bridges attempt to adjust the frame size to accommodate the maximum transmission unit
(MTU) mismatch. Specically, a translational bridge may fragment an IP frame if the
incoming frame exceeds the MTU of the outbound segment. Routers normally perform
fragmentation because it is a Layer 3 process. Translational bridges that perform
fragmentation actually perform a part of the routers responsibilities.
Note, however, that fragmentation is an IP process. Other protocols do not have
fragmentation, so the source must create frames appropriately sized for the segment with
the smallest MTU. In order for these protocols to work correctly, they exercise MTU
discovery, which allows the stations to determine the largest allowed frame for the path(s)
between the source and destination devices. This option exists in IP, too, and is preferred
over fragmentation.
Bridge
Relay
Function
Cat-A
Source
Route Concentrator
Switches Relay
Function
Token Token Token Token Token Token Token Token Token Token Token Token Token Token
Ring Ring Ring Ring Ring Ring Ring Ring Ring Ring Ring Ring Ring Ring
cl250309
Ring
100 200 300 400
Number
In general, a TrCRF can reside in only one Catalyst and cannot span outside of a Catalyst.
This is called an undistributed TrCRF. An exception to this, the default TrCRF, spans
Catalysts and is referred to as a distributed TrCRF. (A backup TrCRF may also span
Catalysts.) The default TrCRF enables all Token Ring ports to belong to a common TrCRF
without any administrator intervention. Users can attach to any Token Ring port and
communicate with any other station in the distributed TrCRF network. The default TrCRF
behaves like a giant ring extending across all Catalysts and provides the plug and play
capability of the Catalyst in Token Ring networks.
When conguring a TrCRF, you must dene the ring and VLAN numbers, and you must
associate it with an existing parent TrBRF (TrBRFs are discussed in the next section). The
parent TrBRF is assigned a VLAN number and is the identier used to associate the TrCRF
to the TrBRF. In addition, you may dene whether the TrCRF will operate in the SRB or
SRT mode. If left unspecied, the TrCRF will operate in SRB mode.
BRF
CRF
cl250310
CAT 1 CAT 2
Enabling a TrBRF requires a bridge and VLAN number. The TrBRFs VLAN number is
paired with the TrCRF VLAN number to create the parent-to-child relationship. Because a
TrCRF must associate with a parent TrBRF, the default TrCRF belongs to a default TrBRF.
When enabling Token Ring switching with a non-default TrCRF and TrBRF, you must rst
congure the TrBRF, then the TrCRF, and nally, group ports to TrCRFs. Referring to
Figure 3-9, you would do the following:
1 Congure the TrBRF at the top of the drawing. To do this, you dene the bridge
number and the VLAN number as in the following:
Console> (enable) set vlan 100 type trbrf bridge
2 Work down the gure to the TrCRF(s). Create the TrCRF, specifying the VLAN
number, the ring number, the bridge type, and parent TrBRF. Here is an example:
Console> (enable) set vlan 110 type trcrf parent 100 ring 10
Source-Route Switching
Source-route switching describes the mechanism of bridging Token Ring trafc. The modes
are determined based on the location of the source and destination devices relative to the
bridging function. The source-route switch (SRS) decides whether to transparently bridge a
frame within a TrCRF or to source-route bridge to another TrCRF. When a station on a switch
port transmits to another station residing on a different port but belonging to the same TrCRF,
the SRS forwards the frame based on the destination MAC address. The SRS learns source
MAC addresses and makes forwarding decisions the same as a transparent bridge. However,
if the source and destination are on different rings, the source creates a frame with a RIF. The
SRS examines the RIF and passes the frame to the bridge relay function for forwarding.
Although this sounds like a source-route bridge, a signicant difference distinguishes an SRS
from an SRB. When a station transmits an ARE, an SRB modies the RIF to indicate ring/bridge
numbers. An SRS never modies a RIF; it simply examines it. When a source sends an all-routes
76 Chapter 3: Bridging Technologies
explorer, for example, it sets the RII bit to one, indicating the presence of a RIF. Examination
of the initial explorer frame, however, reveals that the RIF is empty. The SRS notices that the
RII bit value is one and forwards the explorer to the TrBRF unmodied. The SRS simply says,
This is a source-routed frame; I better send it to the TrBRF and let the bridge worry about it.
In contrast, an SRB or TrBRF modies the explorer RIF by inserting ring and bridge numbers.
In Chapter 2, a broadcast domain was dened and equated to a VLAN; a broadcast domain
describes the extent to which broadcast frames are forwarded or ooded throughout the
network. The domain boundaries terminate at a router interface and include an
interconnected set of virtual bridges. But Token Ring complexities present ambiguities
when dening a VLAN. In a source-route bridged environment, Token Ring actually
creates two kinds of broadcasts: the intra-ring and the inter-ring broadcast.
A device generates an intra-ring broadcast whenever it produces a broadcast frame without
a RIF, and the explorer bit is set to zero. A station may do this whenever it wants to
determine whether the destination is on the same ring as the source. The SRS function
oods this frame type to all ports within the TrCRF. The frame does not cross the TrBRF.
In contrast, an inter-ring broadcast frame sets the explorer bit to one, enabling the frame
to cross ring boundaries. The TrBRF oods the inter-ring frame to all attached rings; all
rings receive a copy of the frame. Figure 3-10 illustrates Token Ring VLAN boundaries.
BRF - Inter-Ring
Broadcast
CRF - Intra-Ring
Broadcast cl250311
A network may see both the intra- and inter-ring broadcasts. Which one actually describes
a VLAN? A VLAN in the Token Ring network includes the TrCRF and the TrBRF. A
VLAN exists whenever Token Ring networks must interconnect through a router.
Ethernet or Token Ring? 77
Figure 3-11 Do not attempt this. Duplicate ring numbers are not allowed on multiple switches.
cl250312
Ring Ring Ring
To mitigate the effects of duplicate ring numbers in a switched network, Cisco developed a
proprietary protocol to detect them and react accordingly. The Duplicate Ring Protocol
(DRiP) sends advertisements with the multicast address 01-00-0C-CC-CC-CC to its
neighbors, announcing VLAN information for the source device only. By default, DRiP
announcements occur every 30 seconds or whenever a signicant conguration change
occurs in the network. DRiP announcements only traverse ISL links and are constrained to
VLAN1, the default VLAN. The receiving Catalyst then compares the information with its
local conguration. If a user attempts to create a Token Ring VLAN that already exists on
another Catalyst, the local unit denies the conguration.
promises a user the token at a predictable rate, thereby allowing access to the network to send
critical commands. Ethernet could make no such claim; therefore, it was frequently dismissed
as a viable networking alternative on the manufacturing oor.
Another factor that caused users to select Token Ring was their mainframe computer type.
If the user had IBM equipment, they tended to use Token Ring because much of the IBM
equipment had Token Ring LAN interfaces. Therefore, IBM shops were practically forced
to use Token Ring to easily attach equipment to their networks.
In an ofce environment where no guarantees of predictability were necessary, Ethernet
found popularity. Ethernet cabling schemes were simpler than Token Ring, and
minicomputer manufacturers included Ethernet interfaces in their workstations. As users
became more comfortable with Ethernet, network administrators selected it more
frequently.
With the introduction of LAN switching technologies, Ethernet now nds application in the
manufacturing environment where it previously could not. Switching reduces the size of a
collision domain and provides a higher throughput potential that makes it possible to use in
manufacturing. Switched Ethernet with ports dedicated to single devices performs equally
as well or better than switched Token Ring networks in a similar conguration.
Summary
Bridging alternatives exist for Ethernet and Token Ring media types. Traditionally, Ethernet
uses transparent bridging with the following operations: learning (adding a source address/
interface to a bridge table), forwarding out a single interface (known unicast trafc),
ooding out all interfaces (unknown unicast, multicast and broadcast), ltering (unicast
trafc where the source and destination are on the same interface side), and aging
(removing an entry from the bridge table.)
Token Ring usually implements source-route bridging in which the source species the path
for the frame to the destination. This means the source identies a sequence of ring/bridge
hops. SRT (source-route transparent bridging), another Token Ring bridge method, does both
source-route bridging and transparent bridging. Source-route bridging is employed if the
frame includes a routing information eld. Otherwise, transparent bridging is used.
When connecting transparently bridged segments (Ethernet) to source-routed segments
(Token Ring), use routing or translational bridging. If you must use bridging, translational
bridging resolves a number of issues in the way that frames are constructed for the different
access methods. However, translational bridges must be aware of the protocols to be
translated. The best solution, though, is to use routing to interconnect mixed media
networks. Chapter 11, Layer 3 Switching, discusses using routers in a switched
environment.
Token Ring switching creates two functions to segment Token Rings: the concentrator relay
function and the bridge relay function. In the Catalyst, the Token Ring Concentrator Relay
Function (TrCRF) denes which ports belong to a ring. A TrCRF may operate either in SRB
or SRT mode. A TrCRF cannot span across Catalysts. The Token Ring Bridge Relay
Function (TrBRF) provides Token Ring bridging between TrCRFs. A TrBRF can span
across Catalysts to allow ports on different units to communicate through a common bridge.
Source-route switching determines whether a frame may be forwarded within the TrCRF
or whether the frame needs to be sent to the TrBRF. The SRS looks for the RII to make this
decision. If the RII indicates the presence of a routing information eld, it forwards the
frame to the TrBRF. Otherwise, the SRS keeps the frame within the TrCRF and uses
transparent bridging.
Review Questions
1 If a RIF is present in a source-routed frame, does a source-route switch ever examine
the MAC address of a frame?
2 To how many VLANs can a TrBRF belong?
3 The transparent-bridge learning process adds entries to the bridging table based on the
source address. Source addresses are never multicast addresses. Yet when examining
the Catalyst bridge table, it is possible to see multicast. Why?
80 Chapter 3: Bridging Technologies
4 Two Cisco routers attach to a Catalyst. On one router you type show cdp neighbor.
You expect to see the other router listed because CDP announces on a multicast
address 01-00-0C-CC-CC-CC, which is ooded by bridges. But you see only the
Catalyst as a neighbor. Suspecting that the other router isnt generating cdp
announcements, you enter show cdp neighbor on the Catalyst. You see both routers
listed verifying that both routers are generating announcements. Why didnt you see
the second router from the rst router? (Hint: neither router can see the other.)
This page intentionally left blank
This chapter covers the following key topics:
Catalyst 5000/6000 CLI Syntax ConventionsProvides the standard Cisco
representation for interpreting commands administered on Catalyst switches.
Catalyst 5000 Conguration MethodsProvides information on how to operate
under the Console, Telnet, and TFTP conguration modes for Catalyst conguration.
Using the Catalyst 5000/6000 Command-Line InterfaceDescribes command-line
recall, editing, and help for the Catalyst 5000 series.
PasswordsProvides documentation on how to set, change, and recover passwords
for the Catalyst 5000/6000 series of switches.
Conguration File ManagementDiscusses how to store and restore conguration
les on ash and TFTP servers for Supervisor I, II, and III modules.
Image File ManagementDescribes how to transfer Supervisor I, II, and III module
software images.
Redundant Supervisor ModulesDiscusses how to implement redundant
Supervisor modules to ensure system operation in the event of a module failover.
Conguring Other CatalystsProvides a quick overview of the conguration
methods for the 1900/2800 and the 3000 series of Catalyst switches.
CHAPTER
4
NOTE Cisco folklore has it that XDI is the name of a UNIX-like kernel purchased for use in
equipment that evolved into the Catalyst 4000, 5000, and 6000 products of today. The XDI
CLI is often referred to as CatOS.
The Catalyst product family evolution does not have the same roots as the Cisco router
products. Ciscos history begins with the development of routers to interconnect networks. As
the router family increased, a number of differences between the early models and the later
became evident. Particularly with the release of 9.1x, the command line interface vastly
differed for the IOS. But the IOS essentially retained the same look and feel after that point
across all of the router family. Users of the Catalyst on the other hand may encounter multiple
CLIs dependent upon the model used. This occurs not because Cisco changed its mind on how
to present the CLI, but because some of the products were acquired technologies with a
previously installed user base. For example, some of the Catalysts such as the 1900 and 2800
came from Grand Junction and have their own conguration methods. Some come from
Kalpana, such as the Catalyst 3000, and use a different menu structure. Some were developed
by Cisco. For example, the 8500 and the 2900XL, and use IOS type congurations. The
Catalyst 5000 family originated with Crescendo. When Cisco acquired Crescendo, a
signicant user base already familiar with the XDI/CatOS conguration modes existed. The
Catalyst 5000 and 6000 series use a CLI which differs from all of the others.
84 Chapter 4: Conguring the Catalyst
This chapter provides an overview for conguring the Catalyst 4000/5000/6000 series
products. The CLI syntax and conventions are covered, along with command recall and
editing methods. Methods for storing and retrieving conguration les images are also
explained. Finally, conguring and managing redundant supervisor modules in a Catalyst
5500/6000/6500 are discussed.
From the Supervisor console, or via Telnet, you can clear the Catalyst conguration with
the clear cong all command. clear cong all in Example 4-1 resets the Supervisor module
to its defaults. Note that this command does not clear the les for the ATM LANE module,
nor for the RSM (or MSM in a Catalyst 6000). This only affects the modules directly
congured from the Supervisor module. To clear the congurations on the ATM or router
modules, you need to access the modules with the session module_number command. This
command performs the equivalent of an internal Telnet to the module so that you can make
conguration changes. The ATM and router modules use IOS commands to change, save,
and clear congurations.
86 Chapter 4: Conguring the Catalyst
TIP Conguring the Catalyst through the console and through Telnet allows you to enter
commands in real time, but only one at a time. Unlike Cisco routers, the Catalyst
immediately stores commands in nonvolatile random-access memory (NVRAM) and does
not require you to perform a copy run start like a router. Any command you type in a
Catalyst is immediately remembered, even through a power cycle. This presents a challenge
when reversing a series of commands. On a router, you can reverse a series of commands
with reload, as long as you didn't write the running conguration into NVRAM.
Before making serious changes to a Catalyst, copy the conguration to an electronic notepad.
On the Catalyst, use the command set length 0 to terminate the more function, enable screen
capture on your device, and enter show cong to capture the current conguration. Then if
you do not like the changes you made and cannot easily reverse them, clear cong all and
replay the captured conguration le to locally restore the starting conguration.
Console Conguration
The Catalyst 5000 series Supervisor module has one physical console connection. For a
Supervisor I or a Supervisor II, the connection is an EIA-232 25-pin connection. For a
Supervisor III module, the connection is an RJ-45 connector. Make sure that you know which
kind of Supervisor module you are working with to ensure that you can attach to the console.
The console has an interesting feature in that it can operate in one of two modes: either as
a console or slip interface. When used as a console, you can attach a terminal or terminal
emulation device such as a PC with appropriate software to the interface. This provides
direct access to the CLI regardless of the conguration. You use this access method when
you have no IP addresses congured in the Catalyst; without an IP address, you cannot
Telnet to the Catalyst over the network. You also use this method whenever you need to do
password recovery. (Password recovery is discussed in a later section.) And, you will
probably elect to access the Catalyst with this method whenever you are local to the
Catalyst with an available terminal.
You can enable the console port as a SLIP interface. (SLIP [serial line interface protocol] is
the precursor to PPP.) When used in the slip mode, you can Telnet directly to the console port.
In a likely setup, you attach a modem to the console port enabling you to Telnet directly to the
Catalyst without having to traverse the network. This can be useful when troubleshooting the
Catalyst from a remote location when you cannot access it over the network. When used as a
slip interface, the interface designator is sl0 [SL0]. You can use the interface as a direct
console attachment or a slip interface, but not both. It can only operate as one or the other. By
default, it operates as a console interface. To congure the console as a slip interface, you need
to assign an IP address to sl0 using the set interface command.
Lastly, you can access the CLI through Telnet over the network. The Catalyst has an
internal logical interface, sc0, that you can assign an IP address to. This address becomes
the source address when generating trafc in the Catalyst, or the destination address when
you attempt to reach the Catalyst. Assigning an address to this logical interface causes the
Catalyst 5000 Conguration Methods 87
Catalyst to act like an IP end station on the network. You can use the address to perform
Telnet, TFTP, BOOTP, RARP, ICMP, trace, and a host of other end station functions.
By default, the sc0 has no IP address and belongs to VLAN 1. If you want to change any of
these parameters, use the set interface command. You can modify sc0's IP address and
VLAN assignment in one statement. For example, set int sc0 10 144.254.100.1
255.255.255.0 assigns sc0 to VLAN 10 and congures an IP address of 144.254.100.1 with
a Class C IP mask.
Telnet Conguration
Before you can Telnet to a Catalyst, you need to assign an IP address to the sc0 interface on
the Supervisor module. The previous section, "Console Conguration," demonstrated how to
do this. You can Telnet to a Catalyst as long as your Telnet device can reach the VLAN and IP
network that the sc0 interface belongs to. Telnetting to the Catalyst allows you to perform any
command as if you were directly attached to the Catalyst console. You do, however, need to
know the normal mode and privileged EXEC passwords to gain access.
It was also mentioned earlier that if you enter clear cong all from a remote location, you
effectively cut yourself off from communicating with the Catalyst through the network.
Changing the IP address or VLAN assignment on sc0 can do the same thing. Therefore, be
sure to thoroughly review the results of changing the Catalyst IP address or VLAN
assignment remotely.
A Catalyst security feature allows you to specify an access list of authorized stations that
can access the Catalyst through Telnet or Simple Network Management Protocol (SNMP).
You can specify up to 10 entries through the set ip permit command. To enable the access
list, you need to specify the set of authorized stations and then turn on the IP permit lter.
To specify the list of allowed stations, use the command syntax set ip permit ip_address
[mask]. The optional mask allows you to specify wildcards. For example, if you type set ip
permit 144.254.100.0 255.255.255.0, you authorize all stations in subnet 144.254.100.0 to
access the console interface. If you enter the command set ip permit 144.254.100.10 with
no mask, the implied mask 255.255.255.255 is used, which species a specic host.
You can specify up to 10 entries this way. To activate the permit list, use the command set
ip permit enable.
Activating the list does not affect any other transit or locally originated IP processes such
as trace route and ICMP Echo Requests/Replies. The IP permit list only controls inbound
Telnet and SNMP trafc addressed to the Catalyst. If the source IP address does not fall
within the permitted range, the Catalyst refuses the connection.
TIP If you apply a permit list from a remote Telnet connection, ensure that you include yourself
in the permit list. Otherwise, you disconnect yourself from the Catalyst you are conguring.
88 Chapter 4: Conguring the Catalyst
You can also secure the Catalyst using Terminal Access Controller Access Control System
Plus (TACACS+) authentication. TACACS+ enables a communication protocol between
your Catalyst and a TACACS+ server. The server authenticates a user based upon the
username and password for each individual you want to access the Catalyst console.
Normally, the Catalyst authenticates using local parameters, the exec and enable
passwords. If the user accessing the Catalyst knows these passwords, the user is
authenticated to enter the corresponding mode.
TACACS+ allows you to demand not just a password, but also a username. The user
attempting to log in must have a valid username/password combination to gain access.
If a user attempts to log in to the Catalyst when TACACS+ is enabled, the Catalyst sends a
request to the TACACS+ server for authentication information. The server replies and the
user is either authenticated or rejected.
To enable TACACS+, you need to have a functional and accessible TACACS+ server in
your network. The specics of conguring TACACS+ is beyond the scope of this book. See
the Catalyst documentation for conguration details.
TIP If you congure TACACS+ on a Catalyst and the TACACS+ server becomes unavailable
for some reason, the locally congured normal and privileged passwords can be used as a
backdoor (be sure to set these passwords to something other than the default of no
password).
TFTP Conguration
The Catalyst has a TFTP client allowing you to retrieve and send conguration les from/
to a TFTP server. The actual syntax to do TFTP conguration le transfers depends upon
the version of Supervisor module installed in the Catalyst.
If you plan for the Catalyst to obtain the new conguration over the network after a clear
cong all, you either need to restore a valid IP address and default gateway setting or you
need to have the conguration le on an accessible TFTP server. Details for using TFTP are
described in the section, "Catalyst Conguration File Management."
Table 4-2 compares various access methods for conguring the Catalyst.
Using the Catalyst 5000/6000 Command-Line Interface 89
TIP You might occasionally see Cisco refer to the Catalyst 5000/6000 interface as an XDI
interface. This is Cisco's internal identication of the interface. Another name is CatOS.
90 Chapter 4: Conguring the Catalyst
Command-line recall and editing vastly differed prior to Catalyst code version 4.4. With
system codes prior to 4.4, command-line recall and editing consists of using UNIX shell-
like commands. To recall a previous command, you need to specify how many commands
back in the history le you want. Or, you can recall a command through pattern matching.
To edit a command line, you need to use UNIX-like commands that specify a pattern and
what to substitute in the pattern's place. Doing command-line editing in this manner is not
self intuitive for many users, unless the user is a UNIX guru.
With Catalyst Supervisor engine software release 4.4, Cisco introduced IOS-type command-
line recall and editing where up and down arrows on your terminal keypad scroll you through
the Catalyst's command history buffer. If you are familiar with command-line recall and
editing with a Cisco router, you will be comfortable with the Catalyst CLI. If however, you
still have code levels prior to 4.4, regrettably, you must continue to use the UNIX structure.
The manner in which the Catalyst displays help differs from the router displays. The router
uses a parameter-by-parameter method of displaying help, whereas the Catalyst displays a
complete command syntax.
The following sections describe command-line recall, editing, and help for the Catalyst
5000 series with the XDI/CatOS style interface.
Command-Line Recall
When you enter a command in the Catalyst, it retains the command in a buffer called the
history buffer. The history buffer can store up to 20 commands for you to recall and edit.
Various devices have methods of recalling commands. The Catalyst uses abbreviated key
sequences to recall commands. These sequences resemble what a UNIX c-shell user might
use. UNIX users often live with awkward methods of recalling and editing commands.
Therefore, their comfort level with the legacy Catalyst editing system is probably fairly
high, but might be low for the rest of us.
In UNIX, you often perform commands with a bang included in the command line. A bang
is nothing more than an exclamation point (!) on a keyboard, but "exclamation" is too
difcult to say when dictating commands. Therefore, bang is used in its place. Table 4-3
summarizes the key sequence for recalling previous commands in the history buffer.
Sometimes you not only want to recall a command, but also edit it. Table 4-4 shows the
sequences to recall and edit previous commands.
Suppose, for example, that you enter a command set vlan 3 2/1-10,4/12-216/1,5/7.
This command string assigns a set of ports to VLAN 3. However, you realize after entering
the command that you really mean for them to be in VLAN 4 rather VLAN 3. You could
retype the whole command a second time and move the ports to VLAN 4, or you could
simply type ^3^4. This forces the Catalyst not only to use the previous command, but to
change the number 3 to a number 4, which in this case, corrects the VLAN assignment.
One frustration when mentally recalling commands can be that you have a hard time
remembering what command you entered, seven lines previously. This can become
particularly challenging because the Catalyst history buffer can store up to 20 commands.
Use the history command to see your history buffer. Example 4-2 shows output from a
history command. Notice that the commands are numbered allowing you to reference a
specic entry for command recall. For example, the output recalls command 2 from the
history buffer. This caused the Catalyst to recall the history command. Note also that new
commands add to the bottom of the list. Newer commands have higher numbers.
Using Help
In a Cisco router, you access help by entering ? on a command line. The router then prompts
you with all possible choices for the next parameter. If you type in the next parameter and
type ? again, the router displays the next set of command-line choices. In fact, the router
displays help on a parameter-by-parameter basis. Additionally, when the router displays
help options, it also ends by displaying the portion of the command that you entered so far.
This enables you to continue to append commands to the line without needing to reenter
the previous portion of the command.
The Catalyst help system functions differently from the router, though. You access help in the
same manner as you do in a router, but the results differ. For example, where a router prompts
you for the next parameter, the Catalyst displays the entire usage options for the command, if
your command string is unique so that the Catalyst knows what command you want. Example
4-3 shows the help result for a partial command string. However, the string does not uniquely
identify what parameter should be modied, so it lists all set system commands.
On the other hand, if you have enough of the command on the line that the Catalyst
recognizes what command you intend to implement, it displays the options for that
command. This time, in Example 4-4, the string identies a specic command and the
Catalyst displays help appropriate for that command. The user wants to modify the console
interface in some way, but is unsure of the syntax to enter the command.
Example 4-4 Another Catalyst Help Example
Console> (enable) set interface ?
Usage: set interface <sc0|sl0> <up|down>
set interface sc0 [vlan] [ip_addr [netmask [broadcast]]]
set interface sl0 <slip_addr> <dest_addr>
Console> (enable)
Notice that when the console displays help, it returns the command line with a blank line. The
command string you entered so far is not displayed for you as it is on a router. You can now
elect to use command recall. Suppose you want to disable the logical interface, sc0. So you
want to enter the command set int sc0 down. Being a clever network administrator, you elect
to use command recall and complete the command. What happens if you type !! sc0 down ?
Using the Catalyst 5000/6000 Command-Line Interface 93
You see the command usage screen again, without the console changing state to down (see
Example 4-5). This happens because the command recall executes the previous statement that
was set int ? with the help question mark and your appended parameters. When you add the
additional parameters, the Catalyst interprets the string as set int ? sc0 down , sees the
question mark, and displays help.
Example 4-5 Command Recall after Help
CAT1> (enable) set int ?
Usage: set interface <sc0|sl0> <up|down>
set interface sc0 [vlan] [ip_addr [netmask [broadcast]]]
set interface sl0 <slip_addr> <dest_addr>
CAT1> (enable) !! sc0 down
set int ? sc0 down
Usage: set interface <sc0|sl0> <up|down>
set interface sc0 [vlan] [ip_addr [netmask [broadcast]]]
set interface sl0 <slip_addr> <dest_addr>
CAT1> (enable)
If you have system code 4.4 or later, you can use the up/down arrow to perform command
recall after help, but the recall includes the question mark. The advantage here, though, over
the !! recall is that you can edit out the question mark on the recalled command line using
router editing commands. Therefore, you can perform command recall, remove the
question mark, and enter the rest of the command. The Catalyst then correctly interprets the
command, assuming that you subsequently enter correct and meaningful parameters.
A Catalyst invokes help when you enter a question mark on the command line. It also
provides help if you enter a partial command terminated with <ENTER>. For example, the
command in Example 4-4 displays the same screen if the user enters set interface
<ENTER>. The Catalyst uniquely recognizes set int, but also observes that the command
is not complete enough to execute. Therefore, the Catalyst displays the command usage
screen. If you intend to modify the sc0 VLAN membership to VLAN 5 and change the IP
address in the same line, you can enter the command set int sc0 5 144.254.100.1
255.255.255.0. Suppose that as you enter the command you enter the VLAN number, but
forget the rest of the command line. You might be tempted to hit <ENTER> to get a
command usage screen. But you do not see the usage screen. Instead, the Catalyst sees the
current command line and says, "There is enough on this line to execute, so I will." You just
successfully changed the sc0 VLAN membership without changing the IP address. If you
do this through a Telnet session in a production network, you probably just completely
removed Telnet access to the Catalyst. It is now time to walk, drive, or y to the Catalyst to
restore connectivity. (Or call someone who can do it for you and confess your mistake!)
TIP In many cases, you can get usage help with a partial command and <ENTER>. However,
it is best to use the question mark to ensure that you do not prematurely execute a command
that might prove to be catastrophic to your network and career.
94 Chapter 4: Conguring the Catalyst
Note in Example 4-6 that the le collates in logical sections. First, the Catalyst writes any
globally applicable conguration items such as passwords, SNMP parameters, system
variables, and so forth. Then, it displays congurations for each Catalyst module installed.
Note that the module conguration les refer to Spanning Tree and VLAN assignments.
Further, it does not display any details about other functions within the module. For
example, an RSM is installed in module 5 of this Catalyst. Although this is a router module,
it attaches to a virtual bridge port internally. The Catalyst displays the bridge attachment
parameters, but not the Route Switch Module (RSM) or ATM LANE conguration lines.
To see the these module specic congurations, you need to access them with the session
module_number and view its own conguration le.
98 Chapter 4: Conguring the Catalyst
Other show commands display item specic details. For example, to look at the current
console conguration, you can use the show interface (sh int) command as demonstrated
in Example 4-7.
Another useful show command displays the modules loaded in your Catalyst (see Example 4-8).
Mod MAC-Address(es) Hw Fw Sw
--- ---------------------------------------- ------ ------- ----------------
1 00-90-92-bf-70-00 thru 00-90-92-bf-73-ff 1.5 3.1(2) 3.1(1)
3 00-10-7b-4e-8d-d0 thru 00-10-7b-4e-8d-e7 1.1 2.3(1) 3.1(1)
4 00-10-7b-42-b0-59 2.1 1.3 3.2(6)
5 00-e0-1e-91-da-e0 thru 00-e0-1e-91-da-e1 5.0 20.7 11.2(12a.P1)P1
Mod Sub-Type Sub-Model Sub-Serial Sub-Hw
--- -------- --------- ---------- ------
1 EARL 1+ WS-F5520 0008700721 1.1
1 uplink WS-U5531 0007617579 1.1
Console> (enable)
The output in Example 4-8 displays details about the model number and description of the
modules in each slot. The second block of the output tells you what MAC addresses are
associated with each module. Notice that the Supervisor module reserves 1024 MAC
addresses. Many of these addresses support Spanning Tree operations, but other processes
are involved too. Module 3, the 24-port Ethernet module, reserves 24 MAC addresses, one
for each port. These also support Spanning Tree in that they are the values used for the port
ID in the Spanning Tree convergence algorithm. The third block of the display offers details
regarding the Supervisor module.
Other key show statements are demonstrated throughout the rest of the book.
Using the Catalyst 5000/6000 Command-Line Interface 99
Clearly, there are several system variables that you can modify. Example 4-9 modies the
system location object.
Some commands provide a warning if your action might cause connectivity problems for
you or the users. For example, in Example 4-10, the user intends to change the IP address
of the console interface. If the user is making the change remotelythat is, the user is
logged in to the Catalyst through a Telnet sessionthe user loses connectivity and needs to
re-establish the Telnet session.
Use a clear command to restore a parameter to a default value. Suppose you have a VLAN 4
congured on the Catalyst and want to remove it. You use the command clear vlan 4. This
eliminates references to VLAN 4 in the Catalyst. However, some things associated with VLAN
4 are not eliminated. For example, if you have ports assigned to VLAN 4 and you clear vlan 4,
the ports assigned to VLAN 4 move into a disabled state. They do not move to VLAN 1. You
need to manually reassign the ports to VLAN 1. The clear cong command demonstrated in
Example 4-1 returns the whole Catalyst to its default out-of-the-box conguration. The Catalyst
warns you about potential connectivity issues with this command before executing it.
The user then changes the normal mode password with the set password command. As
with the enable password, the user has to know the existing password before the Catalyst
allows any changes to the password. Upon entering the correct password, the Catalyst asks
for the new password and a conrmation.
TIP When the Catalyst asks during the password recovery process what to use for the new
password, simply respond with <ENTER> too. Otherwise, trying to type in new passwords
sometimes leads to a need to reboot again, particularly if you are a poor typist. By initially
setting the password to the default value, you minimize your probability of entering a bad
value. After setting the enable and EXEC passwords to the default, you can at your leisure
go back and change the values without the pressure of completing the process during the
30 seconds provided for password recovery.
TIP As with many security situations, it is imperative that you consider physical security of your
boxes. As demonstrated in the password recovery process, an attacker simply needs the
ability to reboot the Catalyst and access to the console to get into the privilege mode. When
in the privilege mode, the attacker can make any changes that he desires. Keep your
Catalyst closets secured and minimize access to consoles.
102 Chapter 4: Conguring the Catalyst
TIP A security conguration issue of which you should be aware: change the SNMP
community strings from their default values. Example 4-6 shows the output from a Catalyst
conguration le where the SNMP community strings are still at their default value. A
system attacker might use SNMP to change your system. He starts with these common
default values. Make them difcult to guess, but remember that they are transmitted over
the network in clear text and are, therefore, snoopable.
TIP TFTP servers are inherently weak, security wise. It is strongly recommended that you do
not keep your conguration les in a TFTP directory space until you actually need to
retrieve the le. System attackers who compromise your TFTP server can modify the
conguration les without your knowledge to provide a security opening the next time a
device downloads the conguration le from the server. Move your conguration les to
secure directory spaces and copy them back to the TFTP directory space when you are
ready to use them.
Although this adds another step to your recovery process, the security benets frequently
outweigh the procedural disadvantages.
Retrieving a le from the server uses the command congure network. When retrieving a
le, you need to specify the source lename on the TFTP server (see Example 4-13).
Example 4-13 Retrieving a Conguration File
Console> (enable) configure ?
Usage: configure <host> <file>
Console> (enable) configure network
IP address or name of remote host? 144.254.100.50
Name of configuration file? cat
Configure using cat from 144.254.100.50 (y/n) [n]? y
/
Finished network download. (6193 bytes)
[Truncated output would show the configuration lines]
Note that in the command usage output, the congure network option is not displayed.
However, it is a valid option to use.
If you store your conguration le in your ash, you can recover it with the command copy
ash {tftp | le-id | cong}. Again, any of three destinations are possible.
If the le is on any other ash device, use the command copy le-id {tftp | ash | le-id |
cong}.
If you have the console port congured as a slip interface rather than a console, you can use
TFTP to transfer the image through the port.
TIP When using redundant uplink ports on a standby Supervisor, be sure that you do not congure
more than 20 VLANs in current versions of the software. Doing so can potentially create
Spanning Tree loops, increase convergence time, and decrease network stability.
There are a few things that must be true for the failover to function correctly:
1 The Catalyst 5000 Supervisor module must be a Supervisor Engine II or later. The
legacy Supervisor module that is associated with the Catalyst 5000 series does not
support the redundant features. In fact, the Supervisor I module does not even work
in a 5500 chassis. All Catalyst 6000 Supervisors support redundancy.
2 The active and the standby Supervisor engines must be the same model. For Catalyst
5500s, they must both be Supervisor II or both be Supervisor III. You cannot mix a
Supervisor II and a Supervisor III in the same unit. The reason for this will become
clear shortly.
3 If you use two Catalyst 5000 Supervisor III modules, the feature cards on the two
cards must be the same. If you have the NetFlow Feature Card (NFFC) card on one,
they must both have NFFC capability.
Redundant Supervisor Modules 107
The rst three items you must administratively ensure. You must select the appropriate
hardware to support the redundancy feature. The Catalyst cannot do this for you.
However, the Catalyst can greatly help you regarding the last two items. The system code
automatically synchronizes software between the two Supervisor modules to ensure that
they are running the same les. This helps to ensure that, in the event of a failover, the
failover condition can support all congured features that were running when the rst unit
was the active unit. Most network engineers can deal with a failover situation and replace
a failed module. But when the network operational mode changes as a result of the failover
("oh no, everything now is in VLAN 1!"), they become very unhappy. So do users.
TIP Remember that any redundant Supervisor module you insert into the Catalyst acquires the
conguration of the operating active Supervisor module. Do not insert a module with an
"updated" conguration le and expect it to modify the active unit. You lose your updated le.
1 The active unit checks to see if the boot images have the same time stamp. If they
differ, the active Supervisor module updates the standby Supervisor module.
2 If you change the BOOT environment variable, the active Supervisor module copies
the new target boot image to the standby, if necessary, and modies the environment
variable on the standby so that they start with the same image.
3 If you upgrade the boot image on the active unit, the active Supervisor module updates
the standby.
Notice that images and congurations ow from the active to the standby units, never the other
direction. You cannot update the image on the standby module rst and then synchronize the
active module with the standby. The standby always synchronizes to the active.
When using Catalyst 5000 Supervisor Engine III modules and all Catalyst 6000 modules,
additional elements are compared between the active and standby engines. Specically, the
active module compares not just the boot image, but also the run-time image. The run-time
image is the image used by the ROM monitor to boot the Supervisor module. If the run-
time image successfully loads, and there is a boot image, and a BOOT environment variable
pointing to the boot image, the Catalyst also loads the boot image, which is your desired
operational image.
The Supervisor ensures that the run-time images are the same on the two modules. As with
the boot image, if they differ, the active unit synchronizes the two by downloading a new
run-time image. Therefore, the active Supervisor module performs the following to
determine if there is a need to synchronize:
1 The active unit checks the timestamp on the run-time image. If they differ, it initiates
the synchronization process.
2 If you overwrite the run-time image on the active unit, the active module synchronizes
the standby unit.
SwitchProbe...
continues
110 Chapter 4: Conguring the Catalyst
RMON Configuration
Review Questions
1 What happens if you replace the active Supervisor module?
2 If your redundant Supervisor engines are running software version 4.1, the uplink
ports on the standby engine are disabled until it becomes the active Supervisor. What
strategies might you need to employ to ensure that failover works for the uplinks?
3 Table 4-4 shows how to recall and edit a command from the history buffer. How would
you recall and edit the following command so that you move the ports from VLAN 3
to VLAN 4?
set vlan 3 2/1-10,3/12-21,6/1,5,7
4 What happens if you congure the Supervisor console port as sl0, and then you
directly attach a PC with a terminal emulator through the PC serial port?
5 The Catalyst 5500 supports LS 1010 ATM modules in the last 5 slots of the chassis.
Slot 13 of the 5500 is reserved for the LS1010 ATM Switch Processor (ASP). Can you
use the session command to congure the ASP?
6 The command-line interface has a default line length of 24 lines. How can you
conrm this?
This page intentionally left blank
This chapter covers the following key topics:
What is a VLAN?Provides a practical and technical denition of virtual LANs.
VLAN TypesDescribes how Layer 2, 3, and 4 switching operate under a VLAN.
802.1Q: VLAN InteroperabilityDescribes the IEEE 802.1Q committee's effort to
develop a vendor-independent method to create virtual bridged local area networks
via shared VLANS (SVLs).
Justifying the Need for VLANsDescribes how network security, broadcast
distribution, bandwidth utilization, network latency from routers, and complex access
lists justify the need for conguring VLANs. This section also details the improper
motivation for setting up a VLAN.
Catalyst VLAN CongurationDescribes how to plan for, create, and view VLAN
congurations.
Moving Users in VLANsDescribes how VLANs simplify moving a user from one
location to another.
Protocol FilteringDescribes how to control ooding of unneeded protocols.
CHAPTER
5
VLANs
When the industry started to articulate virtual LANs (VLANs) in the trade journals and the
workforce, a lot of confusion arose. What exactly did they mean by VLAN? Authors had
different interpretations of the new network terminology that were not always consistent with
each other, much less in agreement. Vendors took varied approaches to creating VLANs
which further muddled the understanding. This chapter presents denitions for VLANs as
used in the Catalyst world and explains how to congure VLANs. It also discusses reasons to
use and not use VLANs and attempts to clarify misinformation about them.
What Is a VLAN?
With many denitions for VLAN oating around, what exactly is it? The answer to this
question can be treated in two ways because there is a technical answer and a practical
answer. Technically, as set forth by IEEE, VLANs dene broadcast domains in a Layer 2
network. As demonstrated in Chapter 2, Segmenting LANs, a broadcast domain is the
extent that a broadcast frame propagates through a network.
Legacy networks use router interfaces to dene broadcast domain boundaries. The inherent
behavior of routers prevents broadcasts from propagating through the routed interface.
Hence routers automatically create broadcast domains. Layer 2 switches, on the other hand,
create broadcast domains based upon the conguration of the switch. When you dene the
broadcast domain in the switch, you tell the switch how far it can propagate the broadcast.
If the switch receives a broadcast on a port, what other ports are allowed to receive it?
Should it ood the broadcast to all ports or to only some ports?
Unlike legacy network drawings, you cannot look at a switched network diagram and know
where broadcast domains terminate. Figure 5-1 illustrates a legacy network where you can
clearly determine the termination points for broadcast domains. They exist at each router
interface. Two routers dene three domains in this network. The bridge in the network
extends Broadcast Domain 2, but does not create a new broadcast domain.
114 Chapter 5: VLANs
Broadcast Domain 2
Broadcast Domain 3
CL250501
Broadcast Domain 1
In the switched network of Figure 5-2, you cannot determine the broadcast domains by
simple examination. The stations might belong to the same or multiple broadcast domains.
You must examine conguration les in a VLAN environment to determine where
broadcast domains terminate. Without access to conguration les, you can determine the
broadcast domain extent with network analysis equipment, but it is a tedious process. How
to do this is left as a review question at the end of this chapter.
What Is a VLAN? 115
CL2505s0
Even though you cannot easily see the broadcast domains in a switched network does not
mean that they do not exist. They exist where you dene and enable them. Chapter 2
presented a discussion on switches and compared them to bridges. A switch is a multi-port
bridge that allows you to create multiple broadcast domains. Each broadcast domain is like
a distinct virtual bridge within the switch. You can dene one or many virtual bridges within
the switch. Each virtual bridge you create in the switch denes a new broadcast domain
(VLAN). Trafc from one VLAN cannot pass directly to another VLAN (between
broadcast domains) within the switch. Layer 3 internetworking devices must interconnect
the VLANs. You should not interconnect the VLANs with a bridge. Using a bridge merges
the two VLANs into one giant VLAN. Rather, you must use routers or Layer 3 switches to
interconnect the VLANs. Each of the four switches belong to two VLANs. A total of three
broadcast domains are distributed across the switches. Figure 5-3 shows a logical
representation of a switched network.
116 Chapter 5: VLANs
Broadcast Domain 2
Br
Do oad
m ca Broadcast
ai st
n Domain 1
3
CL250503
Rather than representing VLANs by designating the membership of each port, each switch
has an internal representation for each virtual bridge. This is not a common way of
illustrating a VLAN network, but serves to exaggerate the internal conguration of a LAN
switch where each bridge within the switch corresponds to a single VLAN.
VLAN Types
The IEEE denes VLANs as a group of devices participating in the same Layer 2 domain.
All devices that can communicate with each other without needing to communicate through
a router (only use hubs/repeaters and bridges, real or virtual) share the broadcast domain.
The Layer 2 internetworking devices move frames through the broadcast domain by
examining the destination MAC address. Then, by comparing the destination address to a
table, the device can determine how to forward the frame towards the destination.
Some devices use other header information to determine how to move the frame. For example,
Layer 3 switches examine the destination and source IP address and forward frames between
broadcast domains when needed. Traditionally, routers perform Layer 3 switching. Frames
enter the router, the router chooses the best path to get to the destination, and the router then
VLAN Types 117
forwards the frame to the next router hop as shown in Figure 5-4. The routing protocol that
you activate in the router determines the best path. A best path might be the fewest hops. Or,
it might be the set of highest bandwidth segments. Or, it might be a combination of metrics.
In Figure 5-4, only one choice exists to get from Station A to Station B.
Router 1 Router 2
Router 3
CL250504
When the frame enters Router 2, the router not only determines the next hop to move the
frame toward the destination, but it also performs a new Layer 2 encapsulation with a new
destination/source MAC address pair, performs some Layer 3 activities such as
decrementing the TTL value in an IP header, and calculates a new frame check sequence
(FCS) value. Router 3 performs a similar set of actions before sending the frame to Station
B. This is often called packet-by-packet switching.
The same process still occurs if you replace the shared wire segments in Figure 5-4 with a
Layer 2 switched network. Figure 5-5 illustrates a similar network using Layer 2 switches and
Layer 3 routers to interconnect the broadcast domains (VLANs). To get from Station A to B
in the switched network, the frame must pass through three routers. Further, the frame must
transit the link between Cat-C and Cat-D twice. Although this might be an exaggeration for
such a small network, this can frequently happen in larger scale networks. In an extreme case,
the frame can travel through the Layer 2 switched network multiple times as it passes from
router to router on its way to the destination.
118 Chapter 5: VLANs
172.16.1.2
Router 1
172.16.2.2
172.16.2.1
Router 2
Cat-A Cat-B 172.16.3.1
Cat-D Cat-C
172.16.3.2 172.16.4.2
Router 3
B
172.16.4.1
L250505
Layer 3 switching, on the other hand, circumvents the multiple entries and exits of the frame
through routers. By adding a Netow Feature Card and enabling Multilayer Switching (MLS)
in a Catalyst 5000 supervisor module, a Catalyst 5000/5500 can rewrite a frame header like a
router does. This gives the appearance of the frame passing through a router, yet it eliminates
the need for a frame to actually pass in and out of a router interface. The Catalyst learns what
to do with the frame header by watching a locally attached router. MLS is discussed in more
detail in Chapter 11, Layer 3 Switching. MLS creates a shortcut around each router as
shown in Figure 5-6. When multiple routers are in the system, multiple MLS shortcuts exist
between the source and destination devices. These shortcuts do not violate any Layer 3
routing rules because the NFFC does not perform any rewrites until the frames initially pass
VLAN Types 119
through a router. Further, when it does create the shortcut, the NFFC rewrites the frame header
just as the router does.
Router 1
172.16.1.2
172.16.2.2
172.16.2.1
1 2 Router 2
172.16.3.1
Cat-A Cat-B
3
4
5
Cat-D Cat-C
172.16.3.2 172.16.4.2 172.16.4.1
Router 3
B
CL250506
Another type of Layer 3 switching, Multiprotocol over ATM (MPOA), even eliminates the
need to repeatedly pass a frame through the switched cloud. Functionally, MPOA in ATM
equates to MLS in a frame network in that they both bypass routers. The routers in Figure
5-7 attach directly to an ATM cloud. Normally, when Station A wants to communicate with
Station B, frames must pass in and out of the routers just as they do in the basic routed
example of Figure 5-4. In Figure 5-7, the frames normally pass through the ATM cloud four
times to reach Station B. However, MPOA creates a shortcut between two devices residing
in different broadcast domains as shown in Figure 5-7. See Chapter 10, Trunking with
Multiprotocol Over ATM, for more details.
120 Chapter 5: VLANs
A
172.16.1.1
Cat-A Cat-B
172.16.1.2 172.16.2.1
172.16.2.2 172.16.3.1
MPOA Shortcut
172.16.3.2
172.16.4.2
Cat-D Cat-C
172.16.4.1
CL250507
B
Other VLAN types use combinations of Layer 2, Layer 3, or even Layer 4 to create
shortcuts in a system. Layer 4 switching creates shortcuts based upon the Layer 3 addresses
and upon the Layer 4 port values. This is sometimes called application switching and
provides a higher level of granularity for switching. Chapter 11 provides a more thorough
discussion on this subject in the context of MLS.
Table 5-1 summarizes the various switch types found in the industry.
802.1Q: VLANs and Vendor Interoperability 121
Figure 5-8 An SVL Capable Switch and a Router that Routes and Bridges
A
VLAN1 IP 172.16.1.0
1 Net BEUI
R1 2
VLAN2 IP 172.16.2.0
R2 3 4 Net BEUI
Switch-A
250508
When Station A transmits an IP frame to Station B, Station A transmits a frame with the router's
R1 MAC address as the destination and Station A's address as the source. The router routes the
frame to Station B using the router's R2 MAC address for the source and Station B's MAC
address for the destination. When Station B responds to Station A, the switch learns about
Station B on Port 4 Switch-A's bridge table looks like that shown in Table 5-2, Event 1. The
table shows the four ports on Switch-A (1, 2, 3, and 4) and the learned MAC addresses on each
of the ports. Ports 1 and 2 belong to VLAN 1, and Ports 3 and 4 belong to VLAN 2. The MAC
addresses are represented as A and B for the two workstations and R1 and R2 for the two router
interfaces. Switch-A knows about Station As MAC address on Port 1 and the router interfaces
R1 and R2 MAC addresses on Ports 2 and 3, respectively. When Station A transmits a
NetBEUI frame, the switch relearns Station As MAC address on Port 1. When a router routes
a frame, it replaces the source MAC address with its own MAC address. But, when the router
bridges the frame out interface R2, the router does not replace the MAC address header and
passes the original source MAC address through. Therefore, Switch-A now sees a frame from
Station A on Port A.3. This causes the switch to believe that Station A moved. The bridge table
now looks like Event 2 in Table 5-2. When Station B attempts to respond to Station A, the
switch forwards the frame to router interface R2. But when the router sends the frame out
interface R1, the switch does not forward the frame to Port 1. Rather, the switch lters the
frame because the switch believes that Station A is on Port 3, a different VLAN.
One nal deciency with 802.1Q concerns Spanning Tree. With 802.1Q, there is currently
only one instance of Spanning Tree. This forces all VLANs to the same Spanning Tree
topology which might not be optimal for all VLANs. The Catalyst, for example, uses
multiple instances of Spanning Tree: one for each VLAN. This allows you to optimize the
topology for each VLAN. Part II of this book, Spanning Tree provides details on the
multiple instances of Spanning Tree.
The Catalyst does not use SVL tables. Rather, it uses Independent VLAN learning (IVL)
which allows the same MAC address to appear in different broadcast domains. An IVL-
capable device maintains independent bridge tables for each VLAN, allowing devices to
reuse a MAC address in different VLANs. All of the Catalyst bridging examples in this
book illustrate IVL methods.
Engineering
50509
One way to eliminate the problem is to move all accounting users onto the same segment. This
is not always possible because there might be space limitations preventing all accountants from
sharing a common part of the building. Another reason might be due to geography. Users on one
segment might be a considerable distance from users on the other segment. To move the users
to a common location might mean moving the employee's household from one city to another.
A second method is to replace accounting with marketing. Who really wants to look at
marketing data anyway, except for a good laugh? But accounting cannot distribute pay
checks, and marketing tries to get our money. Clearly this is not a good solution.
A third approach is through the use of VLANs. VLANs enable you to place all process-related
users in the same broadcast domain and isolate them from users in other broadcast domains.
You can assign all accounting users to the same VLAN regardless of their physical location
in the facility. You no longer have to place them in a network based upon their location. You
can assign users to a VLAN based upon their function. Keep all of the accounting users on
one VLAN, the marketing users on another VLAN, and engineering in yet a third.
By creating VLANs with switched network devices, you create another level of protection.
Switches bridge trafc within a VLAN. When a station transmits, the frame goes to the
intended destination. As long as it is a known unicast frame, the switch does not distribute
the frame to all users in the VLAN (see Figure 5-10).
Figure 5-10 A Known Unicast Frame in a Switched Network
A B
Justifying the Need for VLANs 125
TIP If broadcasts are a problem in your network, you might mitigate the effect by creating
smaller broadcast domains. This was described in Chapter 2. In VLANs, this means
creating additional VLANs and attaching fewer devices to each. The effectiveness of this
action depends upon the source of the broadcast. If your broadcasts come from a localized
server, you might simply need to isolate the server in another domain. If your broadcasts
come from stations, creating multiple domains might help to reduce the number of
broadcasts in each domain.
126 Chapter 5: VLANs
NOTE Although each port of a Catalyst behaves like a port on a bridge, there is an exception. The
Catalyst family has group switch modules where ports on the module behave like a shared
hub. When devices attach to ports on this module, they share bandwidth like a legacy
network. Use this module when you have high density requirements, and where the devices
have low bandwidth requirements, yet need connectivity to a VLAN.
In most normal situations, then, a station only sees trafc destined specically for it. The
switch lters most other background trafc in the network. This allows the workstation to
have full dedicated bandwidth for sending and/or receiving frames interesting trafc.Unlike
a shared hub system where only one station can transmit at a time, the switched network in
Figure 5-11 allows many concurrent transmissions within a broadcast domain without
directly affecting other stations inside or outside of the broadcast domain. Station pairs A/B,
C/D, and E/F can all communicate with each other without affecting the other station pairs.
Figure 5-11 Concurrent Transmissions in a Catalyst
E F
A C
B D
VLAN 1
Justifying the Need for VLANs 127
Network A
FDDI
Token Token
Ring Ring
Network B
CL250512
Figure 5-12 shows Station A attached to Network A in a legacy network. The user's highly
educated, capable, and rational manager decides that the user needs to move to another
location. As the network administrator, you inquire about the motivation for the move and
learn that, It's none of your business, which is shorthand for, I don't have anything better
to do than to reassign employee locations. Being a diligent network administrator, you
quickly recognize an important task, so you set about to plan the move.
prove to be feasible, the motivations still apply in smaller areas. That is why this section
covers the initial benets of end-to-end VLANs.
What issues do you need to consider to support the move? Many issues ranging from Layer
1 through Layer 7 of the OSI model. Ignore Layers 8 (nancial), 9 (political), and 10
(religious) for now because these are not ofcially in the OSI denition.
Table 5-3 summarizes a number of issues that should concern you.
You must deal with all of these issues when you move users in a legacy network environment.
Layer 1 and Layer 2 issues can create some undesirable situations like forcing you to change
a user from Ethernet to Token Ring because the new location uses that access method. This
should cause you to worry about compatibility with the user's upper layer protocols. If you
need to attach to a different network type, you need to change the workstation's NIC and
associated drivers. You might think that the drivers are compatible with the upper layers, but
you might discover at 5:00 PM on a Friday evening that they are not.
Maybe you need to use ber optics to reach from the new location to a hub because the
distance is too long. Or, you might use ber because the cable runs through an electrically
noisy environment.
Possibly, new switches or hubs need to be installed to support the relocated user because all
of the other interfaces might be occupied. If you install a new repeater/hub, make sure that
you do not exceed the collision domain extent. If you install a new switch, you need to
congure the correct VLAN setting and any other appropriate parameters.
Although all layers create headaches at one time or another for network administrators,
Layer 3 creates irksome migraines. Layer 3 issues are even more complex, because they
frequently involve changes in equipment conguration les. When you move the user, he
might attach to an entirely different logical network than where he was originally. This
creates a large set of potential actions on your part. For example, because the user now
attaches to a different network, you need to modify his host address. Some of this pain is
lessened through the use of Dynamic Host Conguration Protocol (DHCP) to automatically
acquire an IP address. This works even when moving a user from one VLAN to another.
Even more annoying, you might need to modify any policy-based devices to allow the new
address to reach the same services as prior to the move. For example, you might need to
modify access lists in routers to enable the station's frames to transit the network to reach a
le server. Remember that routers evaluate access lists from the top of the list to the bottom
and stop whenever a match occurs. This means that you need to be sure to place the new
entry in the correct location in the access list so that it is correctly evaluated. If any rewalls
exist between the station and its resources, you need to ensure that the rewall's settings
permit access to all desired resources.
Yet another item you must consider involves a combination of higher and lower layers.
What bandwidth does the user's applications require? Can the network provide the same
bandwidth for the paths that the frames must now transit? If not, you might have some
serious network redesign in front of you.
L250513
VLANs do not eliminate Layer 1 or Layer 2 issues. You still need to worry about port
availability, media and access types, and the distance from the station to the switch.
You still need to worry about higher layer issues such as bandwidth to resources. The
switched network cannot implicitly guarantee bandwidth. It does, however, offer you
exible alternatives to install more bandwidth between switches without redesigning a
whole network infrastructure. For example, you can install more links between Catalysts,
or you can move to higher speed links. (Inter-Catalyst connection options are reviewed in
Chapter 8, Trunking Technologies and Applications.) Upgrading to a higher speed link
does not force you to install a new access method. You can upgrade from a 10 Mbps to a
Fast Ethernet or Gigabit Ethernet solution fairly easily and transparently to users.
Obviously, similar solutions are available in routers too, but you might not be able to obtain
the port density that you want to service many stations.
Moving Users in VLANs 133
VLANs do not directly help mitigate lower layer or higher layer difculties in a legacy
LAN. Other than for the possibility of user stations experiencing more bandwidth with
switched VLAN equipment, why use VLANs? Here is the good news: in a VLAN
environment, Layer 3 issues no longer need to be a concern as they were in legacy network
designs. When moving the user in Figure 5-13, you can congure the switch port at the new
location to belong to the same VLAN as at the old location. This allows the user to remain
in the same broadcast domain. Because the user belongs to the same broadcast domain, the
routers and rewalls view the user as belonging to the same network even though a new
physical location is involved. This eliminates the need to perform any Layer 3 tasks such as
changing host addresses for the new location and leaves access list and rewall
congurations intact.
The VLAN approach just described is sometimes called end-to-end VLANs, or VLAN
everywhere or the distributed VLAN design method. It has the clear advantage of allowing
you to keep a user in the same broadcast domain regardless of the physical location. As
good as it seems to take this approach, it does have disadvantages. (Alas, nothing is ever as
good as it seems.) Issues arise whenever the network grows in extent. As you add more
Catalysts to the system, you add more bridges which increases the Spanning Tree topology
complexity. This was mentioned in the previous section.
Backbone
Distribution
Access
172.16.1.0 172.16.2.0
CL250514
Part V of this book describes VLAN design philosophies. One approach, the Layer 3
distribution design, minimizes the Spanning Tree extent and topology because the
Spanning Tree is constrained to the pockets of access devices. Access pockets can be placed
on oors as in Figure 5-15. Each oor has its own access network. Users on the oor share
the access network regardless of their community of interest. Engineering and accounting
might share the VLAN. If necessary, the access network can be divided into a couple of
VLANs to provide additional isolation between users or departments. Further, it enables
load balancing, which is not easily obtainable in a Layer 2 design. These advantages lead
many network engineers to avoid the end-to-end VLAN approach in favor of the Layer 3
design approach.
Moving Users in VLANs 135
Floor 6
Floor 5
Floor 4
Floor 3
Historically, network approaches swayed from Layer 2 to Layer 3 back to Layer 2 and now
back to Layer 3. The earliest networks were by default Layer 2. At some point in history,
someone realized that they didn't scale very well, and they wanted to connect Layer 2
segments together. So routers were invented. Soon, the whole world deployed routers. But
because routers were slow, designers started to look at high-performance bridges to
interconnect the networks on a large scale. This was the advent of the Layer 2 switching
products during the late 1980s to early 1990s. Until recently, Layer 2 switching plans
dominated new network designs. Then came the realization that large scale Layer 2 networks
created other problems, and router speeds have increased dramatically since the early 1990s.
Engineers reexamined Layer 3 approaches for the backbone and distribution networks and
now tend to consider Layer 3 designs a more desirable approach. It can, however, restore the
disadvantages of Layer 3 complexities in a legacy network if poorly implemented.
136 Chapter 5: VLANs
Planning VLANs
Before you enable new VLANs, make sure that you know what you really want to do and how
your actions can affect other VLANs or stations already present in your system. The planning
at this stage can primarily focus around Layer 3 issues. What networks need to be supported
in the VLAN? Is there more than one protocol that you want in the VLAN? Because each
VLAN corresponds to a broadcast domain, you can support multiple protocols in the VLAN.
However, you should only have one network for each protocol in the VLAN.
A multiswitch system like that shown in Figure 5-16 can have several VLANs.
Figure 5-16 A Typical VLAN Network Address Deployment
VLAN 100, 200, 300 VLAN 200, 300
Each VLAN in Figure 5-16 supports multiple protocols. For the networks to communicate with
each other, they must pass through a router. The router on a stick in Figure 5-16 interconnects
several networks together. Example 5-1 shows a conguration le for this router.
Catalyst VLAN Conguration 137
Example 5-1 sets up a trunk between a device and the router. Trunks and Inter-Switch Link
(ISL) encapsulation are discussed in more detail in Chapter 8. Trunks allow trafc from
more than one VLAN to communicate over one physical link. The encapsulation isl
command in Example 5-1 instructs the router to use ISL trunking to communicate with the
broadcast domain for each subinterface. Note that the router conguration uses
subinterfaces. Normally, in a router, you assign a single address per protocol on an
interface. However, when you want to use a single physical interface in a way that, to the
routing protocols, appears as multiple interfaces, you can use subinterfacesfor example,
when creating a trunk between a Catalyst and a router, as in Example 5-1. The router needs
to identify a separate broadcast domain for each VLAN on the trunk.
Cisco routers use a concept of subinterfaces to spoof the router into thinking that a physical
interface is actually more than one physical interface. Each subinterface identies a new
broadcast domain on the physical interface and can belong to its own IP network, even
though they all actually belong to the same major interface. The router congured in
Example 5-1 uses three subinterfaces making the router think that the one physical interface
(the major interface) interface fastethernet 2/0 is actually three physical interfaces, and
therefore, three broadcast domains. Each belongs to a different IP subnet. You can
recognize a subinterface on a Cisco router because the major interface designator has a .x
after it. For example, subinterface 3 is identied in Example 5-1 as int fastethernet 2/0.3
where .3 designates the specic subinterface for the major interface.
The subinterface concept arises again when conguring LANE and MPOA on routers and
on the Catalyst ATM modules.
In Example 5-1, which network is isolated from the others? IPX network 300 is isolated
because the router does not have this network dened on any of its interfaces.
At times, a physical network conguration can confuse you. A common question we hear in
class or consulting situations is, Can I do this with a VLAN? Frequently, an answer can be
devised by representing the VLANs in a logical conguration. Figure 5-16 shows a physical
network; Figure 5-17 shows the same network, but redrawn to show the logical connectivity.
138 Chapter 5: VLANs
The drawing replaces each VLAN with a wire representation labeled with the networks assigned
to the VLAN. This more conventional representation helps when trying to design and deploy a
VLAN, because it places networks and components into their logical relationship.
172.16.10.1 172.16.30.1
IPX NET 100
172.16.20.1
IPX NET 200
CL250517
Figure 5-18 shows another network conguration where two VLANs carry the same IP
network. Nothing prevents this conguration and it is valid. It is not, however, recommended
for most networks. This conguration represents the same subnet on two logical wires.
Figure 5-18 Overlapping IP Networks
The network could be redrawn as in Figure 5-19 where clearly there are two isolated broadcast
domains. As long as you do not attempt to interconnect them with a router, this conguration is
completely valid. Why can they not be connected with a router? Because this forces a router to
have two interfaces that belong to the same IP subnetwork. The router does not let you do this.
Figure 5-19 Logical Representation of Figure 5-18
You might also believe the same regarding VLAN 200. However, the physical representation
of Figure 5-18 makes it clear that the two VLANs share a link between the switches. This
must be a shared bandwidth link which is not obvious from the logical representation.
TIP Use the logical representation to plan and troubleshoot Layer 3 issues and use the physical
drawings to determine Layer 2-related issues.
Creating VLANs
Creating a VLAN involves the following steps:
Step 1 Assign the Catalyst to a VTP domain
Step 2 Create the VLAN
Step 3 Associate ports to a VLAN
To facilitate creation, deletion, and management of VLANs in Catalysts, Cisco developed
a protocol called VLAN Trunking Protocol (VTP). Chapter 12, VLAN Trunking
Protocol, covers VTP in more detail. However, a brief introduction is necessary here. You
can divide a large Catalyst network into VTP management domains to ease some
conguration and management tasks. Management domains are loosely analogous to
autonomous systems in a routed network where a group of devices share some common
attributes. Catalysts share VLAN information with each other within a VTP domain. A
Catalyst must belong to a VTP domain before you can create a VLAN. You cannot create a
VLAN on any Catalyst. The Catalyst must be congured in either the server or transparent
mode to create the VLAN. By default, the Catalyst operates in the server mode. These
modes and the command details to set them are described in Chapter 12.
You can congure a Catalysts VTP domain membership with the set vtp domain name
command. Each domain is uniquely identied by a text string. Note that the name is case
sensitive. Therefore, a domain name of Cisco is not the same as cisco. Other rules about
VTP domains that you need to consider are also detailed in Chapter 12.
Whenever you create or delete a VLAN, VTP transmits the VLAN status to the other
Catalysts in the VTP domain. If the receiving Catalyst in the VTP domain is congured as
a server or client, it uses this information to automatically modify its VLAN list. This saves
you the task of repeating the command to create the same VLAN in all participating
Catalysts within the domain. Create the VLAN in one Catalyst, and all Catalysts in the
domain automatically learn about the new VLAN. The exception to the rule occurs if the
receiving Catalyst is in transparent mode. In this case, the receiving Catalyst ignores the
VTP. Transparent Catalysts only use locally congured information.
After the Catalyst belongs to a named VTP domain, you can create a VLAN. Use the set
vlan command to create a VLAN in a Catalyst.
140 Chapter 5: VLANs
Example 5-2 shows three attempts to create VLAN 2. Note that in the second attempt the
Catalyst fails to create the VLAN as indicated in the bolded line of the output. It fails
because the Catalyst was not assigned to a VTP management domain. Only after the
Catalyst is assigned to a VTP domain is Catalyst able to create the VLAN. What is the
domain name that the Catalyst belongs to? The Catalyst belongs to the VTP domain wally.
Example 5-2 set vlan Screen Example
Console> (enable) set vlan 2 willitwork
Usage: set vlan <vlan_num> [name <name>] [type <type>] [state <state>]
[said <said>] [mtu <mtu>] [ring <ring_number>]
[bridge <bridge_number>] [parent <vlan_num>]
[mode <bridge_mode>] [stp <stp_type>]
[translation <vlan_num>] [backupcrf <off|on>]
[aremaxhop <hopcount>] [stemaxhop <hopcount>]
(name = 1..32 characters, state = (active, suspend)
type = (ethernet, fddi, fddinet, trcrf, trbrf)
said = 1..4294967294, mtu = 576..18190, ring_number = 0x1..0xfff
bridge_number = 0x1..0xf, parent = 2..1005, mode = (srt, srb)
stp = (ieee, ibm, auto), translation = 1..1005
hopcount = 1..13)
Console> (enable) set vlan 2 name willitwork
Cannot add/modify VLANs on a VTP server without a domain name.
Console> (enable)
Console> (enable) set vtp domain wally
VTP domain wally modified
Console> (enable) set vlan 2 willitwork
Vlan 2 configuration successful
Console> (enable)
Note that the usage information indicates that the minimum input necessary to create a
VLAN is the VLAN number. Optionally, you can specify a VLAN name, type, and other
parameters. Many of the other parameters congure the Catalyst for Token Ring or FDDI
VLANs. If you do not specify a VLAN name, the Catalyst assigns the name VLAN#. If you
do not specify a VLAN type, the Catalyst assumes that you are conguring an Ethernet
VLAN. Assigning a name does not change the performance of the Catalyst or VLAN. If
used well, it enables you to document the VLAN by reminding yourself what the VLAN is
for. Use meaningful names to document the VLAN. This helps you with troubleshooting
and conguration tasks.
After you create a VLAN, you can assign ports to the VLAN. Assigning ports to a VLAN
uses the same command as for creating the VLAN. Example 5-3 shows an attempt to assign
a block of ports to VLAN 2. Unfortunately, the command is entered incorrectly the rst
time. What is wrong with the command? The set vlan command fails in the rst case
because the range species a non-existent interface on the Supervisor module. 1/8 indicates
the eighth port on the Supervisor.
Catalyst VLAN Conguration 141
Console> (enable)
After the port designation is corrected, the Catalyst successfully reassigns the block of ports to
VLAN 2. When designating ports, remember that you can assign a block by using hyphens and
commas. Do not insert any spaces, though, between the ports on the line. This causes the
Catalyst to parse the command leaving you with only some of the ports assigned into the VLAN.
NOTE In many instances where administrators install Catalysts, legacy hubs already exist. You might
have network areas where stations do not need the dedicated bandwidth of a switch port and
can easily share bandwidth with other devices. To provide more bandwidth, you can elect to
not attach as many devices as were originally attached, and then attach the hub to a Catalyst
interface. Be sure to remember, though, that all of the devices on that hub belong to the same
Layer 2 VLAN because they all ultimately attach to the same Catalyst port.
Deleting VLANs
You can remove VLANs from the management domain using the clear vlan vlan_number
command. For example, if you want to remove VLAN 5 from your VTP management
domain, you can type the command clear vlan 5 on a Catalyst congured as a VTP server.
You cannot delete VLANs from a VTP client Catalyst. If the Catalyst is congured in
transparent mode, you can delete the VLAN. However, the VLAN is removed only from the
one Catalyst and is not deleted throughout the management domain. All VLAN creations
and deletions are only locally signicant on a transparent Catalyst.
When you attempt to delete the VLAN, the Catalyst warns you that all ports belonging to
the VLAN in the management domain will move into a disabled state. If you have 50
devices as members of the VLAN when you delete it, all 50 stations become isolated
because their local Catalyst port becomes disabled. If you recreate the VLAN, the ports
become active again because the Catalyst remembers what VLAN you want the port to
belong to. If the VLAN exists, the ports become active. If the VLAN does not exist, the
ports become inactive. This could be catastrophic if you accidentally eliminate a VLAN
that still has active users on it.
142 Chapter 5: VLANs
Also, realize that if you have a VTP management domain where you have most of your
Catalysts congured as VTP servers and clients with a few Catalysts congured in
transparent mode, you can inadvertently cause another situation when you delete a VLAN
in the transparent device when the VLAN exists throughout the management domain. For
example, suppose you have three Catalysts in a row with Cat-A congured in server mode,
Cat-B congured in transparent mode, and Cat-C congured in client or server mode. Each
Catalyst has a member of VLAN 10, so you create the VLAN on Cat-B, and you create it
on Cat-A (Cat-C acquires the VLAN information from Cat-A as a result of VTP). From a
Spanning Tree point of view, you have one Spanning Tree domain, and therefore, one
Spanning Tree Root Bridge. But suppose that you decide you no longer need VLAN 10 on
Cat-B, because there are no longer members of the VLAN. So, you delete the VLAN with
the clear vlan 10 command. From a VLAN point of view, this is perfectly acceptable.
However, from a Spanning Tree point of view, you now created two Spanning Tree
domains. Because Cat-B no longer participates in the VLAN, it no longer contributes to the
Spanning Tree for that VLAN. Therefore, Cat-A and Cat-C each become a Root Bridge for
VLAN 10 in each of their Spanning Tree domains.
Although Spanning Tree reconverges as a result of the apparent topology change, users in
VLAN 10 cannot communicate with each other until the Spanning Tree topology nally
places ports into the forwarding state.
TIP When deleting VLANs from a management domain, whether it is on a Catalyst congured
in server or transparent mode, be sure that you consider how you can affect the network.
You have the possibility of isolating a lot of users and of disrupting Spanning Tree in a
network.
VLAN Type SAID MTU Parent RingNo BrdgNo Stp BrdgMode Trans1 Trans2
---- ----- ---------- ----- ------ ------ ------ ---- -------- ------ ------
1 enet 100001 1500 - - - - - 0 0
2 enet 100002 1500 - - - - - 0 0
1002 fddi 101002 1500 - 0x0 - - - 0 0
1003 trcrf 101003 1500 0 0x0 - - - 0 0
1004 fdnet 101004 1500 - - 0x0 ieee - 0 0
1005 trbrf 101005 1500 - - 0x0 ibm - 0 0
VLAN 2, created in the previous section, is highlighted in the output of Example 5-4. The
show vlan output is divided into three portions. The rst portion shows the VLAN number,
name, status, and ports assigned to the VLAN. This provides a quick evaluation of the
condition of a VLAN within the Catalyst. The second portion displays the VLAN type and
other parameters relevant to the VLANfor example, the MTU size. The other columns
display information for Token Ring and FDDI VLANs. The third portion of the output
displays further information for source-routed VLANs.
Note that there are several VLANs present in the output. All of the entries in the display,
except for VLAN 2, show default VLANs which are always present in the Catalyst. These
default VLANs cannot be removed from the Catalyst. When you rst power a new Catalyst,
all Ethernet interfaces belong to VLAN 1. Also, the Supervisor module sc0 or sl0 interface
belongs to this VLAN by default. If you interconnect several Catalysts, each populated with
Ethernet modules and with only default congurations, all of the Catalyst interfaces belong
to the same VLAN. You have one giant broadcast domain.
144 Chapter 5: VLANs
The rst two steps globally affect Catalysts. When you create a VLAN, VTP announces the
addition or deletion of the VLAN throughout the VTP domain. Assigning ports to a VLAN,
however, is a local event. VTP does not announce what ports belong to which VLAN. You
must log in to the Catalyst where you want to assign ports. After you assign the port to a
VLAN, any device attached to the port belongs to the assigned VLAN. (The exception to
this is the port security feature that allows one and only one MAC address on the port to
belong to the VLAN.) When you attach a station to a port on the Catalyst, you need to
ensure that the port belongs to the correct VLAN. Unfortunately, you might not always have
access to the CLI to make a change. Or, you might have users who frequently relocate
within their facilities environment. But you do not want them to bother you every time they
relocate a station, especially when it happens after midnight or during a weekend.
Cisco built a feature into the Catalyst to facilitate dynamic port congurations. The
dynamic VLAN feature automatically congures a port to a VLAN based upon the MAC
address of the device attached to the port as shown in the following sequence:
Step 1 When you attach a device to the port and the device transmits a frame,
the Catalyst learns the source MAC address.
Step 2 The Catalyst then interrogates a VLAN membership policy server
(VMPS). The VMPS server has a database of MAC addresses and the
authorized VLAN for each MAC address.
Step 3 The VMPS responds to the client Catalyst with the authorized VLAN.
Step 4 The VMPS client Catalyst congures the port to the correct VLAN based
upon the information received from the VMPS.
The bulk of your work as the network administrator is to initially build the database. After
you build the database, you (or your users) do not have to statically congure a Catalyst
every time that a device moves from one port to another.
This feature also provides a level of security because the users MAC address for the device must
be in a database before the Catalyst assigns the port to a VLAN. If the MAC address is not in
the database, the Catalyst can refuse the connection or assign the user to a default VLAN.
Three components enable a dynamic VLAN environment. First, you must have a TFTP
server. The VMPS database resides as a text le on the TFTP server. The second
component, the VMPS server, reads the database from the TFTP server and locally
remembers all of the data. Dynamic VLAN clients interrogate the VMPS whenever a device
VMPS and Dynamic VLANs: Advanced Administration 145
attaches to a port on the Catalyst. You can congure up to two backup VMPS servers. The
third component, the VMPS client, communicates with the VMPS server using UDP
transport and a socket value of 1589. This is a well known protocol value registered with
the Internet Assigned Numbers Authority (IANA) as VQP (VMPS Query Protocol).
Figure 5-20 illustrates the relationship between the components. Cat-A serves as the
primary VMPS server, with two other Catalysts also enabled as backup VMPS servers. The
section on conguring the VMPS client details how to identify primary and backup VMPS
servers. The VMPS server (Cat-A) accesses the TFTP server when you initially enable the
VMPS server, or whenever you manually force the VMPS to download a new conguration
table. The VMPS server must have an IP address and it might need a default route to the
TFTP server for the VMPS server to initialize. The VMPS server needs a default route if
the VMPS and TFTP servers reside on different subnets/VLANs.
Cat-B and Cat-C are each congured as VMPS clients and get port-to-VLAN authorizations
from the VMPS server. Therefore, they need to be able to communicate with the VMPS server.
Figure 5-20 Dynamic VLAN Architecture
VMPS Client
Cat-B
172.16.1.2
VMPS Client
Cat-C
172.16.1.3
146 Chapter 5: VLANs
The following list outlines the steps for conguring dynamic VLANs:
Step 1 Build the VLAN database and load into a TFTP server.
The sections that follow provide more detail on this seven-step sequence for conguring
dynamic VLANs.
continues
VMPS and Dynamic VLANs: Advanced Administration 147
fallback, the port remains unassigned. If you set the mode to secure, the VMPS server
instructs the VMPS client to shut down the interface instead of leaving it unassigned. An
unassigned port can continue to try to assign a port through repeated requests. A shutdown
port stays that way until you enable it.
The fallback VLAN is like a miscellaneous VLAN. If the database does not have an entry for the
MAC address, the VMPS server assigns the device to the fallback VLAN, if one is congured.
Two commands congure the VMPS serverset vmps tftpserver ip_addr [lename] and
set vmps state enable. The rst command points the VMPS server to the TFTP server and
optionally species the database lename. If you do not specify a lename, the VMPS tries
the lename vmps-cong-database.1. Use the command set vmps tftpserver ip_addr
[lename] to inform the VMPS server of the TFTP servers IP address and the VMPS
database lename to request.
After you congure the TFTP server information, you can enable the VMPS server with the
command set vmps state enable. At this point, the VMPS server attempts to download the
VMPS database from the TFTP server.
If at some point after you enable the server you modify the VMPS database on the TFTP server,
you can force the VMPS server to acquire the new database with the command download vmps.
You can check the status of the VMPS server with the command show vmps. This command
reports all of the current conguration information for the server, as shown in Example 5-6.
The show vmps command works for both the VMPS server and client. The top half of the
display shows the server conguration information, and the bottom half displays client values.
If you have trouble getting the VMPS server operational, use this command to view a summary
of the parameters. In particular, check that the VMPS domain name matches the VTP domain
name. State is either enabled or disabled. You should see enabled if you entered the set vmps
state enable command. Check the operational status. This displays either active, inactive, or
downloading. The downloading status implies that the VMPS server is retrieving the VMPS
database from the TFTP server. The inactive status means that the VMPS server tried to get the
database, but failed and became inactive. Finally, check the database lename and ensure that
the Catalyst can reach the server, that the le exists, and that it is a VMPS le.
150 Chapter 5: VLANs
Cisco has two optional tools for the VMPS databasethe User Registration Tool (URT)
and the User Tracker for Cisco Works for Switched Internetworks (CWSI). The tools help
with the creation of the database and allow you to place the VMPS server in a non-Catalyst
device. The sections that follow provide additional information on these two tools.
URT
Cisco's User Registration Tool (URT) allows you to have a VLAN membership database
built based upon a users Windows/NT login information rather than based upon a MAC
address. You can only use URT with Windows 95/98 and Windows NT 4 clients running
Microsoft Networking (NetBios or Client for Microsoft Networks) running over TCP/IP
using the Dynamic Host Control Protocol (DHCP). URT does not support other operating
systems or network layer protocols. You must manually load a URT client package on the
NT 4 clients/servers so it can interact with the URT server. However, Windows 95/98 clients
automatically install the URT client service from their NT domain controller.
URT sets up an NT 4 database and behaves like a VMPS server. You still need to enable
Catalysts as VMPS clients pointing to the NT server with the URT database.
Managing the URT server requires CWSI 2.1 as it interacts with the CWSI 2.1 ANI server
to dene workstation relationships to VLANs.
To congure ports as dynamic, use the command set port membership mod_num/
port_num dynamic. You cannot make a trunk port a dynamic port. You must rst turn off
trunking before you set port membership to dynamic. Nor can you set a secure port to
dynamic. If you have port security enabled, you must disable it before you set it to dynamic.
After you enter the set port membership command, the Catalyst attempts to communicate
with the VMPS server using VQP when the attached device initially transmits. If the client
successfully communicates with the server, the server responds in one of four ways:
Assigns the port to an authorized VLAN
Assigns the port to a fallback VLAN
Denies access
Disables the port
If the VMPS server nds an entry for the MAC address in the VMPS database, the server
responds with the authorized VLAN for that device. The VMPS client enables the port and
congures the port to the correct VLAN. If the VMPS server does not nd the MAC address
in the database, it assigns the device to the fallback VLAN if you set one up in the database.
If you do not have a fallback specied, the VMPS server responds with instructions to deny
access or shut down the interface, depending upon the VMPS security setting. Deny access
differs from shutdown in that deny allows devices to try again (the behavior if the security
option is disabled), whereas shutdown literally shuts down the port and prevents any further
attempts to dynamically assign the port (the default behavior if the security option is
enabled).
You can have multiple hosts on the dynamic port; however, all hosts must be authorized for
the same VLAN, and you cannot have more than 50 hosts on the port.
Note that a Catalyst does not initiate a VQP to the server until the device attached to the
port transmits. When the local Catalyst sees the source MAC address, it can generate a
request to the VMPS server. If you use the show port command, you can determine what
VLAN a port is assigned to. Dynamic ports have a VLAN nomenclature of dyn- as shown
in Example 5-7.
Note the entry for Port 1/1. It has a dynamic VLAN assignment. But the highlighted Port
3/1 is a dynamic port without a VLAN assignment. The Catalyst does not forward any
frames from the host attached to this port. When you rst attach a host to the port, the
Catalyst does not know the source MAC address and automatically congures the port in
this mode.
After the host transmits and the VMPS client receives a valid response from the VMPS
server, the VMPS client Catalyst enables the interface in the correct VLAN. If the client sits
idle for awhile causing the bridge aging timer to expire for the entry, the Catalyst returns
the port to an unassigned state. The VMPS client issues a new query to the VMPS server
when the host transmits again.
Conrm the VMPS client conguration with the show vmps command as was shown in
Example 5-6. The bottom half of this output shows the client settings. The reconrm
interval denes how often the client interrogates the VMPS server to see if a policy changed
for locally attached hosts. In Example 5-6, the interval is for every 20 minutes. The Server
Retry Count, in this case three, species how many times the VMPS client should try to
reach the VMPS server. If it fails to receive a response from the server after three attempts,
the client attempts to reach one of the backup servers. Finally, the output shows how the IP
address of the VMPS server the client is attempting to use, 172.16.1.1.
Protocol Filtering
A switch forwards trafc within a broadcast domain based upon the destination MAC
address. The switch lters, forwards, or oods the frame depending upon whether or not
the switch knows about the destination in its address table. The switch normally does not
look at any Layer 3 information (or Layer 2 protocol type) to decide how to treat the frame.
(MLS and MPOA are exceptions). Refer to Figure 5-21 for another example of the Catalyst
blocking trafc based upon the protocol.
A B
IP
IPX
520
Review Questions 153
If Station A in Figure 5-21 sends a frame to Station B, the switch forwards the frame, even
if Station B does not share the same Layer 3 protocol as Station A. This is an unusual
situation. Suppose, however, that the VLAN contains stations with a mix of protocols in
use. Some stations use IP, some use IPX, and others might even have a mix of protocols. If
a switch needs to ood an IP frame, it oods it out all ports in the VLAN, even if the
attached station does not support the frame's protocol. This is the nature of a broadcast
domain.
A Catalyst 5000 equipped with a NetFlow Feature Card and a Supervisor III engine, as well
as many other Catalysts, can override this behavior with protocol ltering. Protocol ltering
works on Ethernet, Fast Ethernet, or Gigabit Ethernet non-trunking interfaces. Protocol
ltering prevents the Catalyst from ooding frames from a protocol if there are no stations
on the destination port that use that protocol. For example, if you have a VLAN with a mix
of IP and IPX protocols, any ooded trafc appears on all ports in the VLAN. Protocol
ltering prevents the Catalyst from ooding trafc from a protocol if the destination port
does not use that protocol. The Catalyst listens for active protocols on an interface. Only
when it sees an active protocol does it ood trafc from that protocol. In Figure 5-21, there
is a mix of protocols in the VLAN. Some of the stations in the network support only one
protocol, either IP or IPX. Some of the stations support both. The Catalyst learns that
Station A uses IP, Station B uses IPX, and Station C uses both by examining the Layer 2
protocol type value. When Station A creates an IP broadcast, Station B does not see the
frame, only Station C. Likewise, if Station B creates a frame for the switch to ood, the
frame does not appear on Station A's interface because this is an IP-only interface.
The Catalyst enables and disables protocols in groups. They are the following:
IP
IPX
AppleTalk, DECnet, and Vines
All others
Review Questions
This section includes a variety of questions on the topic of campus design implementation.
By completing these, you can test your mastery of the material included in this chapter as
well as help prepare yourself for the CCIE written test.
1 Early in this chapter, it was mentioned that you can determine the extent of a broadcast
domain in a switched network without conguration les. How do you do it?
2 Two Catalysts interconnect stations as shown in Figure 5-22. Station A cannot
communicate with Station B. Why not? Example 5-8 provides additional information
for the system.
154 Chapter 5: VLANs
172.16.1.1 172.16.1.2
A B
Cat-A Cat-B
2/5 2/4
2/7 2/15
2/18 2/8
C 2/11 2/11 D
172.16.2.1 172.16.2.2
L250521
Example 5-8 Cat-A and Cat-B Congurations
Cat-A > (enable) show vlan
VLAN Name Status Mod/Ports, Vlans
---- -------------------------------- --------- ----------------------------
1 default active 1/1-2
2/1-8
2 vlan2 active 2/9-24
1002 fddi-default active
1003 token-ring-default active
1004 fddinet-default active
1005 trnet-default active
3 Again referring to Figure 5-22 and Example 5-8, can Station C communicate with
Station D?
4 Are there any Spanning Tree issues in Figure 5-22?
5 Draw a logical representation of Figure 5-22 of the way the network actually exists as
opposed to what was probably intended.
6 Is there ever a time when you would bridge between VLANs?
7 List the three components of dynamic VLANs using VMPS.
This page intentionally left blank
PART
II
Spanning Tree
Chapter 6 Understanding Spanning Tree
CAUTION Please note that the examples used in this chapter (and Chapter 7) are designed to illustrate
the operation of the Spanning-Tree Protocol, not necessarily good design practices. Design
issues are addressed in Chapter 11, Layer 3 Switching, Chapter 14, Campus Design
Models, Chapter 15, Campus Design Implementation, and Chapter 17, Case Studies:
Implementing Switches.
Cat-5
The catch is that loops are potentially disastrous in a bridged network for two primary
reasons: broadcast loops and bridge table corruption.
What Is Spanning Tree and Why Use Spanning Tree? 161
Broadcast Loops
Broadcasts and Layer 2 loops can be a dangerous combination. Consider Figure 6-2.
HostA
6 1
1/1 2 2 1/1
Cat-1 Cat-2
1/2 1/2
4
3
HostB
602.eps
Assume that neither switch is running STP. Host-A begins by sending out a frame to the
broadcast MAC address (FF-FF-FF-FF-FF-FF) in Step 1. Because Ethernet is a bus
medium, this frame travels to both Cat-1 and Cat-2 (Step 2).
When the frame arrives at Cat-1:Port-1/1, Cat-1 will follow the standard bridging algorithm
discussed in Chapter 3, Bridging Technologies, and ood the frame out Port 1/2 (Step 3).
Again, this frame will travel to all nodes on the lower Ethernet segment, including Cat-
2:Port1/2 (Step 4). Cat-2 will ood the broadcast frame out Port 1/1 (Step 5) and, once
again, the frame will show up at Cat-1:Port-1/1 (Step 6). Cat-1, being a good little switch,
will follow orders and send the frame out Port 1/2 for the second time (Step 7). By now I
think you can see the patternthere is a pretty good loop going on here.
Additionally, notice that Figure 6-2 quietly ignored the broadcast that arrived at Cat-2:Port-
1/1 back in Step 2. This frame would have also been ooded onto the bottom Ethernet
segment and created a loop in the reverse direction. In other words, dont forget that this
feedback loop would occur in both directions.
Notice an important conclusion that can be drawn from Figure 6-2bridging loops are
much more dangerous than routing loops. To understand this, refer back to the discussion
of Ethernet frame formats in Chapter 1, Desktop Technologies. For example, Figure 6-3
illustrates the layout of a DIX V2 Ethernet frame.
162 Chapter 6: Understanding Spanning Tree
6 6 2 46-1500 4
Notice that the DIX V2 Ethernet frame only contains two MAC addresses, a Type eld, and
a CRC (plus the next layer as Data). By way of contrast, an IP header contains a time to live
(TTL) eld that gets set by the original host and is then decremented at every router. By
discarding packets that reach TTL=0, this allows routers to prevent run-away datagrams.
Unlike IP, Ethernet (or, for that matter, any other common data link implementation)
doesnt have a TTL eld. Therefore, after a frame starts to loop in the network above, it
continues forever until one of the following happens:
Someone shuts off one of the bridges or breaks a link.
The sun novas.
As if that is not frightening enough, networks that are more complex than the one illustrated
in Figure 6-2 (such as Figure 6-1) can actually cause the feedback loop to grow at an
exponential rate! As each frame is ooded out multiple switch ports, the total number of
frames multiplies quickly. I have witnessed a single ARP lling two OC-12 ATM links for 45
minutes (for non-ATM wizards, each OC-12 sends 622 Mbps in each direction; this is a total
of 2.4 Gbps of trafc)! For those who have a hard time recognizing the obvious, this is bad.
As a nal note, consider the impact of this broadcast storm on the poor users of Host-A and
Host-B in Figure 6-2. Not only can these users not play Doom (a popular game on campus
networks) with each other, they cant do anything (other than go home for the day)! Recall
in Chapter 2, Segmenting LANs, that broadcasts must be processed by the CPU in all
devices on the segment. In this case, both PCs lock up trying to process the broadcast storm
that has been created. Even the mouse cursor freezes on most PCs that connect to this
network. If you disconnect one of the hosts from the LAN, it generally returns to normal
operation. However, as soon as you reconnect it to the LAN, the broadcasts again consume
100 percent of the CPU. If you have never witnessed this, some night when only your worst
enemy is still using the network, feel free to create a physical loop in some VLAN (VLAN
2, for example) and then type set spantree 2 disable into your Catalyst 4000s, 5000s, and
6000s to test this theory. Of course, dont do this if your worst enemy is your boss!
Figure 6-4 Without STP, Even Unicast Frames Can Loop and Corrupt Bridging Tables
HostA
1/1 2 2 1/1 5
6
Cat-1 Cat-2
Bridge Table
1/2 1/2
4 MAC Port
3 AA-AA-AA-AA-AA-AA 0
AA-AA-AA-AA-AA-AA 1
HostB
4.eps
For example, suppose that Host-A, possessing a prior ARP entry for Host-B, wants to send
a unicast Ping packet to Host-B. However, Host-B has been temporarily removed from the
network, and the corresponding bridge-table entries in the switches have been ushed for
Host-B. Assume that both switches are not running STP. As with the previous example, the
frame travels to Port 1/1 on both switches (Step 2), but the text only considers things from
Cat-1s point of view. Because Host-C is down, Cat-1 does not have an entry for the MAC
address CC-CC-CC-CC-CC-CC in its bridging table, and it oods the frame (Step 3). In
Step 4, Cat-2 receives the frame on Port 1/2. Two things (both bad) happen at this point:
1 Cat-2 oods the frame because it has never learned MAC address CC-CC-CC-CC-
CC-CC (Step 5). This creates a feedback loop and brings down the network.
2 Cat-2 notices that it just received a frame on Port 1/2 with a source MAC of AA-AA-
AA-AA-AA-AA. It changes its bridging entry for Host-As MAC address to the
wrong port!
As frames loop in the reverse direction (recall that the feedback loop exists in both
directions), you actually see Host-As MAC address ipping between Port 1/1 and Port 1/2.
In short, not only does this permanently saturate the network with the unicast ping packet,
but it corrupts the bridging tables. Remember that its not just broadcasts that can ruin your
network.
164 Chapter 6: Understanding Spanning Tree
Bridge IDs
A Bridge ID (BID) is a single, 8-byte eld that is composed of two subelds as illustrated
in Figure 6-5.
Figure 6-5 The Bridge ID (BID) Is Composed of Bridge Priority and a MAC Address
BID 8 Bytes
Bridge MAC
Priority
2 Bytes 6 Bytes
Range: 0-65,535 From Backplane/Supervisor
Default: 32,768
05.eps
The low-order subeld consists of a 6-byte MAC address assigned to the switch. The
Catalyst 5000 and 6000 use one of the MAC addresses from the pool of 1024 addresses
assigned to every supervisor or backplane. This is a hard-coded number that is not designed
to be changed by the user. The MAC address in the BID is expressed in the usual
hexadecimal (base 16) format.
NOTE Some Catalysts pull the MAC addresses from the supervisor module (for example, the
Catalyst 5000), whereas others pull the addresses from the backplane (such as the Catalyst
5500 and 6000).
The high-order BID subeld is referred to as the Bridge Priority. Do not confuse Bridge
Priority with the various versions of Port Priority that are discussed in Chapter 7,
Advanced Spanning Tree. The Bridge Priority eld is a 2-byte (16-bit) value. The C
programmers in the crowd might recall that an unsigned 16-bit integer can have 216
possible values that range from 065,535. The default Bridge Priority is the mid-point
value, 32,768. Bridge Priorities are typically expressed in a decimal (base 10) format.
Two Key Spanning-Tree Protocol Concepts 165
NOTE This book only covers the IEEE version of the Spanning-Tree Protocol. Although the basic
mechanics of both are identical, there are some differences between IEEE STP and DEC
STP (the original implementation of the Spanning-Tree Protocol). For example, DEC STP
uses an 8-bit Bridge Priority. Layer 2 Catalysts (such as the 4000s, 5000s, and 6000s) only
support IEEE STP. Cisco routers support both varieties. A third variety, the VLAN-Bridge
Spanning Tree, is being introduced in 12.0 IOS code for the routers. This version can be
useful in environments that mix routing and bridging and is discussed in Chapter 11.
Path Cost
Bridges use the concept of cost to evaluate how close they are to other bridges. 802.1D
originally dened cost as 1000 Mbps divided by the bandwidth of the link in Mbps. For
example, a 10BaseT link has a cost of 100 (1000/10), Fast Ethernet and FDDI use a cost of
10 (1000/100). This scheme has served the world well since Radia Perlman rst began
working on the protocol in 1983. However, with the rise of Gigabit Ethernet and OC-48
ATM (2.4 Gbps), a problem has come up because the cost is stored as an integer value that
cannot carry fractional costs. For example, OC-48 ATM results in 1000 Mbps/2400
Mbps=.41667, an invalid cost value. One option is to use a cost of 1 for all links equal to or
greater than 1 Gbps; however, this prevents STP from accurately choosing the best path
in Gigabit networks.
As a solution to this dilemma, the IEEE has decided to modify cost to use a non-linear scale.
Table 6-1 lists the new cost values.
The values in Table 6-1 were carefully chosen so that the old and new schemes interoperate
for the link speeds in common use today.
The key point to remember concerning STP cost values is that lower costs are better. Also
keep in mind that Versions 1.X through 2.4 of the Catalyst 5000 NMP use the old, linear
values, whereas version 3.1 and later use the newer values. All Catalyst 4000s and 6000s
utilize the new values.
Bridges pass Spanning Tree information between themselves using special frames known
as bridge protocol data units (BPDUs). A bridge uses this four-step decision sequence to
save a copy of the best BPDU seen on every port. When making this evaluation, it considers
all of the BPDUs received on the port as well as the BPDU that would be sent on that port.
As every BPDU arrives, it is checked against this four-step sequence to see if it is more
attractive (that is, lower in value) than the existing BPDU saved for that port. If the new
BPDU (or the locally generated BPDU) is more attractive, the old value is replaced.
TIP Bridges send conguration BPDUs until a more attractive BPDU is received.
In addition, this saving the best BPDU process also controls the sending of BPDUs.
When a bridge rst becomes active, all of its ports are sending BPDUs every 2 seconds
(when using the default timer values). However, if a port hears a BPDU from another bridge
that is more attractive than the BPDU it has been sending, the local port stops sending
BPDUs. If the more attractive BPDU stops arriving from a neighbor for a period of time
(20 seconds by default), the local port can once again resume the sending of BPDUs.
NOTE There are actually two types of BPDUs: Conguration BPDUs and Topology Change
Notication (TCN) BPDUs. The rst half of this chapter only discusses Conguration
BPDUs. The second half discusses TCNs and the differences between the two.
Three Steps of Initial STP Convergence 167
Figure 6-6 Model Network Layout for Discussion of Basic STP Operations
MAC=AA-AA-AA-AA-AA-AA
Cat-A
1/1 1/2
1/1 1/1
Cat-B Cat-C
1/2 1/2
MAC=BB-BB-BB-BB-BB-BB MAC=CC-CC-CC-CC-CC-CC
eps
This network consists of three bridges connected in a looped conguration. Each bridge has
been assigned a ctitious MAC address that corresponds to the devices name (for example,
Cat-A uses MAC address AA-AA-AA-AA-AA-AA).
168 Chapter 6: Understanding Spanning Tree
TIP Many texts use the term highest priority when discussing the results of the Root War.
However, notice that the bridge with the highest priority actually has the lowest value. To
avoid confusion, this text always refers to the values.
I am the Root!
All hail me! BID=
32,768.AA-AA-AA-AA-AA-AA
Cat-A
1/1 1/2
BID= BID=
32,768.BB-BB-BB-BB-BB-BB 32,768.CC-CC-CC-CC-CC-CC
1/1 1/1
Cat-B Cat-C
1/2 1/2
Three Steps of Initial STP Convergence 169
Okay, but how did the bridges learn that Cat-A had the lowest BID? This is accomplished
through the exchange of BPDUs. As discussed earlier, BPDUs are special packets that bridges
use to exchange topology and Spanning Tree information with each other. By default, BPDUs
are sent out every two seconds. BPDUs are bridge-to-bridge trafc; they do not carry any end-
user trafc (such as Doom or, if you are boring, e-mail trafc). Figure 6-8 illustrates the basic
layout of a BPDU. (BPDU formats are covered in detail in the Two Types of BPDUs section.)
Figure 6-8 Basic BPDU Layout
For the purposes of the Root War, the discussion is only concerned with the Root BID and
Sender BID elds (again, the real names come later). When a bridge generates a BPDU
every 2 seconds, it places who it thinks is the Root Bridge at that instant in time in the Root
BID eld. The bridge always places its own BID in the Sender BID eld.
TIP Remember that the Root BID is the bridge ID of the current Root Bridge, while the Sender
BID is the bridge ID of the local bridge or switch.
It turns out that a bridge is a lot like a human in that it starts out assuming that the world
revolves around itself. In other words, when a bridge rst boots, it always places its BID in both
the Root BID and the Sender BID elds. Suppose that Cat-B boots rst and starts sending out
BPDUs announcing itself as the Root Bridge every 2 seconds. A few minutes later, Cat-C boots
and boldly announces itself as the Root Bridge. When Cat-Cs BPDU arrives at Cat-B, Cat-B
discards the BPDU because it has a lower BID saved on its ports (its own BID). As soon as Cat-
B transmits a BPDU, Cat-C learns that it is not quite as important as it initially assumed. At this
point, Cat-C starts sending BPDUs that list Cat-B as the Root BID and Cat-C as the Sender
BID. The network is now in agreement that Cat-B is the Root Bridge.
Five minutes later Cat-A boots. As you saw with Cat-B earlier, Cat-A initially assumes that
it is the Root Bridge and starts advertising this fact in BPDUs. As soon as these BPDUs
arrive at Cat-B and Cat-C, these switches abdicate the Root Bridge position to Cat-A. All
three switches are now sending out BPDUs that announce Cat-A as the Root Bridge and
themselves as the Sender BID.
170 Chapter 6: Understanding Spanning Tree
As discussed earlier, bridges use the concept of cost to judge closeness. Specically,
bridges track something called Root Path Cost, the cumulative cost of all links to the Root
Bridge. Figure 6-9 illustrates how this value is calculated across multiple bridges and the
resulting Root Port election process.
Figure 6-9 Every Non-Root Bridge Must Select One Root Port
Root
Bridge
Cost=19
BPDU
7
Cost=19
eps
When Cat-A (the Root Bridge) sends out BPDUs, they contain a Root Path Cost of 0 (Step
1). When Cat-B receives these BPDUs, it adds the Path Cost of Port 1/1 to the Root Path Cost
contained in the received BPDU. Assume the network is running Catalyst 5000 switch code
greater than version 2.4 and that all three links in Figure 6-9 are Fast Ethernet. Cat-B receives
a Root Path Cost of 0 and adds in Port 1/1s cost of 19 (Step 2). Cat-B then uses the value of
19 internally and sends BPDUs with a Root Path Cost of 19 out Port 1/2 (Step 3).
Three Steps of Initial STP Convergence 171
When Cat-C receives these BPDUs from Cat-B (Step 4), it increases the Root Path Cost to 38
(19+19). However, Cat-C is also receiving BPDUs from the Root Bridge on Port 1/1. These
enter Cat-C:Port-1/1 with a cost of 0, and Cat-C increases the cost to 19 internally (Step 5).
Cat-C has a decision to make: it must select a single Root Port, the port that is closest to the
Root Bridge. Cat-C sees a Root Path Cost of 19 on Port 1/1 and 38 on Port 1/2Cat-C:Port-
1/1 becomes the Root Port (Step 6). Cat-C then begins advertising this Root Path Cost of 19
to downstream switches (Step 7).
Although not detailed in Figure 6-9, Cat-B goes through a similar set of calculations: Cat-
B:Port-1/1 can reach the Root Bridge at a cost of 19, whereas Cat-B:Port-1/2 calculates a
cost of 38Port-1/1 becomes the Root Port for Cat-B. Notice that costs are incremented as
BPDUs are received on a port.
TIP Remember that STP costs are incremented as BPDUs are received on a port, not as they are
sent out a port.
For example, BPDUs arrive on Cat-B:Port-1/1 with a cost of 0 and get increased to 19
inside Cat-B. This point is discussed in more detail in the section Mastering the show
spantree Command.
TIP Remember the difference between Path Cost and Root Path Cost.
Path Cost is a value assigned to each port. It is added to BPDUs received on that port to
calculate the Root Path Cost.
Root Path Cost is dened as the cumulative cost to the Root Bridge. In a BPDU, this is the
value transmitted in the cost eld. In a bridge, this value is calculated by adding the
receiving ports Path Cost to the value contained in the BPDU.
Figure 6-10 Every Segment Elects One Designated Port Based on the Lowest Cost
Segment 1 Segment 2
1/1 Cat-A 1/2
Designated Designated
Port Port
My Root Path My Root Path
Cost=19 Cost=19
1/1 1/1
Cat-B Cat-C
1/2 1/2
Designated
My Root Path Port My Root Path
Cost=19 Cost=19
Segment 3
10.eps
To locate the Designated Ports, take a look at each segment in turn. First look at Segment
1, the link between Cat-A and Cat-B. There are 2 bridge ports on the segment: Cat-A:Port-
1/1 and Cat-B:Port-1/1. Cat-A:Port-1/1 has a Root Path Cost of 0 (after all, it is the Root
Bridge), whereas Cat-B:Port-1/1 has a Root Path Cost of 19 (the value 0 received in BPDUs
from Cat-A plus the Path Cost of 19 assigned to Cat-B:Port1/1). Because Cat-A:Port-1/1
has the lower Root Path Cost, it becomes the Designated Port for this link.
For Segment 2 (Cat-A to Cat-C link), a similar election takes place. Cat-A:Port-1/2 has a
Root Path Cost of 0, whereas Cat-C:Port-1/1 has a Root Path Cost of 19. Cat-A:Port-1/2
has the lower cost and becomes the Designated Port. Notice that every active port on the
Root Bridge becomes a Designated Port. The only exception to this rule is a Layer 1
physical loop to the Root Bridge (for example, you connected two ports on the Root Bridge
to the same hub or you connected the two ports together with a crossover cable).
Now look at Segment 3 (Cat-B to Cat-C): both Cat-B:Port-1/2 and Cat-C:Port-1/2 have a
Root Path Cost of 19. There is a tie! When faced with a tie (or any other determination),
STP always uses the four-step decision sequence discussed earlier in the section Four-Step
STP Decision Sequence. Recall that the four steps are as follows:
Step 1 Lowest Root BID
In the example shown in Figure 6-10, all three bridges are in agreement that Cat-A is the
Root Bridge, causing Root Path Cost to be evaluated next. However, as pointed out in the
previous paragraph, both Cat-B and Cat-C have a cost of 19. This causes BID, the third
decision criteria, to be the deciding factor. Because Cat-Bs BID (32,768.BB-BB-BB-BB-
BB-BB) is lower than Cat-Cs BID (32,768.CC-CC-CC-CC-CC-CC), Cat-B:Port-1/2
becomes the Designated Port for Segment 3. Cat-C:Port-1/2 therefore becomes a non-
Designated Port.
First, the bridged network elects a single Root Bridge. Second, every non-Root Bridge
elects a single Root Port, the port that is the closest to the Root Bridge. Third, the bridges
elect a single Designated Port for every segment.
For example, in a network that contains 15 switches and 146 segments (remember every
switch port is a unique segment), the number of STP components that exist corresponds to
the values documented in Table 6-2.
Every BPDU received on a port is compared against the other BPDUs received (as well as the
BPDU that is sent on that port). Only the best BPDU (or superior BPDU) is saved. Notice in
all cases that best is determined by the lowest value (for example, the lowest BID becomes
the Root Bridge, and the lowest cost is used to elect the Root and Designated Ports). A port
stops transmitting BPDUs if it hears a better BPDU that it would transmit itself.
You can view this list as a hierarchy in that bridge ports start at the bottom (Disabled or
Blocking) and work their way up to Forwarding. The Disabled state allows network
administrators to manually shut down a port. It is not part of the normal, dynamic port
processing. After initialization, ports start in the Blocking state where they listen for BPDUs.
A variety of events (such as a bridge thinking it is the Root Bridge immediately after
booting or an absence of BPDUs for certain period of time) can cause the bridge to
transition into the Listening state. At this point, no user data is being passedthe port is
sending and receiving BPDUs in an effort to determine the active topology. It is during the
Listening state that the three initial convergence steps discussed in the previous section take
place. Ports that lose the Designated Port election become non-Designated Ports and drop
back to the Blocking state.
Ports that remain Designated or Root Ports after 15 seconds (the default timer value) progress
into the Learning state. This is another 15-second period where the bridge is still not passing
user data frames. Instead, the bridge is quietly building its bridging table as discussed in Chapter
3. As the bridge receives frames, it places the source MAC address and port into the bridging
table. The Learning state reduces the amount of ooding required when data forwarding begins.
Five STP States 175
NOTE In addition to storing source MAC address and port information, Catalysts learn
information such as the source VLAN.
If a port is still a Designated or Root Port at the end of the Learning state period, the port
transitions into the Forwarding state. At this stage, it nally starts sending and receiving
user data frames. Figure 6-11 illustrates the port states and possible transitions.
Standard states
(1) Port enabled or initialized.
(2) Port disabled or fails.
(3) Port selected as a Root or
Designated Port.
(4) Port ceases to be a Root
or Designated Port.
(5) Forwarding timer expires.
Cisco-specific states
(6) PortFast
(7) UplinkFast
2
Listening
2
5
3
1 4
Disabled 4
Blocking Learning
or Down
2 7
6 4 5
2
Forwarding
11.eps
Figure 6-12 shows the sample network with the port classications and states listed. Notice
that all ports are forwarding except Cat-C:Port-1/2.
176 Chapter 6: Understanding Spanning Tree
Root
Bridge
F RP RP F
1/1 1/1
Cat-B Cat-C
1/2 1/2
F DP NDP B Blocking
Segment 3
Table 6-4 documents the symbols used throughout the book to represent Spanning Tree states.
The Hello Time controls the time interval between the sending of Conguration BPDUs.
802.1D species a default value of two seconds. Note that this value really only controls
Conguration BPDUs as they are generated at the Root Bridgeother bridges propagate
BPDUs from the Root Bridge as they are received. In other words, if BPDUs stop arriving
for 220 seconds because of a network disturbance, non-Root Bridges stop sending
periodic BPDUs during this time. (If the outage lasts more than 20 seconds, the default Max
Age time, the bridge invalidates the saved BPDUs and begins looking for a new Root Port.)
However, as discussed in the Topology Change Notication BPDUs section later, all
bridges use their locally congured Hello Time value as a TCN retransmit timer.
Forward Delay is the time that the bridge spends in the Listening and Learning states. This
is a single value that controls both states. The default value of 15 seconds was originally
derived assuming a maximum network size of seven bridge hops, a maximum of three lost
BPDUs, and a Hello Time interval of two seconds (see the section Tuning Forward Delay
in Chapter 7 for more detail on how Forward Delay is calculated). As discussed in the
Topology Change Notication BPDUs section, the Forward Delay timer also controls the
bridge table age-out period after a change in the active topology.
Max Age is the time that a bridge stores a BPDU before discarding it. Recall from the earlier
discussions that each port saves a copy of the best BPDU it has seen. As long as the bridge
receives a continuous stream of BPDUs every 2 seconds, the receiving bridge maintains a
continuous copy of the BPDUs values. However, if the device sending this best BPDU
fails, some mechanism must exist to allow other bridges to take over.
For example, assume that the Segment 3 link in Figure 6-12 uses a hub and Cat-B:Port-1/
2s transceiver falls out. Cat-C has no immediate notication of the failure because its still
receiving Ethernet link from the hub. The only thing Cat-C notices is that BPDUs stop
arriving. Twenty seconds (Max Age) after the failure, Cat-C:Port-1/2 ages out the stale
BPDU information that lists Cat-B as having the best Designated Port for Segment 3. This
causes Cat-C:Port-1/2 to transition into the Listening state in an effort to become the
Designated Port. Because Cat-C:Port-1/2 now offers the most attractive access from the
Root Bridge to this link, it eventually transitions all the way into Forwarding mode. In
practice, it takes 50 seconds (20 Max Age + 15 Listening + 15 Learning) for Cat-C to take
over after the failure of Port 1/2 on Cat-B.
In some situations, bridges can detect topology changes on directly connected links and
immediately transition into the Listening state without waiting Max Age seconds. For
example, consider Figure 6-13.
178 Chapter 6: Understanding Spanning Tree
Figure 6-13 Failure of a Link Directly Connected to the Root Port of Cat-C
Root
Bridge
F F
DP Cat-A DP
F RP
F DP RP
F
.eps
In this case, Cat-C:Port-1/1 failed. Because the failure results in a loss of link on the Root
Port, there is no need to wait 20 seconds for the old information to age out. Instead, Cat-
C:Port-1/2 immediately goes into Learning mode in an attempt to become the new Root
Port. This has the effect of reducing the STP convergence time from 50 seconds to 30
seconds (15 Listening + 15 Learning).
TIP The default STP convergence time is 30 to 50 seconds. The section Fast STP
Convergence in Chapter 7 discusses ways to improve this.
There are two key points to remember about using the STP timers. First, dont change the
default timer values without some careful consideration. This is discussed in more detail in
Chapter 7. Second, assuming that you are brave enough to attempt timer tuning, you should
only modify the STP timers from the Root Bridge. As you will see in the Two Types of
BPDUs section, the BPDUs contain three elds where the timer values can be passed from
the Root Bridge to all other bridges in the network. Consider the alternative: if every bridge
was locally congured, some bridges could work their way up to the Forwarding state
before other bridges ever leave the Listening state. This chaotic approach could obviously
destabilize the entire network. By providing timer elds in the BPDUs, the single bridge
acting as the Root Bridge can dictate the timing parameters for the entire bridged network.
TIP You can only modify the timer values from the Root Bridge. Modifying the values on other
bridges has no effect. However, dont forget to update any backup Root Bridges.
Mastering the show spantree Command 179
Example 6-1 show spantree Output from Cat-B in the Network Shown in Figure 6-6
Cat-B (enable) show spantree
VLAN 1
Spanning tree enabled Global Statistics
Spanning tree type ieee
This show spantree output in Example 6-1 can be broken down into four sections as follows:
1 Global statistics for the current switch/bridge (lines 24)
2 Root Bridge statistics (lines 59)
3 Local bridge statistics (lines 1012)
4 Port statistics (lines 1316)
The global statistics appear at the top of the screen. The rst line of this section (VLAN 1)
indicates that the output only contains information for VLAN 1. The second line indicates that
STP is enabled on this Catalyst for this VLAN. The nal line of this section shows that the IEEE
version of STP is being utilized (this cannot be changed on most Catalyst switches). Additional
details about these values are discussed in the All of This Per VLAN! section at the end of the
chapter.
The rst two lines of the Root Bridge statistics display the BID of the current Root Bridge.
The BID subelds are displayed separatelyDesignated Root shows the MAC address
contained in the low-order six bytes, whereas Designated Root Priority holds the high-
order two bytes. The cumulative Root Path Cost to the Root Bridge is displayed in the
180 Chapter 6: Understanding Spanning Tree
Designated Root Cost eld. The fourth line of this section (Designated Root Port) shows
the current Root Port of the local device. The last line of the Root Bridge statistics section
shows the timer values currently set on the Root Bridge. As the previous section discussed,
these values are used throughout the entire network (at least for VLAN 1) to provide
consistency. The term designated is used here to signify that these values pertain to the
bridge that this device currently believes is the Root Bridge. However, because of topology
changes and propagation delays during network convergence, this information might not
reect the characteristics of the true Root Bridge.
The local bridge statistics section displays the BID of the current bridge in the rst two
lines. The locally congured timer values are shown in the third line of this section.
TIP The timer values shown in the local bridge statistics section are not utilized unless the
current bridge becomes the Root Bridge at some point.
The port statistics section is displayed at the bottom of the screen. Depending on the
number of ports present in the Catalyst, this display can continue for many screens using
the more prompt. This information displays the Path Cost value associated with each port.
This value is the cost that is added to the Root Path Cost eld contained in BPDUs received
on this port. In other words, Cat-B receives BPDUs on Port 1/1 with a cost of 0 because
they are sent by the Root Bridge. Port 1/1s cost of 19 is added to this zero-cost value to
yield a Designated Root Cost of 19. In the outbound direction, Cat-B sends BPDUs
downstream with a cost of 19Port 1/2's Path Cost of 19 is not added to transmitted
BPDUs.
TIP The cost values displayed in the port statistics section show spantree are added to BPDUs
that are received (not sent) on that port.
The information displayed by show spantree can be critical to learning how Spanning Tree
is working in your network. For example, it can be extremely useful when you need to
locate the Root Bridge. Consider the network shown in Figure 6-14.
Mastering the show spantree Command 181
Cat-1
1/1 1/2
19 100
2/1 1/1
1/2
Cat-2 2/2 19 Cat-4
2/3 2/1
100 19
1/1 1/2
Cat-3
250614.eps
Example 6-2 shows the output of show spantree on Cat-1 for VLAN 1.
Example 6-2 Locating the Root Bridge with show spantree on Cat-1 for VLAN 1
Cat-1 (enable) show spantree
VLAN 1
Spanning tree enabled
Spanning tree type ieee
Although this information indicates that the Root Bridge has a BID of 100.00-E0-F9-16-
28-00, locating the specic MAC address 00-E0-F9-16-28-00 in a large network can be
difcult. One approach is to maintain a list of all MAC addresses assigned to all Catalyst
a tedious and error-prone activity. A more effective approach is to simply use the output of
show spantree to walk the network until you locate the Root Bridge. By looking at the
Designated Root Port eld, you can easily determine that the Root Bridge is located
182 Chapter 6: Understanding Spanning Tree
somewhere out Port 1/1. By consulting our topology diagram (or using the show cdp
neighbor command), you can determine that Cat-2 is the next-hop switch on Port 1/1.
Then, Telnet to Cat-2 and issue the show spantree command as in Example 6-3.
Example 6-3 Locating the Root Bridge with show spantree on Cat-2 for VLAN
Cat-2 (enable) show spantree
VLAN 1
Spanning tree enabled
Spanning tree type ieee
Cat-2's Root Port is Port 2/2. After determining Port 2/2's neighboring bridge (Cat-4),
Telnet to Cat-4 and issue the show spantree command as in Example 6-4.
Example 6-4 Locating the Root Bridge with show spantree on Cat-4 for VLAN 1
Cat-4 (enable) show spantree
VLAN 1
Spanning tree enabled
Spanning tree type ieee
Because Cat-4s Root Port is 2/1, you will next look at Cat-3 (see Example 6-5).
Example 6-5 Locating the Root Bridge with show spantree on Cat-3 for VLAN 1
Cat-3 (enable) show spantree
VLAN 1
Spanning tree enabled
Spanning tree type ieee
Several elds highlight the fact that Cat-3 is the Root Bridge:
The Root Port is Port 1/0. Note that Catalyst 4000s, 5000s, and 6000s do not have a
physical port labeled 1/0. Instead, the NPM software uses a reference to the logical
console port, SC0, as a logical Root Port.
The local BID matches the Root Bridge BID.
The Root Path Cost is zero.
The timer values match.
All ports are in the Forwarding state.
The search is overyou have found the Root Bridge located at Cat-3.
Conguration BPDUs
All of the BPDUs discussed so far (and the vast majority of BPDUs on a healthy network)
are Conguration BPDUs. Figure 6-15 illustrates a BPDUs protocol format.
Figure 6-15 Conguration BPDU Decode
NOTE For simplicity, the chapter has so far ignored the fact that there are two types of BPDUs and
simply has used the term BPDU. However, recognize that all of these cases were referring
to Conguration BPDUs. The second type of BPDU, the Topology Change BPDU, is
discussed in the next section.
The decode in Figure 6-15 was captured and displayed by the NetXRay software from Network
Associates (formerly Network General). Although considerably newer versions are available for
sale, the version shown in Figure 6-15 is useful because it provides a very easy-to-read
representation and decode of the Spanning-Tree Protocol. At the top of the screen you can
observe the Ethernet 802.3 header. The source address is the MAC address of the individual port
sending the BPDU. Every port on a Catalyst uses a unique Source MAC Address value for
BPDUs sent out that port. Note the difference between this MAC address and the MAC address
used to create the BID. The source MAC address is different on every Catalyst port. The BID is
a global, box-wide value (within a single VLAN) that is formed from a MAC address located
on the supervisor card or backplane. The source MAC is used to build the frame that carries the
BPDU, whereas the BID's MAC is contained within the actual Conguration BPDU.
The Destination MAC Address uses the well-known STP multicast address of 01-80-C2-
00-00-00. The Length eld contains the length of the 802.2 LLC (Logical Link Control)
header, BPDU, and pad that follows. Note that the CRC shown at the bottom of the screen
is also part of the 802.3 encapsulation (specically, the 802.3 trailer).
Two Types of BPDUs 185
Below the 802.3 header lies the 802.2 LLC header. This 3-byte header consists of three
elds that essentially identify the payload (in this case, a BPDU). The IEEE has reserved
the DSAP (destination service access point) and SSAP (source service access point) value
0x42 hex to signify STP. This value has the unique advantage of being the same regardless
of bit ordering (0x42 equals 0100 0010 in binary), avoiding confusion in environments that
use translational bridging. Dont worry about the next byte, the control byte. It turns out
that every non-SNA protocol you can name (including STP) always uses the value 0x03 to
represent an Unnumbered Information (UI) frame.
The lower two-thirds of the output contains the actual BPDU. Conguration BPDUs
consist of the following 12 elds (although many displays break the two BIDs out into
separate subelds as shown in Figure 6-15):
Protocol IDAlways 0. Future enhancements to the protocol might cause the
Protocol ID values to increase.
VersionAlways 0. Future enhancements to the protocol might cause the Version
value to increase.
TypeDetermines which of the two BPDU formats this frame contains
(Conguration BPDU or TCN BPDU). See the next section, Topology Change
Notication BPDUs, for more detail.
FlagsUsed to handle changes in the active topology and is covered in the next
section on Topology Change Notications.
Root BID (Root ID in Figure 6-15)Contains the Bridge ID of the Root Bridge.
After convergence, all Conguration BPDUs in the bridged network should contain
the same value for this eld (for a single VLAN). NetXRay breaks out the two BID
subelds: Bridge Priority and bridge MAC address. See the Step One: Elect One
Root Bridge section for more detail.
Root Path CostThe cumulative cost of all links leading to the Root Bridge. See the
earlier Path Cost section for more detail.
Sender BID (Bridge ID in Figure 6-15)The BID of the bridge that created the
current BPDU. This eld is the same for all BPDUs sent by a single switch (for a
single VLAN), but it differs between switches. See the Step Three: Elect Designated
Ports section for more detail.
Port IDContains a unique value for every port. Port 1/1 contains the value 0x8001,
whereas Port 1/2 contains 0x8002 (although the numbers are grouped into blocks
based on slot numbers and are not consecutive). See the Load Balancing section of
Chapter 7 for more detail.
Message AgeRecords the time since the Root Bridge originally generated the
information that the current BPDU is derived from. If a bridge looses connectivity to the
Root Bridge (and hence, stops receiving BPDU refreshes), it needs to increment this
counter in any BPDUs it sends to signify that the data is old. Encoded in 256ths of a second.
186 Chapter 6: Understanding Spanning Tree
Max AgeMaximum time that a BPDU is saved. Also inuences the bridge table
aging timer during the Topology Change Notication process (discussed later). See
the Three STP Timers section for more detail. Encoded in 256ths of a second.
Hello TimeTime between periodic Conguration BPDUs. The Root Bridge sends
a Conguration BPDU on every active port every Hello Time seconds. This causes
the other bridges to propagate BPDUs throughout the bridged network. See the
Three STP Timers section for more detail. Encoded in 256ths of a second.
Forward DelayThe time spent in the Listening and Learning states. Also
inuences timers during the Topology Change Notication process (discussed later).
See the Three STP Timers section for more detail. Encoded in 256ths of a second.
Table 6-6 summarizes the Conguration BPDU elds.
The TCN BPDU is much simpler than the Conguration BPDU illustrated in Figure 6-15
and consists of only three elds. TCN BPDUs are identical to the rst three elds of a
Conguration BPDU with the exception of a single bit in the Type eld. After all, at least
one bit is needed to say this is a TCN BPDU, not a Conguration BPDU. Therefore, the
Type eld can contain one of two values:
0x00 (Binary: 0000 0000) Conguration BPDU
0x80 (Binary: 1000 0000) Topology Change Notication (TCN) BPDU
Thats it. TCN BPDUs dont carry any additional information.
Figure 6-17 TCN BPDUs are Required to Update Bridge Tables More Quickly
MAC=DD-DD-DD-DD-DD-DD
HostD Bridge Table
MAC Port
Root DD-DD-DD-DD-DD-DD 1
Bridge EE-EE-EE-EE-EE-EE 1
1/1 1/2
Cat-A Segment 2
Hub
Bridge Table
Segment 1 Bridge Table
MAC Port
MAC Port
DD-DD-DD-DD-DD-DD 1
1 DD-DD-DD-DD-DD-DD 1 EE-EE-EE-EE-EE-EE 1
1/1 EE-EE-EE-EE-EE-EE 2 1/1
Cat-B Cat-C
1/2 1/2
Hub
2
Segment 3
HostE
MAC=EE-EE-EE-EE-EE-EE
ps
Suppose that Host-D is playing Doom with Host-E. As discussed earlier in Figure 6-12, the
trafc from Host-D ows directly through Cat-B to reach Host-E (Step 1). Assume that the
Ethernet transceiver on Cat-B:Port-1/2 falls out (Step 2). As discussed earlier, Cat-C: Port
1/2 takes over as the Designated Port in 50 seconds. However, without TCN BPDUs, the
game continues to be interrupted for another 250 seconds (4 minutes, 10 seconds). Why is
this the case? Prior to the failure, the bridging table entries for MAC address EE-EE-EE-
EE-EE-EE on all three switches appeared as documented in Table 6-7.
Topology Change Process 189
In other words, all frames destined for Host-E before the failure had to travel
counterclockwise around the network because Cat-C:Port-1/2 was Blocking. When Cat-
B:Port-1/2 fails, Cat-C:Port-1/2 takes over as the Designated Port. This allows trafc to start
owing in a clockwise direction and reach Host-E. However, the bridging tables in all three
switches still point in the wrong direction. In other words, it appears to the network as if Host-
E has moved and the bridging tables still require updating. One option is to wait for the natural
timeout of entries in the bridging table. However, because the default address timeout is 300
seconds, this unfortunately results in the 5-minute outage calculated previously.
TCN BPDUs are a fairly simple way to improve this convergence time (and allow us to
continue playing Doom sooner). TCN BPDUs work closely with Conguration BPDUs as
follows:
1 A bridge originates a TCN BPDU in two conditions:
It transitions a port into the Forwarding state and it has at least one Designated Port.
It transitions a port from either the Forwarding or Learning states to the Blocking
state.
These situations construe a change in the active topology and require notication be
sent to the Root Bridge. Assuming that the current bridge is not the Root Bridge, the
current bridge begins this notication process by sending TCN BPDU out its Root
Port. It continues sending the TCN BPDU every Hello Time interval seconds until the
TCN message is acknowledged (note: this is the locally congured Hello Time, not
the Hello Time distributed by the Root Bridge in Conguration BPDUs).
2 The upstream bridge receives the TCN BPDU. Although several bridges might hear
the TCN BPDU (because they are directly connected to the Root Ports segment), only
the Designated Port accepts and processes the TCN BPDU.
3 The upstream bridge sets the Topology Change Acknowledgment ag in the next
Conguration BPDU that it sends downstream (out the Designated Port). This
acknowledges the TCN BPDU received in the previous step and causes the originating
bridge to cease generating TCN BPDUs.
4 The upstream bridge propagates the TCN BPDU out its Root Port (the TCN BPDU is
now one hop closer to the Root Bridge).
190 Chapter 6: Understanding Spanning Tree
5 Steps 2 through 4 continue until the Root Bridge receives the TCN BPDU.
6 The Root Bridge then sets the Topology Change Acknowledgment ag (to
acknowledge the TCN BPDU sent by the previous bridge) and the Topology change
ag in the next Conguration BPDU that it sends out.
7 The Root Bridge continues to set the Topology Change ag in all Conguration BPDUs
that it sends out for a total of Forward Delay + Max Age seconds (default = 35 seconds).
This ag instructs all bridges to shorten their bridge table aging process from the default
value of 300 seconds to the current Forward Delay value (default=15 seconds).
Figure 6-18 summarizes the use of these bits during the seven-step TCN procedure (the
steps numbers are circled):
Flags
Non-Root Non-Root Root BPDU
Bridge A Bridge B Bridge Type TCA TC
3 Config 1 0
6 Config 1 1
Config 0 1
7 Config 0 1
Config 0 1
TC flag continues
for a total of
Max Age + Forward Delay
seconds (Default=35)
.eps
Applying these steps to the topology in Figure 6-17 (for simplicity, the steps are not shown in
Figure 6-17), Cat-B and Cat-C send TCN BPDUs out their Port 1/1 (Step 1). Because the
upstream bridge is also the Root Bridge, Steps 2 and 5 occur simultaneously (and allow Steps
3 and 4 to be skipped). In the next Conguration BPDU that it sends, the Root Bridge sets the
TCN ACK ag to acknowledge receipt of the TCN from both downstream Catalysts. Cat-A
also sets the Topology Change ag for 35 seconds (assume the default Forwarding Delay and
Max Age) to cause the bridging tables to update more quickly (Step 6 and 7). All three
switches receive the Topology Change ag and age out their bridging tables in 15 seconds.
Notice that shortening the aging time to 15 seconds does not ush the entire table, it just
accelerates the aging process. Devices that continue to speak during the 15-second age-
Topology Change Process 191
out period never leave the bridging table. However, if Host-D tries to send a frame to Host-
E in 20 seconds (assume that Host-E has been silent), it is ooded to all segments by the
switches because the EE-EE-EE-EE-EE-EE MAC address is no longer in any of the
bridging tables. As soon as this frame reaches Host-E and Host-E responds, the switches
learn the new bridge table values that are appropriate for the new topology.
Table 6-8 shows the bridge table entries for MAC address EE-EE-EE-EE-EE-EE on all
three bridges after the new topology has converged and trafc has resumed.
At this point, connectivity between Host-D and Host-E is reestablished and our Doom
Deathmatch can resume. Notice that the TCN BPDU reduced the failover time from 5
minutes to 50 seconds.
As previously mentioned in the Conguration BPDUs section, both Flag elds are stored
in the same octet of a Conguration BPDU. This octet is laid out as illustrated in Figure 6-19.
Bit 7 Bit 0
TCA Reserved TC
As discussed in the previous section, the TCA ag is set by the upstream bridge to tell the
downstream bridge to stop sending TCN BPDUs. The TC ag is set by the Root Bridge to
shorten the bridge table age-out period from 300 seconds to Forward Delay seconds.
192 Chapter 6: Understanding Spanning Tree
Cat-1
19 19
19
Cat-2 Cat-3
Root
19 19
Bridge
100 Cat-4 19
19 19
19
Cat-6 Cat-5
19 19
Cat-7
0.eps
Figure 6-20 illustrates a network of seven switches connected in a highly redundant (that
is, looped) conguration. Link costs are indicatedall are Fast Ethernet (cost of 19) except
for the vertical link on the far left that is 10BaseT (cost of 100).
Assuming that Cat-4 wins the Root War, Figure 6-21 shows the active topology that results.
Using Spanning Tree in Real-World Networks 193
Figure 6-21 Complex Network with Central Root Bridge and Active Topology
Cat-1
Branch-A
Cat-2 Cat-3
Branch-B
Root
Cat-4 Bridge
Branch-C
Cat-6 Cat-5
Branch-D
Host-B Cat-7
Host-A
621.eps
The setup in Figure 6-21 clearly illustrates the basic objective of the Spanning-Tree
Protocol: make one bridge the center of the universe and then have all other bridges locate
the shortest path to that location (all roads lead to Rome). This results in an active
topology consisting of spoke-like branches that radiate out from the Root Bridge.
Notice that the Root Bridge is acting as the central switching station for all trafc between
the four branches and must be capable of carrying this increased load. For example, Cat-7
and Cat-5 on Branch-D must send all trafc through the Root Bridge (Cat-4) to reach any
of the other switches. In other words, dont use your slowest bridge in place of Cat-4!
Figure 6-21 also illustrates the importance of a centrally located Root Bridge. Consider the
trafc between Host-A on Cat-7 and Host-B on Cat-6. When these two users want to re up
194 Chapter 6: Understanding Spanning Tree
a game of Doom, the trafc must cross four bridges despite the fact that Cat-7 and Cat-6 are
directly connected. Although this might seem inefcient at rst, it could be much worse! For
example, suppose Cat-1 happened to win the Root War as illustrated in Figure 6-22.
Figure 6-22 Complex Network with Inefcient Root Bridge and Active Topology
Root
Bridge
Cat-1
Cat-2 Cat-3
Branch-A
Branch-B
Cat-4
Cat-5 Cat-6
Host-B Cat-7
Host-A
2.eps
In this scenario, the network has converged into two branches with all trafc owing
through the Root Bridge. However, notice how suboptimal the ows areDoom trafc
between Host-A and Host-B must now ow through all seven bridges!
Deterministic Root Bridge Placement 195
In the event that I havent convinced you to avoid a randomly chosen Root Bridge, let me
point out that the deck is stacked against you. Assume that Cat-1 is a vintage Cisco MGS
or AGS router doing software bridging (Layer 2 forwarding capacity equals about 10,000
packets per second) and the remaining six devices are Catalyst 5500 or 6000 switches
(Layer 2 forwarding capacity equals millions of packets per second). Without even
thinking, I can guarantee you that the MGS becomes the Root Bridge every time by default!
How can I be so sure? Well, what determines who wins the Root War? The lowest BID. And
as you saw earlier, BIDs are composed of two subelds: Bridge Priority and a MAC
address. Because all bridges default to a Bridge Priority of 32,768, the lowest MAC address
becomes the Root Bridge by default. Catalysts use MAC addresses that begin with OUIs
like 00-10-FF and 00-E0-F9 (for example, MAC address 00-10-FF-9F-85-00). All Cisco
MGSs use Ciscos traditional OUI of 00-00-0C (for example, MAC address 00-00-0C-58-
AF-C1). 00-00-0C is about as low an OUI as you can havethere are only twelve numbers
mathematically lower. Therefore, your MGS always has a lower MAC address than any
Catalyst you could buy, and it always wins the Root War by default in a Cisco network (and
almost any other network on the planet).
In other words, by ignoring Root Bridge placement, you can lower the throughput on your
network by a factor of 1,000! Obviously, manually controlling your Root Bridge is critical
to good Layer 2 performance.
high-order 16 bits of the BID consist of the Bridge Priority, lowering this number by even
one (from 32,768 to 32,767) allows a bridge to always win the Root War against other
bridges that are using the default value.
Bridge Priority is controlled through the set spantree priority command. The syntax for
this command is:
set spantree priority priority [vlan]
Although the vlan parameter is optional, I suggest that you get in the habit of always typing
it (otherwise, someday you will accidentally modify VLAN 1 when you intended to modify
some other VLAN). The text revisits the vlan parameter in more detail later. For now, just
assume VLAN 1 in all cases.
TIP Almost all of the Spanning Tree set and show commands support an optional vlan
parameter. If you omit this parameter, VLAN 1 is implied. Get in the habit of always
explicitly entering this parameter, even if you are only interested in VLAN 1. That way you
don't accidentally modify or view the wrong VLAN.
Suppose that you want to force Cat-4 to win the Root War. You could Telnet to the switch
and enter:
set spantree priority 100 1
This lowers the priority to 100 for VLAN 1, causing Cat-4 to always win against other
switches with the default of 32,768 (including the MGS with its lower MAC address).
But what happens if Cat-4 fails? Do you really want to fail back to the MGS? Probably not.
Make Cat-2 the secondary Root Bridge by entering the following on Cat-2:
set spantree priority 200 1
As long at Cat-4 is active, Cat-2 never wins the Root War. However, as soon as Cat-4 dies,
Cat-2 always takes over.
TIP Notice that I used 100 for the primary Root Bridge and 200 for the secondary. I have found
this numbering convention to work well in the real world and recommend that you adopt it.
It is easy to understand and, more importantly, easy to remember. For example, if you ever
look at a show command and see that your current Root Bridge has a Bridge Priority of 200,
you instantly know that something has gone wrong with the primary Root Bridge. This
scheme also comes in handy when the subject of load balancing is discussed later.
Deterministic Root Bridge Placement 197
To make a particular Catalyst the Root Bridge for VLAN 1, simply Telnet to that device and
enter the following:
set spantree root 1
This causes the Catalyst to look at the Bridge Priority of the existing Root Bridge. If this value
is higher than 8,192, the set spantree root macro programs the local Bridge Priority to 8,192.
If the existing Root Bridge is less than 8,192, the macro sets the local bridge priority to one
less than the value used by the existing Root Bridge. For example, if the existing Root Bridge
is using a Bridge Priority of 100, set spantree root sets the local Bridge Priority to 99.
NOTE The current documentation claims that set spantree root sets the value to 100 less than the
current value if 8,192 is not low enough to win the Root War. However, I have always
observed it to reduce the value only by 1.
To make another Catalyst function as a backup Root Bridge, Telnet to that device and enter
the following:
set spantree root 1 secondary
This lowers the current Catalyst's Bridge Priority to 16,384. Because this value is higher
than the value used by the primary, but lower than the default of 32,867, it is a simple but
effect way to provide deterministic Root Bridge failover.
The dia and hello parameters can be used to automatically adjust the STP timer values
according to the recommendations spelled out in the 802.1D specication. Tuning STP
timers is discussed in detail in the Fast STP Convergence section of Chapter 7.
This might seem like a subtle point, but set spantree root is not a normal commandit is a
macro that programs other commands. In other words, set spantree root never appears in a
show cong listing. However, the results of entering set spantree root do appear in the
conguration. For example, suppose that you run the macro with set spantree root 1.
Assuming that the existing Root Bridge is using a Bridge Priority higher than 8192, the macro
automatically issues a set spantree priority 1 8191 command. After the set spantree
priority command is written to NVRAM, there is no evidence that the macro was ever used.
However, just because set spantree root is a macro, don't let that fool you into thinking that
it is unnecessary uff. On the contrary, using the set spantree root macro has several
benets over manually using the commands yourself:
It can be easier to use.
It doesn't require you to remember lots of syntax.
198 Chapter 6: Understanding Spanning Tree
If you must resort to timer tuning, set spantree root is safer than manually setting the
timers because it calculates the values recommended in the 802.1D spec. See the Fast
Convergence section of Chapter 7 for more detail on timer tuning.
Figure 6-23 PVST Allows You to Create Different Active Topologies for Each VLAN
Root
Bridge
Root
Bridge
ps
Exercises 199
In fact, this is why there is a vlan parameter on the set spantree command that you saw in
the previous section. Every VLAN could have a different set of active paths. Every VLAN
could have different timer values. You get the idea.
The fact that Cisco uses one instance of STP per VLAN has two important implications:
It can allow you to tap into many wonderful features such as load balancing and per-
VLAN ows.
It can make your life miserable (if you dont know what you are doing).
The goal here is to show you how to maximize the rst point and avoid the second.
On the whole, having multiple instances of Spanning Tree gives you a phenomenal tool for
controlling your network. Be aware that many vendors only use one instance of STP for all
VLANs. Not only does this reduce your control, but it can lead to broken networks. For
example, suppose that the single instance of Spanning Tree in VLAN 1 determines the
active topology for all VLANs. However, what if you remove VLAN 5 from a link used by
VLAN 1if that link is selected as part of the active topology for all VLANs, VLAN 5
becomes partitioned. The VLAN 5 users are probably pretty upset about now.
Chapter 7 explores how to take advantage of the multiple instances of STP to accomplish
advanced tasks such as STP load balancing.
Exercises
This section includes a variety of questions and hands-on lab exercises. By completing
these, you can test your mastery of the material included in this chapter as well as help
prepare yourself for the CCIE written and lab tests. You can nd the answers to the Review
Questions and the Hands-On Lab Exercises in Appendix A, Answers to End of Chapter
Exercises.
Review Questions
1 Summarize the three-step process that STP uses to initially converge on an active
topology.
2 How many of the following items does the network shown in Figure 6-24 contain:
Root Bridges, Root Ports, Designated Ports? Assume all devices are operational.
200 Chapter 6: Understanding Spanning Tree
100 100
PCs PCs
Cat-1 Cat-2
eps
3 When running the Spanning-Tree Protocol, every bridge port saves a copy of the best
information it has heard. How do bridges decide what constitutes the best
information?
4 Why are Topology Change Notication BPDUs important? Describe the TCN
process.
5 How are Root Path Cost values calculated?
6 Assume that you install a new bridge and it contains the lowest BID in the network.
Further assume that this devices is running experimental Beta code that contains a
severe memory leak and, as a result, reboots every 10 minutes. What effect does this
have on the network?
7 When using the show spantree command, why might the timer values shown on the
line that begins with Root Max Age differ from the values shown on the Bridge Max
Age line?
8 Label the port types (RP=Root Port, DP=Designated Port, NDP=non-Designated
Port) and the STP states (F=Forwarding, B=Blocking) in Figure 6-25. The Bridge IDs
are labeled. All links are Fast Ethernet.
Exercises 201
BID =
32768.00-90-92-55-55-55
IDF-Cat-5
BID =
32768.00-90-92-44-44-44
IDF-Cat-4
MDF-Cat-2 MDF-Cat-3
BID = 32768.00-90-92-22-22-22 BID = 32768.00-90-92-33-33-33
Server-Farm-Cat-1
BID = 32768.00-90-92-11-11-11
ps
9 What happens to the network in Figure 6-26 if Cat-4 fails?
Cat-1 Cat-6
Cat-2 Cat-7
202 Chapter 6: Understanding Spanning Tree
Hands-On Lab
Build a network that resembles Figure 6-27.
10.1.1.1
PC-1
PC-2
10.1.1.2
27.eps
Using only VLAN 1, complete the following steps:
1 Start a continuous ping (tip: under Microsoft Windows, use the ping -t ip_address
command) between PC-1 and PC-3. Break the link connecting PC-3 to Cat-2. After
reconnecting the link, how long does it take for the pings to resume?
2 Start a continuous ping between PC-1 and PC-2. As in Step 1, break the link between
PC-3 and Cat-2. Does this affect the trafc between PC-1 and PC-2?
3 Use the show spantree command on Cat-1 and Cat-2. What bridge is acting as the
Root Bridge? Make a note of the state of all ports.
4 Why is the 1/2 port on the non-Root Bridge Blocking? How did the Catalyst know to
block this port?
5 Start a continuous ping between PC-1 and PC-3. Break the 1/1 link connecting Cat-1
and Cat-2. How long before the trafc starts using the 1/2 link?
6 Reconnect the 1/1 link from Step 2. What happens? Why?
7 With the continuous ping from Step 3 still running, break the 1/2 link connecting Cat-
1 and Cat-2. What effect does this have?
This page intentionally left blank
The authors would like to thank Radia Perlman for graciously contributing her time
to review the material in this chapter.
This chapter covers the following key topics:
Typical Campus Design: A Baseline NetworkIntroduces a baseline network for
use throughout most of the chapter.
STP Behavior in the Baseline Network: A Spanning Tree ReviewReviews the
concepts introduced in Chapter 6, Understanding Spanning Tree, while also
introducing some advanced STP theory.
Spanning Tree Load BalancingMaster this potentially confusing and poorly
documented feature that can double available bandwidth for free. Four Spanning Tree
load balancing techniques are discussed in detail.
Fast STP ConvergenceSeven techniques that can be used to improve on the
default STP failover behavior of 3050 seconds.
Useful STP Display CommandsSeveral commands that can be extremely useful
for understanding and troubleshooting STPs behavior in your network.
Per-VLAN Spanning Tree Plus (PVST+)An important feature introduced by Cisco,
PVST+ allows interoperability between traditional per-VLAN Spanning Tree (PVST)
Catalysts and devices that support 802.1Q.
Disabling STPExplains how to disable Spanning Tree on Catalyst devices. Also
considers reasons why this might be done and why it shouldnt be done.
Tips and Tricks: Mastering STPA condensed listing of Spanning Tree advice to
help you avoid STP problems in your network.
CHAPTER
7
CAUTION As with Chapter 6, the examples in this chapter are designed to illustrate the intricacies of the
Spanning-Tree Protocol, not good design practices. For more information on Spanning Tree
and campus design principles, please refer to Chapter 11, Layer 3 Switching, Chapter 14,
Campus Design Models, Chapter 15, Campus Design Implementation, and Chapter 17,
Case Studies: Implementing Switches.
Figure 7-1 Typical Design for a Single Building in a Switched Campus Network
Building 1
C-1D
C-1C
C-1A C-1B
Server
Farm
Bu
2
ild
g
in
C-
2B
in
3
ild
C- 3
A
Bu
2C C-
3C
C-
C-
C-
-2
3B
3D
C
2D
C-
CL250701.eps
Figure 7-1 shows a network that might be contained within several buildings of a large
campus environment. The PCs and workstations directly connect on one of the Intermediate
Distribution Frame (IDF) switches located on every oor. These IDF switches also link to
Main Distribution Frame (MDF) switches that are often located in the buildings basement
or ground oor. Because the MDF is carrying trafc for the entire building, it is usually
provisioned with a pair of redundant switches for increased bandwidth and reliability. The
MDF switches then connect to the rest of the campus backbone where a server farm is
generally located.
By looking at this network from the point of view of a single IDF wiring closet, much of
the complexity can be eliminated as shown in Figure 7-2.
Typical Campus Design: A Baseline Network 207
32, 768.DD-DD-DD-DD-DD-DD
Cat-D IDF
1/1 1/2
19 19
Seg 3 Seg 4
1/1 1/1 Cat-C
Cat-B
1/2 19 1/2
MDF
32, 768. Seg 5 32, 768.
BB-BB-BB-BB-BB-BB 2/1 2/1 CC-CC-CC-CC-CC-CC
Seg 1 Seg 2
19 19
1/1 1/2
CL25 0702.eps
Server Farm
Cat-A
32, 768.AA-AA-AA-AA-AA-AA
By reducing the clutter of multiple IDFs and MDFs, Figure 7-2 allows for a much clearer
picture of what is happening at the Spanning Tree level and focuses on the fact that most
campus designs utilize triangle-shaped STP patterns. (These patterns and their impact on
campus design are discussed in detail during Chapters 14, 15, and 17.) Furthermore,
because modern trafc patterns tend to be highly centralized, it eliminates the negligible
level of IDF-to-IDF trafc present in most networks.
NOTE Chapter 14 discusses the differences between two common campus design modules: the
campus-wide VLANs model and the multilayer model. In campus-wide VLANs, the
network consists of many large and overlapping triangle- and diamond-shaped patterns that
can be simplied in the manner shown in Figure 7-2. Because these triangles and diamonds
overlap and interconnect, it often leads to signicant scaling problems. The multilayer
model avoids these issues by using Layer 3 switching (routing) to break the network into
many small triangles of Layer 2 connectivity. Because these triangles are isolated by the
Layer 3 switching component, the network is invariably more scalable.
A detailed discussion of the differences between these two models is beyond the scope of
this chapter. Obviously, consult Chapter 14 for more information. However, it is worth
noting that this chapter often uses examples from the campus-wide VLANs model because
they present more Spanning Tree challenges and opportunities. Nevertheless, you should
generally consider using Layer 3 switching for reasons of scalability.
208 Chapter 7: Advanced Spanning Tree
Cat-D in Figure 7-2 is an IDF switch that connects to a pair of MDF switches, Cat-B and Cat-C.
The MDF switches connect through the campus backbone to a server farm switch, Cat-A.
TIP Utilize this technique of simplifying the network when trying to understand STP in your
network. When doing this, be sure to include paths where unusually heavy trafc loads are
common. Having a modular design where a consistent layout is used throughout the
network can make this much easier (see Chapters 14 and 15 for more information).
Step 2 The bridges consider Root Path Cost, the cumulative cost of the path to
the Root Bridge. Every non-Root Bridge uses this to locate a single,
least-cost path to the Root Bridge.
Step 3 If the cost values are equal, the bridges consider the BID of the
sending device.
Step 4 Port ID (a unique index value for every port in a bridge or switch) is
evaluated if all three of the previous criteria tie.
As long as a port sees its own Conguration BPDU as the most attractive, it continues
sending Conguration BPDUs. A port begins this process of sending Conguration BPDUs
in what is called the Listening state. Although BPDU processing is occurring during the
Listening state, no user trafc is being passed. After waiting for a period of time dened by
the Forward Delay parameter (default=15 seconds), the port moves into the Learning state.
At this point, the port starts adding source MAC addresses to the bridging table, but all
incoming data frames are still dropped. After another period equal to the Forward Delay,
the port nally moves into the Forwarding state and begins passing end-user data trafc.
However, if at any point during this process the port hears a more attractive BPDU, it
immediately transitions into the Blocking state and stops sending Conguration BPDUs.
First, the bridges elect a single Root Bridge by looking for the device with the lowest Bridge
ID (BID). By default, all bridges use a Bridge Priority of 32,768, causing the lowest MAC
address to win this Root War. In the case of Figure 7-2, Cat-A becomes the Root Bridge.
210 Chapter 7: Advanced Spanning Tree
TIP As with all Spanning Tree parameters, the lowest numeric Bridge ID value represents the
highest priority. To avoid the potential confusion of lowest value and highest priority, this
text always refers to values (in other words, the lower amount is preferred by STP).
Second, every non-Root Bridge elects a single Root Port, its port that is closest to the Root
Bridge. Cat-B has to choose between three ports: Port 1/1 with a Root Path Cost of 57, Port
1/2 with a cost of 38, or Port 2/1 with a cost of 19. Obviously, Port 2/1 is the most attractive
and becomes the Root Port. Similarly, Cat-C chooses Port 2/1. However, Cat-D calculates
a Root Path Cost of 38 on both portsa tie. This causes Cat-D to evaluate the third decision
criterionthe Sending BID. Because Cat-B has a lower Sending BID than Cat-C, Cat-
D:Port-1/1 becomes the Root Port.
Finally, a Designated Port is elected for every LAN segment (the device containing the
Designated Port is referred to as the Designated Bridge). By functioning as the only port
that both sends and receives trafc to/from that segment and the Root Bridge, Designated
Ports are the mechanism that actually implement a loop-free topology. It is best to analyze
Designated Port elections on a per-segment basis. In Figure 7-2, there are ve segments.
Segment 1 is touched by two bridge portsCat-A:Port-1/1 at a cost of zero and Cat-B:Port-
2/1 at a cost of 19. Because the directly-connected Root Bridge has a cost of zero, Cat-
A:Port-1/1 obviously becomes the Designated Port. A similar process elects Cat-A:Port-1/
2 as the Designated Port for Segment 2. Segment 3 also has two bridge ports: Cat-B:Port-
1/1 at a cost of 19 and Cat-D:Port-1/1 at a cost of 38. Because it has the lower cost, Cat-
B:Port-1/1 becomes the Designated Port. Using the same logic, Cat-C:Port-1/1 becomes the
Designated Port for Segment 4. In the case of Segment 5, there are once again two options
(Cat-B:Port-1/2 and Cat-C:Port-1/2), however both are a cost of 19 away from the Root
Bridge. By applying the third decision criterion, both bridges determine that Cat-B:Port-1/
2 should become the Designated Port because it has the lower Sending BID.
Figure 7-3 shows the resulting active topology and port states.
STP Behavior in the Baseline Network: A Spanning Tree Review 211
Figure 7-3 Active Topology and Port States in the Baseline Network
Cat-D
RP 1/1 1/2
F B
19 19
Seg 3 Seg 4
F DP
DP F
1/1
Cat-B 1/2 DP 19 1/2 1/1
F Seg 5 B Cat-C
2/1 2/1
F F
RP RP
Seg 1 Seg 2
19 19
F F
CL25 0703.eps
DP 1/1 1/2 DP
Cat-A
Root Bridge
Two ports remain in the Blocking state: Cat-C:Port-1/2 and Cat-D:Port-1/2. These ports are
often referred to as non-Designated Ports. They provide a loop-free path from every
segment to every other segment. The Designated Ports are used to send trafc away from
the Root Bridge, whereas Root Ports are used to send trafc toward the Root Bridge.
NOTE From a technical perspective, it is possible to debate the correct ordering of Steps 2 and 3
in what I have called the 3-Step Spanning Tree Convergence Process. Because 802.1D (the
Spanning Tree standards document) specically excludes Designated Ports from the Root
Port election process, the implication is that Designated Ports must be determined rst.
However, 802.1D also lists the Root Port selection process before the Designated Port
selection process in its detailed pseudo-code listing of the complete STP algorithm. This
text avoids such nerdy debates. The fact of the matter is that both occur constantly and at
approximately the same time. Therefore, from the perspective of learning how the protocol
operates, the order is irrelevant.
In addition to determining the path and direction of data forwarding, Root and Designated
Ports also play a key role in the sending of BPDUs. In short, Designated Ports send
Conguration BPDUs, whereas Root Ports send TCN BPDUs. The following sections
explore the two types of BPDUs in detail.
212 Chapter 7: Advanced Spanning Tree
BPDU BPDU
CL25 0704.eps
Cat-A
1/1 DP 1/1 1/2 DP 1/1
Root Hub Cat-B Hub Cat-D
RP RP
Bridge
Figure 7-4 shows Cat-A, the Root Bridge, originating Conguration BPDUs every two
seconds. As these arrive on Cat-B:Port-1/1 (the Root Port for Cat-B), Conguration BPDUs
are sent out Cat-Bs Designated Ports, in this case Port 1/2.
Several observations can be made about the normal processing of Conguration BPDUs:
Conguration BPDUs ow away from the Root Bridge.
Root Ports receive Conguration BPDUs.
STP Behavior in the Baseline Network: A Spanning Tree Review 213
Figure 7-5 The Root Bridge Failed Just Before Cat-C Was Connected
Cat-B Cat-D
Cat-A
Cat-C
CL25 0705.eps
2 Connect Cat-C
Figure 7-6 illustrates the conversation that ensues between Cat-C and Cat-B.
214 Chapter 7: Advanced Spanning Tree
Cat-B Cat-D
Cat-A
BP
D
BPDU
2
U
1
No! Cat-A
is the Root! I am
the Root!
CL25 0706.eps
1/1
Cat-C
As discussed in Chapter 6, Cat-C initially assumes it is the Root Bridge and immediately
starts sending BPDUs to announce itself as such. Because the Root Bridge is currently
down, Cat-B:Port-1/2 has stopped sending Conguration BPDUs as a part of the normal
processing. However, because Cat-B:Port-1/2 is the Designated Port for this segment, it
immediately responds with a Conguration BPDU announcing Cat-A as the Root Bridge.
By doing so, Cat-B prevents Cat-C from accidentally trying to become the Root Bridge or
creating loops in the active topology.
The sequence illustrated in Figure 7-6 raises the following points about Conguration
BPDU exception processing:
Designated Ports can respond to inferior Conguration BPDUs at any time.
As long as Cat-B saves a copy of Cat-As information, Cat-B continues to refute any
inferior Conguration BPDUs.
Cat-As information ages out on Cat-B in Max Age seconds (default=20 seconds). In
the case of Figure 7-5, Cat-B begins announcing itself as the Root Bridge at that time.
By immediately refuting less attractive information, the network converges more
quickly. Consider what might happen if Cat-B only used the normal conditions to send
a Conguration BPDUCat-C would have 20 seconds to incorrectly assume that it
was functioning as the Root Bridge and might inadvertently open up a bridging loop.
Even if this did not result in the formation of a bridging loop, it could lead to
unnecessary Root and Designated Port elections that could interrupt trafc and
destabilize the network.
Because Cat-D:Port-1/1 is not the Designated Port for this segment, it does not send
a Conguration BPDU to refute Cat-C.
STP Behavior in the Baseline Network: A Spanning Tree Review 215
TIP The TCN process is discussed in considerably more detail in Chapter 6 (see the section
Topology Change Notication BPDUs).
STP Timers
The Spanning-Tree Protocol provides three user-congurable timers: Hello Time, Forward
Delay, and Max Age. To avoid situations where each bridge is using a different set of timer
values, all bridges adopt the values specied by the Root Bridge. The current Root Bridge
places its timer values in the last three elds of every Conguration BPDU it sends. Other
bridges do not alter these values as the BPDUs propagate throughout the network.
Therefore, timer values can only be adjusted on Root Bridges.
STP Behavior in the Baseline Network: A Spanning Tree Review 217
TIP Avoid the frustration of trying to modify timer values from non-Root Bridgesthey can
only be changed from the Root Bridge. However, do not forget to modify the timer values
on any backup Root Bridges so that you can keep a consistent set of timers even after a
primary Root Bridge failure.
The Hello Time timer is used to control the sending of BPDUs. Its main duty is to control
how often the Root Bridge originates Conguration BPDUs; however, it also controls how
often TCN BPDUs are sent. By repeating TCN BPDUs every Hello Time seconds until a
Topology Change Acknowledgement (TCA) ag is received from the upstream bridge,
TCN BPDUs are propagated using a reliable mechanism. 802.1D, the Spanning Tree
standard document, species an allowable range 110 seconds for Hello Time. The syntax
for changing the Hello Time is as follows (see the section Lowering Hello Time to One
Second for more information):
set spantree hello interval [vlan]
The Forward Delay timer primarily controls the length of time a port spends in the
Listening and Learning states. It is also used in other situations related to the topology
change process. First, when the Root Bridge receives a TCN BPDU, it sets the Topology
Change (TC) ag for Forward Delay+Max Age seconds. Second, this action causes all
bridges in the network to shorten their bridge table aging periods from 300 seconds to
Forward Delay seconds. Valid Forward Delay values range from 430 seconds with the
default being 15 seconds. To change the Forward Delay, use the following command (see
the section Tuning Forward Delay for more information):
set spantree fwddelay delay [vlan]
The Max Age timer controls the maximum length of time that a bridge port saves
Conguration BPDU information. This allows the network to revert to a less attractive
topology when the more attractive topology fails. As discussed in the previous paragraph,
Max Age also plays a role in controlling how long the TC ag remains set after the Root
Bridge receives a TCN BPDU. Valid Max Age values are 6 to 40 seconds with the default
being 20 seconds. The command for changing the Max Age time is shown in the following
(see the section Tuning Max Age for more information):
set spantree maxage agingtime [vlan]
Conguration BPDUs also pass a fourth time-related value, the Message Age eld (dont
confuse this with Max Age). The Message Age eld is not a periodic timer valueit
contains the length of time since a BPDUs information was rst originated at the Root
Bridge. When Conguration BPDUs are originated by the Root Bridge, the Message Age
eld contains the value zero. As other bridges propagate these BPDUs through the network,
the Message Age eld is incremented by one at every bridge hop. Although 802.1D allows
for more precise timer control, in practice, bridges simply add one to the existing value,
resulting in something akin to a reverse TTL. If connectivity to the Root Bridge fails and
218 Chapter 7: Advanced Spanning Tree
all normal Conguration BPDU processing stops, this eld can be used to track the age of
any information that is sent during this outage as a part of Conguration BPDU exception
processing discussed earlier.
The Spanning-Tree Protocol also uses a separate Hold Time value to prevent excessive
BPDU trafc. The Hold Time determines the minimum time between the sending of any
two back-to-back Conguration BPDUs on a given port. It prevents a cascade effect of
where BPDUs spawn more and more other BPDUs. This parameter is xed at a non-
congurable value of one second.
Each half in this network has a completely independent Spanning Tree. For example, each
half elects a single Root Bridge. When a topology change occurs on the left side of the
network, it has no effect on the right side of the network (at least from a Spanning Tree
perspective). As Chapter 14, Campus Design Models, discusses, you should strive to use
Layer 3 routers and Layer 2 switches together to maximize the benet that this sort of
separation can have on the stability and scalability of your network.
On the other hand, routers do not always break a network into separate Spanning Trees. For
example, there could be a backdoor connection that allows the bridged trafc to bypass the
router as illustrated in Figure 7-8.
STP Behavior in the Baseline Network: A Spanning Tree Review 219
Figure 7-8 Although This Network Contains a Router, It Represents a Single Bridge Domain
In this case, there is a single, contiguous, Layer 2 domain that is used to bypass the router.
The network only elects a single Root Bridge, and topology changes can create network-
wide disturbances.
TIP By default, Route Switch Modules (RSMs) (and other router-on-a-stick implementations) do
not partition the network as shown in Figure 7-7. This can signicantly limit the scalability
and stability of the network. See Chapter 11, Layer 3 Switching, Chapter 14, Campus
Design Models, and Chapter 15, Campus Design Implementation, for more information.
As Chapter 6 explained, Cisco uses a separate instance of Spanning Tree for each VLAN
(per-VLAN Spanning TreePVST). This provides two main benets: control and
isolation.
Spanning Tree control is critical to network design. It allows each VLAN to have a
completely independent STP conguration. For example, each VLAN can have the Root
Bridge located in a different place. Cost and Priority values can be tuned on a per-VLAN
basis. Per-VLAN control allows the network designer total exibility when it comes to
optimizing data ows within each VLAN. It also makes possible Spanning Tree load
balancing, the subject of the next section.
Spanning Tree isolation is critical to the troubleshooting and day-to-day management of
your network. It prevents Spanning Tree topology changes in one VLAN from disturbing
other VLANs. If a single VLAN loses its Root Bridge, connectivity should not be
interrupted in other VLANs.
220 Chapter 7: Advanced Spanning Tree
NOTE Although Spanning Tree processing is isolated between VLANs, dont forget that a loop in
a single VLAN can saturate your trunk links and starve out resources in other VLANs. This
can quickly lead to a death spiral that brings down the entire network. See Chapter 15 for
more detail and specic recommendations for solving this nasty problem.
However, there are several technologies that negate the control and isolation benets of
PVST. First, standards-based 802.1Q species a single instance of Spanning Tree for all
VLANs. Therefore, if you are using vanilla 802.1Q devices, you lose all of the advantages
offered by PVST. To address this limitation, Cisco developed a feature called PVST+ that
is discussed later in this chapter. Second, bridging between VLANs defeats the advantages
of PVST by merging the multiple instances of Spanning Tree into a single tree.
NOTE Although the initial version of 802.1Q only specied a single instance of the Spanning-Tree
Protocol, future enhancements will likely add support for multiple instances. At presstime,
this issue was being explored in the IEEE 802.1s committee.
To avoid the confusion introduced by these issues and exceptions, it is often useful to
employ the term Spanning Tree domain. Each Spanning Tree domain contains its own set
of STP calculations, parameters, and BPDUs. Each Spanning Tree domain elects a single
Root Bridge and converges on one active topology. Topology changes in one domain do not
effect other domains (other than the obvious case of shared equipment failures). The term
Spanning Tree domain provides a consistent nomenclature that can be used to describe
STPs behavior regardless of whether the network is using any particular technology.
It is important to realize that the term load balancing is a bit of a euphemism. In reality,
Spanning Tree load balancing almost never achieves an even distribution of trafc across the
available links. However, by allowing the use of more than one path, it can have a signicant
impact on your networks overall performance. Although some people prefer to use terms such
as load distribution or load sharing, this text uses the more common term of load balancing.
There are four primary techniques for implementing Spanning Tree load balancing on Catalyst
gear. Table 7-1 lists all four along with the primary command used to implement each.
Each of these is discussed in the separate sections that follow. Note that the discussion only
addresses load balancing techniques based on the Spanning-Tree Protocol. For a discussion
of non-STP load balancing techniques, see Chapter 11 and Chapter 15.
can utilize multiple redundant paths. Within a single VLAN, the trafc is loop free and only
utilizes a single path to reach each destination. But two VLANs can use both redundant
links that you have installed to a wiring closet switch.
For example, consider Figure 7-9, a variation of the simplied campus network used in
Figure 7-2.
Figure 7-9 A Simplied Campus Network
IDF
Cat-C
MDF MDF
Cat-A Cat-B
CL25 0709.eps
Servers Servers
In this version of the simplied campus network, the campus backbone has been eliminated
so that the entire network is contained within a single building. Instead of locating the
server farm in a separate building, the servers have been moved to the MDF closets.
Redundant links have been provided to the IDF switches (only one IDF is shown). Assume
that the wiring closet switch only supports 20 users that rarely generate more than 100
Mbps of trafc. From a bandwidth perspective, a single riser link could easily handle the
load. However, from a redundancy perspective, this creates a single point of failure that
most organizations are not willing to accept. Given the assumption that corporate policy
dictates multiple links to every wiring closet, wouldnt it be nice to have all the links
available for carrying trafc? After all, after both links have been installed, Spanning Tree
load balancing can potentially double the available bandwidth for free! Having three paths
to every closet is indicative of two things: You work for a really paranoid organization and
you can possibly triple your bandwidth for free!
Spanning Tree Load Balancing 223
Even when redundant links are available, dont overlook the second STP requirement for
Spanning Tree load balancingmultiple VLANs. Many organizations prefer to place a
single VLAN on each switch to ease VLAN administration. However, this is in conict with
the goal of maximizing bandwidth through load balancing. Just rememberone
VLAN=one active path. This is not meant to suggest that its wrong to place a single VLAN
on a switch, just that it prevents you from implementing Spanning Tree load balancing.
TIP If you only place a single VLAN on a switch, you cannot implement Spanning Tree load
balancing. Either add VLANs or consider other load balancing techniques such as
EtherChannel or MHSRP. See Chapter 11 and Chapter 15 for more information.
IDF
Cat-C
1/1 1/2
2
VL
N
AN
A
VL
19
19
VLAN 2 VLAN 3
Root Bridge Servers Servers Root Bridge
224 Chapter 7: Advanced Spanning Tree
This network contains two VLANs. Cat-A is the Root Bridge for VLAN 2, and Cat-B is the
Root Bridge for VLAN 3. From Cat-Cs perspective, the available bandwidth to the servers
has been doubled. First, examine VLAN 2. Cat-C has two possible paths to the Root Bridge:
Cat-C:Port-1/1 can reach the Root Bridge with a cost of 19, whereas Cat-C:Port-1/2 can get
there at a cost of 38. Obviously, Port 1/1 is chosen as Cat-Cs Root Port for VLAN 2. VLAN
3 also has two paths to the Root Bridge, but this time the costs are reversed: 38 through Port
1/1 and 19 through Port 1/2. Therefore, VLAN 3s trafc uses Cat-C:Port-1/2. Both links
are active and carrying trafc. However, if either link fails, Spanning Tree places all
bandwidth on the remaining link to maintain connectivity throughout the network.
IDF
MDF MDF
Cat-A Cat-B
Router A Router B
Not only has the load been distributed between the two routers (Router A is handling VLAN
2, whereas Router B is handling VLAN 3), the Layer 2 load has also been spread across
both IDF uplinks. By placing the Root Bridge for VLAN 2 on Cat-A, the Spanning-Tree
Spanning Tree Load Balancing 225
Protocol automatically creates the best Layer 2 path to the default gateway, the rst-hop
destination of most trafc in modern campus networks. The same has been done with
Cat-B and VLAN 3.
More information on this technique (including how to use HSRP for redundancy) is
discussed in Chapter 11.
Figure 7-12 Root Bridge Placement Load Balancing Requires Well-Dened Trafc Patterns
Part A Part B Part C
Sales
Servers Servers
Root A
A B Bridge B A B
C D C D C D
Root
E F E F E Bridge
F
Part A of Figure 7-12 illustrates the physical topology and the location of servers. The Sales
department has its servers attached to Cat-A, whereas the Human Resources department has
connected their servers to Cat-F. Part B of Figure 7-12 shows the active topology for the Sales
VLAN. By placing the Root Bridge at the servers, the Spanning Tree topology automatically
mirrors the trafc ow. Part C of Figure 7-12 shows the active topology for the Human
Resources VLAN. Again, the paths are optimal for trafc destined to the servers in that
VLAN. Consider what happens if the Root Bridges for both VLANs are placed on Cat-F. This
forces a large percentage of the Sales VLANs trafc to take an inefcient path through Cat-F.
A potential problem with using this technique is that the trafc between the VLANs might
be too similar. For example, what if a single server farm handles the entire network? You
are left with two unappealing options. First, you could optimize the trafc ow by placing
all of the Root Bridges at the server farm, but this eliminates all load balancing. Second,
you could optimize for load balancing by distributing the Root Bridges, but this creates
unnecessary bridge hops for trafc that is trying to reach the servers.
The rst command lowers the Bridge Priority on Cat-A to 100 for VLAN 2 (the Sales
VLAN) so that it wins the Root Bridge election. In a similar fashion, the second command
congures Cat-F to be the Root Bridge for VLAN 3 (the Human Resources VLAN).
Figure 7-13 Back-to-Back Switches Cannot Use Root Bridge Placement Load Balancing
Part A Part B
VLAN 2 VLAN 3
100.AA-AA-AA-AA-AA-AA
Root Bridge
Cat-A Cat-A
VLAN 3
DP F F DP RP F B
1/1 1/2 1/1 1/2
Cat-B Cat-B
CL25 0712.eps
Root Bridge
100.BB-BB-BB-BB-BB-BB
First, examine VLAN 2 as shown in Part A. Cat-A needs to pick a single Root Port to reach
Cat-B, the Root Bridge for VLAN 2. As soon as Cat-A recognizes Cat-B as the Root Bridge,
Cat-A begins evaluating Root Path Cost. Because both Cat-A:Port-1/1 and Cat-A:Port-1/2
have a cost of 19 to the Root Bridge, there is a tie. In an effort to break the tie, Cat-A considers
the Sending BID that it is receiving over both links. However, both ports are connected to the
same bridge, causing Cat-A to receive the same Sending BID (100.BB-BB-BB-BB-BB-BB)
on both links. This results in another tie. Finally, Cat-A evaluates the Port ID values received
228 Chapter 7: Advanced Spanning Tree
NOTE Note that it is possible to implement load balancing in Figure 7-13 by crossing the cables
such that Cat-B:Port-1/1 connects to Cat-A:Port-1/2 and Cat-B:Port-1/2 connects to Cat-
A:Port-1/1. However, this approach is not very scalable and can be difcult to implement
in large networks. Exercise 1 at the end of this chapter explores this load balancing
technique.
Although the network in Figure 7-13 fails to implement load balancing, it does raise two
interesting points. First, notice that it is the non-Root Bridge that must implement load
balancing. Recall that all ports on the Root Bridge become Designated Ports and enter the
Forwarding state. Therefore, it is the non-Root Bridge that must select a single Root Port
and place the other port in a Blocking state. It is precisely this decision process that must
be inuenced to implement load balancing.
Second, it is the received values that are being used here. Cat-A is not evaluating its own
BID and Port ID; it is looking at the values contained in the BPDUs being received from
Cat-B.
NOTE There is one case where the local Port ID is used. As shown in Figure 7-14, imagine two
ports on Cat-B connecting to a hub that also connects to Cat-A, the Root Bridge. In this
case, the received Port ID is the same on both ports of Cat-B. To resolve the tie, Cat-B needs
to evaluate its own local Port ID values (the lower Port ID becomes the Root Port). This
topology is obviously fairly rare in modern networks and not useful for load balancing
purposes.
Spanning Tree Load Balancing 229
Root Bridge
Cat-A
DP F
Port 1/1
Hub
RP F B Port 1/2
Port 1/1
Cat-B
How can the load balancing be xed in Figure 7-13? Given that Port ID is being used as
decision criterion to determine which path to use, one strategy is to focus on inuencing
these Port ID values. On Catalysts using the XDI/CatOS interface (such as the Catalyst
4000s, 5000s, and 6000s), this can be done by applying the set spantree portvlanpri
command. The full syntax for this command is
set spantree portvlanpri mod_num/port_num priority [vlans]
where mod_num is the slot number that a line card is using and port_num is the port on an
individual line card.
As with the other set spantree and show spantree commands already discussed, the vlans
parameter is optional. However, I suggest that you always code it so that you dont
accidentally modify VLAN 1 (the default) one day.
The priority parameter can have the values ranging from 0 to 63, with 32 being the default.
The priority parameter adjusts the Port ID eld contained in every Conguration BPDU.
Although the Port ID eld is 16 bits in length, set spantree portvlanpri just modies the
high-order 6 bits of this eld. In other words, Port ID consists of the two subelds shown
in Figure 7-15.
230 Chapter 7: Advanced Spanning Tree
Port ID
16 Bits
CL25 0713.eps
6 Bits 10 Bits
Port Priority Port Number
Port Number is a unique value statically assigned to every port: 1 for Port 1/1, 2 for Port 1/
2, and so on (because number ranges are assigned to each slot, not all of the numbers are
consecutive). Being 10 bits in length, the Port Number subeld can uniquely identify 210,
or 1,024, different ports. The high-order 6 bits of the Port ID eld hold the Port Priority
subeld. As a 6-bit number, this subeld can mathematically hold 26=64 values (0 to 63,
the same values that can be used with the set spantree portvlanpri command). Because
the Port Priority subeld is contained in the high-order bits of Port ID, lowering this value
by even one (to 31 from the default of 32) causes that port to be preferred.
The overly observant folks in the crowd might notice that Cisco routers use a different range
of Port Priority values than do Catalyst switches. Whereas Catalysts accept Port Priority
values between 0 and 63, the routers accept any value between 0 and 255. This difference
comes from the fact that the routers are actually using the values specied in the 802.1D
standard. Unfortunately, the 802.1D scheme only uses 8 bits for the Port Number eld,
limiting devices to 256 ports (28). Although this is more than adequate for traditional
routers, it is a signicant issue for high-density switches such as the Catalyst 5500. By
shifting the subeld boundary two bits, the Catalysts can accommodate the 1,024 ports
calculated in the previous paragraph (210). Best of all, this difference is totally harmless
as long as every port has a unique Port ID, the Spanning-Tree Protocol is perfectly happy.
In fact, as combined Layer 2/Layer 3 devices continue to grow in popularity and port
densities continue to increase, this sort of modication to the 802.1D specication should
become more common.
NOTE Starting in 12.0 IOS, the routers started using a subeld boundary that allows the Port
Number subeld to be between 9 and 10 bits in length. This was done to support high-
density bridging routers such as the 8540. The new scheme still allows Port Priority to be
specied as an 8-bit value (0255) and then the value is divided by either two (9-bit Port
Number) or four (10-bit Port Number) to scale the value to the appropriate size.
Spanning Tree Load Balancing 231
How does all this bit-twiddling cause trafc to ow across multiple paths? Figure 7-16
redraws the VLANs originally presented in Figure 7-13 to locate the Root Bridge for both
VLANs on Cat-B.
Figure 7-16 Back-to-Back Switches: The Root Bridge for Both VLANs Is on Cat-B
= 0x7C02
= 0x8001
= 0x8001
= 0x8002
Port ID
Port ID
Port ID
Port ID
DP DP DP DP
F F F F 1/2
1/1 1/2 1/1 1/2 1/1
Cat-B Cat-B Cat-B
As was the case with Part A of Figure 7-13, the default Conguration BPDUs received on
Port 1/1 of Cat-A contains a Port ID of 0x8001, but Port 1/2 receives the value 0x8002.
Because 0x8001 is lower, Port 1/1 becames the Root Port for all VLANs by default.
However, if you lower VLAN 3s Port Priority to 31 on Cat-B:Port-1/2, it lowers the Port
ID that Cat-A:Port-1/2 receives for VLAN 3 to 0x7C01. Because 0x7C01 is less than
0x8001, Cat-A now elects Port 1/2 as the Root Port for VLAN 3 and sends trafc over this
link. The syntax to implement this change is as follows:
Cat-B (enable) set spantree portvlanpri 1/2 31 3
Voil, you have load balancingVLAN 2 is using the left link and VLAN 3 is using the
right link.
TIP Note that the portvlanpri value must be less than the value specied by portpri.
232 Chapter 7: Advanced Spanning Tree
By default, Cat-A is already sending trafc over the 1/1 link, so it is not necessary to add
any commands to inuence this behavior. However, it is probably a good idea to explicitly
put in the command so that you can document your intentions and avoid surprises later:
Cat-B (enable) set spantree portvlanpri 1/1 31 2
This command lowers Cat-Bs Port Priority on Port 1/1 to 31 for VLAN 2 and reinforces
Cat-As default behavior of sending trafc over this link for VLAN 2.
As you might expect on a device performing load balancing, one port is forwarding and one
is blocking (see the last two lines from Example 7-1). You can also observe that the load
balancing is having the desired effect because Port 1/2 is the forwarding port. However,
notice how the Port Priority is still set at 32. At rst this might appear to be a bug. On the
Spanning Tree Load Balancing 233
contrary, the Port Priority eld of show spantree only shows the outbound Port Priority. To
see the values Cat-A is receiving, you must look at Cat-Bs outbound values as shown in
Example 7-2.
There it isPort 1/2s Priority has been set to 31. If you think about the theory behind using
set spantree portvlanpri, this makes perfect sense. However, its very easy to look for the
value on Cat-A, the device actually doing the load balancing.
TIP The received Port ID can be viewed on Cat-A with the show spantree statistics command. This
feature is discussed in the Useful STP Display Commands section toward the end the chapter.
Keeping all of this straight is especially fun when you are trying to track down some
network outage and the pressure is on! In fact, it is this counterintuitive nature of port/
VLAN priority that makes it such a hassle to use.
As if the confusing nature of port/VLAN priority wasnt bad enough, set spantree
portvlanpri can only be used in back-to-back situations. Recall that Port ID is evaluated last
in the STP decision sequence. Therefore, for it to have any effect on STPs path sections, both
Root Path Cost and Sending BID must be identical. Although identical Root Path Costs are
fairly common, identical Sending BIDs only occur in one topology: back-to-back bridges. For
example, look back at Figure 7-3. Cat-D never receives the same Sending BID on both links
because they are connected to completely different switches, Cat-B and Cat-C. Modifying
port/VLAN priority in this case has no effect on load balancing.
234 Chapter 7: Advanced Spanning Tree
TIP Dont use set spantree portvlanpri in the typical model where each IDF switch is
connected to redundant MDF switches; it does not work.
TIP Dont use set spantree portvlanpri with back-to-back congurations. Instead, use Fast or
Gigabit EtherChannel.
Unfortunately, many documents simply present set spantree portvlanpri as the way to do
load balancing in a switched network. By failing to mention its limitations and proper use
(for example, that it should be coded on the upstream bridge), many users are led on a
frustrating journey and never get load balancing to work.
Given all the downsides, why would anyone use port/VLAN priority load balancing? In
general, Root Bridge placement and port/VLAN cost provide a much more intuitive,
exible, and easy-to-understand options. However, there are two cases where set spantree
portvlanpri might be useful. First, if you are running pre-3.1 NMP code, the port/VLAN
cost feature is not available. Second, you might be using back-to-back switches in a
conguration where Etherchannel is not an optionfor example, if you are using non-
EtherChannel capable line cards in your Catalyst or the device at the other end of the link
is from another vendor (God forbid!).
NOTE The version numbers given in this chapter are for Catalyst 5000 NMP images. Currently,
this is the same numbering scheme that the Catalyst 4000s and 6000s use. Other Cisco
products might use different numbering schemes.
Finally, dont confuse set spantree portvlanpri with the set spantree portpri command.
The set spantree portpri command allows you to modify the high-order 6 bits of the Port
ID eld; however, it modies these bits for all VLANs on a particular port. This obviously
does not support the sort of per-VLAN load balancing being discussed in this section. On
the other hand, set spantree portvlanpri allows Port Priority to be adjusted on a per-port
and per-VLAN basis.
Spanning Tree Load Balancing 235
NOTE To save NVRAM storage space, set spantree portvlanpri only allows you to set a total of
two priority values for each port. One is the actual value specied on the set spantree
portvlanpri command line. The other is controlled through the set spantree portpri
command. This behavior can appear very strange to uninformed users. See the section
Load Balancing with Port/VLAN Cost for more detail (port/VLAN cost also suffers from
the same confusing limitation). However, this limitation rarely poses a problem in real-
world networks (in general, it is only required for back-to-back switches linked by more
than two links, a situation much better handled by EtherChannel).
Figure 7-17 A Campus Network where Root Bridge Placement and Port Priority Load Balancing Are Not Effective
Building 1
Cat-1C
IDF
VL
AN
AN
VL
19 19
3
Cat-1A Cat-1B
VLAN 3
VLAN 2 4 Root MDF
Root Bridge
Bridge
4 4
VLAN 2
4 4
3
AN
VL
Cat-2A Cat-2B
MDF
4
3
&
s2
19 19
AN
VL
IDF
CL25 0715.eps
Cat-2C
Building 2
Spanning Tree Load Balancing 237
Figure 7-17 is a simplied diagram of a typical (but not necessarily recommended) campus
design. In this design, VLANs 2 and 3 span all switches in both buildings. Each building consists
of a pair of redundant MDF (Main Distribution Frame) switches in the main wiring closet. Both
MDF switches connect to the IDF (Intermediate Distribution Frame) switches sitting on every
oor (only one IDF switch in each building is shown to simplify the diagram). Link costs are
also indicated (the IDF links are Fast Ethernet and MDF links are Gigabit Ethernet).
To implement load balancing, you cannot use the Port Priority load balancing technique
because the switches are not back-to-back. How about using Root Bridge placement? To
load balance in Building 1, you could place the Root Bridge for VLAN 2 on Cat-1A while
placing the Root Bridge for VLAN 3 on Cat-1B. This causes VLAN 2s trafc to use the
left riser link and VLAN 3s trafc to use the riser on the right. So far so good.
But what does this do to the trafc in Building 2? The IDF switch in Building 2 (Cat-2C) has
several paths that it can use to reach the Root Bridge for VLAN 2 (Cat-1A). Which of these
paths does it use? Well, refer back to the four-step STP decision criteria covered earlier. The
rst criterion evaluated is always the Root Bridge. Because everyone is in agreement that Cat-
1A is the Root Bridge for VLAN 2, Cat-2C proceeds to the second criterionRoot Path Cost.
One possibility is to follow the path Cat-2C to Cat-2B to Cat-2A to Cat-1A at a Root Path
Cost of 27 (19+4+4). A better option is Cat-2C to Cat-2B to Cat-1A at a cost of 23 (19+4).
However, path Cat-2C to Cat-2A to Cat-1A also has a Root Path Cost of 23 (19+4).
Because there are two paths tied for the lowest cost, Cat-2A proceeds to the third decision
factorSending BID. Assume that both Cat-2A and Cat-2B are using the default Bridge
Priority (32,768). Also assume that Cat-2A has a MAC address of AA-AA-AA-AA-AA-
AA and Cat-2B has a MAC address of BB-BB-BB-BB-BB-BB. Because Cat-2A has the
lower BID (32,768.AA-AA-AA-AA-AA-AA), all trafc for VLAN 2 uses the path Cat-2C
to Cat-2A to Cat-1B.
OK, how about VLAN 3? Because all switches are in agreement that Cat-1B is the Root Bridge
of VLAN 3, Root Path Cost is considered next. One option is to follow the path Cat-2C to Cat-
2A to Cat-1A to Cat-1B at a cost of 27 (19+4+4). A better option is Cat-2C to Cat-2A to Cat-
1B at a cost of 23 (19+4). However, again, there is an equal cost path along Cat-2C to Cat-2B to
Cat-1B (cost =19+4=23). Cat-2C then evaluates the BIDs of Cat-2A and Cat-2B, choosing Cat-
2A. VLAN 3 trafc therefore follows the path Cat-2C to Cat-2A to Cat-1B. This does provide
load balancing across the campus core, but now both VLANs are using the same IDF riser cable.
In other words, the load balancing in Building 1 destroyed the load balancing in Building 2.
Clearly, a new technique is required. Assuming that you want to maintain both VLANs across
both buildings (I am using this assumption because it is a common design technique; however,
in general, I recommend against itsee Chapter 14 for more details), there are two options:
Bridge Priority
Port/VLAN cost
Of these, port/VLAN cost is almost always the better option. However, because set
spantree portvlancost was not available until 3.1 Catalyst images, the Bridge Priority
technique is discussed rst.
238 Chapter 7: Advanced Spanning Tree
For consistency, you should also enter the following command on Cat-2A (although this
command merely reinforces the default behavior, it helps document your intentions):
Cat-2A (enable) set spantree priority 1000 2
Building 1
Cat-1C
1/1 1/2
19 19
Cat-1A Cat-1B
VLAN 3
VLAN 2 4 Root
Root Bridge
Bridge
4 4
VLAN 2
VLAN 3
4 4
Cat-2A Cat-2B
set spantree set spantree
priority 1000 2 4 priority 1000 3
19 19
2
3
AN
AN
VL
VL
1/1 1/2
CL25 0716.eps
Cat-2C
Building 2
240 Chapter 7: Advanced Spanning Tree
Note that these commands are adjusting Bridge Priority, not Port Priority. set spantree
portvlanpri, the previous technique, was used to adjust Port Priority (the high-order 6 bits
of the Port ID eld) on a per-port and per-VLAN basis. On the contrary, Bridge Priority is
a parameter that is global across all ports on a given switch, but it can be individually set
for each VLAN on that switch.
Wait a minute! you exclaim. Isnt this the same value I used to set the Root Bridge?
Yes, it is! The Bridge Priority must be low enough to inuence the load balancing of trafc
ows, but if it is too low, the bridge wins the Root War and disturbs the entire topology.
Picking the right balance of Port Priorities can be a tricky job in large, complex, and overly
at networks. In fact, it is one of the most signicant problems associated with Bridge
Priority load balancing. To maintain a consistent pattern, you should use a scheme such as
100, 200, 300, to position your Root Bridges and 1000, 2000, 3000, to inuence load
balancing. This numbering scheme helps you make your network self-documenting while
also keeping the load balancing adjustments safely out of the range of the Root Bridge
adjustments. This approach assumes that you dont have more than eight backup Root
Bridges (if thats not the case, you probably have bigger problems than worrying about load
balancing!).
TIP Use a Bridge Priority numbering scheme that clearly delineates Root Bridge placement
from load balancing. For example, use multiples of 100 for your Root Bridges and
multiples of 1000 for load balancing.
An additional caveat of using Bridge Priorities to implement load balancing is that it can
scale poorly. As your network grows, it is very difcult to keep track of which device has
had its Bridge Priority adjusted for each VLAN for which reason (Root Bridge or load
balancing).
Furthermore, the technique shares much of the confusion of set spantree portvlanpri.
Namely, you must modify MDF switches to implement load balancing on IDF switches.
Because it is the received BPDUs that are being evaluated on Cat-2C, changing the BID on
Cat-2C has no effect. In short, you must modify the upstream switch when load balancing
with either Bridge Priorities or port/VLAN priority.
Spanning Tree Load Balancing 241
To make matters even worse, this load balancing scheme creates an awkward situation
where one technique is used in Building 1 (Root Bridge placement) and another technique
is used in Building 2 (Bridge Priority). It is much more straightforward to use a simple
technique that is exible enough to implement load balancing in both buildings.
Why, then, does anyone choose to use Bridge Priorities to implement load balancing? For
one simple reasonprior to 3.1 Catalyst 5000 code it was your only choice in certain
topologies. However, dont get discouragedstarting in 3.1, Cisco offered a feature called
port/VLAN cost load balancing that addresses these downsides.
NOTE Somewhat ironically, the Catalyst 3000s (3000, 3100, and 3200) supported per-port and
per-VLAN cost conguration well before the feature was moved into the high-end products
such as the Catalyst 5000.
Figure 7-19 Load Balancing Using the set spantree portvlancost Command
Building 1 Building 1
Cat-1C Cat-1C
VL
2
AN
VL
19 1000 1000 19
3
Cat-1A Cat-1B Cat-1A Cat-1B
VLAN3
4
Root 4 4 Root 4 4
Bridge Bridge
VLAN 2
4 4 4 4
VL
AN
3
19 1000 1000 19
3
2
AN
N A
VL
VL
Cat-2C Cat-2C
Building 2 Building 2
Spanning Tree Load Balancing 243
First, consider load balancing in Building 1. Cat-1C, the IDF switch in Building 1, has two
potential paths to the Root Bridge. It can go directly to Cat-1A over the 1/1 link at a cost of
19, or it can use the 1/2 link to go through Cat-1B at a cost of 23 (19+4). The 1/1 link is ne
for VLAN 2, but load balancing requires that VLAN 3 use the 1/2 link. This can be
accomplished by increasing the Root Path Cost on the 1/1 link to anything greater than 23
for VLANs that should take the 1/2 link. For example, enter the command from Example
7-3 on Cat-1C to increase the Root Path Cost for VLAN 3 on Port 1/1 to 1000.
Although a comparable entry is not required for VLAN 2 on the 1/2 link, it is generally a
good idea to add it for consistency as illustrated in Example 7-4.
The previous two commands increase the Root Path Cost on the undesirable link high enough
to force trafc across the other IDF-to-MDF link. For example, the command in Example 7-
3 discourages VLAN 3s trafc from using the 1/1 link, causing it to use the 1/2 link.
However, if either riser link fails, all of the trafc rolls over to the remaining connection.
As mentioned earlier, it is important to understand the difference between Root Path Cost and
Path Cost. This point is especially true when working with port/VLAN cost load balancing.
Root Path Cost is the cumulative Spanning Tree cost from a bridge or switch to the Root Bridge.
Path Cost is the amount that is added to Root Path Cost as BPDUs are received on a port. Notice
that set spantree portvlancost is manipulating Path Cost, not Root Path Cost. It might help if
you remember the command as set spantree portvlanpathcost (just dont try typing that in!).
Notice that the set spantree portvlancost command allows you to omit several of the
parameters as a shortcut. If you omit the cost parameter, it lowers the cost by one from its
244 Chapter 7: Advanced Spanning Tree
current value. If you omit the preferred_vlans parameter, it uses the VLAN list from the last
time the command was used. In other words, the command in Example 7-5 is designed to
make Port 1/2 the preferred path for VLAN 3.
Example 7-5 Selecting Cat-1C:Port-1/2 As the Preferred Path for VLAN 3 Using the Automatically Calculated Value
Cat-1C> (enable) set spantree portvlancost 1/2 3
Port 1/2 VLANs 1-2,4-1005 have path cost 19.
Port 1/2 VLANs 3 have path cost 18.
TIP Unlike most Spanning Tree commands on Catalysts that substitute the default value of 1
when the vlan parameter is omitted, set spantreee portvlancost uses the same VLAN (or
VLANs) as the previous use of this command. To avoid surprises, it is safer to always
specify both the cost and the preferred_vlans parameters.
However, lowering the cost to 18 on Port 1/2 for VLAN 3 does not work in situations such
as Cat-1C in Figure 7-19. In this case, Cat-1C sees two paths to the Root Bridge. As
explained earlier, the Root Path Cost values before tuning are 19 on the 1/1 link and 23
(19+4) on the 1/2 link. Lowering the 1/2 Path Cost by one results in a Root Path Cost for
Port 1/2 of 22, not enough to win the Root Port election.
One solution is to manually specify a cost parameter that is low enough to do the trick as
in Example 7-6.
Example 7-6 Selecting Cat-1C:Port-1/2 As the Preferred Path for VLAN 3 By Manually Specifying a Lower Cost
on Port 1/2
Console> (enable) set spantree portvlancost 1/2 cost 14 3
Port 1/2 VLANs 1-2,4-1005 have path cost 19.
Port 1/2 VLANs 3 have path cost 14.
This lowers the cumulative Root Path Cost on Port 1/2 to 18 (14+4) and causes it to win out
against the cost of 19 on Port 1/1.
However, this approach might not be stable in the long run. What if the link between Cat-1A and
Cat-1B is replaced with Fast Ethernet or Fast EtherChannel? Or what if an additional switch is
added in the middle of this link? In fact, if anything is done to increase the cost between Cat-1A
and Cat-1B, this load balancing scheme fails. Therefore, as a general rule of thumb, it is better
to increase the cost of undesirable paths than decrease the cost of desirable paths.
TIP Increasing the cost of undesirable paths is more exible and scalable than decreasing the
cost of desirable paths.
Spanning Tree Load Balancing 245
Example 7-7 Selecting Cat-2C:Port-1/2 As the Preferred Path for VLAN 3 By Manually Specifying a Higher Cost
on Port 1/1
Cat-2C (enable) set spantree portvlancost 1/1 cost 1000 3
Port 1/1 VLANs 1-2,4-1005 have path cost 19.
Port 1/1 VLANs 3 have path cost 1000.
Then, force VLAN 2 over the 1/1 link, as shown in Example 7-8.
See how much easier set spantree portvlancost is to use than the port/VLAN priority and
Bridge Priority load balancing? First, it is much easier to visualize the impact that the
commands are having on the network. Second, both IDF switches use similar and
consistent commands. Third, and best of all, the commands are entered on the very switch
where the load balancing is occurringthe IDF switch. If you have an IDF switch with
multiple uplinks, just Telnet to that device and use set spantree portvlancost to spread the
load over all available links. With port/VLAN cost, there is no need to make strange
modications to upstream switches.
The results of using set spantree portvlancost are also much easier to read in show
spantree. For example, the output in Example 7-9 would appear for VLAN 2 on Cat-2C,
the IDF switch in Building 2.
246 Chapter 7: Advanced Spanning Tree
Example 7-9 show spantree Output for VLAN 2 After Using Port/VLAN Cost Load Balancing
Cat-2C (enable) show spantree 2
VLAN 2
Spanning tree enabled
Spanning tree type ieee
The increased cost on the 1/2 link is plainly shown along with the fact that Port 1/1 is in the
Forwarding state. Also notice that Port 1/1 is acting as the Root Port.
The output in Example 7-10 shows the Spanning Tree information for VLAN 3 on the
same switch.
Example 7-10 show spantree Output for VLAN 3 After Using Port/VLAN Cost Load Balancing
Cat-2C (enable) show spantree 3
VLAN 3
Spanning tree enabled
Spanning tree type ieee
In the case of Example 7-10, the 1/1 port is Blocking, whereas the 1/2 port is Forwarding.
The Root Port has also shifted to 1/2 with a Root Path Cost of 23 (19 for the riser link plus
4 to cross the Gigabit Ethernet link between Cat-2B and Cat-1A).
TIP Carefully observe the difference between Root Path Cost and Path Cost. Root Path Cost is
the cumulative cost to the Root Bridge. Path Cost is the amount that each port contributes
to Root Path Cost.
Because it is both exible and easy to understand, port/VLAN cost is one of the most useful
STP load balancing tools (along with Root Bridge placement). The only requirement is that
you run 3.1 or higher code on the switches where you want to implement the load balancing
(generally your IDF switches). Note that you do not need to run 3.1+ code everywhere, it
is only necessary on the actual devices where set spantree portvlancost load balancing is
congured (modifying the set spantree portvlancost on some devices does not create any
interoperability problems with your other switches).
TIP As with port/VLAN priority, the per-VLAN value must be less than the per-port value. In other
words, the set spantree portvlancost value must be less than the set spantree portcost value.
For example, the command in Example 7-11 increases the cost to 1000 for VLANs 2 and 3
on Port 1/1.
Example 7-11 Increasing Port/VLAN Cost on Port 1/1 for VLANs 2 and 3
Cat-A> (enable) set spantree portvlancost 1/1 cost 1000 2-3
Port 1/1 VLANs 1,4-1005 have path cost 19.
Port 1/1 VLANs 2-3 have path cost 1000.
248 Chapter 7: Advanced Spanning Tree
Next, try increasing the Path Cost for VLAN 5 to 2000 as in Example 7-12.
Notice how it also changes the Path Cost for VLANs 2 and 3. A quick look at the output of
show spantree in Example 7-13 conrms the change.
Example 7-13 Only the Most Recently Specied Port/VLAN Cost Value Is Used
Console> (enable) show spantree 1/1
Port Vlan Port-State Cost Priority Fast-Start Group-Method
--------- ---- ------------- ----- -------- ---------- ------------
1/1 1 forwarding 19 31 disabled
1/1 2 forwarding 2000 31 disabled
1/1 3 forwarding 2000 31 disabled
1/1 4 forwarding 19 31 disabled
1/1 5 forwarding 2000 31 disabled
1/1 6 forwarding 19 31 disabled
1/1 7 forwarding 19 31 disabled
1/1 8 forwarding 19 31 disabled
1/1 9 forwarding 19 31 disabled
1/1 10 forwarding 19 31 disabled
Poof! The cost of 1000 is gone. As mentioned in the Port/VLAN Priority Load Balancing
section that covered the set spantree portvlanpri command (which also exhibits the
behavior seen in Example 7-13), this sleight of hand is a subtle side effect caused by a
technique the Catalysts use to save NVRAM storage. Internally, Catalysts store three
different values related to cost per port:
A global cost for that port
A single set spantree portvlancost value
A list of VLANs using the set spantree portvlancost value
Therefore, separate cost values cannot be stored for every VLAN. However, the good news
is that, apart from its strange appearances, this is not a signicant drawback in most
situations. Networks with lots of redundancy and multiple links might nd it a limitation,
but most never notice it at all.
Fast STP Convergence 249
Cat-C
1/1 1/2
RP
30 secs
19 Failed Link
1000
CL25 0718.eps
Bridge
4
Cat-A Cat-B
In this network, Cat-A is the Root Bridge, and Cat-C (an IDF switch) has selected Port 1/2
as its Root Port because it has a lower Root Path Cost (23 versus 1000). Assume that the
cable connecting Cat-C and Cat-B fails. This produces an immediate physical layer loss of
link on Cat-C:Port-1/2 and causes that port to be placed in the not-connected state. Port 1/
2 is then immediately excluded from STP processing and causes Cat-C to start searching
for a new Root Port. 30 seconds later (twice the Forward Delay), Port 1/1 enters the
Forwarding state and connectivity resumes.
TIP Max Age is used to detect and recover from indirect failures.
Cat-C
1/1 1/2
RP
50 secs
1000 19
CL25 0719.eps
Bridge
4
Cat-A Cat-B
Failed Link
In this case, the link between Cat-A and Cat-B fails. Cat-C:Port-1/2 receives no direct
notication that anything has changed. All Cat-C notices is that Conguration BPDUs stop
arriving on Port 1/2. After waiting for the number of seconds specied by the Max Age
timer, Cat-C:Port-1/1 starts to take over as the Root Port. This reconvergence takes
considerably longer: 50 seconds as opposed to 30 seconds.
The default Max Age value of 20 seconds is designed to take two factors into account:
End-to-End BPDU Propagation Delay
Message Age Overestimate
These values can be used to calculate the End-to-End BPDU Propagation Delay using the
following formula:
end-to-end_bpdu_propagation_delay
= ((lost_msgs + 1) hello_t) + (bpdu_delay (dia 1))
= ((3 + 1) 2) + (1 (7 1))
= 8 + 6 = 14 seconds
TIP Note that this process of simply incrementing the Message Age eld in each bridge causes
bridges farther from the Root Bridge to age out their Max Age counters rst. Therefore, this
effect is more pronounced in at-earth networks that consist of many Layer 2 switches
connected without any intervening routers or Layer 3 switches. This is another advantage
to creating hierarchy with Layer 3 switching as prescribed by the multilayer model in
Chapter 14.
TIP When using Layer 3 switching to limit the size of your Spanning Tree domains, the Max
Age timer can be safely tuned. The extent of tuning that is possible is based on the style of
Layer 3 switching in use and the overall campus network design. See Chapter 15 for
specic details and recommendations.
You can only modify the timer values on Root Bridges. Dont forget to also change the
values on any backup Root Bridges.
If you do lower the Hello Time interval, carefully consider the impact that it has on your
CPU. STP can easily be the single most intensive CPU process running on a modern bridge
(which does all of the frame forwarding in hardware). Cutting the Hello Time interval in
half doubles the load that STP places on your CPU. See the sections Lowering Hello Time
to One Second and Tips and Tricks: Mastering STP later in this chapter for more
guidance.
Fast STP Convergence 255
TIP Decreasing Hello Time to one second doubles the STP load placed on your CPU. Use the
formula presented at the end of the chapter to be certain that this does not overload your
CPU. However, in networks that contain a limited numbers of VLANs, lowering Hello
Time to one second can be an excellent way to improve convergence times. See the section
Lowering Hello Time to One Second for more information.
TIP Be careful when calculating bridge diameterunexpected hops can creep in when other
links or devices fail.
Some of the 802.1D values might appear overly conservative. For instance, most users
would argue that their networks would never drop three BPDUs while transferring
information a mere seven bridge hops. Likewise, the assumption that each bridge takes one
second to propagate a BPDU seems strange in a world of high-horsepower switching.
Although it might be tempting to recalculate the formula with more real-world values, I
strongly recommend against this. Keep in mind that these values were chosen to provide
adequate margin in networks experiencing failure conditions, not just networks happily
humming along while everything is operating at peak efciency. When a failure does occur,
your bandwidth and CPU capacity can be depleted as the network tries to recover. Be sure
to leave some reserves to handle these situations.
TIP Only modify the diameter and Hello Time variables in the Max Age calculation. Modifying
the other values can surprise you some day (when you least expect it!).
Although any form of STP timer tuning can be dangerous, reducing Max Age can be less
risky than other forms. If you set Max Age too low, a brief interruption in the ow of
Conguration BPDUs in the network can cause Blocking ports to age out their BPDU
information. When this happens, this rogue bridge starts sending Conguration BPDUs in
an attempt to move into the Forwarding state. If there is a functioning Designated Port
available for that segment, it refutes the BPDU with a Conguration BPDU of its own (this
256 Chapter 7: Advanced Spanning Tree
TIP Modifying Max Age is less dangerous than changing the other timer values. Unfortunately,
it only improves convergence in the case of an indirect failure.
Calculating End-to-End BPDU Propagation Delay and Message Age Overestimate for
Forward Delay
These components are used to calculate Forward Delay as follows:
end-to-end_bpdu_propagation_delay
= ((lost_msgs + 1) hello_t) + (bpdu_delay (dia 1))
= ((3 + 1) 2) + (1 (7 1))
= 8 + 6 = 14 seconds
message_age_overestimate
= (dia 1) overestimate_per_bridge
= (7 1) 1
= 6 seconds
These two calculations are the same two used to derive Max Age. With Forward Delay, just
as in Max Age, they account for the time it takes to propagate BPDUs across the network
and for the error present in the Message Age eld of Conguration BPDUs.
258 Chapter 7: Advanced Spanning Tree
TIP You can only modify the timer values on Root Bridges. Dont forget to also change the
values on any backup Root Bridges so as to be consistent during primary Root Bridge
failure. As with other Spanning Tree commands, it is best to get into the habit of always
specifying the VLAN parameter.
TIP Be very careful and conservative when you adjust Forward Delay. If you set Forward Delay
too low, it can create network-wide outages.
TIP Unlike Forward Delay and Max Age, lowering the Hello Time value does not improve
convergence. On the contrary, you lower the Hello Time to make it possible for you to also
lower the Forward Delay and/or Max Age timers. In general, it is simplest to use the set
spantree root macro discussed in Chapter 6 (it automatically makes all necessary
adjustments based on the suggested formulas in 802.1D).
The Hello Time can be adjusted with the set spantree hello command. For instance, the
following command lowers the Hello Time for VLAN 3 to one second:
set spantree hello 1 3
If you do lower the Hello Time value, carefully consider the CPU overload warning
mentioned in the Tuning Max Age section. For more information, see the formula
presented in the Tips and Tricks: Mastering STP section.
PortFast
PortFast is a feature that is primarily designed to optimize switch ports that are connected
to end-station devices. By using PortFast, these devices can be granted instant access to the
Layer 2 network.
Think for a moment about what happens when you boot your PC every morning. You ip
the big red switch, the monitor ickers, it beeps and buzzes. Somewhere during that process
your network interface card (NIC) asserts Ethernet link, causing a Catalyst port to jump
from not connected to the STP Learning state. Thirty seconds later, the Catalyst puts your
port into Forwarding mode, and you are able to play Doom to your hearts content.
Normally, this sequence never even gets noticed because it takes your PC at least 30
seconds to boot. However, there are two cases where this might not be true.
First, some NICs do not enable link until the MAC-layer software driver is actually loaded.
Because most operating systems try to use the network almost immediately after loading
the driver, this can create an obvious problem. Several years ago, this problem was fairly
common with certain Novell ODI NIC drivers. With more modern NICs, this problem is
fairly common with PC Card (PCMCIA) NICs used in laptop computers.
Second, there is a racea race between Microsoft and Intel. Intel keeps making the CPUs
faster and Microsoft keeps making the operating systems slowerand so far Intel is
winning. In other words, PCs are booting faster than ever. In fact, some modern machines
are done booting (or at least far enough along in the process) and need to use the network
before STPs 30-second countdown has nished. Dynamic Host Control Protocol (DHCP)
and NT Domain Controller authentication are two common activities that occur late in the
initialization process.
Fast STP Convergence 261
In both cases, STPs default settings can create a problem. How do you know if you have this
problem? Probably the easiest is to plug both the PC and the Catalyst port into a hub. This
provides a constant link to the Catalyst and keeps the port in Forwarding mode regardless of
whether the PC is booted or not. Another classic symptom is if your PC always has problems
when you rst cold boot it in the morning, but it never has problems when you warm boot it
during the day or try to manually complete login or DHCP sequences after booting.
This problem motivates some network administrators to disable STP altogether. This
certainly xes any STP booting problems, but it can easily create other problems. If you
employ this strategy, it requires that you eliminate all physical loops (a bad idea from a
resiliency standpoint) and carefully avoid all physical layer loops (something that can be
difcult to do in the real world). Also, keep in mind that you cant disable STP for a single
port. set spantree disable [vlan] is a per-VLAN global command that disables STP for
every port that participates in the specied VLAN (and, as you would expect, VLAN 1 is
the default if you do not specify the VLAN parameter). Moreover, some of the Layer 3
switching technologies, such as the Catalyst 5000 Supervisor Module III NetFlow Feature
Card (NFFC), require that Spanning Tree be disabled on the entire box (all VLANs)!
In short, rather than disabling STP, you should consider using Ciscos PortFast feature. This
feature gives you the best of both worldsimmediate end-station access and the safety net
of STP.
PortFast works by making a fairly simple change in the STP process. Rather than starting
out at the bottom of the Blocking-to-Listening-to-Learning-to-Forwarding hierarchy of
states as with normal STP, PortFast starts at the top. As soon as your switch sees the link,
the port is placed in the Forwarding state (Catalyst 3000s actually spend one second in both
Listening and Learning, but whos counting?). If STP later detects that you have a loop, it
does all of the Root and Designated Port calculations discussed earlier. If a loop is detected,
the port is put in the Blocking state.
This magic only occurs when the port rst initializes. If the port is forced into the Blocking
state for some reason and later needs to return to the Forwarding state, the usual Listening
and Learning processing is done.
However, to the contrary, PortFast can actually improve the stability of large networks!
Recall the discussion of TCN BPDUs. TCN BPDUs are sent every time a bridge detects a
change in the active topology to shorten the bridge table age-out time to the Forward Delay
262 Chapter 7: Advanced Spanning Tree
interval. Do you really want to potentially ush large sections of your bridging tables every
time a user boots? Probably not.
TIP Use PortFast on your end-station ports. Not only does it avoid problems when these devices
boot, it reduces the amount of Topology Change Notications in your network.
Despite all of PortFasts benets, you should not carelessly enable it on every port. Only
enable it on ports that connect to workstations. Because servers rarely reboot (you hope),
dont enable it here.
TIP One exception to the rule of not using PortFast on server ports involves the use of fault-
tolerant NICs. If you are using one of these NICs that toggles link-state during failover
(most dont), you should enable PortFast on these server ports.
Finally, you cannot use PortFast on trunk ports. Although Catalysts allow the command to
be entered on trunk links, it is ignored. In short, PortFast is like any other power tool: it is
extremely useful, but only if used correctly.
Using PortFast
Enabling PortFast is simple. Simply use the set spantree portfast command:
set spantree portfast mod_num/port_num {enable | disable}
For example, to enable PortFast on every port of a 24-port module in slot 3, issue the
following command:
set spantree portfast 3/1-24 enable
If you want to check to see where you have PortFast enabled, you can use the show
spantree command as in Example 7-14.
Fast STP Convergence 263
Look under the Fast-Start column. Notice how the end-station ports on module three have
PortFast enabled, whereas the uplink ports on the Supervisor do not.
TIP In many cases, you might experience a 1720 second delay even after you have enabled
PortFast. This is almost always caused by a side effect of the Port Aggregation Protocol
(PAgP) used to handle EtherChannel negotiations. As discussed in the Disabling Port
Aggregation Protocol section later in this chapter, PAgP hides port initialization changes
for approximately 1718 seconds. In other words, although PortFast might enable the link
as soon as it is aware that the port has transitioned, PAgP delays this notication. In a future
software release, Cisco is considering disabling PAgP on ports where PortFast is enabled,
a change that would avoid this problem.
UplinkFast
UplinkFast is an exciting feature that Cisco rolled out in the 3.1 NMP release. This
exclusive feature (it is patented) allows wiring closet switches to converge in two to three
seconds!
The syntax for UplinkFast is even simpler than PortFast:
set spantree uplinkfast {enable | disable} [rate station_update_rate]
264 Chapter 7: Advanced Spanning Tree
You should only enable UplinkFast on IDF-like wiring closet switches in correctly
designed networks. UplinkFast is designed to only operate on switches that are leaves (end
nodes) in your Spanning Tree. If you enable it in the core of you network, it generally leads
to unexpected trafc ows.
For example, consider Figure 7-22, the typical campus introduced earlier.
Figure 7-22 A Typical Campus Network Using UplinkFast
Cat-D
IDF
1/1 1/2
19 1000
1/1 1/1
19 19
1/1 1/2
CL25 0720.eps
Server
Farm
Cat-A
Root
Bridge
Cat-D is an IDF switch that is connected to two MDF switches (Cat-B and Cat-C).
Although set spantree uplinkfast is a global command that applies to all VLANs, this
section only analyzes a single VLAN: VLAN 2. Cat-A, the server farm switch, is the Root
Bridge for VLAN 2. Cat-D has two uplink ports that are potential Root Port candidates.
Utilizing the load balancing techniques discussed earlier, the cost on Port 1/2 has been
increased to 1000 to force VLAN 2s trafc across the 1/1 link. Notice that Port 1/1
becomes the Root Port. UplinkFast is then enabled on Cat-D with the following command:
Cat-D> (enable) set spantree uplinkfast enable
This causes Cat-D to notice that Port 1/2 is Blocking and therefore constitutes a redundant
connection to the Root Bridge. By making a note of this backup uplink port, Cat-D can set
itself up for a quick rollover in the event that Port 1/1 fails. The list of potential uplink ports
can be viewed with the show spantree uplinkfast command as in Example 7-15.
Fast STP Convergence 265
Port 1/1 is shown as the primary port (it is in the Forwarding state) and Port 1/2 is the
backup. If three uplink ports exist, all three appear in the output.
It is important to recognize that UplinkFast is a Root Port optimization. It allows wiring
closet switches to quickly bring up another Root Port in the event that the primary port fails.
TIP Do not enable UplinkFast on every switch in your network! Only enable UplinkFast on
leaf-node Catalysts such as your IDF switches.
To enforce the requirement of leaf-node status, Cisco modies several STP parameters
when UplinkFast is enabled. Take a look at the output of the set spantree uplinkfast
command in Example 7-16.
First, the Bridge Priority is modied to an unusually high value of 49,152. This causes the
current switch to effectively take itself out of the election to become the Root Bridge. Second,
266 Chapter 7: Advanced Spanning Tree
it adds 3000 to the cost of all links. This is done to discourage other switches from using the
current switch as a transit switch to the Root Bridge. Notice that neither of these actions limits
STP failover in your network. The Bridge Priority modication only discourages other switches
from electing this switch as the Root Bridge. If the other switches fail, this switch happily
becomes the Root Bridge. Also, the increase to Path Cost only discourages other switches from
using the current switch as a transit path to the Root Bridge. However, if no alternate paths are
available, the current switch gleefully transfers trafc to and from the Root Bridge.
Notice the third line in the output in Example 7-16 (in bold). This is evidence of a subtle
trick that is the crux of what UplinkFast is all about. It should probably be fairly obvious
by now that a failure on Cat-D:Port-1/1 forces Cat-D to take all MAC addresses associated
with Port 1/1 in the Bridging Table and points them to Port 1/2. However, a more subtle
process must take place to convert the bridging tables in other switches. Why is this extra
step necessary? Figure 7-23 shows the network with the left-hand link broken.
MAC= 00-00-ID-2B-DE-AD
Host B
Cat-D
1/1 1/2
19 1000
Failed Link
1/1
1/1
1/2 19 1/2
Cat-B 2/1 2/1 Cat-C
Root Cat-A
Bridge
CL25 0721.eps
Host A
MAC= 00-AA-00-12-34-56
Fast STP Convergence 267
Cat-D changes MAC address 00-AA-00-12-34-56 (Host-A) to Port 1/2 so that it has a correct
view of the network. However, notice that Cat-A, Cat-B, and Cat-C are still trying to send
trafc for 00-00-1D-2B-DE-AD (Host-B) to the broken link! This is where the real ingenuity
of UplinkFast comes in: Cat-D sends out a dummy multicast frame for the addresses in its
local Bridging Table. One frame is sent for each MAC address that is not associated with one
of the uplink ports. These packets are sent to a multicast 01-00-0C-CD-CD-CD destination
address to ensure that they are ooded throughout the bridged network. Recall from Chapter
3 that multicast addresses are ooded as with broadcast frames. However, note that Cisco does
not use the traditional multicast address of 01-00-0C-CC-CC-CC. Because this multicast
address is reserved for single hop protocols such as Cisco Discovery Protocol (CDP), VLAN
Trunk Protocol (VTP), Dynamic ISL (DISL), and Dynamic Trunk Protocol (DTP), Cisco
devices have been programmed to not ood the 01-00-0C-CC-CC-CC. To avoid this behavior,
a new multicast address needed to be introduced.
Each frame contains the source address of a different entry in the local Bridging Table. As
these packets are ooded through the network, all of the switches and bridges make a note
of the new interface the frame arrived on and, if necessary, adjust their bridging tables. By
default, the Catalyst sends 15 of these dummy frames every 100 milliseconds, but this rate
can be adjusted with the [rate station_update_rate] parameter (the number represents how
many dummy updates to send every 100 milliseconds).
However, adjusting the rate parameter usually does not improve failover performance.
Notice that only MAC addresses not learned over the uplinks are ooded. Because
UplinkFast only runs on leaf-node switches where the vast majority of the MAC addresses
in the bridging table are associated with the uplink ports, usually only a few hundred
addresses require ooding. The default rate oods 450 to 600 addresses in the 34 second
UplinkFast convergence period. Therefore, it only makes sense to increase the rate if you
have more than about 500 devices connected to your wiring closet switch.
UplinkFast is an extremely effective and useful feature. It provides much faster
convergence than any of the timer tuning techniques discussed earlier and is much safer. As
long as you only deploy it in leaf-node switches, it can be a wonderful way to maintain the
safety of STP while dramatically improving failover times in most situations.
BackboneFast
BackboneFast is a complementary (and patented) technology to UplinkFast. Whereas
UplinkFast is designed to quickly respond to failures on links directly connected to leaf-
node switches, it does not help in the case of indirect failures in the core of the backbone.
This is where BackboneFast comes in.
Dont expect BackboneFast to provide the two to three second rollover performance of
UplinkFast. As a Max Age optimization, BackboneFast can reduce the indirect failover
performance from 50 to 30 seconds (with default parameters; and from 14 to 8 seconds with
the tunable values set their minimums). However, it never eliminates Forwarding Delay and
provides no assistance in the case of a direct failure (recall from the Tuning Max Age
section that direct failures do not use Max Age).
268 Chapter 7: Advanced Spanning Tree
TIP BackboneFast is a Max Age optimization. It allows the default convergence time for
indirect failures to be reduced from 50 seconds to 30 seconds.
As discussed in the previous section, UplinkFast should only be enabled on a subset of all
switches in your network (leaf-node, wiring closet switches). On the other hand,
BackboneFast should be enabled on every switch in your network. This allows all of the
switches to propagate information about link failures throughout the network.
When a device detects a failure on the link directly connected to its Root Port, the normal
rules of STP dictate that it begin sending Conguration BPDUs in an attempt to become the
Root Bridge. What other devices do with these Conguration BPDUs depends on where the
Designated Ports are located. If a Designated Port hears these inferior BPDUs, it
immediately refutes them with a Conguration BPDU as discussed in the Conguration
BPDU section earlier. If a non-Designated Port receives the inferior BPDU, it is ignored.
However, in either case, the 802.1D standard does not provide a mechanism that allows
switches receiving inferior BPDUs to make any judgments about the state of the network.
How does BackboneFast magically eliminate Max Age from the STP convergence delay?
By taking advantage of the following two mechanisms:
The rst allows switches to detect a possible indirect failure.
The second allows them to verify the failure.
The BackboneFast detection mechanism is built around the concept that inferior BPDUs are
a signal that another bridge might have lost its path to the Root Bridge. BackboneFasts
verication mechanism employs a request and response protocol that queries other switches
to determine if the path to the Root Bridge has actually been lost. If this is the case, the switch
can expire its Max Age timer immediately, reducing the convergence time by 20 seconds.
To detect the possible failure of the Root Bridge path, BackboneFast checks the source of
the inferior BPDU. If the BPDU is from the local segments Designated Bridge, this is
viewed as a signal of an indirect failure. If the inferior BPDU came from another switch, it
is discarded and ignored.
The verication process is more complex than the detection process. First, BackboneFast
considers if there are alternate paths to the Root Bridge. If the switch receiving an inferior
BPDU has no ports in the Blocking state (ports looped to itself are excluded), it knows that
it has no alternate paths to the Root Bridge. Because it just received an inferior BPDU from
its Designated Bridge, the local switch can recognize that it has lost connectivity to the Root
Bridge and immediately expire the Max Age timer.
Fast STP Convergence 269
If the switch does have blocked ports, it must utilize a second verication mechanism to
determine if those alternate paths have lost connectivity to the Root Bridge. To do this, the
Catalysts utilize a Root Link Query (RLQ) protocol. The RLQ protocol employs two types
of packetsRLQ Requests and RLQ Responses.
RLQ Requests are sent to query upstream bridges if their connection to the Root Bridge is
stable. RLQ Responses are used to reply to RLQ Requests. The switch that originates the
RLQ Request sends RLQ frames out all non-Designated Ports except the port that received
the inferior BPDU. A switch that receives an RLQ Request replies with an RLQ Response
if it is the Root Bridge or it knows that it has lost its connection to the Root Bridge. If neither
of these conditions is true, the switches propagate the RLQ Requests out their Root Ports
until the stability of the Root Bridge is known and RLQ Responses can be sent. If the RLQ
Response is received on an existing Root Port, the switch knows that its path to the Root
Bridge is stable. On the other hand, if the RLQ Response is received on some port other
than the current Root Port, it knows that it has lost its connection to the Root Bridge and
can immediately expire the Max Age timer. A switch propagates BPDUs out all Designated
Ports until the switch that originated the RLQ Request is reached.
To illustrate this process, consider the simplied campus network shown in Figure 7-24.
Cat-C
1/1 1/2
RP F B NDP
Segment 2 19 19 Segment 3
DP F F DP
1/1 1/2 1/1
CL25 0722.eps
Root F 19 F 1/2
Bridge DP Segment 1 RP
Cat-A Cat-B
As discussed earlier, BackboneFast must be enabled on all three switches in this network.
Assume that Cat-A is the Root Bridge. This results in Cat-B:Port-1/2 and Cat-C:Port-1/1
becoming Root Ports. Because Cat-B has the lower BID, it becomes the Designated Bridge
for Segment 3, resulting in Cat-C:Port-1/2 remaining in the Blocking state.
270 Chapter 7: Advanced Spanning Tree
Next, assume that Segment 1 fails. Cat-A and Cat-B, the switches directly connected to this
segment, instantly know that the link is down. To repair the network, it is necessary that
Cat-C:Port-1/2 enter the Forwarding state. However, because Segment 1 is not directly
connected to Cat-C, Cat-C does not start sending any BPDUs on Segment 3 under the
normal rules of STP until the Max Age timer has expired.
BackboneFast can be used to eliminate this 20-second delay with the following eight-step
process (illustrated in Figure 7-25):
Step 1 Segment 1 breaks.
Step 2 Cat-B immediately withdraws Port 1/2 as its Root Port and begins
sending Conguration BPDUs announcing itself as the new Root Bridge
on Port 1/1. This is a part of the normal STP behavior (Steps 37 are
specic to BackboneFast).
Step 3 Cat-C:Port-1/2 receives the rst Conguration BPDU from Cat-B and
recognizes it as an inferior BPDU.
Step 4 Cat-C then sends an RLQ Request out Port 1/1.
Step 5 Cat-A:Port-1/1 receives the RLQ Request. Because Cat-A is the Root
Bridge, it replies with an RLQ Response listing itself as the Root Bridge.
Step 6 When Cat-C receives the RLQ Response on its existing Root Port, it
knows that it still has a stable connection to the Root Bridge. Because
Cat-B originated the RLQ Request, it does not need to forward the RLQ
Response on to other switches.
Step 7 Because Cat-C has a stable connection to the Root Bridge, it can
immediately expire the Max Age timer on Port-1/2.
Step 8 As soon as the Max Age timer expires in Step 7, the normal rules of STP
require Port Cat-C:Port-1/2 to start sending Conguration BPDUs.
Because these BPDUs list Cat-A as the Root Bridge, Cat-B quickly
learns that it is not the Root Bridge and it has an alternate path to Cat-A.
Fast STP Convergence 271
Expire Max
Age
7
Cat-C
1/1 1/2
6 3 Cfg. BPDU-
RLQ 4 Cat-B is Root
Request RLQ
Response
Cfg.
BPU- 8
Cat-A is
Root 2 Cfg. BPDU-
RLQ 5 5 RLQ Cat-B is Root
Request Response
1/1 1/1
CL25 0723.eps
1/2 1/2
Cat-A Cat-B
1
Root
Bridge
Although this allows Cat-B to learn about the alternate path to the Root Bridge within
several seconds, it still requires that Cat-C:Port-1/2 go through the normal Listening and
Learning states (adding 30 seconds of delay to the convergence with the default values and
8 seconds with the minimum value for Forward Delay).
TIP BackboneFast requires 4.1 or later code on the Catalyst 5000. All Catalyst 4000s and 6000s
support BackboneFast.
that assists in correctly conguring bundles of Fast and Gigabit Ethernet links that act as
one large EtherChannel pipe. PAgP defaults to a mode called the auto state where it looks
for other EtherChannel-capable ports. While this process is occurring, STP is not aware that
the link is even active. This condition can be observed with show commands. For example,
show port displays a connected status for Port 1/1 immediately after it has been connected
as a trunk link (see Example 7-17).
Example 7-17 show port Output Immediately After Port 1/1 Is Connected
Cat-D (enable) show port
Port Name Status Vlan Level Duplex Speed Type
----- ------------------ ---------- ---------- ------ ------ ----- ------------
1/1 connected trunk normal a-half a-100 10/100BaseTX
1/2 notconnect trunk normal a-half a-100 10/100BaseTX
3/1 notconnect 1 normal half 10 10BaseT
3/2 notconnect 1 normal half 10 10BaseT
However, if show spantree is issued at the same time, it still displays the port as not-
connected as demonstrated in Example 7-18.
Example 7-18 show spantree Output Immediately After Port 1/1 is Connected
Cat-D (enable) show spantree 1
VLAN 1
Spanning tree enabled
Spanning tree type ieee
After approximately 1520 seconds, PAgP releases the port for use by the rest of the box.
At this point, the port enters Listening, Learning, and then Forwarding. In short, because of
PAgP, the port took 50 seconds instead of 30 seconds to become active.
Fast STP Convergence 273
Therefore, you should carefully consider the impact of PAgP in your campus
implementations. First, it is advisable to use the desirable channeling state for links where
an EtherChannel bundle is desired. Specically, you should avoid using the on state
because it hard-codes the links into a bundle and disables PAgPs capability to intelligently
monitor the bundle. For example, all STP BPDUs are sent over a single link of the
EtherChannel. If this one link fails, the entire bundle can be declared down if PAgP is not
running in the auto or desirable states.
TIP When using EtherChannel, code the ports to the desirable channeling state. Do not use the
on state because it disables PAgPs capability to handle Spanning Tree failover situations.
However, in cases where EtherChannel is not in use, disabling PAgP can improve Spanning
Tree performance dramatically. In general, campus networks benet from disabling PAgP
in three situations:
End-Station Ports
Servers using fault-tolerant NICs that toggle link state during failover
Testing
End-stations can benet from disabling PAgP on their switching ports. This can be especially
noticeable when used in conjunction with PortFast. Even with PortFast enabled, EtherChannel-
capable ports still require almost 20 seconds for activation because PAgP hides the port activation
from STP. By disabling PAgP with the set port channel mod_num/port_num off command, this
20-second delay can be almost eliminated, allowing PortFast to function as expected.
Fault-tolerant server NICs that toggle link state during failover can also benet from a
similar performance improvement (however, most fault-tolerant NICs do not toggle link).
Otherwise, the PAgP delay needlessly interrupts server trafc for almost 20 seconds.
Finally, you should consider disabling PAgP on non-channel ports when performing STP
performance. Otherwise, the 20-second PAgP delay can skew your results.
TIP You might want to disable PAgP on EtherChannel-capable end-station and fault-tolerant
server ports. This can also be useful when testing STP performance.
The good news is that this should not affect trunk link failover performance in production
in most situations. For example, assume that Cat-D is using Port 1/1 and Port 1/2 as uplinks.
If Port 1/1 fails, failover can start immediately because both links have been active for some
time and are therefore past the initial PAgP lockout period. On the other hand, if Port 1/2
was acting as a cold standby and not connected when Port 1/1 failed, that is a different
matter. In this case, you need to walk up and physically plug in Port 1/2 and PAgP does add
to the STP failover time.
274 Chapter 7: Advanced Spanning Tree
Although optional, it is best to get into the habit of always specifying the vlan parameter.
Otherwise, it is easy to waste crucial time looking at output for VLAN 1 (the default) when
you thought you were looking at some other VLAN. The same is also true of show spantree
statistics and show spantree blockedports.
BPDU-related parameters
port Spanning Tree enabled
state forwarding
port_id 0x8001
port number 0x1
path cost 19
message age (port/VLAN) 0(20)
designated_root 00-90-92-55-80-00
designated_cost 0
designated_bridge 00-90-92-55-80-00
designated_port 0x8001
top_change_ack FALSE
config_pending FALSE
port_inconsistency none
The output of show spantree statistics is broken into ve sections. Several of the more
useful elds are discussed here. The message age (port/VLAN) eld under the BPDU-
related parameters section displays two values. The rst value (outside the parentheses)
displays the age of the most recently received BPDU plus any time that has elapsed since
it arrived. This can be useful to determine if the ow of Conguration BPDUs has stopped
arriving from the Root Bridge. The second value (inside the parentheses) displays the Max
Age for the VLAN, currently at the default of 20 seconds in the sample output. This is the
locally congured value, not the value received from the Root Bridge (and actually in use).
The PORT based information & statistics presents some very useful BPDU counter
statistics. The rst two lines display the number of Conguration BPDUs transmitted and
received. The next two lines display the same information for TCN BPDUs. Each line
contains two values. The rst value (outside the parentheses) displays the number of
BPDUs transmitted or received on that port for the specied VLAN (if it is a trunk). The
second value (inside the parentheses) shows the total number of BPDUs received for the
entire VLAN (all ports).
If you are experiencing STP problems, this information can be used to verify that BPDUs
are owing. However, notice that both ends of a link generally do not increment both the
transmit and the receive counters. During steady state processing, only the Designated Port
increments the Conguration BPDU transmit counter, whereas the Root Port (or Ports) at
the other end only increments the receive counter. The BPDU counters can be invaluable
when troubleshooting situations where a link has failed in such a way that trafc cannot
ow in both directions. Without this information, it can take days to narrow down the source
of the instability.
280 Chapter 7: Advanced Spanning Tree
TIP Use the BPDU transmit and receive counters to troubleshoot link failure problems. Also,
Ciscos UniDirectional Link Detection (UDLD) can be very useful.
The VLAN based information & statistics section contains helpful information on
topology changes. Last topology change occurred shows the time and date that the last
change took place. The topology change count eld shows the total number of topology
changes that have occurred since Spanning Tree initialized on this port. The topology
change last recvd. from eld shows the port MAC address (the MAC address used in the
802.3 header, not the MAC address used for the BID) of the last bridge or switch to send
the current bridge a TCN BPDU. Use these elds to track instability caused by excessive
Topology Change Notications. However, notice that unless you are using PortFast on all
of your end-station ports, every time a PC or workstation boots or shuts down it generates
a TCN BPDU.
TIP Use the topology change information in the VLAN based information & statistics section
to track down TCN BPDU problems.
NOTE Although the initial release of 802.1Q only specied a single instance of the Spanning-Tree
Protocol, the IEEE is working on multiple instances of STP in the 802.1s working group.
Per-VLAN Spanning Tree Plus 281
PVST+ Region
ISL 802.1Q
Trunk Trunk
CL25 0724.eps
Because it provides interoperability between the other two types of regions, a PVST+
region is generally used in the backbone. It connects to MST regions via 802.1Q trunks and
PVST regions via ISL links. However, more exible congurations are allowed. For
example, two PVST+ regions can connect via an MST backbone region.
Cat-D
1/1 1/2
2/1
2/1
2/2 2/2
Cat-B 2/3 2/3 Cat-C
1/1 1/2
CL25 0725.eps
Cat-A
Five basic combinations of MST, PVST, and PVST+ switches can be used in this network:
All switches are PVST devicesThis is the case discussed in the Spanning Tree
Load Balancing section earlier. All of the load balancing techniques covered in that
section obviously work here.
All switches are PVST+ devicesLoad balancing can be implemented using exactly
the same techniques as with PVST switches. The PVST+ and CST BPDUs are
handled without any user conguration.
All switches are MST devicesNo STP load balancing is possible under the current
version of 802.1Q.
284 Chapter 7: Advanced Spanning Tree
An IDF switch is an MST device and remaining devices are PVST+ devices For
PVST+ BPDUs and trafc to pass through both uplink ports on the IDF switch, both
ports need to be in the STP Forwarding state. Beyond that, normal STP load balancing
techniques can be utilized.
An MDF switch is an MST device and remaining devices are PVST+ devices
For PVST+ BPDUs and trafc to pass through all ports of the MDF switch, all of the
ports must be placed in the Forwarding state. After that has been accomplished,
normal STP load balancing can be done.
In short, PVST+ allows all of the usual Spanning Tree load balancing techniques to be used
with one exceptionall inter-switch ports on MST devices should be in the Forwarding state.
NOTE Load balancing might be possible if some MST ports are Blocking. However, the load
balancing design requires careful analysis. In general, it is easiest to design the network to
have all inter-switch MST ports in the Forwarding state (if possible).
Two problems arise when the MST ports are not forwarding:
PVST+ BPDUs are only ooded out a subset of the ports, and therefore only learn a
subset of the topology.
Blocking MST ports destroy the capability to implement load balancing. Recall that
when an MST switch puts a port in the Blocking state, it blocks that port for all
VLANs. Because this forces all trafc to use a single path, load balancing is no longer
possible.
TIP The mapping and tunneling aspects of PVST+ require no user conguration or intervention.
This allows plug-and-play interoperability. However, load balancing might require some
extra STP tuning to force the MST ports into the Forwarding state.
It can be tricky to meet the requirement that all ports on the MST switches forward while
also distributing the trafc across multiple paths. A couple of examples illustrate this point.
First, consider the case of an MDF MST switch as shown in Figure 7-28. Cat-B has been
replaced by Sw-B, a generic 802.1Q MST switch.
Per-VLAN Spanning Tree Plus 285
Part A Part B
PVST+ VLANs (VLANs>1) MST/CST VLAN (VLAN 1)
Cat-D Cat-D
DP F
1 F 2/1
DP
2 DP 2/2 Cat-C
19 Sw-B
F B
Cat-C 3 2/3 RP
RP F F
19 19 20
1/2 F F DP
DP
CL25 0726.eps
1/1 Cat-A Cat-A
Root Root
Bridge Bridge
Part A in Figure 7-28 illustrates the tunneling process of PVST+ BPDUs through an MST
switch (this is used for VLANs other than VLAN 1). Because the MST switch oods the
PVST+ BPDUs out all inter-switch ports (assuming that the requirement of all ports
Forwarding is met), it is as though the MST switch does not exist. An interesting consequence
of this is that the left path appears to only have a cost of 19, whereas the right path has the
usual cost of 38 (19+19). In other words, Cat-A, the Root Bridge, originates Conguration
BPDUs with a cost of zero. These BPDUs arrive without any modication on Cat-D:Port-1/
1 where the cost is increased to 19. On the right link, Cat-C receives the BPDUs and increases
the Root Path Cost to 19. When Cat-D:Port-1/2 receives these, it increases the Root Path Cost
to 38. This issue is easily overcome by increasing cost to some large value such as 1000 on
the link you do not want the trafc to take (this is another example of a case where using the
default portvlancost behavior of lowering the cost by one does not work as discussed in the
earlier section Decreasing Path Cost for Load Balancing). For example, trafc for VLAN 3
could be forced to take the right link by increasing VLAN 3s Path Cost on Cat-D:Port-1/1 to
1000 (the cost of the right path would remain 38 and be more attractive).
Part B in Figure 7-28 illustrates the active topology seen in VLAN 1, the MST/CST VLAN. This
is the VLAN where the STP parameters must be tuned to meet the requirement that all ports on
the MST switch be Forwarding. One easy way to meet this requirement is to make Sw-B the
Root Bridge for VLAN 1. In a simple topology such as that shown in Figure 7-28, this is probably
286 Chapter 7: Advanced Spanning Tree
the most effective approach. However, a more exible technique is generally required for larger
networks. Part B shows a solution that can be utilized in these cases (note that Cat-A is the Root
Bridge). Sw-B:Port-1 and Sw-B:Port-3 are Forwarding by default (Sw-B:Port-1 becomes a
Designated Port and Sw-B:Port-3 becomes the Root Port). Sw-B:Port-2, on the other hand,
might or might not become a Designated Port (if Cat-C has a lower BID, Cat-C:Port-2/2 wins
the election). To eliminate this chance and force Sw-B:Port-2 to win the Designated Port
election, the Path Cost of Cat-C:Port-2/3 can be increased from 19 to 20 (or something even
higher).
TIP Load balancing generally requires all inter-switch ports on MST switches to be in the
Forwarding state.
Figure 7-29 illustrates the case of an IDF MST switch. Cat-B has been put back into service
and the generic 802.1Q switch has been relocated to the IDF wiring closet in place of Cat-D
(it is called Sw-D).
Figure 7-29 IDF MST Switch Load Balancing
Part A Part B
PVST+ VLANs (VLANs>1) MST/CST VLAN (VLAN 1)
19 Sw-D
1 2 DP
F
RP F
19 19
DP F B
2/1 2/1 2/1
2/2 2/2 Cat-C DP 2/2 Cat-C
20
Cat-B Cat-B
19 F B
2/3 2/3 RP 2/3 RP
F F
19 19 19 20
1/2 F F DP
1/1 DP
CL25 0727.eps
Cat-A Cat-A
Root Root
Bridge Bridge
In this case, both uplink ports on Sw-D must be Forwarding. However, to ensure this behavior,
one of Sw-Ds ports must become a Root Port and the other must become a Designated Port.
This can be accomplished by increasing the cost on Cat-C:Port-2/2 and Cat-C:Port-2/3
Disabling STP 287
enough to force Cat-C:Port-2/1 into the Blocking state (because Sw-D:Port-2 has the most
attractive port for the Cat-C to Sw-D segment). Now that the MST switch has all inter-switch
ports in the Forwarding state, load balancing can be addressed. In this case, Cat-B should be
congured to forward half of the VLANs (for example, the even VLANs) while Cat-C
handles the other half (the odd VLANs). This can be done by increasing (or decreasing) the
Path Cost on Ports 2/3 of Cat-B and Cat-C for alternating VLANs.
NOTE Note that the Path Cost on Cat-C:Port-2/3 needs to be set between 20 and 37 for the
topology to work as described in Figure 7-29. If it were set lower, Cat-C:Port-2/1 would
have a lower Root Path Cost than Sw-D:Port-2, causing Sw-D:Port-2 to enter the Blocking
state. On the other hand, if Cat-C:Port-2/3s Path Cost were set to higher than 37, Cat-
C:Port-2/1 would become the Root Port for Cat-C, causing Cat-C:Port-2/3 to enter the
Blocking state.
Disabling STP
It might be necessary to disable Spanning Tree in some situations. For example, some
network administrators disable STP in frustration after not being able to resolve STP bugs
and design issues. Other people disable STP because they have loop-free topologies. Some
shops resort to disabling STP because they are not aware of the PortFast feature (not to
mention its interaction with PAgP as discussed earlier).
If you do need to disable STP, Catalysts offer the set spantree disable command. On most
Catalyst systems, STP can be disabled on a per-VLAN basis. For example, set spantree
disable 2 disables STP for VLAN 2. However, dont forget that this disables STP for all ports
in the specied VLANLayer 2 Catalyst switches such as the 4000s, 5000s, and 6000s
currently do not offer the capability to disable STP on a per-port basis. Example 7-26 shows
the use of the set spantree disable command to disable STP for VLAN 1 on Cat-A.
If you are using certain Layer 3 switching technologies such as the NetFlow Feature Card,
STP can only be disabled for an entire device (all VLANs).
288 Chapter 7: Advanced Spanning Tree
TIP STP cannot be disabled per port on Layer 2-oriented Catalyst equipment such as the 4000s,
5000s, and 6000s. When these Catalysts are not using a NFFC, you are allowed to disable
STP per VLAN, but this applies to all ports in the VLAN on the specied device. Because
devices such as the Catalyst 8500 use the full router IOS, you have complete control over
where Spanning Tree runs (through the use of bridge-group statements).
Disabling STP on an entire device can be accomplished with the set spantree disable all
command.
However, it is generally better to use features such as PortFast, UplinkFast, Layer 3
switching, and a scalable design than it is to completely disable Spanning Tree. When
Spanning Tree is disabled, your network is vulnerable to miscongurations and other
mistakes that might create bridging loops.
TIP Dont take disabling STP lightly. If loops are formed by mistake, the entire network can
collapse. In general, it is preferable to utilize features such as UplinkFast than to entirely
disable STP.
One of the more common places where Spanning Tree can be disabled is when using an
ATM campus core. Because LANE naturally provides a loop-free environment, some
ATM-oriented vendors leave Spanning Tree disabled by default. However, for this to work,
you must be very careful to avoid loops in the Ethernet portion of your network. Besides
preventing loops between end-user ports, you generally must directly connect every IDF
switch to the ATM core (in other words, redundant Ethernet links cannot be used from the
MDF closets to the IDF closets because they would form loops).
Finally, notice that when STP is disabled on Layer 2 Catalyst equipment such as the 4000s,
5000s, and 6000s, BPDUs are ooded through the box. In other words, as soon as Spanning
Tree is disabled, the 01-80-C2-00-00-00 multicast address is again treated as a normal
multicast frame (rather than being directed to the Supervisor where the frames are absorbed
and possibly regenerated). The net effect of this is that Catalysts with Spanning Tree
disabled are invisible to neighboring switches that are still running the protocol. To these
switches, the Catalyst with STP disabled is indistinguishable from a Layer 1 hub (at least
as far as STP goes).
Tips and Tricks: Mastering STP 289
TIP Many CWSI/CiscoWorks 2000 users are not aware that it contains a Spanning Tree
mapping tool. To use it, rst pull up VLAN Director. Then select a VTP domain. Then pick
a VLAN in that domain. This highlights the nodes and links that participate in the VLAN.
It also brings up the VLAN section (it has a yellow light bulb next to it). Click the Spanning
Tree checkbox and the Blocking ports, which are marked with a sometimes-hard-to-see X.
Update your design after adding to the network. After carefully implementing load
balancing and Root Bridge failover strategies, be sure to evaluate the impact of network
additions and modications. By adding devices and paths, it is easy for innocent-looking
changes to completely invalidate your original design. Also be sure to update your
documentation and diagrams. This is especially true if you are using a at-earth design (see
the section Campus-Wide VLANs Model in Chapter 14).
290 Chapter 7: Advanced Spanning Tree
Avoid timer tuning in at designs. Unless you have a narrow Spanning Tree diameter and
a very controlled topology, timer tuning can do more harm than good. It is usually safer and
more scalable to employ techniques such as UplinkFast and BackboneFast.
Max Age tuning is less risky than Forward Delay tuning. Although overly-aggressive Max
Age tuning can lead to excessive Root Bridge, Root Port, and Designated Port elections, it
is less dangerous than overly-aggressive Forward Delay tuning. Because Forward Delay
controls the time a device waits before placing ports in the Forwarding state, a very small
value can allow devices to create bridging loops before accurate topology information has
had time to propagate. See the sections Tuning Max Age and Tuning Forward Delay
earlier in this chapter for more information.
If you do resort to timer tuning, consider using the set spantree root macro. This macro sets
Spanning Tree parameters based on the recommended formulas at the end of the 802.1D spec.
For more information, see the section Using A Macro: set spantree root in Chapter 6.
Use timer tuning in networks using the multilayer design model. Because this approach
constrains the Layer 2 topology into lots of Layer 2 triangles, consider using STP timer
tuning in networks using the multilayer design model. In general, it is recommended to use
the set spantree root command and specify a diameter of 23 hops and a Hello Time of
two seconds. See the section Timer Tuning in Chapter 15 for more detailed information.
Utilize Root Bridge placement load balancing in networks employing MLS and the
multilayer design model. This chapter discussed the importance of using Layer 3
switching to limit the size of your Spanning Tree domains. Chapters 11, 14, 15, and 17 look
at the use of various approaches to Layer 3 switching and make some specic
recommendations on how to use this technology for maximum benet. Chapter 14 details
one of the most successful campus design architectures, the so-called multilayer model.
When implemented in conjunction with Ciscos MLS/NFFC, the multilayer model seeks to
reduce Spanning Tree domains to a series of many small Layer 2 triangles. Because these
triangles have very predictable and deterministic trafc ows, they are very well suited to
using the Root Bridge Placement form of STP load balancing. In general, the Root Bridges
should be located near or collocated with the default gateway router for that VLAN.
Port/VLAN cost is the most exible STP load balancing technique. Other than Root
Bridge placement, which can be useful in networks with well-dened trafc patterns for
each VLAN, port/VLAN cost load balancing is the preferable option. In general, it should
be used in all situations other than the case mentioned in the previous tip. For details, see
Load Balancing with Port/VLAN Cost earlier in this chapter.
Spanning Tree load balancing requires that multiple VLANs be assigned to IDF
wiring closet switches. Although assigning a single VLAN to IDF switches can make
administration easier, it prevents STP load balancing from being possible. In some cases,
non-STP load balancing techniques such as MHSRP and EtherChannel might still be
possible with a single IDF VLAN.
Tips and Tricks: Mastering STP 291
Use a separate management VLAN. Just like end-stations, Catalysts must process all
broadcast packets in the VLAN where they are assigned. In the case of a Layer 2 Catalyst,
this is the VLAN where SC0 is placed. If this VLAN contains a large amount of broadcast
trafc, it can then overload the CPU and cause it to drop frames. If end-user trafc is
dropped, no problem. However, if STP (or other management) frames are dropped, the
network can quickly destabilize. Isolating the SC0 logical interfaces in their own VLAN
protects them from end-user broadcasts and allows the CPU to focus only on important
management trafc. In many cases, the factory default VLAN (VLAN 1 for Ethernet)
works well as the management VLAN. Chapter 15 discusses management VLAN design
issues and techniques.
Minimize Layer 2 loops in the management VLAN. Many networks contain lots of
redundancy in the management VLAN. The thought is that it prevents failovers from
isolating switch management capabilities. Unfortunately, this can also destabilize the
network. In networks with lots of management VLAN loops, all it takes is a single switch
to become overloaded or run into an STP bug. If this switch then opens up a bridging loop
in the management VLAN, suddenly neighboring bridges see a ood of broadcast and
multicast trafc. As this trafc inevitably overloads the neighboring switches, they can
create additional bridging loops. This phenomenon can pass like a wave across the entire
network with catastrophic results. Although it might only directly effect VLAN 1, by
disabling the CPUs in all bridges and switches, it effectively shuts down the entire Layer 2
network. To avoid these problems, it is advisable to use Layer 3 routing, not Layer 2
bridging, to provide redundancy in the management VLAN. This point is discussed further
in Chapter 15.
Use Layer 3 switching to reduce the size of Spanning Tree domains. Now that you are
armed with a truck-load of STP knowledge, creating scalable and exible STP designs
should be a breeze! However, this knowledge should also lead you to the conclusion that
excessively large Spanning Tree domains are a bad idea. Small STP domains provide the
best mix of failover performance and reliability. See Chapters 11, 14, 15, and 17 for more
information.
Try to design your network such that Spanning Tree domains consist of MDF-IDF-
MDF triangles. This maximizes STPs value as a Layer 2 failover feature minimizes any
scalability concerns.
Use PortFast on end-station ports to reduce Topology Change Notications. Not only
can PortFast eliminate problems associated with devices that boot and access the network
quickly, it reduces the number of Topology Change Notications in the network. See the
PortFast section in the chapter for more information.
Use UplinkFast to improve IDF wiring closet failover time. UplinkFast is an extremely
effective and clever STP optimization that reduces most wiring closet failover times to two
or three seconds. See the UplinkFast section in this chapter for more information.
292 Chapter 7: Advanced Spanning Tree
PVST+ load balancing requires all inter-switch ports on MST switches to be in the
Forwarding state. PVST+ allows traditional PVST Catalysts to interoperate with 802.1Q
switches that only use a single Spanning Tree (MST). Best of all, it does this without any
additional conguration! However, it might require careful planning to maintain effective
STP load balancing. See the PVST+ section of this chapter for more information.
Always specify the VLAN parameter with Spanning Tree commands to avoid
accidental changes to VLAN 1. Many of the Spanning Tree set and show commands allow
you to omit the VLAN parameter. When doing so, you are implying VLAN 1. To avoid
confusion and unintentional modications to VLAN 1, it is best to get in the habit of always
specifying this parameter.
The original implementations of Fast EtherChannel in 2.2 and 2.3 NMP images did
not support STP over EtherChannel. Spanning Tree still viewed the link as two or four
separate ports and would block all but one (obviously defeating the purpose of
EtherChannel). The limitation was solved in 3.1 and later versions of code. Dont be misled
by older or incorrect documentation that does not reect this enhancementusing STP
over EtherChannel is generally a good idea.
Utilize the desirable EtherChannel mode for maximum Spanning Tree stability. When
using the on mode, it is possible for the entire channel to be declared down when only the single
link carrying STP failed. See Chapter 8 for more information on EtherChannel technology.
Be certain that you do not overload your CPU with Spanning Tree calculations. Keep
the total number of logical ports below the values specied in Table 7-4.
Use the following formula to calculate the number of logical ports on your devices:
Logical Port = number VLANs on non-ATM trunks +
(2 number VLANs on ATM trunks) + number non-trunk ports
In other words, you want to add up the total number of VLANs on every port in your box.
ATM VLANs (these are called ELANs; see Chapter 9, Trunking with LAN Emulation)
are more heavily weighed by counting them twice.
For example, consider a Catalyst 5000 MDF switch with 100 Ethernet trunk ports, each of
which carry 25 VLANs. Also assume that the MDF switch is 32 Ethernet-attached servers
using non-trunk links. In this case, the total number of logical ports would be:
2,532 = (100 trunks 25 VLANs) + 32 non-trunk ports
Exercises 293
Although this is by no means the largest MDF switch possible with Catalyst equipment,
notice that it requires a Supervisor III. If the trunks were ATM trunks, the total number of
logical ports swell to 5,032more than even a Catalyst 5000 Supervisor III can handle.
Finally, note that this calculation assumes a Hello Time of two seconds. If you have
decreased your Hello Time to one second to speed convergence, double the value you
calculate in the formula. For instance, the number of Ethernet logical ports in the previous
example would be 5,032, and the number of ATM logical ports would swell to 10,064!
Exercises
This section includes a variety of questions and hands-on lab exercises. By completing
these you can test your mastery of the material included in this chapter as well as help
prepare yourself for the CCIE written and lab tests.
Review Questions
1 Label the port types (RP=Root Port, DP=Designated Port, NDP=non-Designated
Port) and the STP states (F=Forwarding, B=Blocking) in Figure 7-30. The Bridge IDs
are labeled. All links are Fast Ethernet. Assume that there is only a single VLAN and
that the portvlanpri command has not been used.
1/1 1/1
Cat-A Cat-B
1/2 1/2
BID= BID=
32,768.AA-AA-AA-AA-AA-AA 100.BB-BB-BB-BB-BB-BB
728.eps
VLANs 101-103
C-1 C-4 C-8
C-5 C-6
CL25 0729.eps
VLANs 101-102 VLAN 101
C-10
VLANs 101-115
5 When is the Root Bridge placement form of STP load balancing most effective? What
command(s) are used to implement this approach?
6 When is the Port Priority form of STP load balancing useful? What command(s) are
used to implement this approach? What makes this technique so confusing?
7 When is the Bridge Priority form of STP load balancing useful? What command(s)
are used to implement this approach? What makes this technique so confusing?
8 When is the portvlancost form of load balancing useful? What is the full syntax of
the portvlancost command? What is the one confusing aspect of this technique?
9 What technology should be used in place of portvlanpri?
10 What are the components that the default value of Max Age is designed to account
for? There is no need to specify the exact formula, just the major components captured
in the formula.
11 What are the components that the default value of Forwarding Delay is designed to
account for? There is no need to specify the exact formula, just the major components
captured in the formula.
12 What are the main considerations when lowering the Hello Time from the default of
two seconds to one second?
13 Where should PortFast be utilized? What does it change about the STP algorithm?
14 Where should UplinkFast be utilized? In addition to altering the local bridging table
to reect the new Root Port after a failover situation, what other issue must UplinkFast
address?
Exercises 295
Hands-On Lab
Complete an STP design for the network shown in Figure 7-32.
Building 1
C-1D
C-1C
C-1A C-1B
Server
Farm
Bu
2
ild
g
in
C-
2B
in
g
3
C-
ild
3
A
Bu
C-
2C
3C
C-
C-
A
C-
-2
3B
C
D
2D
C-
CL25 0730.eps
296 Chapter 7: Advanced Spanning Tree
Figure 7-32 shows a three-building campus. Each building contains two MDF switches (A
and B) and two IDF switches (C and D). The number of IDF switches in each building is
expected to grow dramatically in the near future. The server farm has its own switch that
connects to Cat-1A and Cat-1B. The network contains 20 VLANs. Assume that each server
can be connected to a single VLAN (for example, the SAP server can be connected to the
Finance VLAN). Assume that all links are Fast Ethernet except the ring of links between
the MDF switches, which are Gigabit Ethernet.
Be sure to address the following items: STP timers, Root Bridges, Load Balancing, failover
performance, and trafc ows. Diagram the primary and backup topologies for your design.
NOTE This design utilizes what Chapters 14 and 15 refer to as the campus-wide VLANs design
model. In general, this design is not recommended for large campus designs. However, it is
used here because it makes extensive use of the Spanning-Tree Protocol. For more
information on campus-wide VLANs and other design alternatives, please consult Chapters
11, 14, 15, and 17.
This page intentionally left blank
PART
III
Trunking
Chapter 8 Trunking Technologies and Applications
Why Trunks?
When all of the Catalysts in a network support one VLAN and need connectivity, you can
establish links between the Catalysts to transport intra-VLAN trafc. One approach to
interconnecting Catalysts uses links dedicated to individual VLANs. For example, the
network in Figure 8-1 connects several Catalysts together. All of the Catalyst congurations
include only one VLANall ports belong to the same VLAN. Catalysts A and B
interconnect with two direct links for resiliency. If one link fails, Spanning Tree enables the
second link.
302 Chapter 8: Trunking Technologies and Applications
When you dedicate a link to a single VLAN, this is called an access link. Access links never
carry trafc from more than one VLAN. You can build an entire switched network with
access links. But as you add VLANs, dedicated links consume additional ports in your
network when you extend the VLAN to other switches.
In Figure 8-1, multiple links interconnect the Catalysts, but each link belongs to only 1
VLAN. This is possible because there is only one VLAN in the network. What if there were
more than one? To interconnect multiple VLANs, you need a link for each VLAN. The
network in Figure 8-2 interconnects six Catalysts and contains three distributed VLANs.
Notice that Cat-B has members of all three VLANs, whereas its neighbors only have
members of two VLANs. Even though the neighbors do not have members of all VLANs,
an access link for all three VLANs is necessary to support Cat-B. Without the VLAN 3
access links attached to Cat-B, VLAN 3 members attached to Cat-B are isolated from
VLAN 3 members on other Catalysts.
Why Trunks? 303
Cat-A Cat-B
1,2 1,2,3
1
2
3
123 123
1 VLAN 1,2
2
3
123 13
VLAN VLAN
1,2,3 1,3
When deploying a network with access links, each link supplies dedicated bandwidth to the
VLAN. The link could be a standard 10-Mbps link, a Fast Ethernet, or even a Gigabit
Ethernet link. You can select the link speed appropriate for your VLAN requirements.
Further, the link for each VLAN can differ. You can install a 10-Mbps link for VLAN 1 and
a 100-Mbps link for VLAN 2.
Unfortunately, access links do not scale well as you increase the number of VLANs or
switches in your network. For example, the network of Figure 8-1 uses 34 interfaces and 17
links to interconnect the VLANs. Imagine if there were 20 switches in the network with
multiple VLANs. Not only does your system cost escalate, but your physical layer tasks as
an administrator quickly become unbearable as the system expands.
Alternatively, you can enable a trunk link between Catalysts. Trunks allow you to distribute
VLAN connectivity without needing to use as many interfaces and cables. This saves you
cost and administrative headaches. A trunk multiplexes trafc from multiple VLANs over
a single link. Figure 8-3 illustrates the network from Figure 8-2 deployed with trunks.
304 Chapter 8: Trunking Technologies and Applications
1,2,3
1,2,3 1,3
In this network, only 12 ports and six links are used. Although VLANs share the link
bandwidth, you conserve capital resources in your network by sharing the links. The
majority of this chapter focuses on connectivity between switches. As a practical
introduction to trunks, the following section describes reasons to attach routers and le
servers to switches with trunks.
VLAN 3
VLAN 2
VLAN 4
Trunks
2,3,4
In Figure 8-4, workstations belong to VLANs 2, 3, and 4. Because these stations attach to
different broadcast domains, they cannot communicate with each other except through a
router. Trunks connect a le server and a router to the switched network. The trunk
connection to the router enables inter-VLAN connectivity. Without trunks, you can use
multiple interfaces on the router and attach each to a different port on the switch as in Figure
8-5. The difculty you might experience, though, is in the number of VLANs that this
conguration supports. If the connections are high-speed interfaces like Fast Ethernet, you
might only install a couple of interfaces. If you use 10-Mbps interfaces, you might not have
the bandwidth that you want to support the VLANs.
306 Chapter 8: Trunking Technologies and Applications
Figure 8-5 A Brute Force Method of Attaching Routers and Servers to Multiple VLANs
2 3 4
2 3 4
Likewise, you could attach a le server to more than one VLAN through multiple interface
cards. As when interconnecting switches with dedicated links, this does not scale well and
costs more than a trunk link. Therefore, the trunk connectivity used in Figure 8-4 is usually
more reasonable.
When a router or le server attaches as a trunk to the switch, it must understand how to
identify data from each of the VLANs. The router must, therefore, understand the
multiplexing technique used on the link. In a Cisco environment, this can be either ISL or
802.1Q over Ethernet, 802.10 over FDDI, or LANE/MPOA over ATM. In a mixed vendor
environment, you must trunk with 802.1Q or LANE/MPOA.
NOTE Some vendors such as Intel and others supply ISL-aware adapter cards for workstations
allowing you to use Ciscos trunk protocols. This is benecial if you want to attach a le
server to the Catalyst using a trunk link rather than multiple access links.
Ethernet Trunks
Most trunk implementations use Ethernet. You can construct Ethernet trunks using Fast
Ethernet or Gigabit Ethernet, depending upon your bandwidth needs. EtherChannel
(dened in greater detail in the sections that follow) creates additional bandwidth options
by combining multiple Fast Ethernet or Gigabit Ethernet links. The combined links behave
as a single interface, load distribute frames across each segment in the EtherChannel, and
provide link resiliency.
Ethernet Trunks 307
Simply inter-connecting Catalysts with Ethernet does not create trunks. By default, you
create an access link when you establish an Ethernet interconnection. When the port
belongs to a single VLAN, the connection is not a trunk in the true sense as this connection
never carries trafc from more than one VLAN.
To make a trunk, you must not only create a link, but you must enable trunk processes. To
trunk over Ethernet between Catalysts, Cisco developed a protocol to multiplex VLAN
trafc. The multiplexing scheme encapsulates user data and identies the source VLAN for
each frame. The protocol, called Inter-Switch Link (ISL), enables multiple VLANs to share
a virtual link such that the receiving Catalyst knows in what VLAN to constrain the packet.
TIP Trunks allow you to more easily scale your network than access links. However, be aware
that Layer 2 broadcast loops (normally eliminated with Spanning Tree) for a VLAN carried
on a trunk degrades all VLANs on the trunk. Be sure to enable Spanning Tree for all VLANs
when using trunks.
The following sections describe EtherChannel and ISL. The physical layer aspects of
EtherChannel are covered rst followed by a discussion of ISL encapsulation.
EtherChannel
EtherChannel provides you with incremental trunk speeds between Fast Ethernet and
Gigabit Ethernet, or even at speeds greater than Gigabit Ethernet. Without EtherChannel,
your connectivity options are limited to the specic line rates of the interface. If you want
more than the speed offered by a Fast Ethernet port, you need to add a Gigabit Ethernet
module and immediately jump to this higher-speed technology. You do not have any
intermediate speed options. Alternatively, you can create multiple parallel trunk links, but
Spanning Tree normally treats these as a loop and shuts down all but one link to eliminate
the loop. You can modify Spanning Tree to keep links open for some VLANs and not others,
but this requires signicant congurations on your part.
EtherChannel, on the other hand, allows you to build incremental speed links without
having to incorporate another technology. It provides you with some link speed scaling
options by effectively merging or bundling the Fast Ethernet or Gigabit Ethernet links and
making the Catalyst or router use the merged ports as a single port. This simplies
Spanning Tree while still providing resiliency. EtherChannel resiliency is described later.
Further, if you want to get speeds greater than 1 Gbps, you can create Gigabit
EtherChannels by merging Gigabit Ethernet ports into an EtherChannel. With a Catalyst
6000 family device, this lets you create bundles up to 8 Gbps (16 Gbps full duplex).
308 Chapter 8: Trunking Technologies and Applications
Unlike the multiple Spanning Tree option just described, EtherChannel treats the bundle of
links as a single Spanning Tree port and does not create loops. This reduces much of your
conguration requirements simplifying your job.
EtherChannel works as an access or trunk link. In either case, EtherChannel offers more
bandwidth than any single segment in the EtherChannel. EtherChannel combines multiple
Fast Ethernet or Gigabit Ethernet segments to offer more apparent bandwidth than any of the
individual links. It also provides link resiliency. EtherChannel bundles segments in groups of
two, four, or eight. Two links provide twice the aggregate bandwidth of a single link, and a
bundle of four offers four times the aggregate bandwidth. For example, a bundle of two Fast
Ethernet interfaces creates a 400-Mbps link (in full-duplex mode). This enables you to scale
links at rates between Fast Ethernet and Gigabit Ethernet. Bundling Gigabit Ethernet
interfaces exceeds the speed of a single Gigabit Ethernet interface. A bundle of four Gigabit
Ethernet interfaces can offer up to 8 Gbps of bandwidth. Note that the actual line rate of each
segment remains at its native speed. The clock rate does not change as a result of bundling
segments. The two Fast Ethernet ports comprising the 400-Mbps EtherChannel each operate
at 100 Mbps (in each direction). The combining of the two ports does not create a single 200-
Mbps connection. This is a frequently misunderstood aspect of EtherChannel technology.
EtherChannel operates as either an access or trunk link. Regardless of the mode in which
the link is congured, the basic EtherChannel operation remains the same. From a
Spanning Tree point of view, an EtherChannel is treated as a single port rather than multiple
ports. When Spanning Tree places an EtherChannel in either the Forward or Blocking state,
it puts all of the segments in the EtherChannel in the same state.
Bundling Ports
When bundling ports for EtherChannel using early EtherChannel-capable line modules,
you must follow a couple of rules:
Bundle two or four ports.
Use contiguous ports for a bundle.
All ports must belong to the same VLAN. If the ports are used for trunks, all ports
must be set as a trunk.
If you set the ports to trunk, make sure that all ports pass the same VLANs.
Ensure that all ports at both ends have the same speed and duplex settings.
You cannot arbitrarily select ports to bundle. See the following descriptions for
guidelines.
These rules are generally applicable to many EtherChannel capable modules, however,
some exceptions exist with later Catalyst modules. For example, the Catalyst 6000 line
cards do not constrain you to use even numbers of links. You can create bundles with three
links if you so choose. Nor do the ports have to be contiguous, or even on the same line
card, as is true with some Catalyst devices and line modules. The previously mentioned
Ethernet Trunks 309
exceptions of the Catalyst 6000 EtherChannel rules come from newer chipsets on the line
modules. These newer chips are not present on all hardware. Be sure to check your
hardware features before attempting to create any of these other bundle types.
Early EtherChannel-capable modules incorporate a chip called the Ethernet Bundling
Controller (EBC) which manages aggregated EtherChannel ports. For example, the EBC
manages trafc distribution across each segment in the bundled link. The distribution
mechanism is described later in this section.
When selecting ports to group for an EtherChannel, you must select ports that belong to the
same EBC. On a 24-port EtherChannel capable module, there are three groups of eight
ports. On a 12-port EtherChannel capable module, there are three groups of four ports.
Table 8-1 shows 24- and 12-port groupings.
Table 8-2 Valid and Invalid 12-Port EtherChannel Examples (for Original Catalyst 5000 Implementations)
Port 1 2 3 4 5 6 7 8 9 10 11 12
Example A OK 1 1 2 2 3 3 4 4 5 5 6 6
Example B OK 1 1 2 2 3 3
Example C OK 1 1 1 1 2 2
Example D NOK 1 1
Example E NOK 1 1 2 2 2 2
Example F NOK 1 1
Example G NOK 1 1
For example, in a 12-port module, you can create up to two dual segment EtherChannels
within each group as illustrated in Example A of Table 8-2. Or, you can create one dual
segment EtherChannel within each group as in Example B of Table 8-2. Example C
illustrates a four-segment and a two-segment EtherChannel.
You must avoid some EtherChannel congurations on early Catalyst 5000 equipment.
Example D of Table 8-2 illustrates an invalid two-segment EtherChannel using Ports 3 and
4 of a group. The EBC must start its bundling with the rst ports of a group. This does not
mean that you have to use the rst group. In contrast, a valid dual segment EtherChannel
can use Ports 5 and 6 with no EtherChannel on the rst group.
310 Chapter 8: Trunking Technologies and Applications
Example E illustrates another invalid conguration. In this example, two EtherChannels are
formed. One is a dual-segment EtherChannel, the other is a four-segment EtherChannel.
The dual-segment EtherChannel is valid. The four-segment EtherChannel, however,
violates the rule that all ports must belong to the same group. This EtherChannel uses two
ports from the rst group and two ports from the second group.
Example F shows an invalid conguration where an EtherChannel is formed with
discontiguous segments. You must use adjacent ports to form an EtherChannel.
Finally, Example G shows an invalid EtherChannel because it does not use the rst ports on
the module to start the EtherChannel. You cannot start the EtherChannel with middle ports
on the line module.
All of the examples in Table 8-2 apply to the 24-port modules too. The only difference
between a 12- and 24-port module is the number of EtherChannels that can be formed
within a group. The 12-port module allows only two EtherChannels in a group, whereas the
24-port module supports up to four EtherChannels per group.
One signicant reason for constraining bundles within an EBC stems from the load
distribution that the EBC performs. The EBC distributes frames across the segments of an
EtherChannel based upon the source and destination MAC addresses of the frame. This is
accomplished through an Exclusive OR (X-OR) operation. X-OR differs from a normal
OR operation. OR states that when at least one of two bits is set to a 1, the result is a 1. X-
OR means that when two bits are compared, at least one bit, but only one bit can have a
value of 1. Otherwise, the result is a 0. This is illustrated in Table 8-3.
The EBC uses X-OR to determine over what segment of an EtherChannel bundle to
transmit a frame. If the EtherChannel is a two-segment bundle, the EBC performs an X-OR
on the last bit of the source and destination MAC address to determine what link to use. If
the X-OR generates a 0, segment 1 is used. If the X-OR generates a 1, segment 2 is used.
Table 8-4 shows this operation.
Ethernet Trunks 311
The middle column denotes a binary representation of the last octet of the MAC address.
An x indicates that the value of that bit does not matter. For a two-segment link, only the
last bit matters. Note that the rst column only states Address 1 or Address 2. It does not
specify which is the source or destination address. X-OR produces exactly the same result
regardless of which is rst. Therefore, Example 2 really indicates two situations: one where
the source address ends with a 0 and the destination address ends in a 1, and the inverse.
Frames between devices use the same link in both directions.
A four-segment operation performs an X-OR on the last two bits of the source and
destination MAC address. An X-OR of the last two bits yields four possible results. As with
the two-segment example, the X-OR result species the segment that the frame travels.
Table 8-5 illustrates the X-OR process for a four-segment EtherChannel.
NOTE Newer Catalyst models such as the 6000 series have the ability to perform the load
distribution on just the source address, the destination address, or both. Further, they have
the ability to use the IP address or the MAC addresses for the X-OR operation.
Some other models such as the 2900 Series XL perform X-OR on either the source or the
destination MAC address, but not on the address pair.
312 Chapter 8: Trunking Technologies and Applications
The results of Examples 1 and 5 force the Catalyst to use Segment 1 in both cases because
the X-OR process yields a 0.
The end result of the X-OR process forces a source/destination address pair to use the same
link for each frame they transmit. What prevents a single segment from becoming
overwhelmed with trafc? Statistics. Statistically, the MAC address assignments are fairly
random in the network. A link does not likely experience a trafc loading imbalance due to
source/destination MAC address values. Because the source and destination use the same
MAC address for every frame between each other, the frames always use the same
EtherChannel segment. It is possible, too, that a workstation pair can create a high volume
of trafc creating a load imbalance due to their application. The X-OR process does not
remedy this situation because it is not application aware.
Ethernet Trunks 313
TIP Connecting RSMs together with a Catalyst EtherChannel might not experience load
distribution. This occurs because the RSM MAC addresses remain the same for every
transmission, forcing the X-OR to use the same segment in the bundle for each frame.
However, you can force the RSM to use multiple user-assigned MAC addresses, one for
each VLAN, with the mac-address command. This forces the switch to perform the X-OR
on a per-VLAN basis and enable a level of load distribution.
The set port channel command enables EtherChannel. It does not establish a trunk. With
only this conguration statement, a single VLAN crosses the EtherChannel. To enable a
trunk, you must also enter a set trunk command. The set trunk command is described in
following sections.
The on and off options indicate that the Catalyst always (or never) bundles the ports as an
EtherChannel. The desirable option tells the Catalyst to enable EtherChannel as long as the
other end agrees to congure EtherChannel and as long as all EtherChannel rules are met.
For example, all ports in the EtherChannel must belong to the same VLAN, or they must
all be set to trunk. All ports must be set for the same duplex mode. If any of the parameters
mismatch, PAgP refuses to enable EtherChannel. The auto option allows a Catalyst to
enable EtherChannel if the other end is set as either on or desirable. Otherwise, the
Catalyst isolates the segments as individual links.
Figure 8-6 shows two Catalysts connected with two Fast Ethernet segments. Assume that
you desire to enable EtherChannel by bundling the two segments.
Figure 8-6 A Catalyst 5000 and a Catalyst 5500 Connected with EtherChannel
Cat-A Cat-B
2/1 10/1
2/2 10/2
Examples 8-2 and 8-3 show sample congurations for both Cat-A and Cat-B.
314 Chapter 8: Trunking Technologies and Applications
TIP Note that when you enable PAgP on a link where Spanning Tree is active, Spanning Tree
takes about 18 more seconds to converge. This is true because PAgP takes about 18 seconds
to negotiate a link. The link negotiation must be completed before Spanning Tree can start
its convergence algorithm.
TIP If you change an attribute on one of the EtherChannel segments, you must make the same
change on all of the segments for the change to be effective. All ports must be congured
identically.
EtherChannel Resiliency
What happens when an EtherChannel segment fails? When a Catalyst detects a segment
failure, it informs the Encoded Address Recognition Logic (EARL) ASIC on the Supervisor
module. The EARL is a special application-specic integrated circuit that learns MAC
addresses. In essence, the EARL is the learning and address storage device creating the bridge
tables discussed in Chapter 3. The EARL ages any addresses that it learned on that segment
so it can relearn address pairs on a new segment in the bundle. On what segment does it
relearn the source? In a two-segment EtherChannel, frames must cross the one remaining
segment. In a four- or eight-segment bundle, trafc migrates to the neighboring segment.
When you restore the failed segment, you do not see the trafc return to the original
segment. When the segment fails, the EARL relearns the addresses on a new link. Until
addresses age out of the bridge table, the frames continue to cross the backup link. This
requires that the stations not transmit for the duration of the bridge aging timer. You can
manually clear the bridge table, but that forces the Catalyst to recalculate and relearn all the
addresses associated with that segment.
EtherChannel Development
EtherChannel denes a bundling technique for standards-based segments such as Fast
Ethernet and Gigabit Ethernet. It does not cause the links to operate at clock rates different
than they were without bundling. This makes the segments non Fast Ethernet- or Gigabit
Ethernet-compliant. EtherChannel enables devices to distribute a trafc load over more
than one segment while providing a level of resiliency that does not involve Spanning Tree
or other failover mechanisms. The IEEE is examining a standards-based approach to
bundling in the 802.3ad committee.
ISL
When multiplexing frames from more than one VLAN over a Fast Ethernet or Fast
EtherChannel, the transmitting Catalyst must identify the frames VLAN membership. This
allows the receiving Catalyst to constrain the frame to the same VLAN as the source,
thereby maintaining VLAN integrity. Otherwise, the frame crosses VLAN boundaries and
violates the intention of creating VLANs.
Ciscos proprietary Inter-Switch Link (ISL) encapsulation enables VLANs to share a
common link between Catalysts while allowing the receiver to separate the frames into the
correct VLANs.
When a Catalyst forwards or oods a frame out an ISL enabled trunk interface, the Catalyst
encapsulates the original frame identifying the source VLAN. Generically, the
encapsulation looks like Figure 8-7. When the frame leaves the trunk interface at the source
Catalyst, the Catalyst prepends a 26-octet ISL header and appends a 4-octet CRC to the
frame. This is called double-tagging or two-level tagging encapsulation.
Ethernet Trunks 317
Non-Trunk Trunk
26 1-24,575 4
ISL trunk links can carry trafc from LAN sources other than Ethernet. For example, Token
Ring and FDDI segments can communicate across an ISL trunk. Figure 8-8 shows two Token
Rings on different Catalysts that need to communicate with each other. Ethernet-based
VLANs also exist in the network. The connection between the Catalysts is an Ethernet trunk.
Figure 8-8 Using Token Ring ISL (TRISL) to Transport Token Ring Over an Ethernet Trunk
VLAN VLAN
10,20,30 10,20,30
ISL = Ethernet
TRISL = Token Ring
Ring Ring
100 200
These differences make transporting Token Ring frames over an Ethernet segment
challenging at the least.
To effectively transport Token Ring frames over an Ethernet link, the Catalyst must deal
with each of these issues.
When Cisco developed ISL, it included a provision for Token Ring and FDDI over
Ethernet. The ISL header includes a space for carrying Token Ring- and FDDI-specic
header information. These are carried in the Reserved eld of the ISL header.
When specically dealing with Token Ring over ISL, the encapsulation is called Token
Ring ISL (TRISL). TRISL adds seven octets to the standard ISL encapsulation to carry
Token Ring information. The trunk passes both ISL- and TRISL-encapsulated frames.
Note that off is not listed because it disables trunking as described below.
When congured as off, the interface locally disables ISL and negotiates (informs) the
remote end of the local state. If the remote end conguration allows dynamic trunk state
changes (auto or desirable), it congures itself as a non-trunk. If the remote side cannot
change state (such as when congured to on), the local unit still disables ISL. Additionally,
if the local unit is congured as off and it receives a request from the remote Catalyst to
enable ISL, the local Catalyst refuses the request. Setting the port to off forces the interface
to remain off, regardless of the ISL state at the remote end. Use this mode whenever you
dont want an interface to be a trunk, but want it to participate in ISL negotiations to inform
the remote side of its local policy.
On the other hand, if the local interface conguration is on, the Catalyst locally enables ISL
and negotiates (informs) the remote side of the local state. If the remote side conguration is
auto or desirable, the link enables trunking and ISL encapsulation. If the remote end state is
off, the link never negotiates to an enabled trunk mode. The local Catalyst enables trunking
while the remote end remains disabled. This creates an encapsulation mismatch preventing
successful data transfers. Use trunk mode on when the remote end supports DISL, and when
you want the local end to remain in trunk mode regardless of the remote ends mode.
The desirable mode causes a Catalyst interface to inform the remote end of its intent to
enable ISL, but does not actually enable ISL unless the remote end agrees to enable it. The
remote end must be set in the on, auto, or desirable mode for the link to establish an ISL
trunk. Do not use the desirable mode if the remote end does not support DISL.
NOTE Not all Catalysts, such as the older Catalyst 3000 and the Catalyst 1900, support DISL. If
you enable the Catalyst 5000 end as desirable and the other end does not support DISL, a
trunk is never established. Only use the desirable mode when you are condent that the
remote end supports DISL, and you want to simplify your conguration requirements.
Conguring a Catalyst in auto mode enables the Catalyst to receive a request to enable ISL
trunking and to automatically enter that mode. The Catalyst congured in auto never
initiates a request to create a trunk and never becomes a trunk unless the remote end is
congured as on or desirable. The auto mode is the Catalyst default conguration. If when
enabling a trunk you do not specify a mode, auto is assumed. A Catalyst never enables
trunk mode when left to the default values at both ends. When one end is set as auto, you
must set the other end to either on or desirable to activate a trunk.
The nonegotiate mode establishes a Catalyst conguration where the Catalyst enables
trunking, but does not send any conguration requests to the remote device. This mode
prevents the Catalyst from sending DISL frames to set up a trunk port. Use this mode when
establishing a trunk between a Catalyst and a router to ensure that the router does not
erroneously forward the DISL requests to another VLAN component. You should also use
Ethernet Trunks 321
this whenever the remote end does not support DISL. Sending DISL announcements over
the link is unproductive when the receiving device does not support it.
Table 8-7 shows the different combinations of trunk modes and the corresponding effect.
With all of these combinations, the physical layer might appear to be operational. If you do a
show port, the display indicates connected. However, that does not necessarily mean that the
trunk is operational. If both the remote and local sides of the link do not have the same
indication (on or off), you cannot transmit any trafc due to encapsulation mismatches. Use
the show trunk command to examine the trunk status. For example, in Table 8-7, the
combination on/auto results in both sides trunking. The combination auto/auto results in
both sides remaining congured as access links. Therefore, trunking is not enabled. Both of
these are valid in that both ends agree to trunk or not to trunk. However, the combination on/
off creates a situation where the two ends of the link disagree about the trunk condition. Both
sides pass trafc, but neither side can decode the received trafc. This is because of the
encapsulation mismatch that results from the disagreement. The end with trunking enabled
looks for ISL encapsulated frames, but actually receives nonencapsulated frames. Likewise,
the end that is congured as an access link looks for nonencapsulated Ethernet frames, but
sees encapsulation headers that are not part of the Ethernet standard and interpret these as
errored frames. Therefore, trafc does not successfully transfer across the link.
Do not get confused between DISL and PAgP. In the section on EtherChannel, PAgP was
introduced. PAgP allows two Catalysts to negotiate how to form an EtherChannel between
them. PAgP does not negotiate whether or not to enter trunk mode. This is the domain of
DISL and Dynamic Trunk Protocol (DTP). DTP is a second generation of DISL and allows
322 Chapter 8: Trunking Technologies and Applications
the Catalysts to negotiate whether or not to use 802.1Q encapsulation. This is discussed
further in a later section in this chapter. On the other hand, note that DISL and DTP do not
negotiate anything about EtherChannel. Rather, they negotiate whether to enable trunking.
TIP It is best to hard code the trunk conguration on critical links between Catalysts such as in
your core network, or to critical servers that are trunk attached.
TIP If you congure the Catalyst trunk links for dynamic operations (desirable, auto), ensure
that both ends of the link belong to the same VTP management domain. If they belong to
different domains, Catalysts do not form the trunk link.
802.1Q/802.1p
In an effort to provide multivendor support for VLANs, the IEEE 802.1Q committee
dened a method for multiplexing VLANs in local and metropolitan area networks. The
multiplexing method, similar to ISL, offers an alternative trunk protocol in a Catalyst
network. Like ISL, 802.1Q explicitly tags frames to identify the frames VLAN
membership. The tagging scheme differs from ISL in that ISL uses an external tag, and
802.1Q uses an internal tag.
The IEEE also worked on a standard called 802.1p. 802.1p allows users to specify priorities
for their trafc. The priority value is inserted into the priority eld of the 802.1Q header. If
a LAN switch supports 802.1p, the switch might forward trafc agged as higher priority
before it forwards other trafc.
ISLs external tag scheme adds octets to the beginning and to the end of the original data
frame. Because information is added to both ends of a frame, this is sometimes called
double-tagging. (Refer back to Table 8-6 for ISL details.) 802.1Q is called an internal tag
scheme because it adds octets inside of the original data frame. In contrast to double-
tagging, this is sometimes called a single-tag scheme. Figure 8-9 shows an 802.1Q tagged
frame.
Ethernet Trunks 323
26 Variable Length 4
Octets Octets
802.IQ
Type/
over DA SA Tag Data Frame FCS
Length
Ethernet
4 Octets
The following bullets describe each of the elds in the 802.1Q header illustrated in Figure 8-9:
TPID (Tag Protocol Identier)This indicates to the receiver that an 802.1Q tag
follows. The value for the TPID is a hexadecimal value of 0x8100.
PriorityThis is the 802.1p priority eld. Eight priority levels are dened in 802.1p
and are embedded in the 802.1Q header.
CFI (Canonical format indicator)This single bit indicates whether or not the
MAC addresses in the MAC header are in canonical (0) or non-canonical (1) format.
VID (VLAN Identier)This indicates the source VLAN membership for the
frame. The 12-bit eld allows for VLAN values between 0 and 4095. However,
VLANs 0, 1, and 4095 are reserved.
An interesting situation arises from the 802.1Q tag scheme. If the tag is added to a
maximum sized Ethernet frame, the frame size exceeds that specied by the original IEEE
802.3 specication. To carry the tag in a maximum sized Ethernet frame requires 1522
octets, four more than the specication allows. The 802.3 committee created a workgroup,
802.3ac, to extend Ethernets maximum frame size to 1522 octets.
If you have equipment that does not support the larger frame size, it might complain if it
receives these oversized frames. These frames are sometimes called baby giants.
324 Chapter 8: Trunking Technologies and Applications
Conguring 802.1Q
Conguration tasks to enable 802.1Q trunks include the following:
1 Specify the correct encapsulation mode (ISL or 802.1Q) for the trunk.
2 Enable the correct DTP trunking mode or manually ensure that both ends of the link
support the same trunk mode.
3 Select the correct native VLAN-id on both ends of the 802.1Q trunk.
dot1q species the trunk encapsulation type. Specically, it enables the trunk using 802.1Q
encapsulation. This is an optional eld for ISL trunks, but mandatory if you want dot1q. Of
course, if you want an ISL trunk, you do not use dot1q, but rather ISL. If you do not specify
the encapsulation type, the Catalyst uses the default value (ISL). Not all modules support both
ISL and 802.1Q modes. Check current Cisco documentation to determine which modes your
hardware supports. Further, not all versions of the Catalyst software support 802.1Q. Only
since version 4.1(1) does the Catalyst 5000 family support dot1q encapsulation. Automatic
negotiation of the encapsulation type between the two ends of the trunk was not available until
Ethernet Trunks 325
version 4.2(1) of the Catalyst 5000 software. 4.2(1) introduced DTP, which is described in the
following section. Prior to 4.2(1), you must manually congure the trunk mode.
Example 8-5 shows a sample output for conguring Port 1/1 for dot1q encapsulation. This
works whether the interface is Fast Ethernet or Gigabit Ethernet.
Enabling 802.1Q trunks on a router is similar to enabling ISL. Like ISL, you must include
an encapsulation statement in the interface conguration. Example 8-6 shows a sample
router conguration.
The number at the end of the encapsulation statement species the VLAN number. The
802.1Q specication allows VLAN values between 0 and 4095 (with reserved VLAN
values as discussed previously). However, a Catalyst supports VLAN values up to 1005.
Generally, do not use values greater than 1005 when specifying the 802.1Q VLAN number
to remain consistent with Catalyst VLAN numbers. Note that newer code releases allow
you to map 802.1Q VLAN numbers into the valid ISL number range. This is useful in a
hybrid 802.1Q/ISL environment by enabling you to use any valid 802.1Q value for 802.1Q
trunks, while using valid ISL values on ISL trunks.
At the end of Example 8-8, the complete list of allowed VLANs is 1-9, 15, 21-1005.
You can use these commands on any trunk, regardless of its tagging mode. Note that if you enter
these commands on an EtherChannel trunk, the Catalyst modies all ports in the bundle to
ensure consistency. Ensure that you congure the remote link to carry the same set of VLANs.
VLANs 100,200,300
FDDI 802.10
Trunk
VLAN VLAN
VLAN VLAN VLAN VLAN
10 20 10 30 20 30
By enabling 802.10 encapsulation on the FDDI interfaces in the network, the FDDI
backbone becomes a Catalyst trunk. The network in Figure 8-10 attaches many Catalysts
allowing them to transport data from distributed VLANs over the FDDI trunk. Member
stations of VLAN 10 on Cat-A can communicate with stations belonging to VLAN 10 on
Cat-B. Likewise, members of VLAN 20 can communicate with each other regardless of
their location in the network.
As with any multiple VLAN network, routers interconnect VLANs. The Cisco router in
Figure 8-10 attached to the FDDI network understands 802.10 encapsulation and can
therefore route trafc between VLANs.
The conguration in Example 8-9 demonstrates how to enable 801.10 encapsulation on a
Cisco router so that VLAN 100 can communicate with VLAN 200.
Optionally Encrypted
802.10
SAID MDF
LSAP
Figure 8-11 shows three elds in the Clear header portion. Only the Security Association
Identier (SAID) eld is relevant to VLANs. Therefore, the other two elds (802.10 LSAP
and MDF) are ignored in this discussion.
The SAID eld as used by Cisco identies the source VLAN. The four-byte SAID allows
for many VLAN identiers on the FDDI network. When you create an FDDI VLAN, you
provide the VLAN number. By default, the Catalyst adds 100,000 to the VLAN number to
create a SAID value. The receiving Catalyst subtracts 100,000 to recover the original FDDI
VLAN value. Optionally, you can specify a SAID value. But this is not usually necessary.
The Catalyst commands in Example 8-10 enable 802.10 encapsulation for VLANs 500 and
600 and modify the VLAN 600 SAID value to 1600.
FDDI Trunks and 802.10 Encapsulation 329
After establishing the VLANs, the show vlan command displays the addition of the
VLANs with the specied SAID value as in Example 8-11. Note that VLAN 500 has a
SAID value of 100,500 because a SAID value was not specied and the Catalyst by default
added 100,000 to the VLAN number.
VLAN Type SAID MTU Parent RingNo BrdgNo Stp BrdgMode Trans1 Trans2
---- ----- ---------- ----- ------ ------ ------ ---- -------- ------ ------
1 enet 100001 1500 - - - - - 0 0
100 trbrf 100100 4472 - - 0x5 ibm - 0 0
110 trcrf 100110 4472 100 0x10 - - srb 0 0
120 trcrf 100120 4472 100 0x20 - - srb 0 0
500 fddi 100500 1500 - 0x0 - - - 0 0
600 fddi 1600 1500 - 0x0 - - - 0 0
1002 fddi 101002 1500 - 0x0 - - - 0 0
1003 trcrf 101003 4472 1005 0xccc - - srb 0 0
1004 fdnet 101004 1500 - - 0x0 ieee - 0 0
1005 trbrf 101005 4472 - - 0xf ibm - 0 0
Although the FDDI VLANS were successfully created, all that was accomplished was the
creation of yet another broadcast domain. The Catalysts treat the FDDI VLAN as distinct
from any of the Ethernet VLANs unless you associate the broadcast domains as a single
domain. Use the set vlan command to merge the FDDI and the Ethernet broadcast domains.
Until you do this, the Catalyst cannot transport the Ethernet VLAN over the FDDI trunk.
To make an Ethernet VLAN 10 and an FDDI VLAN 100 part of the same broadcast domain,
you enter the following command:
Console> (enable) set vlan 10 translation 100
Conversely, the following command is equally effective, where you specify the FDDI
VLAN rst, and then translate it into the Ethernet VLAN:
Console> (enable) set vlan 100 translation 10
These are bidirectional commands. You do not need to enter both commands, only one or
the other.
ATM Trunks
Asynchronous Transfer Mode (ATM) technology has the inherent capability to transport
voice, video, and data over the same infrastructure. And because ATM does not have any
collision domain distance constraints like LAN technologies, ATM deployments can reach
from the desktop to around the globe. With these attributes, ATM offers users the
opportunity to deploy an infrastructure suitable for consolidating what are traditionally
independent networks. For example, some companies have a private voice infrastructure
between corporate and remote ofces. The business leases T1 or E1 services to interconnect
private branch exchanges (PBXs) between the ofces. The company can deploy or lease a
separate network to transport data between the ofces. And nally, to support video
conferencing, an ISDN service can be installed. Each of these networks has its own
equipment requirements, maintenance headaches, and in many cases recurring costs. By
consolidating all of the services onto an ATM network, as in Figure 8-12, the infrastructure
complexities signicantly reduce. Even better, the recurring costs can diminish. Most
importantly, this keeps your employer happy.
ATM Trunks 331
PBX Video
CODEC
Video
CODEC
Site 2 Site 1
Video
CODEC
PBX
Site 3
For those installations where ATM provides a backbone service (either at the campus or
WAN levels), users can take advantage of the ATM infrastructure to trunk between
Catalysts. By inserting a Catalyst LANE module, the Catalyst can send and receive data
frames over the ATM network. The Catalyst bridges the LAN trafc onto the ATM network
to transport the frames (segmented into ATM cells by the LANE module) through the ATM
system and received by another ATM-attached Catalyst or router.
332 Chapter 8: Trunking Technologies and Applications
Catalysts support two modes of transporting data over the ATM network: LANE and
MPOA. Each of these are covered in detail in other chapters. LANE is discussed in Chapter
9, Trunking with LAN Emulation, and Chapter 10, Trunking with Multiprotocol over
ATM, covers MPOA operations. The ATM Forum dened LANE and MPOA for data
networks. If you plan to use ATM trunking, you are strongly encouraged to visit the ATM
Forum Web site (www.atmforum.com) and obtain, for free, copies of the LANE and MPOA
documents. The following sections on LANE and MPOA provide brief descriptions of
these options for trunking over ATM.
LANE
LANE emulates Ethernet and Token Ring networks over ATM. Emulating an Ethernet or
Token Ring over ATM denes an Emulated LAN (ELAN). A member of the ELAN is
referred to as a LANE Client (LEC). Each ELAN is an independent broadcast domain. An
LEC can belong to only one ELAN. Both Ethernet and Token Ring networks are described
as broadcast networks; if a station generates a broadcast message, all components in the
network receive a copy of the frame. ATM networks, on the other hand, create direct point-
to-point connections between users. This creates a problem when a client transmits a
broadcast frame. How does the broadcast get distributed to all users in the broadcast
domain? ATM does not inherently do this. A client could create a connection to all members
of the ELAN and individually forward the broadcast to each client, but this is impractical
due to the quantity of virtual connections that need to be established even in a small- to
moderately-sized network. Besides, each client does not necessarily know about all other
clients in the network. LANE provides a solution by dening a special server responsible
for distributing broadcasts within an ELAN.
In Figure 8-13, three Catalysts and a router interconnect over an ATM network. On the LAN
side, each Catalyst supports three VLANs. On the ATM side, each Catalyst has three clients
to be a member of three ELANs.
ATM Trunks 333
VLAN2
VLAN1 VLAN3
ELAN1
ELAN2
ELAN3
VLAN3 VLAN3
VLAN1 VLAN2 VLAN1 VLAN2
Within the Catalyst congurations, each VLAN maps to one ELAN. This merges the
broadcast domains so that the distributed VLANs can intercommunicate over the ATM
network. Figure 8-14 shows a logical depiction of the VLAN to ELAN mapping that occurs
inside a Catalyst.
334 Chapter 8: Trunking Technologies and Applications
Figure 8-14 A Catalyst with Three LECs Congured to Attach to Three ELANs
ELAN 3
ELAN 1
ELAN 2
You need the router shown in Figure 8-13 if workstations in one VLAN desire to
communicate with workstations in another VLAN. The router can reside on the LAN side
of the Catalysts, but this example illustrates the router on the ATM side. When a station in
VLAN 1 attempts to communicate with a station in VLAN 2, the Catalyst bridges the frame
out LEC 1 to the router. The router, which also has three clients, routes the frame out the
LEC which is a member of ELAN 2 to the destination Catalyst. The destination Catalyst
receives the frame on LEC 2 and bridges the frame to the correct VLAN port.
MPOA
In most networks, several routers interconnect subnetworks. Only in the smallest networks is a
router a member of all subnetworks. In larger networks, therefore, a frame can cross multiple
routers to get to the intended destination. When this happens in an ATM network, the same
information travels through the ATM cloud as many times as there are inter-router hops. In
Figure 8-15, a station in VLAN 1 attached to Cat-A desires to communicate with a station in
ATM Trunks 335
VLAN 4 on Cat-B. Normally, the frame exits Cat-A toward Router 1, the default gateway.
Router 1 forwards the frame to Router 2, which forwards the frame to Router 3. Router 3
transfers the frame to the destination Cat-B. This is the default path and requires four transfers
across the ATM network, a very inefcient use of bandwidth. This is particularly frustrating
because the ATM network can build a virtual circuit directly between Cat-A and Cat-B. IP rules,
however, insist that devices belonging to different subnetworks interconnect through routers.
ELAN 2 ELAN 3
ELAN 1
ELAN 4
ATM
Default path
VLAN 1 Shortcut path
172.16.1.0 VLAN 4
172.16.4.0
MPOA enables devices to circumvent the default path and establish a direct connection
between the devices, even though they belong to different subnets. This shortcut path,
illustrated in Figure 8-15, eliminates the multiple transits of the default path conserving
ATM bandwidth and reducing the overall transit delay.
MPOA does not replace LANE, but supplements it. In fact, MPOA requires LANE as one
of its components. Intra-broadcast domain (transfers within an ELAN) communications use
LANE. MPOA kicks in only when devices on different ELANs try to communicate with
each other. Even so, MPOA might not always get involved. One reason is that MPOA is
336 Chapter 8: Trunking Technologies and Applications
protocol dependent. A vendor must provide MPOA capabilities for a protocol. Currently,
IP is the dominant protocol supported. Another reason MPOA might not create a shortcut
is that it might not be worth it. For MPOA to request a shortcut, the MPOA client must
detect enough trafc between two hosts to merit any shortcut efforts. This is determined by
an administratively congurable threshold of packets per second between two specic
devices. If the client detects a packets per second rate between an IP source and an IP
destination greater than the congured threshold, the client attempts to create a shortcut to
the IP destination. But if the packets per second rate never exceeds the threshold, frames
continue to travel through the default path.
Trunk Options
Three trunk methods and their encapsulation methods were described in the previous
sections. Fast Ethernet and Gigabit Ethernet use ISL or 802.1Q encapsulation. FDDI trunks
encapsulate with a Cisco proprietary adaptation of 802.10. With ATM, you can use LANE
encapsulation. Optionally, you can augment LANE operations with MPOA. Which option
should you use?
Criteria you need to consider include the following:
Existing infrastructure
Your technology comfort level
Infrastructure resiliency needs
Bandwidth requirements
Existing Infrastructure
Your trunk choice might be limited to whatever technology you currently deploy in your
network. If your Catalyst interfaces are Ethernet and Fast Ethernet, and your cabling is
oriented around that, you probably elect to use some form of Ethernet for your trunk lines.
The question becomes one, then, of how much bandwidth do you need to support your
users.
If your backbone infrastructure currently runs FDDI, you might not be able to do much with
other trunk technologies without deploying some additional cabling. You might need to
shift the FDDI network as a distribution network and use another technology for the core
backbone. Figure 8-16 shows the FDDI network below the core network.
Trunk Options 337
Core
Distribution
Access
The FDDI segments are dual-homed to core-level Catalysts providing fault tolerance in the
event that a primary Catalyst fails. The connection type between the core Catalysts is again
determined by the bandwidth requirements. Remember that the FDDI segment is shared.
The bandwidth is divided between all of the attached components and operates in half-
duplex mode. Today, FDDI is probably your last choice for a backbone technology.
ATM proves to be a good choice if you are interested in network consolidation as described
in the ATM trunk section, or if you need to trunk over distances not easily supported by
Ethernet or FDDI technologies.
FDDI Resiliency
FDDI probably has the quickest failover rate because its resiliency operates at Layer 1, the
physical layer. FDDI operates in a dual counter-rotating ring topology. Each ring runs in the
opposite direction of the other ring. If a cable breaks between Cat-A and Cat-B as in Figure
8-17, both Catalysts see the loss of optical signal and enter into a wrapped state. Data
continues to ow between all components in the network in spite of the cable outage. The
cutover time is extremely fast because failure detection and recovery occur at Layer 1.
Cat-C
ATM Resiliency
ATM also provides physical layer recovery. However, the failover time is longer than for
FDDI. In an ATM network, a cable or interface failure can occur at the Catalyst or between
ATM switches. If the failure occurs between ATM switches, the Catalyst requests the ATM
network to re-establish a connection to the destination client(s). The ATM network attempts
to nd an alternate path to complete the connection request. This happens automatically.
Trunk Options 339
Figure 8-18 shows a Catalyst attached to two ATM switches for redundancy. One link, the
preferred link, is the active connection. The second link serves as a backup and is inactive.
Trafc only passes over the active link.
Backup Link
Preferred Link
PHY A PHY B
Lane Module
A failure can occur at the Catalyst. To account for this, the Catalyst LANE module provides
two physical interfaces, PHY A and PHY B. In Figure 8-18, a Catalyst attaches to two ATM
switches. PHY A attaches to ATM Switch 1 and PHY B attaches to ATM Switch 2. The
Catalyst activates only one of the interfaces at a time. The other simply provides a backup
path. If the active link fails, the Catalyst activates the backup port. The Catalyst must rejoin
the ELAN and then reattach to the other client(s) in the network. Although ATM connections
can establish quickly, the additional complexity increases the failover time as compared to
FDDI links. The actual failover time varies depending upon the tasks that the ATM switches
are performing when the Catalyst requests a connection to the ELAN or to another client.
Other types of failures can also occur in a LANE environment. For example, various server
functions must be enabled for LANE to function. The LANE version 1 standard provides
for only one of the servers in each ELAN. If these servers fail, it disables the ELAN. Cisco
has a protocol called Simple Server Redundancy Protocol (SSRP) that enables backup
servers so that the LANE can remain functional in the event of a server failure. This is
discussed in more detail in Chapter 9, Trunking with LAN Emulation.
Ethernet Resiliency
Ethernet options (both Fast Ethernet and Gigabit Ethernet) rely upon Spanning Tree for
resiliency. Spanning Tree, discussed in Chapter 6, Understanding Spanning Tree,
operates at Layer 2, the data link layer. Components detect failures when they fail to receive
BPDUs from the Root Bridge. Spanning Tree recovery can take as much as 50 seconds
depending upon at what values you set the timers.
340 Chapter 8: Trunking Technologies and Applications
EtherChannel, both Fast and Gigabit, provide local resiliency. Figure 8-19 shows two
Catalysts interconnected with an EtherChannel.
An EtherChannel has more than one link actively carrying data. If one of the links in Figure
8-19 fails, the remaining link(s) continue to carry the load, although with a reduced
aggregate bandwidth. This happens without triggering any Spanning Tree events.
Therefore, Spanning Tree times do not get involved. Failover for EtherChannel occurs
quickly, because it uses Layer 1 failure detection and recovery. If you implement redundant
EtherChannels, Spanning Tree activation times must be anticipated.
Bandwidth Requirements
Right or wrong, network engineers most often use bandwidth capabilities for selecting a
trunk technology. Catalyst offers a spectrum of options ranging from half-duplex FDDI
through full-duplex Gigabit EtherChannel. Figure 8-20 illustrates a number of Fast
Ethernet and Fast EtherChannel options with increasing bandwidth.
VLAN 1
VLAN 2
Non-Trunk
A VLAN 3
VLAN 4
VLAN 1,2,3,4
B
Fast Ethernet/
Gigabit Ethernet
Trunk
VLAN 1,3
C VLAN 2,4
VLAN 1,2,3,4
D
Fast EtherChannel/
Gigabit EtherChannel
VLAN 1,2,3,4
Part A of Figure 8-20 shows an interconnection where each link is dedicated to a VLAN.
No trunk encapsulation is used and frames are transported in their native format. Only one
link per VLAN between the Catalysts can be active at any time. Spanning Tree disables any
additional links. Therefore, bandwidth options are only 10/100/1000 Mbps.
By enabling ISL trunking, you can share the link bandwidth with multiple VLANs. A single
Fast Ethernet or Gigabit Ethernet link as in Part B of Figure 8-20 offers 100 or 1000 Mbps
bandwidth with no resiliency. Running multiple trunks in parallel provides additional
bandwidth and resiliency. However, VLAN trafc from any single VLAN can only use one
342 Chapter 8: Trunking Technologies and Applications
path while the other path serves as a backup. For example, in Part C of Figure 8-20, two
links run between the Catalysts. One link carries the trafc for VLANs 1 and 3, and the
other link carries the trafc for VLANs 2 and 4. Each serves as a Spanning Tree backup for
the other. This provides more bandwidth than in Part B of Figure 8-20 by having fewer
VLANs contend for the bandwidth while providing another level of resiliency. However,
each VLAN can still have no more than 100 or 1000 Mbps of bandwidth, depending upon
whether the link is Fast Ethernet or a Gigabit Ethernet.
On the other hand, the VLANs in Parts D and E of Figure 8-20 share the aggregate
bandwidth of the links. These links use Fast or Gigabit EtherChannel. With a two-port
EtherChannel, the VLANs share a 400/4000 Mbps bandwidth. (Each link is full duplex.) A
four-port version has 800/8000 Mbps bandwidth.
Table 8-8 compares the various interconnection modes providing a summary of the
bandwidth capabilities, resiliency modes, and encapsulation types.
Table 8-8 A Comparison of Different Trunk Modes
Trunk Mode Bandwidth (Mbps) Resiliency Encapsulation Comments
Per VLAN Dedicated per VLAN Spanning Tree None VLANs trafc
link 10/100/1000 dedicated per link.
Ethernet Shared 100/1000 Spanning Tree ISL/802.1Q Bandwidth reects
half duplex. Full
duplex doubles
bandwidth.
EtherChannel Shared 200/400/2000/ Layer 1 ISL/802.1Q Spanning Tree might
8000 activate in some
cases.
FDDI Shared 100 Layer 1 wrap 802.10
ATM Shared 155/622 Layer 1 Diverse LANE/MPOA Resiliency against
path network and local
failures.
Review Questions
This section includes a variety of questions on the topic of campus design implementation.
By completing these, you can test your mastery of the material included in this chapter as
well as help prepare yourself for the CCIE written test.
1 What happens in a trafc loading situation for EtherChannel when two servers pass
les between each other?
2 If you have access to equipment, attempt to congure a two-segment EtherChannel
where one end is set to transport only VLANs 110 and the other end of the segment
is set to transport all VLANs. What gets established?
Review Questions 343
3 In Figure 8-13, the conguration shows an 802.1Q encapsulation for VLAN 200 on a
router. How would you add VLAN 300 to the trunk?
4 Congure a Catalyst trunk to transport VLAN 200 and VLAN 300 with 802.1Q.
Repeat the exercise with ISL.
This chapter covers the following key topics:
A Brief ATM TutorialFor engineers accustomed to working in frame-based
technologies such as Ethernet, ATM can seem strange and mysterious. However, as
this section discusses, it is based on many of the same fundamental concepts as
technologies that are probably more familiar.
LANE: Theory of OperationIntroduces the theory used by LAN Emulation
(LANE) to simulate Ethernet and Token Ring networks over an ATM infrastructure.
Explores the conceptual approach used by LANE and its four main components. This
is followed by a detailed description of the LANE initialization sequence and the
required overhead connections.
Conguration ConceptsDiscusses several concepts used to congure LANE on
Cisco equipment.
Conguration SyntaxIntroduces a ve-step process that can be used to congure
LANE on Cisco routers and Catalyst equipment.
A Complete LANE NetworkPulls together the material discussed in previous
sections by examining a complete end-to-end LANE conguration in a sample
campus network.
Testing the CongurationExplains several useful and important commands used
to troubleshoot and maintain LANE networks on Cisco equipment.
Advanced Issues and FeaturesDiscusses a variety of advanced LANE topics
such as LANE design, Simple Server Redundancy Protocol (SSRP), PVC-based
connectivity, and trafc shaping.
CHAPTER
9
Why Cells?
Most readers are probably aware that ATM uses xed-length packages of data called cells.
But whats the big deal about cells? Because it takes quite a lot of work for network devices
hardware to cellify all of their data, what is the payoff to justify all of this extra work and
complexity? Fortunately, cells do have many advantages, including the following:
High throughput
Advanced statistical multiplexing
Low latency
Facilitate multiservice trafc (voice, video, and data)
Each of these advantages of ATM cells is addressed in the sections that follow.
High Throughput
High throughput has always been one of the most compelling benets of ATM. At the time ATM
was conceived, routers were slow devices that required software-based processing to handle the
complex variable-length and variable-format multiprotocol (IP, IPX, and so forth) trafc. The
variable data lengths resulted in inefcient processing and many complex buffering schemes (to
illustrate this point, just issue a show buffers command on a Cisco router). The variable data
formats required every Layer 3 protocol to utilize a different set of logic and routing procedures.
Run this on a general-purpose CPU and the result is a low-throughput device.
A Brief ATM Tutorial 347
The ATM cell was designed to address both of these issues. Because cells are xed in
length, buffering becomes a trivial exercise of simply carving up buffer memory into xed-
length cubbyholes. Because cells have a xed-format, 5-byte header, switch processing is
drastically simplied. The result: it becomes much easier to build very high-speed,
hardware-based switching mechanisms.
DACS
10 timeslots = 14 timeslots =
640 kbps 896 kbps
DA
CS CS
LA DA DC
CL25 0901.eps
Figure 9-1 illustrates a small corporate network with three sites: headquarters is located in
New York City with two remote sites in Washington, DC and Los Angeles. The NY site has
a single T1 to the carriers nearest Central Ofce (CO). This T1 has been channelized into
two sections: one channel to DC and another to LA.
T1 technology uses something called time-division multiplexing (TDM) to allow up to 24
voice conversations to be carried across a single 4-wire circuit. Each of these 24 conversations
is assigned a timeslot that allows it to send 8 bits of information at a time (typically, these 8
348 Chapter 9: Trunking with LAN Emulation
bits are pulse code modulation [PCM] digital representations of human voice conversations).
Repeat this pattern 8000 times per second and you have the illusion that all 24 conversations
are using the wire at the same time. Also note that this results in each timeslot receiving
64,000 bits/second (bps) of bandwidth (8 bits/timeslot8,000 timeslots per second).
However, the data network in Figure 9-1 does not call for 24 low-bandwidth connections
the desire is for two higher-bandwidth connections. The solution is to group the timeslots into
two bundles. For example, a 14-timeslot bundle can be used to carry data to the remote site in
DC, whereas the remaining 10 timeslots are used for data traveling to the remote site in LA.
Because each timeslot represents 64 Kbps of bandwidth, DC is allocated 896 Kbps and LA
receives 640 Kbps. This represents a form of static multiplexing. It allows two connections to
share a single link, but it prevents a dynamic reconguration of the 896/640 bandwidth split.
In other words, if no trafc is being transferred between NY and LA, the NY-to-DC circuit is
still limited to 896 Kbps. The 640 Kbps of bandwidth allocated to the other link is wasted.
Figure 9-2 shows an equivalent design utilizing ATM.
NY
Virtual circuits
only use cells
when required
LA DC
CL25 0902.eps
In Figure 9-2, the NY ofce still has a T1 to the local CO. However, this T1 line is
unchannelizedit acts as a single pipe to the ATM switch sitting in the CO. The advantage
of this approach is that cells are only sent when there is a need to deliver data (other than
some overhead cells). In other words, if no trafc is being exchanged between the NY and
LA sites, 100 percent of the bandwidth can be used to send trafc between NY and DC.
Seconds later, 100 percent of the bandwidth might be available between NY and LA. Notice
A Brief ATM Tutorial 349
that, although the T1 still delivers a constant ow of 1,544,000 bps, this xed amount of
bandwidth is better utilized because there are no hard-coded multiplexing patterns that
inevitably lead to unused bandwidth. The result is a signicant improvement over the static
multiplexing congured in Figure 9-1. In fact, many studies have shown that cell
multiplexing can double the overall bandwidth utilization in a large network.
Low Latency
Latency is a measurement of the time that it takes to deliver information from the source to
the destination. Latency comes from two primary sources:
Propagation delay
Switching delay
Propagation delay is based on the amount of time that it takes for a signal to travel over a given
type of media. In most types of copper and ber optic media, signals travel at approximately
two-thirds the speed of light (in other words, about 200,000 meters per second). Because this
delay is ultimately controlled by the speed of light, propagation delay cannot be eliminated or
minimized (unless, of course, the two devices are moved closer together).
Switching delay results from the time it takes for data to move through some
internetworking device. Two factors come into play here:
The length of the frameIf very large frames are in use, it takes a longer period of
time for the last bit of a frame to arrive after the rst bit arrives.
The switching mechanics of the deviceSoftware-based routers can add several
hundred microseconds of delay during the routing process, whereas hardware-based
devices can make switching and routing decisions in only several microseconds.
Cells are an attempt to address both of these issues simultaneously. Because cells are small,
the difference between the arrival time of the rst and last bit is minimized. Because cells
are of a xed size and format, they readily allow for hardware-based optimizations.
Why 53 Bytes?
As discussed in the previous section, every cell is a xed-length container of information.
After considerable debate, the networking community settled on a 53-byte cell. This 53-
byte unit is composed of two parts: a 5-byte header and a 48-byte payload. However, given
that networking engineers have a long history of using even powers of two, 53 bytes seems
like a very strange number. As it turns out, 53 bytes was the result of an international
compromise. In the late 1980s, the European carriers wanted to use ATM for voice trafc.
Given the tight latency constraints required by voice, the Europeans argued that a 32-byte
cell payload would be most useful. U.S. carriers, interested in using ATM for data trafc,
were more interested in the efciency that would be possible with a larger, 64-byte payload.
The two groups compromised on the mid-point value, resulting in the 48-byte payload still
used today. The groups then debated the merits of various header sizes. Although several
sizes were proposed, the 5-byte header was ultimately chosen.
Each of these steps equates to a layer in the ATM stack shown in Figure 9-3.
Large IP Packets
53 Bytes
Physical Ship Cells
A Brief ATM Tutorial 351
First, ATM must obviously chop up large IP packets before transmission. The technical term
for this function is the ATM Adaptation Layer (AAL); however, I use the more intuitive term
Slice & Dice Layer. The purpose of the Slice & Dice Layer is to act like a virtual Cuisinart
that chops up large data into small, xed size pieces. This is frequently referred to as SAR, a
term that stands for Segmentation And Reassembly, and accurately portrays this Slice & Dice
function (it is also one of the two main functions performed by the AAL). Just as Cuisinarts
are available with a variety of different blades, the AAL blade can Slice & Dice in a variety
of ways. In fact, this is exactly how ATM accommodates voice, video, and data trafc over a
common infrastructure. In other words, the ATM Adaptation Layer adapts all types of trafc
into common ATM cells. However, regardless of which Slice & Dice blade is in use, the AAL
is guaranteed to pass a xed-length, 48-byte payload down to the next layer, the ATM layer.
The middle layer in the ATM stack, the ATM layer, receives the 48-byte slices created by
the AAL. Note the potential for confusion here: all three layers form the ATM stack, but the
middle layer represents the ATM layer of the ATM stack. This layer builds the 5-byte ATM
cell header, the heart of the entire ATM process. The primary function of this header is to
identify the remote ATM device that should receive each cell. After this layer has completed
its work, the cell is guaranteed to be 53 bytes in length.
NOTE Technically, there is a small exception to the statement that the ATM Layer always passes 53-
byte cells to the physical layer. In some cases, the physical layer is used to calculate the cell
headers CRC, requiring only 52-byte transfers. In practice, this minor detail can be ignored.
At this point, the cells are ready to leave the device in a physical layer protocol. This physical
layer acts like a shipping department for cells. The vast majority of campus ATM networks
use Synchronous Optical Network (SONET) as a physical layer transport. SONET was
developed as a high-speed alternative to the T1s and E1s discussed earlier in this chapter.
NOTE SONET is very similar to T1s in that it is a framed, physical layer transport mechanism that
repeats 8,000 times per second and is used for multiplexing across trunk links. On the other
hand, they are very different in that SONET operates at much higher bit rates than T1s while
also maintaining much tighter timing (synchronization) parameters. SONET was devised
to provide efcient multiplexing of T1, T3, E1, and E3 trafc.
Think of a SONET frame as a large, 810-byte moving van that leaves the shipping dock
every 1/8000th of a second (810 bytes is the smallest/slowest version of SONET; higher
speeds use an even larger frame!). The ATM Layer is free to pack as many cells as it can t
into each of these 810-byte moving vans. On a slow day, many of the moving vans might
be almost empty. However, on a busy day, most of the vans are full or nearly full.
352 Chapter 9: Trunking with LAN Emulation
One of the most signicant advantages of ATM is that it doesnt require any particular type
of physical layer. That is, it is media independent. Originally, ATM was designed to run
only over SONET. However, the ATM Forum wisely recognized that this would severely
limit ATMs potential for growth and acceptance, and developed standards for many
different types and speeds of physical layers. In fact, the Physical Layer Working Group of
the ATM Forum has been its most prolic group. Currently, ATM runs over just about any
media this side of barbed wire.
TIP ATM and SONET are two different concepts. ATM deals with cells and SONET is simply
one of the many physical layers available for moving ATM cells from point to point.
Point-to-Point VC
Uni-Directional-or
Bi-Directional
Point-to-Multipoint VC
Leaves
Uni-Directional Only
CL25 0904.eps
Root
Point-to-point virtual circuits behave exactly as the name suggests: one device can be
located at each end of the circuit. This type of virtual circuit is also very common in
technologies such as Frame Relay. These circuits support bi-directional communication
that is, both end points are free to transmit cells.
Point-to-multipoint virtual circuits allow a single root node to send cells to multiple leaf nodes.
Point-to-multipoint circuits are very efcient for this sort of one-to-many communication
because it allows the root to generate a given message only once. It then becomes the duty of
the ATM switches to pass a copy of the cells that comprise this message to all leaf nodes.
Because of their unidirectional nature, point-to-multipoint circuits only allow the root to
transmit. If the leaf nodes need to transmit, they need to build their own virtual circuits.
TIP Do not confuse the root of a point-to-multipoint ATM VC with the Spanning Tree Root
Bridge and Root Port concepts discussed in Chapter 6, Understanding Spanning Tree.
They are completely unrelated concepts.
A Brief ATM Tutorial 355
ATM Addressing
As with all other cloud topologies, ATM needs some method to identify the intended
destination for each unit of information (cell) that gets sent. Unlike most other topologies,
ATM actually uses two types of addresses to accomplish this task: Virtual Path Indicator/
Virtual Channel Indicator (VPI/VCI) addresses and Network Services Access Point
(NSAP) addresses, both of which are discussed in greater detail in the sections that follow.
VPI/VCI Addresses
VPI/VCI, the rst type of address used by ATM, is placed in the 5-byte header of every cell.
This address actually consists of two parts: the Virtual Path Indicator (VPI) and the Virtual
Channel Indicator (VCI). They are typically written with a slash separating the VPI and the
VCI valuesfor example, 0/100. The distinction between VPI and VCI is not important to a
discussion of LANE. Just remember that together these two values are used by an ATM edge
device (such as a router) to indicate to ATM switches which virtual circuit a cell should follow.
For example, Figure 9-5 adds VPI/VCI detail to the network illustrated in Figure 9-2 earlier.
0/50 0/65
0/51 2 1 0/185
0 0
LA DC
0/52 1/76
1
CL25 0902.eps
The NY router uses a single physical link connected to Port 0 on the NY ATM switch
carrying both virtual circuits. How, then, does the ATM switch know where to send each
cell? It simply makes decisions based on VPI/VCI values placed in ATM cell headers by
356 Chapter 9: Trunking with LAN Emulation
the router. If the NY router (the ATM edge device) places the VPI/VCI value 0/50 in the cell
header, the ATM switch uses a preprogrammed table indicating that the cell should be
forwarded out Port 2, sending it to LA. Also, note that this table needs to instruct the switch
to convert the VPI/VCI value to 0/51 as the cells leave the Port 1 interface (only the VCI is
changed). The ATM switch in LA has a similar table indicating that the cell should be
switched out Port 1 with a VPI/VCI value of 0/52. However, if the NY router originates a
cell with the VPI/VCI value 0/65, the NY ATM switch forwards the cell to DC. The ATM
switching table in the NY ATM switch would contain the entries listed in Table 9-1.
Notice the previous paragraph mentions that the ATM switch had been preprogrammed
with this switching table. How this preprogramming happens depends on whether the
virtual circuit is a PVC or an SVC. In the case of a PVC, the switching table is programmed
through human intervention (for example, through a command-line interface). However,
with SVCs, the table is built dynamically at the time the call is established.
NSAPs
The previous section mentioned that SVCs build the ATM switching tables dynamically.
This requires a two-step process:
Step 1 The ATM switch must select a VPI/VCI value for the SVC.
Step 2 The ATM switch must determine where the destination of the call is located.
Step 1 is a simple matter of having the ATM switch look in the table to nd a value currently
not in use on that port (in other words, the same VPI/VCI can be in use on every port of the
same switch; it just cannot be used twice on the same port).
To understand Step 2, consider the following example. If the NY router places an SVC call
to DC, how does the New York ATM switch know that DC is reachable out Port 1, not Port
2? The details of this process involve a complex protocol called Private Network-Network
Interface (PNNI) that is briey discussed later in this chapter. For now, just remember that
the NY switch utilizes an NSAP address to determine the intended destination. NSAP
addresses function very much like regular telephone numbers. Just as every telephone on
A Brief ATM Tutorial 357
the edge of a phone network gets a unique phone number, every device on an ATM network
gets a unique NSAP address. Just as you must dial a phone number to call your friend
named Joe, an ATM router must signal an NSAP address to call a router named DC. Just as
you can look at a phone number to determine the city and state in which the phone is
located, an NSAP tells you where the router is located.
However, there is one important difference between traditional phone numbers and NSAP
addresses: the length. NSAPs are xed at 20 bytes in length. When written in their standard
hexadecimal format, these addresses are 40 characters long! Try typing in a long list of
NSAP addresses and you quickly learn why an ATM friend of mine refers to NSAPs as
Nasty SAPs! You also learn two other lessons: the value of cut-and-paste and that even
people who understand ATM can have friends (theres hope after all!).
NSAP addresses consist of three sections:
A 13-byte prex. This is a value that uniquely identies every ATM switch in the
network. Logically, it functions very much like the area code and exchange of a U.S.
phone number (for example, the 703-242 of 703-242-1111 identies a telephone
switch in Vienna, Virginia). Ciscos campus ATM switches are precongured with
47.0091.8100.0000, followed by a unique MAC address that gets assigned to every
switch. For example, a switch that contains the MAC address 0010.2962.E801 uses a
prex of 47.0091.8100.0000.0010.2962.E801. No other ATM switch in your network
can use this prex.
A 6-byte End System Identier (ESI). This value identies every device connected to
an ATM switch. A MAC address is typically used (but not required) for this value.
Logically, it functions very much like the last four digits of a U.S. phone number (for
example, the 1111 of 703-242-1111 identies a particular phone attached to the
Vienna, Virginia telephone switch).
A 1-byte selector byte. This value identies a particular software process running in
an ATM-attached device. It functions very much like an extension number associated
with a telephone number (for example, 222 in 703-242-1111 x222). Cisco devices
typically use a subinterface number for the selector byte.
Figure 9-6 illustrates the ATM NSAP format.
Figure 9-6 ATM NSAP Format
20 Bytes
1 2 3
13 Bytes 6 Bytes 1
Byte
TIP ATM NSAP addresses are only used to build an SVC. After the SVC is built, only VPI/VCI
addresses are used.
Figure 9-7 illustrates the relationship between NSAP and VPI/VCI addresses. The NSAPs
represent the endpoints whereas VPI/VCIs are used to address cells as they cross each link.
Figure 9-7 Using NSAP Addresses to Build VPI/VCI Values for SVCs
NY
NSAP = 47.0091.8100.0000.0060.12AB.
7421.0000.0C48.29A1.00
0/50 0/51
Globally Unique
Unique to
Each Link
0/51 0/185
LA DC
0/52 0/76
CL25 0907.eps
Table 9-2 documents the characteristics of NSAP and VPI/VCI addresses for easy
comparison.
ATM Switches
ATM switches only handle ATM cells. Ciscos campus switches include the LightStream
LS1010 and 8500 MSR platforms (Cisco also sells several carrier-class switches developed
as a result of their Stratacom acquisition). ATM switches are the devices that contain the
ATM switching tables referenced earlier. They also contain advanced software features
(such as PNNI) to allow calls to be established and high-speed switching fabrics to shuttle
cells between ports. Except for certain overhead circuits, ATM switches generally do not
act as the termination point of PVCs and SVCs. Rather, they act as the intermediary
junction points that exist for the circuits connected between ATM edge devices.
Figure 9-8 illustrates the difference between edge devices and ATM switches.
360 Chapter 9: Trunking with LAN Emulation
Figure 9-8 Catalyst ATM Edge Devices Convert Frames to Cells, Whereas ATM Switches Handle Only Cells
Edge Device
Frames E Cells
t
h A ATM
e T Network
r M
n
e
t
ATM Switch
Cells
Cells
A A
T T
M M
CL25 0908.eps
Cells
TIP Remember the difference between ATM edge devices and ATM switches. ATM edge
devices sit at the edge of the network, but ATM switches actually are the ATM network.
Products such as the Catalyst 5500 and 8500 actually support multiple functions in the
same chassis. The 5500 supports LS1010 ATM switching in its bottom ve slots while
simultaneously accommodating one to seven LANE modules in the remaining slots.
However, it is often easiest to think of these as two separate boxes that happen to share the
same chassis and power supplies. The bottom ve slots (slots 913) accommodate ATM
switch modules while slots 212 accommodate ATM edge device modules (note that slots
912 can support either service).
A Brief ATM Tutorial 361
TIP Cisco sells several other devices that integrate ATM and frame technologies in a single
platform in a variety of different ways. These include the Catalyst 5500 Fabric Integration
Module (FIM), the Catalyst 8500 MSR, and the ATM Router Module (ARM). See Ciscos
Product Catalog for more information on these devices.
ILMI
Integrated Local Management Interface (ILMI) is a protocol created by the ATM Forum to
handle various automation responsibilities. Initially called the Interim Local Management
Interface, ILMI utilizes SNMP to allow ATM devices to automagically learn the conguration
of neighboring ATM devices. The most common use of ILMI is a process generally referred to
as address registration. Recall that NSAP addresses consist of three parts: the switchs prex and
the edge devices ESI and selector byte. How do the two devices learn about each others
addresses? This is where ILMI comes in. Address registration allows the edge device to learn
the prex from the switch and the switch to learn the ESI from the edge device (because the
selector byte is locally signicant, the switch doesnt need to acquire this value).
PNNI
Private Network-Network Interface (PNNI) is a protocol that allows switches to
dynamically establish SVCs between edge devices. However, edge devices do not
participate in PNNIit is a switch-to-switch protocol (as the Network-Network Interface
portion of the name suggests). Network-Network Interface (NNI) protocols consist of two
primary functions:
Signaling
Routing
Signaling allows devices to issue requests that create and destroy ATM SVCs (because
PVCs are manually created, they do not require signaling for setup and tear down).
Routing is the process that ATM switches use to locate the destination the NSAP addresses
specied in signaling requests. Note that this is very different from IP-based routing. IP
routing is a connectionless process that is performed for each and every IP datagram
(although various caching and optimization techniques do exist). ATM routing is only
362 Chapter 9: Trunking with LAN Emulation
performed at the time of call setup. After the call has been established, all of the trafc
associated with that SVC utilizes the VPI/VCI cell-switching table. Note that this
distinction allows ATM to simultaneously fulll the conicting goals of exibility and
performance. The unwieldy NSAP addresses provide exible call setup schemes, and the
low-overhead VPI/VCI values provide high-throughput and low-latency cell switching.
TIP Do not confuse ATM routing and IP routing. ATM routing is only performed during call
setup when an ATM SVC is being built. On the other hand, IP routing is a process that is
performed on each and every IP packet.
For many network designers, one of the most compelling advantages of ATM is freedom
from distance constraints. Even without repeaters, ATM supports much longer distances
than any form of Ethernet. With repeaters (or additional switches), ATM can cover any
distance. For example, with ATM it is very simple and cost-effective to purchase dark ber
between two sites that are up to 40 kilometers apart (much longer distances are possible)
and connect the ber to OC-12 long-reach ports on LS1010 ATM switches (no repeaters are
required). By using repeaters and additional switches, ATM can easily accommodate
networks of global scale. However, on the other hand, a number of vendors have introduced
forms of Gigabit Ethernet ports capable of reaching 100 kilometers without a repeater such
as Ciscos ZX GBIC. Although this does not allow transcontinental Ethernet connections,
it can accommodate many campus requirements.
ATM has historically been considered one of the fastest (if not the fastest) networking
technologies available. However, this point has recently become the subject of considerable
debate. The introduction of hardware-based, Gigabit-speed routers (a.k.a. Layer 3
switches) has nullied the view that routers are slow, causing many to argue that modern
routers can be just as fast as ATM switches. On the other hand, ATM proponents argue that
ATMs low-overhead switching mechanisms will always allow for higher bandwidth than
Layer 3 switches can support. Only time will tell.
In short, the decision to use ATM is no longer a clear-cut choice. Each organization must
carefully evaluate its current requirements and plans for future growth. For additional
guidelines on when to use ATM and when not to use ATM, see Chapter 15, Campus Design
Implementation.
NOTE Recall that LANE is a technique for bridging trafc over an ATM network. As a bridging
technology, LANE therefore uses the Spanning-Tree Protocol discussed in Chapters 6 and
7. Note that this also implies a lower limit on failover performance: features such as SSRP,
discussed later, might fail over in 1015 seconds, however, Spanning Tree prevents trafc
from owing for approximately 30 seconds by default.
Because of this delay, some ATM vendors disable Spanning Tree by default. Because of the
risks associated with doing this (loops can easily be formed in the Ethernet portion of the
network), Cisco enables Spanning Tree over LANE by default.
TIP The acronyms used by LANE can lead to considerable confusion. This is especially true
when using the terms LECs (more than one LANE Client) and LECS (a single LANE
Conguration Server). You might want to always pronounce LEC as Client and LECS as
Cong Server to avoid this confusion.
The Setting
Our play is set in the exciting community of Geektown. In building the set, the director has
spared no expense. On stage, the workers have carefully built an entire nightclub. This is a
single, large brick building (ATM cloud) that contains two separate barrooms (ELANs). On
the right, we have The Funky Bit-Pipe, a popular local discotheque. On the left, The Dusty
Switch, an always-crowded country-western bar.
Plot Synopsis
Act I is where the main drama occurs and consists of ve scenes:
Scene 1Conguration DirectClient Contacts the Bouncer (LECS): The play
begins with a lone Client standing outside the barroom. The Client is clad in a pair of
shiny new cowboy boots, a wide-brimmed Stetson hat, and a Garth Brooks look-alike
shirt. Before the Client can enter the bar and quench his thirst, he must locate the
Bouncer (LECS) standing at the front door to the nightclub. The Bouncer performs a
quick security check and, noticing the Clients cowboy attire, points the Client in the
direction of the Bartender (LES) for the Dusty Switch barroom (ELAN).
Scene 2Control DirectClient Contacts the Bartender (LES): As soon as the
Client enters the bar, it must approach the Bartender. Then, the Bartender makes a
note of the Clients name (MAC address) and table number (NSAP address). Notice
that this solves the rst requirement of the LANE fake out by providing MAC address
to NSAP address mapping.
Scene 3Control DistributeBartender (LES) Contacts the Client: As every Client
enters the barroom, the Bartender adds it as a leaf node to a special fan-out point-to-
multipoint VC. After the new Client contacts the Bartender, the Bartender must add
this Client as well. This fan-out circuit allows the Bartender to easily send a single
frame that gets distributed to all Clients in the ELAN by the ATM switches. Happy
Hour! and Last Call! are commonly heard messages.
Scene 4Multicast SendClient Contacts the Gossip (BUS): The Client then
acquires the location of the Gossip (BUS) from the Bartender (LES). Next, the Client
uses this address to build a connection to the Gossip, allowing the Client to easily send
broadcasts and multicasts to everyone in the bar. Notice that this solves the second
requirement of the fake out: the capability to send broadcast and multicast frames.
Scene 5Multicast ForwardThe Gossip (BUS) Contacts the Client: Just as the
Bartender has special fan-out point-to-multipoint circuits that can be used to
efciently reach every Client in the barroom, The Gossip maintains a similar circuit.
This allows the Gossip to quickly distribute all of the information he collects.
After the ve scenes of Act I are complete, the Client has joined the ELAN, Act II. Notice
that Acts I and II only consider the action in the Dusty Switch ELAN. While the cowboys
are having fun in their ELAN, another group of Clients is dancing the night away in the
discotheque ELAN. Although both barrooms share the services of the Bouncer, each bar
requires its own Bartender and Gossip.
368 Chapter 9: Trunking with LAN Emulation
NOTE Anycast addresses are special ATM NSAPs that allow multiple devices to advertise a single
service. As clients request connections to these addresses, ATM switches automatically
connect them to the nearest available device providing that service. Not only can this
optimize trafc ows, it can provide an automatic form of redundancy (if the nearby server
goes away, the client simply is sent to a more distant device when it tries to reconnect).
Under the well-known VPI/VCI approach, Clients always use 0/17 to communicate with
the LECS. Because this method doesnt lend itself well to failure recovery, the well-known
VPI/VCI technique is rarely used (its also not included in the second version of the LANE
specication).
Assuming that the LEC has acquired the Conguration Servers NSAP via ILMI, the Client
then places an ATM phone call (SVC) to the LECS. This SVC is referred to as the
Conguration Direct (it is a direct connection to the Conguration Server). After the
Conguration Direct has been established, the Client tells the Conguration Server its
NSAP address and the desired ELAN (barroom). The LECS then lives up to its title of the
Bouncer by providing the Client with the following:
SecurityOptionally checks the Clients NSAP address as a security measure
HospitalityTells the Client how to reach the bartender (LES) of the desired
barroom (ELAN)
Also, if the Client doesnt request a specic ELAN, the Bouncer can optionally provide a
default ELAN.
Figure 9-9 illustrates the Conguration Direct VC.
370 Chapter 9: Trunking with LAN Emulation
Figure 9-9 Step 1: The Client Contacts the Conguration Server to Get the NSAP of the LES
Bouncer
LECS
Hospitality & Security
Configuration Direct
S
LE
P
1
SA
Client N
LEC LEC
CL25 0909.eps
LES BUS
Assuming that the Client meets the security requirements and the requested ELAN exists,
the LECS provide the NSAP of the LES to the Client. At this point, the Conguration Direct
can optionally be torn down to conserve virtual circuits on the ATM switch (Cisco devices
take advantage of this option).
Figure 9-10 Step 2: The Client Contacts the LES and Registers Its MAC and NSAP Addresses
Bouncer
LECS
Hospitality & Security
1
Client
LEC
M
AC
&
N
SA
P
Control Direct
2
LES BUS
MAC NSAP
CL25 0910.eps
AAAA 47.0091
Figure 9-11 Step 3: The LES Adds the Client to the Already Existing Control Distribute Point-to-Multipoint Circuit
Bouncer
LECS
Hospitality & Security
1
Client Existing Circuit
LEC
LEC
Control 2
Distribute
3
LES BUS
CL25 0911.eps
Bartender The Gossip
MAC NSAP Broadcasts
Figure 9-12 Step 4: The Client Builds the Multicast Send to the BUS
Bouncer
LECS
Hospitality & Security
1
Client
Existing Circuit
LEC LEC
Multicast
Send
2
3 4
CL25 0912.eps
LES BUS
Figure 9-13 Step 5: The BUS Adds the Client to the Multicast Forward
Bouncer
LECS
Hospitality & Security
Client 1
LEC
LEC
Multicast
Forward
2 5
3 4
CL25 0913.eps
LES BUS
Figure 9-14 Data Direct VCs are created by Clients after they have joined the ELAN
Bouncer
LECS
Hospitality & Security
Data Direct
1
Client
6
LEC LEC
2 5
3 4
CL25 0914.eps
LES BUS
Just as Gossips tell everyone everything they hear, BUSs are used to spray information
to all Clients in an ELAN.
Just as Clients are not allowed to come through the bars back door when they are
thrown out (they must dust themselves off and go back to see the Bouncer rst),
LANE Clients must rejoin an ELAN by visiting the LECS rst.
Type/
LEC ID Dest. MAC Src. MAC Payload
Length
Bytes: 2 6 6 2 46 1500
Bytes: 3 3 2 4 2 6 6 2 46 1500
LANE: Theory of Operation 377
If you compare the LANE Version 1.0 format to the traditional Ethernet frame you will
notice two changes:
The addition of the 2-byte LEC ID eldAs LECs contact the LES, they are
assigned a unique, 2-byte LECID identier. In practice, the rst LEC that joins is 1,
the second is 2, and so on.
The removal of the 4-byte CRCBecause ATM has its own CRC mechanism, the
ATM Forum removed the Ethernet CRC.
Notice that this changes the Ethernet MTU for LANE. As discussed in Chapter 1, the
traditional Ethernet MTU is 1,518 bytes: 14 bytes of header, 1500 bytes of payload, and 4
bytes of CRC. Because Ethernet LANE removes the 4-byte CRC but adds a 2-byte LEC ID,
the resulting MTU is 1516. However, notice that the payload portion is still 1500 bytes to
ensure interoperability with all Ethernet devices.
LANE version 2.0 adds an optional 12-byte header to the front of the format used in version
1.0. The 12 bytes of this header are composed of the following four elds:
802.2 Logical Link Control (LLC) headerThe value AAAA03 signies that the
next ve bytes are a SNAP header.
SNAP Organizationally Unique Identier (OUI)Used to specify the
organization that created this protocol format. In this case, the OUI assigned to the
ATM Forum, 00A03E, is used.
SNAP Frame TypeUsed to specify the specic frame type. This eld allows each
of the organizations specied in the previous eld to create up to 65,535 different
protocols. In the LANE of LANE V2, the value 000C is used.
LANE V2 ELAN IDA unique identier for every ELAN.
This allows the second version of LANE to multiplex trafc from multiple ELANs over a
single Data Direct VC (the ELAN-ID is used to differentiate the trafc).
LECS
LEC-B
LEC-A
In the example, Host-A issues an IP ping to Host-B. Both devices are Ethernet-attached
PCs connected to Catalysts that contain LANE uplink cards in slot 4. Host-A is using IP
address 1.1.1.1 and MAC address AAAA.AAAA.AAAA. Host-B has IP address 1.1.1.2
and MAC address BBBB.BBBB.BBBB. Notice that this example focuses only on the
building of a Data Directboth Clients (Catalyst LANE cards) are assumed to have already
joined the ELAN (using the ve-step process discussed earlier). All caches and LANE
tables are assumed to be at a state just after initialization. The following sequence outlines
the steps that allow two Clients to establish a Data Direct VC.
Step 1 The user of Host-A enters ping 1.1.1.2 at a command prompt.
Step 2 Host-A issues an IP ARP for the IP address 1.1.1.2. Figure 9-17
illustrates this ARP packet.
LANE: Theory of Operation 379
CL25 0917.eps
Ethernet ARP Request
Figure 9-18 The rst ve steps of the Data Direct VC Creation Process
LECS
Host-A Host-B
Cat-A Cat-B
2 4 5
LEC-A
LEC-B
1
5/12 4/1 3 4/1 3/10
CL25 0918.eps
LES BUS
Step 6 Host-B receives the IP ARP request. Recognizing its IP address in the ARP
packet, it builds an IP ARP reply packet. Figure 9-19 illustrates the reply.
In this case, the ARP message contains the MAC address in question.
Also notice that ARP unicasts the reply back to the source node; it is not
sent to all nodes via the broadcast address.
LANE: Theory of Operation 381
Step 7 The LEC-B Catalyst receives the IP ARP reply. Having just added a
bridging table entry for AAAA.AAAA.AAAA in Step 5, the frame is
forwarded to the LANE module in slot 4.
Step 8 The LEC-B software running on the LANE module must then send the IP
ARP reply over the ATM backbone. At this point, two separate threads of
activities take over. The rst thread, an LE_ARP process, is detailed in Step 8;
the second thread, forwarding the IP ARP, is explained in Step 9.
(a) LEC-B must resolve MAC address AAAA.AAAA.AAAA
into an NSAP address. To do this, LEC-B sends an
LE_ARP_REQUEST to the LES. Notice the important
differences between this LE_ARP and the earlier IP ARP.
With the IP ARP, the destination IP address was known but
the MAC address was not known. In the case of the
LE_ARP, the MAC address is now known (via the IP ARP)
and the NSAP address is unknown. In other words, the
LE_ARP is only possible after the IP ARP has already
resolved the MAC address.
(b) The LES consults its local MAC-address-to-NSAP-address
mapping table. Although this table contains a mapping entry
for LEC-A, it does not contain a mapping entry for Host-A.
It is important to realize that LEC-A and Host-A are using
different MAC addresses. Just because LEC-A is acting as a
Proxy Client for Host-A doesnt mean that it has assumed
Host-As MAC address. When LEC-A joined the ELAN, it
could have optionally registered all known addresses for its
Ethernet-attached hosts. However, because transparent
bridges rarely know all MAC addresses (after all, they are
learning bridges), most vendors opt to only register the MAC
address specically assigned to the Proxy LEC.
If it still seems unclear why the LES wouldnt learn about
MAC address AAAA.AAAA.AAAA at the time LEC-A
joined the ELAN, consider the following: What if Host-A
wasnt even running at the time LEC-A joined? In this case,
the MAC address isnt even active, so it is impossible for
LEC-A to register MAC address AAAA.AAAA.AAAA.
(c) Because the LES doesnt have a mapping for the requested
MAC address, it forwards the LE_ARP message on to all
Proxy Clients via the Control Distribute VC.
Figure 9-20 diagrams Steps 6 through 8c.
382 Chapter 9: Trunking with LAN Emulation
LECS
Host-A Host-B
Cat-A Cat-B
LEC-B
LEC-A
8c
8a
LES BUS
8b
MAC NSAP
LECS
Host-A Host-B
Cat-A Cat-B
8f
LEC-B
LEC-A
8d
8e
CL25 0921.eps
LES BUS
LECS
LEC-B
LEC-A
9a
LES BUS
LECS
Host-A Host-B
Cat-A Cat-B
9g
LEC-A
LEC-B
9e
9d
9f
9f
Step 10 At this point, both LECs have attempted to build a Data Direct VC. Which
Data Direct VC is built rst can be different in every case and depends on the
timing of Steps 8 and 9. Assume that LEC-B completes its Data Direct VC
rst and the timing is such that LEC-A also builds a Data Direct VC. In this
case, both LECs begin using the Data Direct VC built by the LEC with the
lower NSAP address. Assume that LEC-A has the lower NSAP. This causes
all trafc to ow over the Data Direct VC created by LEC-A (LEC-Bs Data
Direct VC times out after ve minutes by default on Cisco equipment).
Step 11 Before the Clients can begin communicating, they must take an extra step
to ensure the in-order delivery of information. Notice that both LEC-B
(Step 9b) and LEC-A (Step 9f) have sent information via the BUS. If the
Clients were to start sending information via the Data Direct VC as soon
as it became available, this could lead to out-of-order information. For
example, the Data Direct frame (the second frame sent) could arrive
before the frame sent via the BUS (the rst frame sent). To prevent this,
the Clients can optionally use a Flush protocol. Steps 11a11e follow the
Flush protocol from LEC-As perspective:
(a) LEC-A sends an LE_FLUSH_REQUEST message over
the Multicast Send.
(b) The BUS forwards the LE_FLUSH_REQUEST to LEC-B.
386 Chapter 9: Trunking with LAN Emulation
Figure 9-24 Steps 10 and 11a11e in the Data Direct VC Creation Process
LECS
11e
Cat-A ping Cat-B
Host-A Host-B
10
LEC-B
LEC-A
11a 11b
11d
11c
LES BUS
Notice that after the process forks in Step 8, the order of the events becomes indeterminate.
In some cases, LEC-A is the rst to build a Data Direct VC; in other cases, LEC-B is the rst.
Conguration Concepts 387
Conguration Concepts
There are several important concepts used in conguring LANE on Cisco devices,
including the following:
Mapping VLAN numbers to ELAN names
Addressing
Subinterfaces
Subinterfaces
As explained in Chapter 8, Trunking Technologies and Applications, Cisco uses the
subinterface concept to create logical partitions on a single physical interface. In this case,
each partition is used for a separate ELAN as shown in Figure 9-25.
Cat-A
ATM 0
LECS
In Figure 9-25, subinterface ATM 0.1 is used for ELAN1, and ATM 0.2 is used for ELAN2.
Cat-A is a client on both ELANs and therefore requires a LEC on both subinterfaces. Because
Cat-A is also acting as the LES and BUS for ELAN2 (but not ELAN1), these services are only
388 Chapter 9: Trunking with LAN Emulation
congured on subinterface ATM 0.2. However, notice that the LECS is congured on the
major interface, ATM 0. This placement mirrors the roles that each component plays:
Because the LECS doesnt belong to any particular ELAN, it is placed on the major
interface. As a global concept, the LECS should be placed on the global interface.
LECs, LESs, and BUSs that do belong to specic ELANs are placed on subinterfaces.
In short, subinterfaces allow the interface conguration to match the LANE conguration.
TIP Although most LANE components are congured on a subinterface, the LECS is always
congured on the major interface of Cisco equipment.
Addressing
LANEs reliance on SVCs requires careful planning of NSAP addresses (recall that SCVs
are built by placing an ATM phone call to an NSAP address). Although Cisco allows you
to manually congure NSAP addresses, an automatic NSAP addressing scheme is provided
to allow almost plug-and-play operation.
Recall from Figure 9-6 that NSAP addresses have three sections:
A 13-byte prex from the ATM switch (LS1010)
A 6-byte ESI from the edge device (Catalyst LANE module)
A 1-byte selector byte from the edge device (Catalyst LANE module)
The LANE components automatically acquire the prex from the ATM switch. However,
what does the Catalyst provide for the ESI and selector byte values? Cisco has created a
simple scheme based on MAC addresses to fulll this need. Every ATM interface sold by
Cisco has a block of at least eight MAC addresses on it (some interfaces have more). This
allows each LANE component to automatically adopt a unique NSAP address using the
pattern shown in Table 9-3.
As discussed earlier, notice that the LECS always appears on the major interface and uses
.00 as a selector byte. Also note that although subinterfaces are expressed in decimal to the
IOS (interface ATM 0.29), the selector byte is expressed in hexadecimal (0x1D).
TIP Although subinterface numbers are expressed in decimal in the IOS conguration, they are
expressed in hex when used for the ATM NSAPs selector byte.
In practice, Cisco makes it extremely easy to determine the NSAP addresses for a specic
Catalyst LANE modulejust use the show lane default command. For example, the
Catalyst output in Example 9-1 is connected to an ATM switch with the prex
47.0091.8100.0000.0010.2962.E801.
The selector byte is shown as .** for the LEC, LES, and BUS because the subinterface
numbers are not revealed by this command.
390 Chapter 9: Trunking with LAN Emulation
Conguration Syntax
The good news is that, although the theory of LANE is very complex and cumbersome,
Cisco has made the conguration very simple. In fact, LANE uses the same conguration
syntax across almost the entire product line. In other words, learn how to congure LANE
on a Catalyst and you already know how to congure it on a Cisco router or ATM switch.
To congure a Catalyst LANE module, you must rst use the session command to open a
LANE command prompt. For example, if you currently have a Telnet session into the
Catalyst 5000 Supervisor and you want to congure a LANE blade in slot 4, you issue the
command in Example 9-2.
ATM>
This suddenly catapults you to the IOS-style command prompt on the LANE module!
Thats right, the LANE module runs the traditional IOS software (although its obviously a
separate binary image that must be downloaded from CCO). Almost all of the routers
command-line interface (CLI) features you know and love are available:
BASH-style command-line recall (using the arrow keys)
cong term to alter the conguration
debug commands
copy run start or write mem to save the conguration
Dont forget the last bullet: unlike the Catalyst Supervisor, you must remember to save the
conguration. Forget to do this and you are guaranteed to have a miserable day (or night)
after the next power outage!
TIP Dont forget to use the copy run start command to save your LANE conguration!
It is easiest to think of the LANE module as an independent device that connects to the
Catalyst backplane. In other words, it has its own DRAM and CPU for use while
operational. When the Catalyst is powered down, the LANE module uses its own NVRAM
to store the conguration and ash to store the operating system.
Conguration Syntax 391
The Catalyst LANE module can be congured in ve simple steps, each of which are
detailed in the sections that follow:
Step 1 Build overhead connections
Step 2 Build the LES and BUS
The VCD parameter is used to specify a locally signicant Virtual Circuit Descriptor. The
IOS uses a unique VCD to track every ATM connection. In the case of PVCs, you must
manually specify a unique value. In the case of SVCs, the Catalyst automatically chooses
a unique value.
The VPI and VCI parameters are used to specify the Virtual Path Indicator and Virtual
Channel Indicator, respectively. Recall that these are the two addressing elds in the 5-byte
ATM cell header.
The two PVCs listed in Example 9-3 must exist on every LEC.
The rst PVC provides signaling (QSAAL stands for Q.Signaling ATM Adaptation Layer),
whereas the second PVC obviously provides ILMI. You are free to use any VCD values you
want; however, they must be unique, and the values 1 and 2 are most common.
Be careful not to enter commands in Example 9-4.
392 Chapter 9: Trunking with LAN Emulation
This common mistake results in only one PVC (ILMI) because the VCD numbers are
the same.
The best news of all is that Step 1 is not required for Catalyst LANE modules! All Catalyst
LANE images since 3.X automatically contain the two overhead PVCs. On the other hand,
if you are conguring LANE on a Cisco router, dont forget to enter these two commands.
TIP The ATM PVC statements for signaling and ILMI are not required for Catalyst LANE
module conguration. However, they are required for LANE congurations on Cisco
routers.
For example, the conguration in Example 9-5 creates a LES and BUS for the ELAN
named ELAN1.
After completing the conguration in Example 9-5, you should be able to view the status
of the LES with the show lane server command as demonstrated in Example 9-6.
Conguration Syntax 393
Finally, be sure to make note of the LES address on the second to last line (the ATM
address: eld), it is used in Step 3.
Unless you are using SSRP server redundancy (discussed later), be sure to only congure
one LES/BUS for each ELAN.
At this point, you can now enter one line per ELAN. Each line lists the name of the ELAN
and the NSAP of the LES for that ELAN using the following syntax (some advanced
options have been omitted for simplicity):
name elan-name server-atm-addres atm-addr
NOTE Multiple lines are possible if you are using a feature known as SSRP. This option is
discussed later.
For example, the commands in Example 9-7 build a database for two ELANs.
394 Chapter 9: Trunking with LAN Emulation
A common mistake is to congure the NSAP of the LECS in the database (instead of the
LES NSAP). The LECS doesnt need to have its own address added to the database (it
already knows that), it needs to know the NSAP of the LES. Just remember, the LECS
(Bouncer) needs to tell Clients how to reach the LES (Bartender).
TIP Be certain that you specify the name of the LES, not the LECS, in the LECS database.
Be aware that the capability to edit the LECS database on the ATM device itself is a very
useful Cisco-specic feature. Several other leading competitors require you to build the le
on a TFTP server using cryptic syntax and then TFTP the le to the ATM device, an
awkward process at best.
The lane cong database command binds the database built in the previous step to the
specied major interface. The complete syntax for this command is as follows:
lane config database database-name
The lane cong auto-cong-atm-address command tells the IOS to use the automatic
addressing scheme (MAC Address + 3 . 00) discussed earlier. The following is the complete
syntax for this command:
lane config auto-config-atm-address
Conguration Syntax 395
The 1 ties VLAN 1 to ELAN1, merging the two into a single broadcast domain.
The following is the complete syntax for the lane client command:
lane client [ethernet | tokenring] vlan-num [elan-name]
TIP It is necessary to specify a VLAN number to ELAN name mapping when creating a LEC
on a Catalyst LANE module. This is not required when conguring LANE on a Cisco
router (by default, they route, not bridge, between the LEC and other interfaces).
This is a global command that applies to all ports on the switch. You can obtain the LECS
address by issuing the show lane cong command or the show lane default command on
the device functioning as the Conguration Server. This command must be congured on
all ATM switches (there is not an automated protocol to disseminate the LECS NSAP
between ATM switches). The full syntax of the atm lecs-address-default command is as
follows:
atm lecs-address-default lecs-address [sequence-#]
The sequence-# parameter is used by the SSRP feature discussed later in the chapter.
396 Chapter 9: Trunking with LAN Emulation
LECS
LEC LEC
LEC
Cat-A LES Cat-B
BUS LEC
VLAN 1 ELAN1 VLAN 1
ATM
CL25 0926.eps
BUS
Ethernet LANE LANE Ethernet
Module Module Module Module
The network consists of two Catalysts that contain Ethernet and LANE modules. Each
Catalyst has been congured with two VLANs that use ATM as a trunk media. VLAN 1 is
transparently bridged to ELAN1, creating a single broadcast domain. VLAN 2 uses
ELAN2. Both Catalysts have two LANE Clients, one for each ELAN. Cat-A is acting as
the LES/BUS for ELAN1, and Cat-B is serving as the LES/BUS for ELAN2. Example 9-
10 shows the conguration code for Cat-A.
A Complete LANE Network 397
Example 9-10 Cat-A Conguration for the LANE Network in Figure 9-26
int atm 0
atm pvc 1 0 5 qsaal
atm pvc 2 0 16 ilmi
!
int atm 0.1 multipoint
lane server-bus ethernet ELAN1
lane client ethernet 1 ELAN1
!
int atm 0.2 multipoint
lane client ethernet 2 ELAN2
Example 9-11 Cat-B Conguration for the LANE Network in Figure 9-26
int atm 0
atm pvc 1 0 5 qsaal
atm pvc 2 0 16 ilmi
!
int atm 0.1 multipoint
lane client ethernet 1 ELAN1
!
int atm 0.2 multipoint
lane server-bus ethernet ELAN2
lane client ethernet 2 ELAN2
An ATM-attached router on a stick has also been congured to provide routing services
between the VLANs/ELANs. To function in this capacity, the router must be congured
with the two overhead PVCs on the major interface and two subinterfaces containing
LANE Clients and IP addresses. The router is also acting as the LECS for the network.
Example 9-12 shows the conguration code for the router.
398 Chapter 9: Trunking with LAN Emulation
Example 9-12 Router Conguration for the LANE Network in Figure 9-26
lane database my_database
name ELAN1 server-atm-address 47.0091.8100.0000.0010.2962.E801.0010.2962.E431.01
name ELAN2 server-atm-address 47.0091.8100.0000.0010.2962.E801.0010.1F26.6431.02
!
int atm 1/0
atm pvc 1 0 5 qsaal
atm pvc 2 0 16 ilmi
lane config database my_database
lane config auto-config-atm-address
!
int atm 1/0.1 multipoint
ip address 10.0.1.1 255.255.255.0
lane client ethernet 1 ELAN1
!
int atm 1/0.2 multipoint
ip address 10.0.2.1 255.255.255.0
lane client ethernet 2 ELAN2
TIP Use the show lane client command to remember the order of the LANE joining process.
There are several items in Example 9-13 showing that the LEC has successfully joined
the ELAN:
The State is operational.
The LEC has been up for eight seconds.
All ve overhead VCs have valid NSAPs listed.
Example 9-14 shows a common failure condition.
400 Chapter 9: Trunking with LAN Emulation
Several sections of the output in Example 9-14 point to the failure, including:
The State is initialState.
The LEC attempts to join again in nine seconds.
A Last Fail Reason is provided.
Only the rst VC (the Conguration Direct) lists a non-zero NSAP.
Because the rst line represents the Conguration Direct to the LECS, the information in
this display is strong evidence of a problem very early in the joining process. Also look
carefully at the NSAP for the Conguration Directit is the well-known NSAP. Because
Cisco uses the well-known NSAP as a last gasp effort to contact the LECS if all other
measures have failed, it is often a sign that the join process failed. In fact, this display is
almost always the result of three specic errors:
The ATM switch provided the wrong NSAP (or no NSAP) for the LECSWhen
the LEC contacted the device specied by this NSAP and tried to begin the joining
process, it was rudely told the equivalent of Join an ELAN?? I dont know what you
are talking about! Look at the atm lecs-address-default command on the LS1010 to
verify and x this problem.
The LEC was given the correct NSAP address LECS, but the LECS is downIn
this case, the Client contacted the correct devices, but it was still rebuffed because the
LECS couldnt yet handle LE_CONFIGURE_REQUESTS (the message sent
between Clients and LECSs). Look at the conguration of the LECS to verify and x
this problem. The show lane cong command is often useful for this purpose.
The LEC requested an ELAN that doesnt existTo verify this condition, issue a
terminal monitor command on the LANE module to display error messages. If you
see CFG_REQ failed, no conguration (LECS returned 20) messages, use the show
Testing the LANE Conguration 401
lane client command to check the ELAN name. Make certain that this ELAN name
is identical to one of the ELANs displayed in show lane database on the LECS.
Carefully check for incorrect caseELAN names are case sensitive.
The output in Example 9-15 shows the results of an LEC that made it past contacting the
LECS but has run into problems contacting the LES.
The output in Example 9-15 shows most of the same symptoms as the output in Example
9-14; however, the rst two overhead connections have NSAPs listed. For two reasons, this
pair of overhead connections proves that the LEC successfully reached the LECS. First, the
Conguration Direct (rst overhead circuit) lists the correct NSAP for the LECS. Second,
the LEC could not have attempted a call to the LES (second overhead circuit) without
having received an NSAP from the LECS. In this case, the LEC called the LES (Control
Direct, second VC), but the LES didnt call the LEC back (Control Distribute, third VC).
This diagnosis is further supported by the Last Fail Reason indicating Control Direct VC
being released. In almost all cases, one of two problems is the cause:
The LEC received the incorrect NSAP address for the LES from the LECSThe
LEC again jumped off in the weeds and called the wrong device only to receive an
I dont know what youre talking about! response. Use the show lane database
command on the LECS to verify LES NSAP addresses for the appropriate ELAN. If
necessary, correct the LECS database.
The LEC was given the right NSAP address for the LES, but the LES is down
The LEC called the correct device, but the device was not a functioning LES. Verify
the problem with the show lane server command on the LES (make certain that the
LES is operational). If necessary, adjust the LES conguration parameters.
It is very uncommon to have a problem with only the Multicast Send and Multicast Forward
VCs on Cisco equipment. Such behavior suggests a problem with the BUS: either the LES
provided the wrong NSAP for the BUS or the BUS was not active. Because Cisco requires
402 Chapter 9: Trunking with LAN Emulation
that the LES and BUS be co-located, the BUS should always be reachable if the LES is
reachable. However, if you are using a third-party device for the LES and BUS, it is possible
to run into this problem.
If you are still not able to successfully join the ELAN, consult the more detailed
information given in Chapter 16, Troubleshooting. Chapter 16 explores additional
troubleshooting techniques.
Table 9-5 BUS Performance by Hardware Platform (in Kilopackets per Second)
Platform KPPS
Catalyst 5000 OC-12 LANE Module 500+
Catalyst 5000 OC-3 LANE Module 125
ATM PA (in 7500 and 7200) 70
Catalyst 3000 50
4700 NMP-1A 41
LS1010 30
7000 AIP 27
NOTE Although Catalyst LANE modules provide the highest throughput in Ciscos products line,
routers with very powerful CPUs (such as the 7500 RSP4) can provide faster recovery during
network failover (the additional CPU capacity allows them to build VCs more quickly). In
general, this is less important than the throughput numbers presented in Table 9-5.
SSRP
The LANE 1.0 spec released in January 1995 did not provide for a server-to-server protocol
(known as the LNNI [LANE Network-Network Interface] protocol). Because this
limitation prohibited multiple servers from communicating and synchronizing information
with each other, the ATM Forum had a simple solution: allow only a single instance of each
type of server. The entire network was then dependent on a single LECS, and each ELAN
was dependent on a single LES and BUS. Because this obviously creates multiple single
points of failure, most ATM vendors have created their own redundancy protocols. Ciscos
protocol is known as the Simple Server Redundancy Protocol (SSRP).
SSRP allows for an unlimited number of standby LECS and LES/BUS servers, although
one standby of each type is most common. It allows the standby servers to take over if the
primary fails, but it does not allow multiple active servers. One of the primary benets of
SSRP is that it is interoperable with most third-party LECs.
An additional benet of SSRP is that it is easy and intuitive to congure. To provide for
multiple LECSs, you simply need to congure multiple atm lecs-address-default
commands on every ATM switch. To provide for multiple LES/BUSs for every ELAN, just
add multiple name ELAN_NAME server-atm-address NSAP commands to the LECS
database. For example, the commands in Example 9-16 congure two LECSs on a LS1010
ATM switch.
If the rst LECS is available, the LS1010 always provides this address to LECs via ILMI.
However, if the rst LECS becomes unavailable, the ATM switch just returns the second
LECS NSAP to the LEC. Other than an address change, the LEC doesnt even know the
rst LECS failed.
CAUTION You must be certain to enter redundant LECS addresses in the same order on all ATM
switches.
The commands in Example 9-17 congure redundant LES/BUS servers for ELAN1
and ELAN2.
Advanced LANE Issues and Features 405
If the rst LES listed is available, the LECS returns this address to the LEC in the
LE_CONFIGURE_REPLY. If the rst LES fails, the LECS returns the second LES NSAP
when the LEC tries to rejoin. Again, the LEC isnt aware of the change (other than it was
given a different address).
CAUTION Make certain that all of the LECS database are identical.
Dual-PHY
Cisco offers Dual-PHY on all of the high-end Catalyst LANE Modules (such as the
Catalyst 5000 and 6000). Although these cards still have only a single set of ATM and AAL
logic, it provides dual physical layer paths for redundant connections. This allows the
LANE card to connect to two different ATM switches (in case one link or switch fails), but
only one link can be active at a time. You can congure which link is preferred with the atm
preferred phy command.
Although this feature is of great value in large networks, it does introduce a subtle
conguration change. Notice the conguration in Figure 9-27.
ESI: 0000.0012.3456.01
A B
Catalyst LANE Module
CL25 0927.eps
Prefix = Prefix =
47.0091.8100.0000.1111.2222.3333 47.0091.8100.0000.AAAA.BBBB.CCCC
TIP Using SSRP with Dual-PHY connections generally requires careful planning of the order
of SSRP entries.
Advanced LANE Issues and Features 407
For example, Figure 9-28 illustrates two LES/BUSs linked to two LS1010s through dual-
PHY connections.
Figure 9-28 Redundant LES/BUSs Connected via Dual-PHY to Two ATM Switches
0000.0012.3456.01 0000.00AB.CDEF.01
SW-1 SW-2
A B A B
CL25 0928.eps
47.0091.8100.0000.1111.2222.3333 47.0091.8100.0000.AAAA.BBBB.CCCC
LS-1010_A LS-1010_B
The LECS database should contain four name ELAN_NAME server-atm-address NSAP
commands for each ELAN. Look at Example 9-18.
the ELAN. However, a few seconds later Port B on SW-1 completes the ILMI addresses
registration process and becomes active. Now that the second LES in the database is active,
Port A on SW-2 reverts to backup mode, causing all of the LECs to again be thrown out of
the ELAN. When the LECs try again to rejoin, the LECS sends them to Port B on SW-1.
This unnecessary backtracking can be avoided by carefully coding the order of the LECS
database as in Example 9-19.
The conguration in Example 9-19 offers faster failover because the second device listed
completes address registration before Port A on SW-1 fails. This allows the second LES to
be immediately available without backtracking.
One of the advantages of using the well-known NSAP is that it does not require any
conguration at all on the LS1010 ATM switch. Because the LECs automatically try the
well-known NSAP if the hard-coded NSAP and ILMI options fail, the LEC reaches the
LECS with minimal conguration.
Although the well-known NSAP is potentially easier to congure in small networks, Cisco
recommends the ILMI approach because it provides better support for SSRP server
redundancy. There is a problem if multiple LECS servers try to use the well-known NSAP
for LECS discovery: both servers try to register the same 47.0079 address! With ILMI
and SSRP, every device registers a unique NSAP address, avoiding any PNNI routing
confusion.
If you run into a situation where you need to run SSRP and also support the well-known
NSAP (this can be the case if you are using third-party LECs that do not support the ILMI
LECS discovery method), there is a solution. First, congure all of the LECS devices with
the lane cong xed-cong-atm-address command. Next, be sure to also congure every
LECS for ILMI support with the lane cong auto-cong-atm-address command. This
allows SSRP to function properly and elects only a single primary LECS. Because the
backup LECS servers do not register the well-known NSAP, only the primary originates the
47.0079 route into PNNI. You now have the resiliency of SSRP combined with the
simplicity and widespread support of the well-known NSAP!
410 Chapter 9: Trunking with LAN Emulation
This creates a PVC using VPI 0 and VCI 50 and then binds it to VLAN 1. PVCs can be
useful in small networks, especially when connecting two Catalyst LANE blades back-to-
back without using an ATM switch. Another advanced feature offered by the Catalyst is
PVC-based trafc shaping. This allows you to set a peak bandwidth level on a particular
circuit. It can be very useful for controlling bandwidth across expensive wide-area links.
This feature requires you to load a special image into ash on the LANE module and does
not work with SVCs. For example, Example 9-22 adds trafc shaping to the conguration
in Example 9-21.
The nal parameter on the atm pvc command limits the trafc to a peak rate of 11 Mbps.
Exercises
This section includes a variety of questions and hands-on lab exercises. By completing
these you can test your mastery of the material included in this chapter as well as help
prepare yourself for the CCIE written and lab tests.
Exercises 411
Review Questions
1 What are the three layers of the ATM stack? What does each do?
2 What is the difference between ATM and SONET?
3 What is the difference between a Catalyst with two LANE modules and a
two-port ATM switch?
4 What is the difference between a VPI, a VCI, and an NSAP? When is each used?
5 Assume you attached an ATM network analyzer to an ATM cloud consisting of one
LS1010 ATM switch and two Catalysts with LANE modules. What types of cells
could you capture to observe VPI and VCI values? What type of cells could you
capture to observe NSAP addresses?
6 What are the three sections of an NSAP address? What does each part represent?
7 How do Catalysts automatically generate ESI and selector byte values for use
with LANE?
8 What is the ve-step initialization process used by LANE Clients to join an ELAN?
9 What are the names of the six types of circuits used by LANE? What type of trafc
does each carry?
10 What is the difference between an IP ARP and an LE_ARP?
11 In a network that needs to trunk two VLANs between two Catalysts, how many LECs
are required? How many LECSs? How many LESs? How many BUSs?
12 If the network in Question 12 grows to ten Catalysts and ten VLANs, how many
LECs, LECSs, LESs and BUSs are required? Assume that every Catalyst has ports
assigned to every VLAN.
13 Trace the data path in Figure 9-26 from an Ethernet-attached node in VLAN 1 on Cat-
A to an Ethernet-attached node in VLAN 2 on Cat-B. Why is this inefcient?
412 Chapter 9: Trunking with LAN Emulation
Hands-On Lab
Build a network that resembles Figure 9-29.
Cat-A Cat-B
CL25 0929.eps
Table 9-6 shows the LANE components that should be congured on each device.
When you are done building the network, perform the following tasks:
Test connectivity to all devices.
Turn on debug lane client all and ping another device on the network (you might need
to clear the Data Direct if it already exists). Log the results.
With debug lane client all still running, issue shut and no shut commands on the atm
major interface. Log the results.
Examine the output of the show lane client, show lane cong, show lane server,
show lane bus, and show lane database commands.
Add SSRP to allow server redundancy.
If you have multiple ATM switches, add dual-PHY support (dont forget to update
your SSRP congurations).
This chapter covers the following key topics:
Why Two ATM Modes?Describes the relationship between LANE and MPOA, and
discusses the choices of when to use one as opposed to the other.
MPOA Components and ModelProvides an overview of MPOA including the
various components dened and utilized by MPOA, their relationship with each other,
and how they interact to support MPOA. Various trafc ows for management and user
data are also described.
MPOA CongurationDetails the commands to enable MPOA on an MPOA-capable
Catalyst LANE module and on a Cisco MPOA-capable router. This details the
conguration for both the Multiprotocol Server (MPS) and the Multiprotocol
Client (MPC).
Sample MPOA CongurationThis section puts it all together, and shows a sample
network and the supporting MPOA congurations.
Troubleshooting an MPOA NetworkOffers guidelines for getting MPOA
functional. It provides steps for ensuring that the MPOA components are operational,
and provides insight for using MPOA debug.
CHAPTER
10
Trunking with Multiprotocol
Over ATM
Catalysts rarely stand alone. More often than not, they interconnect with other Catalysts.
Chapter 8, Trunking Technologies and Applications, discussed various methods for
interconnecting Catalysts. Methods range from multiple links with each dedicated to a
single VLAN, to a variety of trunk technologies. Fast Ethernet, Gigabit Ethernet, FDDI, and
ATM technologies are all suitable for trunking Catalysts. Within ATM, you can use LAN
Emulation (LANE) or Multiprotocol over ATM (MPOA) data transfer modes to trunk
Catalysts. Chapter 9, Trunking with LAN Emulation, described that option. This chapter
focuses on (MPOA) in a Catalyst environment, specically, when to use MPOA, its
components, and conguring and troubleshooting MPOA.
Further, consider what happens when a LEC in one ELAN needs to communicate with a
LEC in another ELAN. The trafc must pass through a router because each ELAN denes
a unique broadcast domain. ELANs interconnect through routers, just like VLANs.
Frames traveling from a source LEC in one ELAN to a destination LEC in a second ELAN
must traverse routers, as discussed in Chapter 9. If a frame passes through multiple routers
to get from the source LEC to the destination LEC, multiple data direct circuits must be
established so routers can pass the frame from one ELAN to another. A source LEC must
establish a data direct circuit to its default gateway, another data direct must be established
from that router to the next router, and so on to the nal router. The last router must establish
a data direct to the destination LEC. Consider the network in Figure 10-1. When an Ethernet
frame from VLAN 1 (destined for VLAN 2) hits the switch, the switch LEC segments the
frame into cells and passes it to the default router LEC (Router 1). The default router LEC
reassembles the cells into a frame, and routes it to the LEC on the next ELAN. This LEC
segments the data and forwards the frame (cells) to the next hop router LEC (Router 2) over
ATM. When the cells hit Router 2's ATM interface, Router 2 reassembles the cells, routes
the frame, and then segments the frame before it can pass the frame to Router 3. Router 3
reassembles the cells, routes the frame, segments the frame into cells, and forwards them
to the destination Catalyst. This Catalyst reassembles the cells into a frame and passes the
frame onto the destination Ethernet segment.
Each hop through a router introduces additional latency and consumes routing resources
within each router. Some of the latency stems from the segmentation/reassembly process.
Another latency factor includes the route processing time to determine the next hop. This
element can be less signicant in routers that do hardware routing (as opposed to legacy
software-based routers).
Why Two ATM Modes? 417
ELAN 2 ELAN 3
Router 2
Router 1 Router 3
ELAN 1 ELAN 4
Logical View
VLAN 1 VLAN 2
The hop-by-hop approach was necessary when networks interconnected with shared media
systems such as Ethernet. Physical connections force frames to pass through a router
whenever a device in one network wants to connect to a device in another network. LANE
maintains this model. It honors the rule that devices in different networks must interconnect
through a router. MPOA, however, creates a virtual circuit directly between two devices
residing in different ELANs.
418 Chapter 10: Trunking with Multiprotocol Over ATM
MPOA, in fact, bypasses intermediate routers but gives the appearance that trafc from a
device in one ELAN passes through routers to reach the destination device in another ELAN.
To get from one VLAN to another over ATM can require communication across several
ELANs. The following sections compare how trafc ows within an ELAN as opposed to
when it needs to cross more than one ELAN.
Intra-Subnet Communications
When Catalysts need to communicate with each other within the same ELAN, the Catalysts
use LANE. Intra-subnet (or intra-ELAN) communications occur whenever devices in the
same broadcast domain trunk to each other. The Catalyst LANE module was exclusively
designed to support LANE operations with high performance to handle ows at LAN rates.
Details for intra-subnet communications are described in Chapter 9.
Inter-Subnet Communications
As discussed earlier, occasions arise where hosts in different VLANs need to communicate
with each other over ATM. VLANs have similarities to ELANs on the ATM network.
VLANs describe broadcast domains in a LAN environment, whereas ELANs describe
broadcast domains in an ATM environment. Whenever hosts in one VLAN need to
communicate with hosts in another VLAN, the trafc must be routed between them. If the
inter-VLAN routing occurs on Ethernet, Multilayer Switching (MLS) is an appropriate
choice to bypass routers. If the routing occurs in the ATM network, MPOA is a candidate
to bypass routers. MLS and MPOA both have the same objective: to bypass routers. One
does it in the LAN world (VLANs), the other does it in the ATM world (ELANs).
This chapter details the inter-VLAN/inter-ELAN communications performed with MPOA.
Without MPOA, the trafc follows the hop-by-hop path described earlier in the chapter.
MPOA / NHRP
t on
es uti
qu sol
n
Re Re
tio
pl olu
RP
Re s
NH
Re
y
RP
NH
Ingress Egress
MPS MPS
n
Im
M sitio
es utio
NHS NHS
PO n
po
qu sol
A Re
t
Re Re
Ca q
ch ues
A
PO
e t
M
ELAN ELAN
tio
ply olu
M osi
Im
PO tio
Re es
p
R
A nR
A
Ca e
PO
ch ply
M
e
Shortcut
Ingress Egress
MPC MPC
Note also the presence of LANE components. MPOA depends upon LANE for intra-ELAN
communications. Communication between a Multiprotocol Client (MPC) and a
Multiprotocol Server (MPS) occurs over an ELAN. Communication between adjacent Next
Hop Servers (NHSs), another MPOA component discussed later in the section on Next Hop
Resolution Protocol (NHRP), also occurs over ELANs. Finally, MPSs also communicate
over ELANs. Additionally, if frames are sent between MPCs before a shortcut is
established, the frames transit ELANs. The objective of MPOA is to ultimately circumvent
this situation by transmitting frames over a shortcut between MPCs. However, the shortcut
is not immediately established. It is the responsibility of the ingress MPC to generate a
shortcut request with the egress MPC as the target. The ingress MPC request asks for the
egress MPC's ATM address from the ingress MPS. The ingress MPS receives the shortcut
request from the ingress MPC and resolves the request into a Next Hop Resolution Protocol
(NHRP) request. NHSs forward the request to the nal NHS, which resides in the egress
420 Chapter 10: Trunking with Multiprotocol Over ATM
MPS. The egress MPS resolves the request, informs the egress MPC to expect trafc from
the ingress MPC, and returns the resolution reply to the ingress MPC.
In this process, three kinds of information ows exist: conguration, inbound/outbound,
and control.
MPOA components acquire conguration information from the LECS. LANE version 2
denes this conguration ow. Both the MPC and the MPS can obtain conguration
parameters from the LECS. Alternatively, they can get congurations from internal
statements.
Inbound and outbound ows occur between the MPCs and the MPSs. The inbound ow
occurs between the ingress MPC and MPS, whereas the outbound ow occurs between the
egress MPC and MPS. Inbound and outbound are dened from the perspective of the
MPOA cloud.
MPOA denes a set of control ows used to establish and maintain shortcut information.
Control ows occur over ELANs between adjacent devices. Control ows as dened by
MPOA include:
MPOA Resolution RequestSent from the ingress MPC to the ingress MPS.
MPOA Resolution ReplySent from the ingress MPS to the ingress MPC.
MPOA Cache Imposition RequestSent from the egress MPS to the egress MPC.
MPOA Cache Imposition ReplySent from the egress MPC to the egress MPS.
MPOA Egress Cache Purge RequestSent from the egress MPC to the
egress MPS.
MPOA Egress Cache Purge ReplySent from the egress MPS to the egress MPC.
MPOA Keep-AliveSent from an MPS to an MPC.
MPOA TriggerSent from an MPS to an MPC. If an MPS detects a ow from an
MPC, the MPS issues a request to the MPC to issue an MPOA resolution request.
NHRP Purge RequestSent from the egress MPC to the ingress MPC.
NHRP Purge ReplySent from the ingress MPC to the egress MPC.
Another approach to describe the control ows categorizes the ows by the components
that exercise the ow. Flows between an MPC and an MPS manage the MPC cache. These
include the MPOA resolution request/reply and the MPOA cache imposition request/reply.
The MPC/MPS control ows communicate over the common ELAN.
Control ows between MPCs include the MPOA egress cache purge request and reply. This
control ow occurs over the shortcut, which normally carries only user data. It is used to
eliminate cache errors in the ingress MPC. When the egress MPC detects errors, it sends
the purge request to the ingress MPC, forcing the ingress MPC to reestablish its cache
information.
MPOA Components and Model 421
Control ows also exist between MPSs. However, these are dened by the internetwork
layer routing protocols and NHRP. MPOA does not dene any new control ows
between MPSs.
The control ow list does not dene the actual sequence of the messages. Figure 10-3
shows two MPCs and MPSs interconnected in an ATM network.
n
tio
olu NHS
R es st
e 2b
RP qu
NH Re 5a
2a NH
RP
R
Ingress
5b Re esolu
ply tio Egress
MPS n
MPS
3
MPOA Resolution Request
Ingress Egress
MPC MPC
422 Chapter 10: Trunking with Multiprotocol Over ATM
MPS
The Multiprotocol Server (MPS) interacts with NHRP on behalf of the Multiprotocol
Client (MPC). An MPS always resides in a router and includes a full NHS. The MPS works
with the local NHS and consults the local routing tables to resolve shortcut requests from
an MPC. Figure 10-4 illustrates the components of an MPS.
Router
NHS Routing
MPS
LANE Client(s)
MPOA
Client(s) ELAN(s)
MPOA Components and Model 423
The MPS has a set of interfaces attached to the ATM cloud and at least one interface for
internal services. The external connections pointing to the ATM cloud consist of LANE
client(s) and an MPS interface. The LANE clients support the MPOA device discovery
protocol described later, and the actual ow of data before the shortcuts are established. The
MPS also uses the LEC to forward resolution requests to the next NHS in the system. The
service interface interacts with internal processes such as the router processes and the NHS
to facilitate MPOA resolution requests and replies.
When an MPC detects an inter-ELAN ow, the ingress MPC issues a shortcut request to
the MPS asking if there is a better way to the target MPC. The ingress MPS translates the
MPC's request into an NHRP request, which is forwarded to the egress MPS. The egress
MPS resolves the request and performs a couple of other activities. These activities include
cache imposition to the egress MPC and resolution reply back toward the ingress MPC.
MPC
In most cases, an MPC detects an inter-ELAN ow and subsequently initiates an MPOA
resolution request. The MPC detects inter-ELAN ows by watching the internetwork layer
destination address. When the destination and source network addresses differ, the MPC
identies a candidate ow. The MPC continues to transmit using hop-by-hop Layer 3
routing. But, it counts the number of frames transmitted to the target. If the frame count
exceeds a threshold congured by the network administrator (or default values), the MPC
triggers an MPOA resolution request to an appropriate MPS. The threshold is dened by
two parameters: the number of frames sent and the time interval.
When the ingress MPC receives a resolution reply from the ingress MPS, the MPC can then
establish a shortcut to the target MPC. Additional frames between the ingress and egress
MPCs ow through the shortcut bypassing the routers in the default path.
MPOA identies two types of MPC devices: a host device and an edge device. They differ
in how they originate data in the MPOA system. The sections that follow discuss these two
types of MPOA devices in greater detail.
Internal
Higher Layers
MPC Service Interface
MPC
LEC(s) Shortcut(s)
ATM
Like the MPS, the MPC host device has internal and external interfaces. The external
interfaces include the LEC and the MPC. The MPC communicates to the LES through the
LEC to detect MPOA neighbors. Also, trafc transmissions that are initiated before a
shortcut is established will pass through the LEC.
The MPC interface, on the other hand, is used for shortcuts. When the MPC detects a ow,
it issues a resolution request through its MPC interface. The MPC receives the resolution
reply through the MPC interface too. After the MPC establishes a shortcut, the MPC
interface becomes the origination point of the data circuit to the other MPC.
When you enable MPOA, all outbound trafc is forced through the MPC, whether or not a
shortcut exists. The MPC internal service interface accepts the hosts outbound trafc and
passes it through the MPC. This enables the MPC to watch for ows so that shortcuts can
be established as needed.
LAN Connections
Internal
Higher Layers
MPC Service Interface
MPC
LEC(s) Shortcut(s)
ATM
The MPC edge device is nearly identical to the MPC host device, except that its service
interface connects to bridging processes. This happens because the edge device is
facilitating connectivity of non-ATM capable devices onto the ATM network. These non-
ATM capable devices are connected into the MPOA environment through bridged
interfaces in the MPC edge device.
Router Router
Shortcut
8
(1)Before the ingress MPC requests a shortcut, the MPC forwards frames through the LEC
interface to the ATM cloud to the MPS. The MPS receives the ow on its LEC interface,
performs routing and (2) forwards the frame to the next MPS. This continues until the frame
reaches the egress MPS where the frame is forwarded (3) over the ELAN to the egress MPC.
Until the ingress MPC establishes a shortcut, all frames pass through LECs at each device.
When the ingress MPC detects a ow that exceeds the congured threshold levels (# of frames/
time), the MPC issues an MPOA resolution request (4) through the MPC control interface to
the ingress MPS. The ingress MPS forwards the request (5) to the next NHS which may or may
not reside in an MPS. The resolution request continues to be forwarded until it reaches the
device that serves as the egress MPS. Note that the resolution request propagates through
LANE clients. Resolution replies propagate back to the ingress MPS (6) through LANE
clients. The ingress MPS forwards the reply (7) to the ingress MPC through the MPOA control
circuit. Then the ingress MPC establishes a shortcut (8) to the egress MPC. The shortcut is
established to the MPC interface, not the LEC interface. Subsequent data frames stop transiting
the LEC interfaces and pass through the MPC interface directly to the egress LEC.
In summary, intra-ELAN sessions ow through LANE clients, whereas inter-ELAN ows
pass through the MPC interfaces.
LANE V2
In July of 1997, the ATM Forum released LANE version 2, which introduces enhancements
over version 1. MPOA depends upon some of the enhancements to support MPOA
operations. For example, one of the enhancements was the addition of the elan-id. MPOA
uses the elan-id to identify what broadcast domains (ELANs) MPOA devices belong to.
This is expected behavior in MPOA. In a non-MPOA environment, LECs use the elan-id
value to lter trafc from other ELANs. If a LEC in one ELAN somehow obtains the NSAP
of a LEC in another ELAN, the LEC can issue a connection request. However, because they
are in different ELANs, the receiving ELAN can (and should) reject the connection request.
Why? Because they are in different ELANs, they also belong to different subnetworks. A
MPOA Components and Model 427
direct connection between them, then, is illegal outside of the scope of MPOA. Another
LANEv2 enhancement supports neighbor discovery. When a LEC registers with the LES,
it reports the MPOA device type associated with it. For example, if the LEC is associated
with an MPC, the LEC informs the LES that the LEC serves an MPC. LECs associated with
MPSs also report to the LES. The MPOA devices can then interrogate the LES to discover
any other MPOA devices on the ELAN.
NHRP
Frequently, networks are described as having logically independent IP subnets (LISs). In
legacy LANs, devices on each segment normally belong to a different IP subnet and
interconnect with devices in other subnets through routers. The physical construction of the
networks force trafc through the routers. In an ATM environment, a hard physical
delineation doesn't exist. Connections exist whenever one ATM device requests the ATM
network to create a logical circuit between the two devices. In a LAN environment, devices
in the same subnet usually are located within a close proximity of each other. But in the
ATM network, the devices can be located on opposite sides of the globe and still belong to
the same LIS. In Figure 10-8, several LISs exist in an ATM network. But the illustration
provides no hint as to the geographical proximity of devices in the system.
428 Chapter 10: Trunking with Multiprotocol Over ATM
10.0.0.0 11.0.0.0
12.0.0.0
NHRP describes the attributes of each LIS in an NBMA network. Quoting RFC 2332
(NHRP):
All members of an LIS have the same IP network/subnet number and address mask.
All members of an LIS are directly connected to the same NBMA subnetwork.
All hosts and routers outside of the LIS are accessed via a router.
All members of an LIS access each other directly (without routers).
MPOA Conguration 429
Whenever an IP station in a LIS desires to talk to another IP station in the same LIS, the
source station issues an ARP request to the destination station. If the source desires to
communicate with a destination in another LIS, the source must ARP for a router that is a
member of at least two LISs. A router must belong to the LIS of the source device and the
next hop LIS. In other words, trafc in a LIS ows hop by hop just as it does for legacy
networks interconnected with Layer 3 routers. Routers must, therefore, interconnect
multiple LISs to provide a Layer 3 path between networks.
NHRP identies another method of modeling stations in an NBMA network. Logical
Address Groups (LAGs) differ from LISs in that LISs forward trafc in a hop-by-hop
manner. Trafc must always be forwarded to another device in the same LIS. LAGs, on the
other hand, associate devices based on Quality of Service (QoS) or trafc characteristics.
LAGs do not group stations according to their logical addresses. Therefore, in a LAG
model, two devices in the same NBMA network can talk directly with each other, even if
they belong to different LISs. MPOA shortcuts interconnect devices belonging to different
LISs, creating a LAG.
NHRP works directly with routing protocols to resolve a shortcut between workstations in
different LISs. The primary NHRP component to do this is the Next Hop Server (NHS)
which interacts with the router to determine next hop information. Each MPS has an NHS
collocated with it. The MPS translates a resolution request to an NHS request. The NHS
interrogates the router for next hop information. If the destination is local, the NHS nishes
its job and reports the egress information to the ingress MPS. If the destination is not local,
the NHS forwards the request to the next NHS toward the destination. The NHS determines
the next NHS based upon the local routing tables. It answers the question, What is the next
hop towards the destination? The request is forwarded from NHS to NHS until it reaches
the nal NHS, whereupon the egress information is returned to the ingress MPS.
MPOA Conguration
Surprisingly, in spite of all of the background complexity of MPOA, conguring the MPS
and MPC is quite simple. You must rst have LANE congured, though. Without proper
LANE congurations, MPOA never works.
NOTE Before attempting to congure MPOA on your Catalyst LANE module, ensure that you
have an MPOA-capable module. The legacy LANE modules do not support MPOA. The
MPOA-capable modules include hardware enhancements to support the MPC functions.
Generally, the LANE module must be a model number WS-X5161, WS-X5162 for OC-12,
or WS-X5167 and WS-X5168 for OC-3c support.
430 Chapter 10: Trunking with Multiprotocol Over ATM
Although you can congure the MPOA components in any sequence, the following
sequence helps to ensure that the initialization processes acquire all necessary values to
enable the components.
Step 1 Congure LECS database with elan-id
Step 2 Enable LECS
A conguration sequence other than that listed does not prevent MPOA from functioning,
but you might need to restart the LANE components so that MPOA can correctly operate.
Specically, you should ensure that the LEC can acquire the elan-id from the LECs before
you enable the LEC. The elan-id is used by the MPOA components to identify broadcast
domain membership. This is useful when establishing shortcuts.
Cisco implementations of LANE and MPOA use a default NSAP address scheme. Chapter
9 describes the NSAP format in detail. Remember, however, that the NSAP is comprised of
three parts:
The 13-byte prex
The 6-byte end-station identier (esi)
The 1-byte selector (sel)
The show lane default command enables you to see what the NSAP address for each
LANE component will be if you enable that service within that device. Cisco's
implementation of MPOA also uses a default addressing scheme that can be observed with
the show mpoa default command. Example 10-1 shows the output from these two
commands.
Note that the esi portion highlighted in italics of the MPS and MPC NSAP continue to
increment beyond the esi portion of the LECS NSAP address. The selector byte, however, does
not correlate to a subinterface as happens with the LANE components. Rather, the selector
byte indicates which MPS or MPC sources the trafc. A host, edge device, or router can have
more than one MPC or MPS enabled. The selector byte identies the intended device.
Example 10-2 shows how to congure the LECS database for MPOA.
Every ELAN with MPC components must have an elan-id assigned to it. In Example 10-2, two
ELANs are dened, elan1 and elan2, each with a unique elan-id of 101 and 102, respectively.
The actual value used does not matter, so long as the value is unique to the ATM domain.
Rather than letting the LECs obtain the elan-id value from the LECS, you can manually
congure the elan-id value in each of the LECs in your network. But this can become
administratively burdensome and is not, therefore, a widely used approach. By having the
elan-id congured in the LECS database, you simplify your conguration requirements by
placing the value in one location rather than many.
Whenever a LANE client connects to the LECS as part of the initialization process, the
LEC acquires the elan-id. This value is then used by the MPS and the MPC during their
initialization processes, and during the shortcut establishment.
Conrm that the LEC acquired the elan-id with the show lane client command. The bold
highlight in Example 10-3 indicates that the Cat-A LEC belongs to the ELAN with an elan-id
value of 101. This value came from the ELAN conguration statement in the LECS database.
432 Chapter 10: Trunking with Multiprotocol Over ATM
Example 10-3 show lane client with ELAN-ID Acquired from LECS
Cat-A#show lane client
LE Client ATM0.1 ELAN name: elan1 Admin: up State: operational
Client ID: 2 LEC up for 13 minutes 42 seconds
ELAN ID: 101
Join Attempt: 1
HW Address: 0090.ab16.b008 Type: ethernet Max Frame Size: 1516
ATM Address: 47.009181000000009092BF7401.0090AB16B008.01
If you enable the LANE components before you put the necessary MPOA statements in the
LECS database, the LANE components do not acquire the elan-id value. In this case, you need
to restart the LEC so it reinitializes and obtains the elan-id. Without the elan-id, the MPS and
MPC cannot establish neighbor relationships. Nor can the egress MPS issue a cache imposition
request, as the elan-id is one of the parameters passed in the request as dened by MPOA.
The MPS issues keepalive messages to the neighbor MPCs at a frequency dened by the
keepalive-time value. This maintains the neighbor relationship between the MPC and the MPS.
Example 10-4 shows a sample global conguration for the MPS. The global conguration
requires you to name the MPS. The name is only locally signicant, so you can name it
whatever you want. If you enable more than one MPS in the unit, they need to be uniquely
named within the device.
The network-id parameter allows you to prevent shortcuts between LECs on ELANs served
by one MPS and LECs on ELANs served by another MPS. By default, all MPSs belong to
network-id 1. If you have two MPSs, one with a network-id of 1 and the other with a
network-id of 2, LECs associated with network-id 1 cannot develop a shortcut to LECs
associated with network-id 2.
TIP Even if you elect to retain the default values, you must still enter the global conguration
statement mpoa server cong name MPS_server_name.
In Example 10-7, the MPS sees two MPC neighbors. The output displays the virtual circuits
used to communicate with each of the MPCs. These circuits should not experience idle
timeouts and should, therefore, remain open. The default idle timeout for Cisco equipment is
5 minutes. If the device sees no trafc on the circuit for the idle timeout value, it releases the
circuit. However, by default, the MPS issues keepalives to MPCs every 10 seconds. You can
modify this value, but generally you should leave the timers at the default values.
If any of the neighbors are MPSs, the display would also present their NSAP address along
with the MPC addresses.
TIP Even if you elect to retain the default values of 10 frames per second, you must still enter
the global conguration statement mpoa client cong name MPC_client_name.
Remember that you can bind the LEC to only one MPC. If you bind the MPC to the LEC before
you enable the LEC, you receive a warning indicating that no LEC is congured. You can ignore
this warning as long as you remember to eventually enable a LEC. When you complete the lane
client mpoa command, you can create the LEC on the subinterface. If you create the LEC rst,
you can enable the MPC afterwards. In either case, make sure that the LEC acquires the elan-id.
Suppose that you send an extended ping to an egress MPC with 20 pings in the sequence.
When the ingress MPC sends enough frames to cross the shortcut threshold, it issues the
shortcut request to its neighbor MPS. Assuming that the MPS resolved the shortcut, the
ingress MPC can establish the shortcut and start using it rather than the default path.
But during the extended ping operation, the egress MPC sends an echo reply for each echo
request issued from the ingress MPC. If the shortcut threshold is set for the same or less
than for the ingress MPC, the echo replies cause the egress MPC to issue a shortcut request
too. Ultimately, both the ingress and egress MPCs develop shortcuts between each other as
illustrated by Example 10-12.
The MPCs use only one of the shortcuts, though. The shortcut established by the MPC with
lowest NSAP address is used by both clients. The ingress MPC of Example 10-12 has the
lowest MAC address and should be the one used by the two MPCs. Issuing the show atm
vc 49 command conrms that the local device originated from virtual circuit (VC) 49 (see
Example 10-13).
VC 47 eventually times out and disappears. Until a device tears down the circuit, both
circuits remain in place, but only one is used. Note that the trafc counters for VC 47 show
zero frames sent or received. Whether or not this happens is a function of your trafc ows.
Not all application/protocol ows create a one for one reply-response and will therefore
only create one VC.
Router
MPS/LECS
ELAN1 LES
ELAN2 LES
1.1.1.2 1.1.2.2
Elan 1 Elan 2
elan-id 101 elan-id 102
1.1.1.1 1.1.2.1
The MPCs reside inside of Catalysts equipped with MPOA-capable LANE modules. The
MPS resides in a 7204 router. Each MPC has one LEC enabled. The MPS has two LECs
enabled, one for each of the two ELANs. The LECS and LESs reside in the 7204 router,
although it could just as easily have been congured in either of the Catalysts.
Example 10-14 shows the relevant conguration statements for Cat-A.
The MPS conguration of Example 10-16 resides in a router and has an IP address
associated with each subinterface. Not shown in this abbreviated output, but vitally
important in the conguration, is a routing protocol conguration. You must have routing
enabled for the MPS/NHS to function correctly.
Troubleshooting an MPOA Network 441
Cat-A#
If the resolution request counter does not increment, the MPC does not see interesting
trafc to trigger a request to an MPS. If the resolution request counter increments, but the
resolution reply counter does not match the request counter, the MPC did not receive a reply
to its request. When this happens, ensure that the MPSs are operational. Also check that a
default path actually exists to the egress MPC. If the default path does not exist, the MPS
cannot resolve a shortcut.
Another method of examining the MPC behavior uses debug. The debug mpoa client
command provides an opportunity to track how the MPC monitors a potential ow and to
determine if the MPC actually triggers an MPOA resolution request. Example 10-18 shows
an abbreviated debug output from an MPC.
444 Chapter 10: Trunking with Multiprotocol Over ATM
. . . . . .
The rst two highlighted portions of the output illustrate where the MPC recognized what
is called interesting trafc. Interesting trafc targets a host in another ELAN. Therefore,
the source and destination Layer 3 addresses differ. But the destination MAC address
targets a neighbor ingress MPS. Why does it see a MAC address for the MPS? The MPC
sees a MAC address for the MPS because this is the rst router in the default path. The MPC
puts the rst hop router's MAC address in the data link header. The two highlighted
statements are for frames 9 and 10 within the congured one-second period (the rst eight
frames were removed for simplicity). There is no indication of the frame count, so you
might need to sort through the debug output to see if at least ten frames were seen. Because
there are ten frames per second, the MPC triggers a resolution request to the ingress MPS.
This is shown in the third highlighted area of Example 10-18.
Eventually, the MPC should receive a resolution reply from the MPS as shown in
Example 10-19.
Ensuring the ELANs Are Functional between the Source and the Destination 445
mandatory part:
src_proto_len 4, dst_proto_len 4, flags 0, request_id 2
src_nbma_addr: 47.009181000001000200030099.0090AB16540D.00
src_prot_addr: 0.0.0.0
dst_prot_addr: 3.0.0.1
cie 0:
code 0, prefix_length 0, mtu 1500, holding_time 1200
cli_addr_tl 20, cli_saddr_tl 0, cli_proto_len 0, preference 0
cli_nbma_addr: 47.009181000001000200030099.0090AB164C0D.00
tlv 0:
type 4097, length 4
data: 15 05 00 01
tlv 1:
type 4096, length 23 compulsory
data: 00 00 00 01 00 00 00 67 0E 00 90 AB 16 4C 08 00 90 AB 16 B0 08 08 00
The middle portion of the debug output displays various items from the MPOA resolution
reply messages. For example, cie refers to a client information element as specied by
MPOA. code 0 means that the operation was successful. Reference the MPOA documents
for decode specics. The important parts of the debug for the immediate purposes of
troubleshooting are highlighted.
The egress MPC can reject the imposition whenever its local resources prevent it from
doing so. For example, the egress MPC might not have enough memory resources to hold
another cache entry. It might be that the egress MPC already has too many virtual circuits
established and cannot support another circuit. Any of these can cause the egress MPC to
reject the imposition preventing a shortcut from getting established.
If you use the MPC debug, you should see a line like that shown at the end of Example
10-19 to conrm that the cache imposition worked successfully.
Review Questions
This section includes a variety of questions on the topic of this chapterMPOA. By
completing these, you can test your mastery of the material included in this chapter as well
as help prepare yourself for the CCIE written and lab tests.
1 A network administrator observes that the MPC cannot develop a shortcut. An ATM
analyzer attached to the network shows that the MPC never issues a shortcut request,
even though the 10 frames per second threshold is crossed. Why doesn't the MPC
issue a shortcut request? The show mpoa client command displays as shown in
Example 10-20.
2 When might the ingress and egress MPS reside in the same router?
3 What creates the association of an MPC with a VLAN?
4 Example 10-6 has the following conguration statement in it: lane client ethernet
elan_name. Where is the VLAN reference?
5 If a frame must pass through three routers to get from an ingress LEC to an egress
LEC, do all three routers need to be congured as an MPS?
6 Can you congure both an MPC and an MPS in a router?
This page intentionally left blank
PART
IV
Advanced Features
Chapter 11 Layer 3 Switching
Layer 3 Switching
Layer 3 switching is one of the most important but over-hyped technologies of recent
memory. On one hand, vendors have created a labyrinth of names, architectures, and
options that have done little but confuse people. On the other hand, Layer 3 switching
(routing) is one of the most important ingredients in a successful campus design. While
providing the bandwidth necessary to build modern campus backbones, it also provides the
scalability necessary for growth and ease of maintenance.
The goal of this chapter is to clear up any confusion created by competing marchitectures
(marketing architectures). By digging into the details behind Ciscos approach to Layer 3
switching, myth and fact can be separated. The chapter takes a chronological look at inter-
VLAN (VLAN) routing. It begins with a brief discussion of switching terminology and the
importance of routing. It then dives into the rst technique commonly used to connect
virtual LANs (VLANs) in a switched environment: the router-on-a-stick design. The
chapter then looks at more integrated approaches such as the Catalyst Route Switch Module
(RSM), followed by a discussion of two hardware-based approaches to Layer 3 switching.
The chapter concludes with coverage of Ciscos Hot Standby Router Protocol (HSRP) and
bridging between VLANs.
Layer 3 switching is a term that encompasses a wide variety of techniques that seek to
merge the benets of these previously separate technologies. The goal is to capture the
speed of switching and the scalability of routing. In general, Layer 3 switching techniques
can be grouped into two categories:
Routing switches
Switching routers
As a broad category, routing switches uses hardware to create shortcut paths through the
middle of the network, bypassing the traditional software-based router. Some routing
switch devices have been referred to as router accelerators. Routing switches do not run
routing protocols such as Open Shortest Path First (OSPF) or Enhanced Interior Gateway
Routing Protocol (EIGRP). Instead, they utilize various techniques to discover, create, or
cache shortcut information. For example, Multiprotocol over ATM was discussed in
Chapter 10, Trunking with Multiprotocol over ATM. This is a standards-based technique
that allows ATM-attached devices to build a virtual circuit that avoids routers for sustained
ows of information. Although Cisco obviously supports MPOA, it has developed another
shortcut technique that does not require an ATM backbone. This feature is called Multilayer
Switching (MLS), although many people (and Cisco documents) still refer to it by an earlier
name, NetFlow LAN Switching (NFLS). MLS is discussed in detail during this chapter.
WARNING Do not confuse MLS with other shortcut Layer 3 switching techniques that are not
standards-compliant (many of these use the term cut-through switching). Many of these
other techniques quickly switch the packets through the network without making the
necessary modications to the packet (such as decrementing the TTL eld and rewriting
the source and destination MAC addresses). MLS makes all of same modications as a
normal router and is therefore completely standards-compliant.
Unlike routing switches, switching routers do run routing protocols such as OSPF. These
operations are typically run on a general-purpose CPU as with a traditional router platform.
However, unlike traditional routers that utilize general-purpose CPUs for both control-plane and
data-plane functions, Layer 3 switches use high-speed application specic integrated circuits
(ASICs) in the data plane. By removing CPUs from the data-forwarding path, wire-speed
performance can be obtained. This results in a much faster version of the traditional router.
Switching routers such as the Catalyst 8500 are discussed in more detail later in this chapter.
Although the terms routing switch and switching router seem arbitrarily close, the terms are
actually very descriptive of the sometimes subtle difference between these types of devices.
For example, in the case of routing switch, switch is the noun and routing is the adjective
(you didnt know you were in for a grammar lesson in this chapter, did you?). In other
words, it is primarily a switch (a Layer 2 device) that has been enhanced or taught some
routing (Layer 3) capabilities. In the case of a switching router, it is primarily a router
(Layer 3 device) that uses switching technology (high-speed ASICs) for speed and
performance (as well as also supporting Layer 2 bridging functions).
The Importance of Routing 453
TIP Routing switches are Layer 2-oriented devices that have been enhanced to provide Layer 3
(and 4) functionality. On the other hand, switching routers are primarily Layer 3 devices
that can also do Layer 2 processing (like any Cisco router).
Of the variety of other switching devices and terminology released by vendors, Layer 4 and
Layer 7 switching have received considerable attention. In general, these approaches refer
to the capability of a switch to act on Layer 4 (transport layer) information contained in
packets. For example, Transmission Control Protocol (TCP) and User Datagram Protocol
(UDP) port numbers can be used to make decisions affecting issues such as security and
Quality of Service (QoS). However, rather than being viewed as a third type of campus
switching devices, these should be seen as a logical extension and enhancement to the two
types of switches already discussed. In fact, both routing switches and switching routers
can perform these upper-layer functions.
TIP You should build routing (Layer 3 switching) into all but the smallest campus networks. See
Chapters 14, Campus Design Models, and 15, Campus Design Implementation, for
more information.
454 Chapter 11: Layer 3 Switching
Router-on-a-Stick
Early VLAN designs relied on routers connected to VLAN-capable switches in the manner
shown in Figure 11-1.
d
Re
ue
Bl
Layer 2
Backbone
Red Blue
In this approach, traditional routers are connected via one or more links to a switched
network. Figure 11-1 shows a single link, the stick, connecting the router to the rest of the
campus network. Inter-VLAN trafc must cross the Layer 2 backbone to reach the router
where it can move between VLANs. It then travels back to the desired end station using
normal Layer 2 forwarding. This out to the router and back ow is characteristic of all
router-on-a stick designs.
Figure 11-1 portrays the router connection in a general sense. When discussing specic
options for linking a router to a switched network, two alternatives are available:
One-link-per-VLAN
Trunk-connected router
Router-on-a-Stick 455
One-Link-per-VLAN
One of the earliest techniques for connecting a switched network to a router was the use of
one-link-per-VLAN as shown in Figure 11-2.
en
e
d
Gre
Blu
Re
e2
e0
e1
ISL ISL
ISL
In this case, the switched network carries three VLANs: Red, Blue, and Green. Inter-Switch
Link (ISL) trunks are used to connect the three switches together, allowing a single link to
carry all three VLANs. However, connections to the router use a separate link for every
VLAN. Figure 11-2 illustrates the use of 10 Mbps router ports; however, Fast Ethernet,
Gigabit Ethernet, or even other media such as Asynchronous Transfer Mode (ATM) or
Fiber Distributed Data Interface (FDDI) can be used.
There are several advantages to using the one-link-per-VLAN approach:
It allows existing equipment to be redeployed in a switched infrastructure,
consequently saving money.
It is simple to understand and implement. Network administrators do not have to learn any
new concepts or conguration commands to roll out the one-link-per-VLAN approach.
Because it relies of multiple interfaces, it can provide high performance.
Furthermore, notice that every router interface is unaware of the VLAN infrastructure (they
are access ports). This allows the router to utilize normal processing to move packets
between VLANs. In other words, there is no additional processing or overhead.
456 Chapter 11: Layer 3 Switching
Although there are advantages to the one-link-per-VLAN design, it suffers from several
critical aws:
It can require more interfaces than is practical. In effect, this limits the one-link-per-
VLAN approach to networks carrying less than 10 VLANs. Trying to use this model
with networks that carry 15 or more VLANs is generally not feasible because of port-
density and cost limitations.
Although it can initially save money because it allows the reuse of existing equipment,
it can become very expensive as the number of VLANs grows over time. Keep in mind
that every VLAN requires an additional port on both the router and the switch.
It can become difcult to maintain the network over time. Although the one-link-per-
VLAN design can be simple to initially congure, it can become very cumbersome as
the number of VLANs (and therefore cables) grows.
In short, the downside of the one-link-per-VLAN approach can be summarized as a lack of
scalability. Therefore, you should only consider this to be a viable option in networks that
contain a small number of VLANs.
TIP The one-link-per-VLAN model can be appropriate in networks with a limited number
of VLANs.
Example 11-1 presents a possible conguration for the router in Figure 11-2.
The conguration in Example 11-1 provides inter-VLAN routing services for three VLANs:
VLAN 1 is connected to the Ethernet0 interface and is only using the IP protocol.
VLAN 2 is linked the Ethernet1 interface and uses the IP and IPX protocols.
VLAN 3 is linked to the Ethernet2 interface and supports three network layer
protocols: IP, IPX, and AppleTalk.
Router-on-a-Stick 457
Notice that the router is unaware of VLANs directlyit sees the network as three normal
segments.
Trunk-Connected Routers
As technologies such as ISL became more common, network designers began to use trunk links
to connect routers to a campus backbone. Figure 11-3 illustrates an example of this approach.
Figure 11-3 Trunk-Connected Router
ISL
ISL ISL
ISL
Although any trunking technology such as ISL, 802.1Q, 802.10, LAN Emulation (LANE),
or MPOA can be used, Ethernet-based approaches are most common (ISL and 802.1Q).
Figure 11-3 uses ISL running over Fast Ethernet. The solid lines refer to the single physical
link running between the top Catalyst and the router. The dashed lines refer to the multiple
logical links running over this physical link.
The primary advantage of using a trunk link is a reduction in router and switch ports. Not
only can this save money, it can reduce conguration complexity. Consequently, the trunk-
connected router approach can scale to a much larger number of VLANs than the one-link-
per-VLAN design.
458 Chapter 11: Layer 3 Switching
However, there are disadvantages to the trunk-connected router conguration, including the
following:
Inadequate bandwidth for each VLAN
Additional overhead on the router
Older versions of the IOS only support a limited set of features on ISL interfaces
With regard to inadequate bandwidth for each VLAN, consider, for example, the use of a
Fast Ethernet link where all VLANs must share 100 Mbps of bandwidth. A single VLAN
could easily consume the entire capacity of the router or the link (especially if there is a
broadcast storm or Spanning Tree problem).
With regard to the additional overhead on the router caused by using a trunk-connected
router, not only must the router perform normal routing and data forwarding duties, it must
handle the additional encapsulation used by the trunking protocol. Take ISL running on a
7500 router as an example. Ciscos software-based routers have a number of different
switching modes, a term that Cisco uses to generically refer to the process of data
forwarding in a router.
NOTE Dont confuse the term switching here with how it normally gets used throughout this book.
These software-based routers use the term switching to refer to the process of forwarding
frames through the box, regardless of whether the frames are routed or bridged.
Every Cisco router supports multiple forwarding techniques. Although a full discussion of
these is not appropriate for a campus-oriented book, it is easiest to think of them as gears
in an automobile transmission. For example, just as every car has a rst gear, every Cisco
router (including low-end routers such as the 2500) supports something called Process
Switching. Process Switching relies on the CPU to perform brute-force routing on each and
every packet. Just as rst gear is useful in all situations (uphill, at roads, rain, snow, dry,
and so on), Process Switching can route all packets and protocols. However, just as rst
gear is the slowest in a car, Process Switching is slowest forwarding technique for a router.
Every Cisco router also has a second gearthis is referred to as Fast Switching. By taking
advantage of software-based caching techniques, it provides faster data forwarding.
However, just as second gear is not useful in all situations (going up a steep hill, starting
away from a trafc signal, and so on), Fast Switching cannot handle all types of trafc (for
example, many types of SNA trafc).
Finally, just as high-end automobiles offer fancy six-speed transmissions, high-end Cisco
routers offer a variety of other switching modes. These switching modes go by names such
as Autonomous Switching, Silicon Switching, Optimum Switching, and Distributed
Switching. Think of these as gears three, four, ve, and six (respectively) in a Ferraris
transmissionthey can allow you to move very quickly, but can be useful only in ideal
Router-on-a-Stick 459
conditions and very limited situations (that is, dry pavement, a long country road, and no
police!).
Getting back to the example of an ISL interface on a 7500 router, 7500 routers normally use
techniques such as Optimum Switching and Distributed Switching to achieve data
forwarding rates from 300,000 to over 1,000,000 packets per second (pps).
NOTE Several performance gures are included in this chapter to allow you to develop a general
sense of the throughput you can expect from the various Layer 3 switching options. Any
sort of throughput numbers are obviously highly dependent on many factors such as
conguration options, software version, and hardware revision. You should not treat them
as an absolute indication of performance (in other words, your mileage may vary).
However, when running ISL, that interface becomes limited to second gear Fast Switching.
Because of this restriction, ISL routing is limited to approximately 50,000 to 100,000 pps
on a 7500 (and considerably less on many other platforms).
Some of this limitation is due to the overhead of processing the additional 30-byte ISL
encapsulation. With older interfaces such as the Fast Ethernet Interface Processor (FEIP),
this can be especially noticeable because the second CRC (cyclic redundancy check)
contained in the ISL trailer must be performed in software. In the case of newer interfaces
such as the PA-FE (Fast Ethernet port adapter for 7200 and VIP interfaces) or the FEIP2,
hardware assistance has been provided for tasks such as the ISL CRC. However, even in the
case of the PA-FE and the FEIP2, the Fast Switching limitation remains.
TIP The RSM Versatile Interface Processor (VIP) (the card into which you put port adapters) is
not the same as a 7500 VIP. It is port adapters themselves that are the same in both
platforms.
Note that switching routers such as the Catalyst 8500s use ASICs to handle ISL and 802.1Q
encapsulations, effectively removing the overhead penalty of trunk links. However, devices
such as the 8500 are rarely deployed in router-on-a-stick congurations. See the section on
8500-style switching routers later in this chapter.
460 Chapter 11: Layer 3 Switching
TIP Software-based routers containing Fast Ethernet interfaces, such as the 7500, 7200, 4000,
and 3600, are limited to Fast Switching speeds for ISL operations. ASIC-based routers such
as the Catalyst 8500 do not have this limitation and can perform ISL routing at wire speed.
The third disadvantage of the trunk-connected router design is that older versions of the
IOS only support a limited set of features on ISL interfaces. Although most limitations were
removed in 11.3 and some later 11.2 images, networks using older images need to carefully
plan the inter-VLAN routing in their network. Some of the more signicant limitations
prior to 11.3 include the following:
Support for only IP and IPX. All other protocols (including AppleTalk and DECnet)
must be bridged. Inter-VLAN bridging is almost always a bad idea and is discussed
later in the section Integration between Routing and Bridging.
IPX only supports the novell_ether encapsulation (Novell refers to this as
Ethernet_802.3).
HSRP is not supported. This can make it very difcult or impossible to provide default
gateway redundancy.
Secondary IP addresses are not supported.
TIP ISL interfaces prior to 11.3 (and some later versions of 11.2) only support a limited set of
protocols and features. 11.3+ code addresses all four of the issues mentioned in the
preceding list.
subinterfaces allow each ELAN on a single ATM interface to belong to its own logical
grouping, subinterfaces on Fast Ethernet (or other media) interfaces allow a logical
partition for each VLAN. If the physical interface is Fast Ethernet1/0 (this is also called the
major interface), subinterfaces can use designations such as Fast Ethernet1/0.1, Fast
Ethernet1/0.2, and Fast Ethernet1/0.3. For example, the conguration in Example 11-2
congures a Fast Ethernet port to perform ISL routing for three VLANs.
TIP Although the router allows the subinterface numbers and VLAN numbers to differ, using
the same numbers provides easier maintenance. For example, congure VLAN 2 on
subinterface X.2 (where X equals the major interface designation).
The RSM
In the case of the router-on-a-stick design, trafc ows to the router within the source
VLAN where it is routed into the destination VLAN. This created an out and back ow to
the router. Technically, the Catalyst 5000 RSM uses a very similar ow, but with one
important difference: the stick becomes the Catalyst 5000 backplane (the high-speed
switching path used inside the Catalyst chassis). This difference provides two key benets:
Speed
Integration
Because the RSM directly connects to the Catalyst 5000 backplane, it allows the router to
be much more tightly integrated into the Catalyst switching mechanics. Not only can this
ease conguration tasks, it can provide intelligent communication between the Layer 2 and
Layer 3 portions of the network (several examples are discussed later in the chapter). Also,
because it provides a faster link than a single Fast Ethernet ISL interface, the performance
can be greater. In general, the RSM provides 125,000175,000 pps for IP and
approximately 100,000 pps for other protocols.
TIP If necessary, more than one RSM can be used in a single Catalyst chassis for additional
throughput.
The RSM 463
RSM Conguration
One of the appealing benets of the RSM is its familiarity. From a hardware perspective, it
is almost identical to an RSP2 (the second version of the Route Switch Processor) from a
Cisco 7500. It has the same CPU and contains the same console and auxiliary ports for out-
of-band conguration. It has its own ash, dynamic random-access memory (DRAM), and
nonvolatile random-access memory (NVRAM). And, because it runs the full IOS, RSM is
congured almost exactly like any Cisco router.
TIP Although the IOS is identical from a conguration standpoint, do not try to use a 7500
router image on an RSMthe RSM uses its own image sets. Under Ciscos current naming
convention, RSM images begin with the characters c5rsm.
The most obvious modication is a set of dual direct memory access (DMA) connections
to the Catalyst 5000 backplane. (The backplane connection remains 1.2 Gbps even in 3.6
Gbps devices such as the Catalyst 5500.) The status of these two connections is indicated
by the Channel 0 and Channel 1 LEDs on the front panel. Each channel provides 200 Mbps
of throughput for a total of 400 Mbps.
Because the RSM runs the full IOS and contains its own image and memory, it shares some
of the same conguration aspects as the LANE module discussed in Chapter 9. To congure
the RSM, you need to enter the session slot command. For example, to congure an RSM
located in slot 3, you enter session 3. This instantly transports you from the Catalyst world
of set and show commands to the router world of cong t. The full range of IOS help and
command-line editing features are available. The RSM also requires you to save your
conguration changes to NVRAM using the copy run start command.
TIP Dont forget to save your RSM congurations with copy run start or write mem. Unlike
the Catalyst Supervisor, the RSM does not automatically save conguration changes.
Although the session command is the most common way to congure an RSM, the console
and auxiliary ports can be useful in certain situations. Many organizations use the auxiliary
port to connect a modem to the Catalyst. This is especially useful for Supervisors that do
not contain an Aux port (or, in the case of the Catalyst 5000 Supervisor III, where the Aux
port is not enabled).
464 Chapter 11: Layer 3 Switching
TIP The session command opens a Telnet session across the Catalysts backplane. The
destination address is 127.0.0.slot_number + 1. For example, slot 3 uses 127.0.0.4. Some
versions (but unfortunately not all) of the RSM code allow you to enter telnet 127.0.0.2 to
Telnet from the RSM to the Supervisor in slot 1 (or 127.0.0.3 for slot 2). This can very
useful when accessing the box from a modem connected to the RSMs auxiliary port. If the
code on your RSM does not permit the use of the 127.0.0.X addresses, use normal IP
addresses assigned to both SC0 and an RSM interface. However, this obviously requires a
valid conguration on the Supervisor before you remotely dial into the RSM.
Just as the auxiliary port is useful for connecting a modem to the Catalyst, the RSMs
console port is useful for password recovery operations.
TIP RSM password recovery is identical to normal Cisco router password recovery. See the IOS
System Management documentation for more details.
RSM Interfaces
The RSM uses interfaces just as any Cisco router does. However, instead of using the usual
Ethernet0 and Fast Ethernet1/0, the RSM uses virtual interfaces that correspond to VLANs.
For example, interface vlan 1 and interface vlan 2 can be used to create interfaces for
VLANs 1 and 2, respectively. These virtual interfaces are automatically linked to all ports
congured in that VLAN on the Catalyst Supervisor. This creates a very exible and intuitive
routing platform. Simply use the set vlan vlan_number port_list command on the Supervisor
to make VLAN assignments at will, and the RSM automatically reects these changes.
TIP RSMs do not use subinterfaces for VLAN conguration. Instead, the RSM uses virtual
VLAN interfaces (that function as major interfaces). In fact, these VLAN interfaces
currently do not support subinterfaces.
Except for the earliest versions of RSM code, RSM virtual interfaces only become active if
the Supervisor detects active ports that have been assigned to that VLAN. For example, if
VLAN 3 has no ports currently active, a show interface vlan 3 command on the RSM
shows in the interface in the down state. If a device in VLAN 3 boots, the RSMs VLAN 3
interface enters the up state. This value-added feature further reects the tight integration
between the Supervisor and RSM and is useful for avoiding black hole routing situations.
The RSM 465
TIP You cannot activate an RSM interface until the corresponding VLAN has one or more
active ports.
This black hole prevention feature can be controlled through the use of the set
rsmautostate [ enable | disable ] Supervisor command. In modern Catalyst images, this
feature is enabled by default.
The RSM contains no VLANs by default. The VLAN virtual interfaces are created as
interface vlan commands are rst entered. Each VLAN interface can then be congured
with the addressing and other parameters associated with that VLAN. For example, the
code sample in Example 11-3 creates three interfaces that correspond to three VLANs.
As with the earlier examples, Vlan1 is only used for IP trafc. Vlan2 adds support for IPX
routing, and Vlan3 is running IP, IPX, and AppleTalk services.
TIP RSM interfaces are in a shutdown state when they are rst created. Dont forget to use the
no shutdown command to enable them.
failed to reveal the problem. For example, Example 11-4 shows some of the extended ping
options.
Example 11-4 illustrates the use of the repeat count, datagram size, and data pattern
options. Respectively, these can be useful when trying to create a sustained stream of rapid-
re pings, to probe for maximum transmission unit (MTU) problems, and to detect ones-
density problems on serial links.
However, the most powerful troubleshooting advantage to the RSM is debug. For example,
debug ip icmp or debug ip packet [access-list-number] can be extremely useful when
trying to track down the reason why some ping operation mysteriously fails.
The usual caveats about debug output volumes apply, though. Be very careful, especially
when using commands such as debug ip packet in production networks. It is almost always
advisable to use the access-list-number parameter to very specically limit the amount of
output. Also, because the RSM automatically sends debug output to a connection made via
the session command, the terminal monitor command is not necessary.
TIP To make it easier to enter commands while the router is generating debug or informational
output, use the logging synchronous line command. For the RSM, it is most useful to enter
this under line vty 0 4. However, this command can be useful on all of Ciscos router
platforms running 10.2+ code (in which case it should also be applied to line con 0 and
line aux 0).
MLS 467
TIP Be careful when using the HSSI port adapter (typically used for T3 connections) with the
RSM VIP because it can overload the power supplies on some models. Check the current
release notes for the latest list of models that are affected by this.
As discussed earlier, the RSM is a software-based routing device that cannot provide
enough Layer 3 performance for larger campus networks on its own. However, another
appealing benet to the RSM is that it can be easily upgraded to provide hardware-based
forwarding via MLS, the subject of the next section.
MLS
Multilayer Switching (MLS) is Ciscos Ethernet-based routing switch technology. MLS is
currently supported in two platforms: the Catalyst 5000 and the Catalyst 6000. The Catalyst
5000 makes use of the NetFlow Feature Card (NFFC) I or II to provide hardware-assisted
routing. The Catalyst 6000 performs the same operations using the Multilayer Switch
Feature Card (MSFC) in conjunction with the Policy Feature Card (PFC). In keeping with
the chronological presentation of this chapter, this section focuses on the Catalyst 5000s
implementation of MLS. The Catalyst 6000s Layer 3 capabilities are discussed in the
Catalyst 6000 Layer 3 Switching section later in the chapter and in Chapter 18, Layer 3
Switching and the Catalyst 6000/6500s. Also, although MLS supports both IP and IPX
trafc, this section focuses on IP.
468 Chapter 11: Layer 3 Switching
NOTE IPX MLS is supported on all Catalyst 6000s using a Multilayer Switch Feature Card
(MSFC, discussed later in the chapter). IPX MLS is supported on Catalyst 5000s using a
NFFC II and 5.1+ software.
In its most basic sense, the NFFC is a pattern-matching engine. This allows the Catalyst to
recognize a wide variety of different packets. By matching on various combinations of
addresses and port numbers, the routing switch form of Layer 3 switching can be performed.
However, a host of other features are also possible. By matching on Layer 3 protocol type, a
feature called Protocol Filtering can be implemented. By matching on Internet Group
Management Protocol (IGMP) packets, the Catalyst can perform IGMP Snooping to
dynamically build efcient multicast forwarding tables. Finally, by matching on Layer 2 and
Layer 3 QoS and COS information, trafc classication and differentiation can be performed.
This section initially only considers the Layer 3 switching aspects of the NFFC. The other
capabilities are addressed at the end of the section (as well as in other chapters such as
Chapter 13, Multicast and Broadcast Services).
One of the important things to keep in mind when discussing MLS is that, like all shortcut
switching mechanisms, it is a caching technique. The NFFC does not run any routing
protocols such as OSPF, EIGRP, or BGP.
It is also important to realize that MLS, formerly known as NetFlow LAN Switching, is a
completely different mechanism than the NetFlow switching on Ciscos software-based
routers. In its current implementation, NetFlow on the routers is targeted as a powerful data
collection tool via NetFlow Data Export (although it can also be used to reduce the
overhead associated with things like complex access lists). Although MLS also supports
NetFlow Data Export (NDE), its primary mission is something very differentLayer 3
switching.
Because the NFFC does not run any routing protocols, it must rely on its pattern-matching
capabilities to discover packets that have been sent to a router (notice this is the device
running protocols such as OSPF) and then sent back to the same Catalyst. It then allows the
NFFC to shortcut future packets in a manner that bypasses the router. In effect, the NFFC
notices that it sent a particular packet to the router, only to have the router send it right back.
It then says to itself, Boy, that was a waste of time! and starts shortcutting all remaining
packets following this same path (or ow).
NOTE NetFlow denes a ow as being unidirectional. Therefore, when two nodes communicate
via a bi-directional protocol such as Telnet, two ows are created.
MLS 469
Although MLS is fundamentally a very simple technique, there are many details involved.
The following section presents an in-depth account of the entire MLS mechanism. Later
sections examine how to congure and use MLS.
The following sections describe each of these steps using the sample network shown in
Figure 11-4.
470 Chapter 11: Layer 3 Switching
1/1
Host-A Host-B
2/1 3/1
Red Blue
This network consists of two VLANs, VLAN 1 (Red) and VLAN 2 (Blue). Two end stations
have been shown. Host-A has been assigned to the Red VLAN, and Host-B has been
assigned to the Blue VLAN. An ISL-attached router has also been included. Its single Fast
Ethernet interface (Fast Ethernet1/0) has been logically partitioned into two subinterfaces,
one per VLAN. The IP and MAC addresses for all devices and subinterfaces are shown.
Figure 11-4 portrays the router as an ISL-attached external device using the router-on-a-stick
conguration. Other possibilities include an RSM or a one-interface-per-VLAN attached router.
and Catalysts to boot at random times while also serving as a router keepalive mechanism
for the NFFC (if a router goes ofine, its cache entries are purged).
TIP There is one XTAG per MLS-capable router. The XTAG serves as a single handle for a routers
multiple MAC addresses (each interface/VLAN could be using a different MAC address).
XTAGs are locally signicant (different NFFCs can refer to the same router with different
XTAGs).
CAM
MLSP
CAM
Host-A Host-B
2/1 3/1
As shown in Figure 11-5, the MLSP packets are sourced from subinterface Fast Ethernet1/
0.1 on the router (this is a congurable option; the router commands are presented later).
These packets are then used to populate the Layer 2 CAM table (a form of bridging table
commonly used in modern switches) with special entries that are used to identify packets
going to or coming from a router interface (the show cam Catalyst command places an R
next to these entries). Each router is also assigned a unique XTAG value. If a second router
were present in Figure 11-5, it would receive a different XTAG number than the value of 1
assigned to the rst router. However, notice that all MAC addresses and VLANs for a single
router are associated with a single XTAG value.
472 Chapter 11: Layer 3 Switching
Although it is not illustrated in Figure 11-5, the MLSP hello packets ow throughout the
Layer 2 network. Because they are sent using a multicast address (01-00-0C-DD-DD-DD,
the same address used by CGMP), non-MLS-aware switches simply ood the hello packets
to every segment in VLAN 1. In this way, all MLS switches learn about all MLS-capable
routers.
For example, refer to Figure 11-6 and assume that Host-A Telnets to Host-B. Recognizing
that Host-B is in a different subnet, Host-A sends the packets to its default gateway,
subinterface 1/0.1 on the router.
MLS 473
Candidate
Packet
(Red)
Host-A Host-B
Figure 11-7 illustrates the relevant elds in this packet as it traverses the ISL link to the
router.
The ISL header contains a VLAN ID of 1. The Ethernet header contains a source MAC
address equal to Host-A and a destination MAC address equal to 00-00-0C-11-11-11, the
MAC address of subinterface 1/0.1 on the router. The source and destination IP addresses
belong to Host-A and Host-B, respectively. The switch uses the destination MAC address
to perform two actions:
It forwards the packet out Port 1/1 toward the router using Layer 2 switching.
474 Chapter 11: Layer 3 Switching
It recognizes the MAC address destination address as one of the routers addresses
learned in Step 1. This triggers a lookup for an existing Layer 3 shortcut entry based
on the destination IP address (other options are available, but these are discussed
later). Assuming that a shortcut does not exist (it is a new ow), the packet is agged
as a candidate packet and a partial shortcut entry is created.
Enable
Packet
(Blue)
Host-A Host-B
Figure 11-9 shows the relevant elds contained in the packet as it crosses the ISL link
between the router and switch.
MLS 475
ISL Remaining
Header Ethernet Header IP Header Packet
The router has rewritten the Layer 2 header. Not only has it changed the VLAN number in
the ISL header, it has modied both MAC addresses. The source MAC address is now equal
to 00-00-0C-22-22-22, the MAC address used on the routers Fast Ethernet1/0.2
subinterface, and the destination address is set to Host-B. Although the IP addresses have
not been changed, the router must modify the IP header by decrementing the Time To Live
(TTL) eld and update the IP checksum.
As the packet traverses the Catalyst on its way from the router to Host-B, ve functions are
performed:
1 The destination MAC address is used to Layer 2 switch the packet out Port 3/1.
2 The NFFC recognizes the source MAC address as one of the entries created in Step 1
via the hello process.
3 The NFFC uses the destination IP address to look up the existing partial shortcut entry
created in Step 2.
4 The NFFC compares the XTAG values associated with the source MAC address of
this packet and the partial shortcut entry. Because they match, the NFFC knows that
this is the enable packet coming from the same router targeted by the candidate
packet.
5 The NFFC completes the shortcut entry. This entry will contain all of the information
necessary to rewrite the header of future packets (in other words, the elds shown in
Figure 11-9).
NFFC
Host-A Host-B
Red Blue
Shortcut and
Rewrite Operation
There are two options that MLS can use to rewrite the packet. In the rst option, the NFFC
card itself is used to rewrite the packet. The NFFC actually contains three rewrite engines,
one per Catalyst 5500 bus. These rewrite engines are referred to as central rewrite engines.
The downside of using a central rewrite engine is that it requires the packet to traverse the
bus twice. For example, in Figure 11-10, the packet rst arrives through Port 2/1 and is
ooded across the backplane as a VLAN 1 frame. The NFFC is treated as the destination
output port. After the NFFC has completed the shortcut lookup operation, it uses the rewrite
information contained in the Layer 3 CAM table to update the packet appropriately. It then
sends the rewritten packet back across the bus as a VLAN 2 frame, where the Layer 2 CAM
table is used to forward it out Port 3/1. In other words, it crosses the bus rst as a packet in
the Red VLAN and again as a packet in the Blue VLAN. As a result, performance is limited
to approximately 750,000 pps (on Catalyst 5000s).
The second rewrite option uses a feature called inline rewrite to optimize this ow. When
using Catalyst modules that support this feature, the rewrite operation can be performed on
the output module itself, allowing the packet to cross the bus a single time. Figure 11-11
illustrates the inline rewrite operation.
Figure 11-11 Inline Rewrite
Shortcut
Lookup
on NFFC
Rewrite
Information
NFFC
Host-A Host-B
In-Line
Rewrite
Frame Sent
over Bus Once Rewrite
Performed on
Output Module
478 Chapter 11: Layer 3 Switching
When the packet comes in from Host-A, it is ooded across the bus. All ports make a copy
of the frame, including the destination Port 3/1 and the NFFC. The NFFC looks up the
existing shortcut entry and sends just the rewrite information to Module 3 (this occurs on a
separate bus from the data bus). Module 3 uses its local rewrite engine to modify the packet
and immediately forwards it out Port 3/1. Because the frame only traversed the bus once,
throughput is doubled to approximately 1,500,000 pps.
NOTE The central rewrite versus inline rewrite issue is not a problem on the Calayst 6000 because
all of its Ethernet line cards support inline rewrite.
Cache Aging
To prevent the MLS cache from overowing, an aging process must be run. This is a software-
controlled operation that runs in the background. Although the architecture of current NFFCs
can theoretically hold 128,000 entries, it is recommended to keep the total number of entries
below 32,000 on current versions of the card. MLS supports three separate aging times:
Quick
Normal
Fast
Quick aging is utilized to age out partial shortcut entries that never get completed by an
enable packet. The aging period for these entries is xed at ve seconds.
Normal aging is used for the typical sort of data transfer ow. This is a user-congurable
interval that can range from 64 to 1,920 seconds with the set mls agingtime [agingtime]
command. The default is 256 seconds. When changing the default value, it is rounded to the
nearest multiple of 64 seconds.
Fast aging is used to age short-term data ows such as DNS, ping, and TFTP. The Fast aging
time can be adjusted with the set mls agingtime fast [fastagingtime] [pkt_threshold] command.
If the entry does not have more than pkt_threshold packets within fastagingtime seconds, the
entry is removed. By default, Fast aging is not enabled because the fastagingtimeme parameter
is set to 0. The possible fastagingtime values are 0, 32, 64, 96, and 128 seconds (it uses the
nearest value if you enter a different value). The pkt_threshold parameter can be set to 0, 1, 3, 7,
15, 31, or 63 (again, you can enter other values and it uses the closest possible value).
The MLSP protocol to notify the NFFC to ush all shortcut entries if the access list is
modied
A ow mask
The rst mechanism handles the case where a packet is forwarded to the router and never
returned to any Catalyst because it failed an access list. As a result, MLS can be a safe and
effective technique.
The MLSP ush mechanism provides important integration between the router and the
NFFC. If the router is congured with an access list, the MLSP protocol can be used to
cause all cache entries to be ushed (forcing new entries to be processed by the access list).
The ush mechanism is also used to remove cache entries after a routing table change.
The ow mask is used to set the granularity with which the NFFC determines what
constitutes a ow. In all, three ow masks are possible:
Destination ow mask
Destination-source ow mask
Full ow mask
A destination ow mask enables ows based on Layer 3 destination addresses only. A
single shortcut is created and used for all packets headed to a specic destination IP (or
IPX) address, regardless of the source node or application. This ow mask is used if no
access lists are congured on the router.
A destination-source ow mask uses both the source and destination Layer 3 addresses. As
a result, each pair of communicating nodes uses a unique shortcut entry. However, all of the
applications owing between each pair of nodes uses the same shortcut entry. This ow
mask is used if a standard access list or a simple extended access list without port numbers
is in use on the router.
A full ow mask uses Layer 4 port numbers in addition to source and destination Layer 3
addresses. This creates a separate shortcut for every application owing between every pair
of nodes. By doing so, a full ow mask provides the highest level of control and allows the
NFFC to perform Layer 4 switching. Because it tracks ows at the application level, it can
also be used to provide very detailed trafc statistics via NetFlow Data Export (NDE), a
feature that is discussed in more detail later. The full ow mask is applied if extended access
lists referencing port numbers are in use.
For example, consider the network shown in Figure 11-12.
480 Chapter 11: Layer 3 Switching
10.1.1.1 10.1.2.1
10.1.1.2
10.1.2.2
Host-A
VLAN 1
Host-C
VLAN 2
10.1.1.3
Host-B
VLAN 1
Host-A and Host-B are assigned to VLAN 1, whereas Host-C is in VLAN 2. The Catalyst
and the router have been correctly congured for MLS. Example 11-5 shows the output of
the show mls entry command when using a destination ow mask.
Example 11-5 Sample show mls entry Output When Using a Destination Flow Mask
Cat-A> (enable) show mls entry
Last Used Last Used
Destination IP Source IP Prot DstPrt SrcPrt Destination Mac Vlan Port
--------------- --------------- ---- ------ ------ ----------------- ---- -----
MLS-RP 10.1.1.1:
10.1.1.2 10.1.2.2 TCP 11000 Telnet 00-00-0c-7c-3c-90 1 2/16
10.1.1.3 10.1.2.2 ICMP - - 00-60-3e-26-96-00 1 2/15
10.1.2.2 10.1.1.3 ICMP - - 00-00-0c-5d-0b-f4 2 2/17
Because only three destination IP addresses exist between the two VLANs in this network,
only three lines are displayed with a destination ow mask. Notice that all of the trafc
owing to a single destination address uses a single shortcut entry. Therefore, each line in
the output only shows information on the most recent packet going to each destination. This
fact is reected in the column headers that use names such as Last Used Source IP.
Example 11-6 shows the same output after a destination-source ow mask has been
congured.
MLS 481
Example 11-6 Sample show mls entry Output When Using a Destination-Source Flow Mask
Cat-A> (enable) show mls entry
Last Used
Destination IP Source IP Prot DstPrt SrcPrt Destination Mac Vlan Port
--------------- --------------- ---- ------ ------ ----------------- ---- -----
MLS-RP 10.1.1.1:
10.1.1.3 10.1.2.2 ICMP - - 00-60-3e-26-96-00 1 2/15
10.1.2.2 10.1.1.3 ICMP - - 00-00-0c-5d-0b-f4 2 2/17
10.1.1.2 10.1.2.2 TCP 61954 Telnet 00-00-0c-7c-3c-90 1 2/16
10.1.2.2 10.1.1.2 TCP Telnet 61954 00-00-0c-5d-0b-f4 2 2/17
Example 11-6 displays two lines of output for every pair of nodes that communicate
through the router (one for each direction). For example, the rst two lines indicate that the
last packets to travel between 10.1.1.3 and 10.1.2.2 was a ping (the rst line shows the ow
from 10.1.1.3 to 10.1.2.2, and the second line shows the ow in the opposite direction). The
last two lines show the two sides of a Telnet session between 10.1.1.2 and 10.1.2.2 (notice
how the source and destination port number numbers are swapped). Notice that trafc
between 10.1.1.2 and 10.1.1.3 do not show up (this information is Layer 2 switched and
does not use MLS). Also notice that the Last Used header only applies to the Prot
(protocol), DstPrt (destination port), and SrcPrt (source port) elds. It no longer applies
to the Source IP eld because every new source addresses creates a new shortcut entry.
Finally, Example 11-7 displays sample output after conguring a full ow mask.
Example 11-7 Sample show mls entry Output When Using a Full Flow Mask
Cat-A> (enable) show mls entry
Destination IP Source IP Prot DstPrt SrcPrt Destination Mac Vlan Port
--------------- --------------- ---- ------ ------ ----------------- ---- -----
MLS-RP 10.0.1.1:
10.0.1.2 10.0.2.2 TCP 11778 69 00-00-0c-7c-3c-90 1 2/16
10.0.2.2 10.0.1.3 TCP 110 11004 00-00-0c-5d-0b-f4 2 2/17
10.0.2.2 10.0.1.2 TCP 69 11778 00-00-0c-5d-0b-f4 2 2/17
10.0.1.2 10.0.2.2 TCP 65026 SMTP 00-00-0c-7c-3c-90 1 2/16
10.0.1.3 10.0.2.2 TCP 11002 Telnet 00-60-3e-26-96-00 1 2/15
10.0.1.2 10.0.2.2 TCP 12290 110 00-00-0c-7c-3c-90 1 2/16
10.0.1.2 10.0.2.2 TCP 11266 WWW 00-00-0c-7c-3c-90 1 2/16
10.0.1.2 10.0.2.2 TCP 64514 FTP 00-00-0c-7c-3c-90 1 2/16
10.0.2.2 10.0.1.2 TCP FTP 64514 00-00-0c-5d-0b-f4 2 2/17
10.0.2.2 10.0.1.3 TCP 69 11005 00-00-0c-5d-0b-f4 2 2/17
10.0.2.2 10.0.1.2 TCP WWW 63490 00-00-0c-5d-0b-f4 2 2/17
10.0.2.2 10.0.1.3 TCP 9 11001 00-00-0c-5d-0b-f4 2 2/17
10.0.2.2 10.0.1.3 ICMP - - 00-00-0c-5d-0b-f4 2 2/17
10.0.1.2 10.0.2.2 TCP 62978 9 00-00-0c-7c-3c-90 1 2/16
10.0.1.2 10.0.2.2 TCP 64002 20 00-00-0c-7c-3c-90 1 2/16
10.0.2.2 10.0.1.2 TCP Telnet 62466 00-00-0c-5d-0b-f4 2 2/17
10.0.1.2 10.0.2.2 TCP 63490 WWW 00-00-0c-7c-3c-90 1 2/16
10.0.1.2 10.0.2.2 TCP 62466 Telnet 00-00-0c-7c-3c-90 1 2/16
continues
482 Chapter 11: Layer 3 Switching
Example 11-7 Sample show mls entry Output When Using a Full Flow Mask (Continued)
10.0.2.2 10.0.1.3 TCP Telnet 11002 00-00-0c-5d-0b-f4 2 2/17
10.0.2.2 10.0.1.2 TCP WWW 11266 00-00-0c-5d-0b-f4 2 2/17
10.0.1.3 10.0.2.2 TCP 11004 110 00-60-3e-26-96-00 1 2/15
10.0.2.2 10.0.1.2 TCP SMTP 65026 00-00-0c-5d-0b-f4 2 2/17
10.0.1.3 10.0.2.2 TCP 11005 69 00-60-3e-26-96-00 1 2/15
10.0.2.2 10.0.1.2 TCP 110 12290 00-00-0c-5d-0b-f4 2 2/17
10.0.2.2 10.0.1.3 TCP WWW 11003 00-00-0c-5d-0b-f4 2 2/17
10.0.1.3 10.0.2.2 TCP 11003 WWW 00-60-3e-26-96-00 1 2/15
10.0.2.2 10.0.1.2 TCP 20 64002 00-00-0c-5d-0b-f4 2 2/17
10.0.1.3 10.0.2.2 ICMP - - 00-60-3e-26-96-00 1 2/15
10.0.1.3 10.0.2.2 TCP 11001 9 00-60-3e-26-96-00 1 2/15
10.0.2.2 10.0.1.2 TCP 9 62978 00-00-0c-5d-0b-f4 2 2/17
Notice that Example 11-7 includes every pair of communicating applications (both IP
addresses and port numbers are considered). Also notice that none of the elds include a
Last Used header because all of the individual ows are fully accounted for.
The multiple ow masks allow the NFFC to track information at a sufcient level of
granularity to ensure that denied packets do not slip through using a pre-existing shortcut
entry. However, to be truly secure, input access lists need to process every packet. As a
result, conguring an input access on the router disables MLS on that interface. However,
an optional parameter was introduced in 12.0 IOS images to allow input access lists at the
expense of some security risk. To enable this feature, specify the input-acl parameter on
the end of the mls rp ip global router command (Step 1 in the ve-step router conguration
process discussed later).
TIP The mls rp ip input-acl command can be used to enable input access lists at the expense
of some fairly minor security risks.
If multiple routers are in use with different ow masks, all MLS-capable Catalysts use the
most granular (longest) ow mask. In other words, if there are two routers without access
lists and a third router with a standard access list, the destination-source ow mask is used.
If you are not using access lists but you want to use a source-destination or full ow mask,
you can use the set mls {ow destination | destination-source | full} command to set a
minimum ow mask. For example, by forcing the ow mask to full, you can collect detailed
trafc statistics (see the Using MLS section).
MLS 483
Conguring MLS
Although the theory behind MLS is somewhat involved, it is fairly easy to congure. To fully
implement MLS, you must separately congure the router and the Catalyst Supervisor.
TIP If no VTP domain has been specied on the Catalysts (check this with show vtp domain),
you do not need to set one on the router (in other words, the Null domain is used). If you
use the mls rp ip or mls rp management-interface commands before specifying a VTP
domain, the interface is automatically assigned to the Null domain. To change the domain
name to something else, you need to remove all mls rp commands from that interface and
start reconguring it from scratch (current versions automatically remove the mls rp
commands when you enter no mls rp vtp-domain domain_name).
484 Chapter 11: Layer 3 Switching
Example 11-8 illustrates a router MLS conguration that is appropriate for the example
presented in Figures 11-4 through Figure 11-11.
Example 11-8 External Router MLS Conguration
mls rp ip
!
interface FastEthernet1/0
mls rp vtp-domain Skinner
!
interface FastEthernet1/0.1
encapsulation isl 1
ip address 10.1.1.1 255.255.255.0
mls rp management-interface
mls rp ip
!
interface FastEthernet1/0.2
encapsulation isl 2
ip address 10.1.2.1 255.255.255.0
mls rp ip
And nally, the same conguration running on a router using two Ethernet ports looks like
Example 11-10.
MLS 485
Example 11-10 External Router MLS Conguration for Multiple Ethernet Ports
mls rp ip
!
interface Ethernet0
ip address 10.1.1.1 255.255.255.0
mls rp vtp-domain Skinner
mls rp vlan-id 1
mls rp management-interface
mls rp ip
!
interface Ethernet1
ip address 10.1.2.1 255.255.255.0
mls rp vtp-domain Skinner
mls rp vlan-id 2
mls rp ip
TIP The address to include is displayed on the mls ip address eld of the show mls rp router
command. Be sure to enter show mls rp on the router, not the Catalyst Supervisor.
The switch supports a set mls [enable | disable]. However, because MLS is enabled by default
(given that you have the proper hardware and software), this command is not necessary.
Using MLS
Because the routing switch form of Layer 3 switching is a fairly new technique to most network
administrators, this section takes a look at some of the more important MLS commands.
The rst three lines after the initial prompt tell you if MLS is enabled on this Catalyst and
the congured aging timers. The next two lines indicate the ow mask currently in use and
if a minimum ow mask has been congured. The Total packets switched and Active
shortcuts lines can be very useful for keeping track of the amount of shortcut switching
being performed and the size of your shortcut cache (as mentioned earlier, it is best to keep
this value below 32,000 entries or current versions of the NFFC). The next three lines report
the status of NetFlow Data Export, a feature that is discussed later. The bottom section lists
all of the known routers, their XTAG values, and a list of the MAC addresses and VLANs.
Because the ow mask is set to destination, the cache only creates a single entry per
destination address. Each cache entry is shown on a separate line. The Last Used Source IP,
Protocol, Destination Port, and Source Port Fields show the characteristics of the packet
MLS 487
that most recently used this shortcut entry. Because a single entry exists for all source
nodes, protocols, and applications targeted to the destination address listed in the rst
column, it cannot list every type of packet individually (use a full ow mask for that level
of detail).
If your cache is large, you probably want to use one of the options to lter the output. The
full syntax for the show mls entry command is:
show mls entry {[destination ip_addr_spec] [source ip_addr_spec] | [flow protocol
[ccc] src_port dst_port]} [rp ip_addr]
For example, show mls entry rp 10.1.1.1 lists all of the cache entries created from router
10.1.1.1. show mls entry destination 10.1.2.20 lists the entries created for packets
containing a destination IP address of 10.1.2.20.
This information is very similar to the output of the show ip cache ow router command.
By listing trafc volumes by both packet and byte counts, it can be very useful for proling
your network.
As always, you should be very careful when using debug on production networks.
TIP Note in advance that almost all of the issues discussed in this section can be avoided by simply
making sure that every NFFC is paired with its own internal router such as the RSM (or
RSFC). Because this automatically creates a to the router and back set of ows (across the
blackplane of the Catalyst), it can dramatically simply your overall design considerations.
WAN Links
For example, MLS currently cannot be used on WAN links. Consider the network
illustrated in Figure 11-13.
MLS 489
Router-A Router-B
Red
Cat-A Cat-B
Host-A Host-B
As with the earlier examples, Host-A is sending packets to Host-B. Recognizing that Host-
B is on a different subnet, Host-A forwards all of the trafc to its default gateway, Router-
A. The NFFC in Cat-A recognizes the rst packet as a candidate packet and creates a partial
shortcut entry. However, an enable packet is never received by Cat-A because the trafc is
forwarded directly out the routers serial interface. The to the router and back ow
necessary for MLS is not present. Because the shortcut entry is never completed, it ages out
using the ve-second quick aging scheme.
e0 e1
te
En
ida
ab
nd
le
1 2
Ca
Red Blue
VLAN VLAN
Cat-A Cat-B
Host-A Host-B
The results in Figure 11-14 are very similar to those in Figure 11-13. Cat-A sees the
candidate packet, but only Cat-B sees the enable packet. Shortcut switching is not possible.
TIP MLS requires that the same NFFC or MSFC/PFC must see the ow traveling to and from
the router. This can require careful planning and design work in certain situations.
MLS 491
However, simply placing both VLANs on both switches does not necessarily solve the
problem. In Figure 11-15, both Cat-A and Cat-B contain the Red and Blue VLANs. An ISL
trunk has even been provided to create a contiguous set of Layer 2 bridge domains.
Figure 11-15 Although Both Switches Contain Both VLANs, MLS Is Not Possible
e0 e1
te
En
ida
1 2
ab
nd
Bl
d
le
Re
Ca
ue
Red & Blue ISL Red & Blue
VLANs VLANs
Cat-A Cat-B
Host-A Host-B
However, because non-trunk links are used to connect the router, the router only sends and
receives trafc for the Red VLAN to/from Cat-A, whereas all Blue trafc ows to/from
Cat-B. The effect of this is the same as in the previous two examples: Cat-A only sees the
candidate packet because the enable packet is sent to Cat-B.
e0 e1
Blue
Red
2
Enable
ISL
1
Candidate
Cat-A Cat-B
Host-A Host-B
As shown in Figure 11-16, this forces all inter-VLAN trafc to ow through Cat-A and,
therefore, makes shortcut switching possible.
ISL
Root Bridge
Cat-A
Se
t1
1 ate
En 2
gm
en
id
ab
gm
nd
en
le
t2
Ca
Se
Cat-B Cat-C
Segment 3
Blocking
Host-A Host-B
Because this is a redundant (that is, looped) Layer 2 topology, the Spanning-Tree Protocol
becomes involved. Figure 11-17 assumes that Cat-A is functioning as the Root Bridge for
all VLANs. This places one of the ports on Segment 3 in the Spanning Tree Blocking state.
As a result, trafc owing in the Red VLAN from Host-A to the router uses Segment 1.
Both Cat-B and Cat-A recognize this as a candidate packet and create a partial shortcut
entry. However, because trafc owing from the router to Host-B uses Segment 2, only Cat-
A sees the enable packet and creates a full shortcut entry. Cat-Bs partial shortcut entry ages
out in ve seconds.
Consider what happens if Cat-B becomes the Spanning Tree Root Bridge. Figure 11-18
provides a diagram for this situation.
494 Chapter 11: Layer 3 Switching
Cat-A
e
at
id
nd
Se
Ca
t1
gm
en
1
gm
en
t2
Blocking
Se
Enable
2
Cat-B Cat-C
Segment 3
Root Bridge
Host-A Host-B
This causes Spanning Tree to reconverge to a logical topology where one of the ports on
Segment 2 is Blocking. This allows the trafc from Host-A to the router to follow the same
path as in Figure 11-17. Both Cat-A and Cat-B recognize the rst packet as a candidate
packet and create a partial shortcut entry. However, the trafc owing from the router to
Host-B cannot use Segment 2 because it is blocked. Instead, the trafc ows back through
Cat-B and uses Segment 1 and Segment 3. Notice that this causes both Cat-A and Cat-B to
see the Enable Packet and complete the shortcut entry.
When the second packet is sent from Host-A to Host-B, Cat-B uses its shortcut entry to
Layer 3 switch the packet directly onto Segment 3, bypassing the router. Because Cat-A
does not see any trafc for the shortcut entry it created, the entry ages out in 256 seconds
by default. Although this allows MLS to function (in fact, it creates a more efcient ow in
this case), it can be disconcerting to see the shortcut switching operation move from Cat-A
to Cat-B only because of Spanning Tree. Obviously, the interaction between MLS and
MLS 495
Spanning Tree can get very complex in large and very at campus networks (yet one more
reason to avoid the at earth approach to campus design; see Chapters 14 and 15 for more
information).
Cat-C
2 Enable
Host-C
Cat-B
1 Candidate
Blue
Host-A Host-B
Cat-A
Red
Cat-A
* Blue
3 Remaining
Packets
* = Rewrite
First, look at the case of Host-A sending trafc to Host-B. The trafc from Host-A to the router
travels up the ISL links connecting the Catalysts and the router to each other. As the rst packet
hits the NFFC in each Catalyst, it is recognized as a candidate packet and three partial shortcut
entries are created (one per Catalyst). As the packet travels back down from the router to reach
Host-B, all three NFFC cards see the enable packet and complete the shortcut entries. However,
as additional packets travel from Host-A to Host-B, Cat-A shortcut switches them directly to
Host-B. The shortcut entries in Cat-B and Cat-C simply age out in 256 seconds (by default).
Now consider the ow from Host-A to Host-C. Again, all three NFFCs see the initial packet
as a candidate packet. However, the return packet only passes through Cat-C and Cat-B.
The partial shortcut entry in Cat-A ages out in ve seconds. As Host-A sends additional
packets, Cat-A uses normal Layer 2 switching to send the packets towards the MAC address
MLS 497
of the router. When Cat-B receives the packets, it recognizes that it has a completed shortcut
for this ow and shortcut switches the packets directly to Host-C. Cat-Cs shortcut entry is
not used and therefore ages out in 256 seconds. Figure 11-20 illustrates this sequence.
Cat-C 2 Enable
1 Candidate
Host-C
Cat-B
* 3 Remaining
Blue
Packets
Host-A Host-B
Cat-A
Red Blue
* = Rewrite
Router-A Router-B
4 Enable for 3
1 Candidate
Host A
Red * * Host B
Blue
5 Shortcut Switching =
Double Lookup and
Double Rewrite
Here, Host-A is still located in the Red VLAN and Host-B is still located in the Blue VLAN.
However, a new VLAN has been created between the two routers (call it the Purple VLAN).
Host-A still sends trafc destined to Host-B to its default gateway using the Red VLAN. As
the rst packet passes through the Catalyst, the NFFC recognizes it as a candidate packet
and creates a partial shortcut entry (labeled Step 1 in Figure 11-21). Router-A then forwards
the trafc over the Purple VLAN to Router-B. As the packet passes back through the
Catalyst, the NFFC recognizes the packet as an enable packet and completes the shortcut
entry (Step 2 in Figure 11-21). However, it also recognizes the destination MAC address as
that of Router-B and therefore sees this packet as another candidate packet (Step 3 in Figure
11-21). Router-B then routes the packet normally and forwards it to Host-B over the Blue
VLAN. As the packet passes back through the Catalyst for the third time, it is identied as
an enable packet for the partial entry created in Step 3. A second shortcut entry is created
(Step 4 Figure 11-21).
When additional trafc ows from Host-A to Host-B (Step 5 in Figure 11-21), two sets of
shortcut lookups and rewrite operations are performed. As a result, the additional packets
are not sent to either router. Neat!
MLS 499
Protocol Filtering
Protocol Filtering is the capability of the NFFC to limit broadcast and multicast trafc on
a per-port and per-protocol basis. As discussed in Chapter 5, VLANs, it allows a group
of nodes to be placed in a single VLAN and only receive trafc associated with the
protocols they are actually running. Four groupings of protocols exist: IP, IPX, a combined
group of AppleTalk and DECnet (some platforms also include VINES here), and a nal
group that contains all other protocols. By pattern matching on the protocol type
information contained in the Layer 2 header, the NFFC can, for example, lter IPX SAPs
on ports that are only using IP.
Protocol Filtering is disabled by default. To enable this feature, use the set protocollter
enable command. To congure Protocol Filtering, use the set port protocol command:
set port protocol mod_num/port_num {ip|ipx|group} {on|off|auto}
The group parameter corresponds to AppleTalk and DECnet (and, in some cases, VINES).
The on state forces that port to send broadcasts of the specied type. The off state forces
that port to not send broadcasts of the specied type. The auto state only send broadcasts
for the specied protocol if that protocol is detected coming in that port. This creates a
dynamic conguration where the Catalyst is detecting the protocols being run on each port
and only sending the appropriate broadcasts in response. IP defaults to the on state, and the
other protocol categories (IPX and group) default to auto.
The show protocollter command can be used to determine if Protocol Filtering is running on
a device. The show port protocol command can be used to view the conguration on a per-
port basis (including the number of nodes detected on a per port and per protocol basis). For
ports and protocols in the auto state, auto-on and auto-off are used to indicate the dynamically
selected setting currently in use. Trunk ports are excluded from Protocol Filtering.
One solution to this problem is the use of static CAM entries. However, given the growing
popularity of multicast usage, this can rapidly become a huge management problem. For
example, every time a user wants to join or leave a multicast group, it requires manual
intervention by the network administrators. In a large network, this can easily amount to
hundreds of entries and adjustments per day.
Clearly some sort of dynamic process is required. Three options are available for
dynamically building multicast forwarding tables: CGMP, GMRP, and IGMP Snooping.
This section briey discusses these three options, especially as they pertain to the NFFC.
For a more thorough discussion, please see Chapter 13.
Of these techniques, Cisco developed the Cisco Group Management Protocol (CGMP)
rst. This allows routers running the Internet Group Management Protocol (IGMP) to
update the Catalyst Layer 2 CAM table. IGMP is a protocol that allows end stations to
request that routers send them a copy of certain multicast streams. However, because it is a
Layer 3 protocol, it is difcult for a Layer 2 switch to speak this protocol. Therefore, Cisco
developed CGMP. Think of it as a mechanism that allows a Layer 3 router to tell a Layer 2
Catalyst about multicast group membership. As a result, the Layer 2 Catalyst forwards IP
multicast trafc only to end-station ports that are actually interested.
Conguring CGMP on a Catalyst is simple. It runs by default on most Catalysts, requiring
no conguration whatsoever. Other Catalysts, such as the 5000, require the set cgmp
enable command. The show multicast group cgmp command can be used to display the
multicast MAC address to port mappings created via the CGMP protocol. To congure
CGMP on the router, the ip cgmp command must be congured on the interfaces where
CGMP support is desired. In addition, some sort of multicast routing protocol must be
congured (PIM dense-mode is the simplest option).
In the future, the GARP Multicast Registration Protocol (GMRP) might become a
commonly used approach. GMRP uses the Generic Attributed Registration Protocol
(GARP) specied in 802.1p to provide registration services for multicast MAC addresses.
However, because work is still ongoing in the development of GMRP, this is not a suitable
option today.
The third option, IGMP Snooping, is a standards-based alternative to the Cisco Group
Management Protocol (CGMP). This relies on the pattern-matching capabilities of the
NFFC to listen for IGMP packets as they ow between the router and the end stations. By
inspecting these packets, the Catalyst can learn which ports have end stations interested in
which multicast groups.
Some vendors have implemented IGMP Snooping using general-purpose CPUs. However,
without some sort of hardware-based support, this approach suffers from extreme scaling
problems. This situation arises because IGMP messages are intermixed with data in
literally every multicast ow in the network. In short, vendors cannot simply point a single
IGMP multicast MAC address at the CPU. Instead, the switch must sort through every
packet of every multicast stream looking for and processing IGMP packets. Do not try this
on a general-purpose CPU!
MLS 501
NOTE This also suggests that IGMP is not a replacement for CGMP. IGMP is suitable for high-
end devices that contain ASIC-based pattern-matching capabilities. However, low-end
devices without this support still require the services of CGMP. In fact, many multicast
networks require both.
The good news is that IGMP Snooping is extremely easy to congure. Because IGMP
Snooping is a passive listening process much like routed running in quiet mode on a UNIX
box, no conguration is required on the router (although it still needs to be running a
multicast routing protocol). On the Catalyst, simply enter the set igmp enable command.
Use the show multicast group igmp command to display the list of multicast MAC address
to port mappings created via the IGMP Snooping process.
Quality of Service
Although the initial version of the NFFC (NFFC I) did not support Quality of Service (QoS)
and Class of Service (COS), more recent versions (NFFC II and MSFC/PFC) have included
this feature. This capability is targeted at being able to reclassify trafc in the wiring closet
at the edge of the network. This can allow mission critical trafc to be agged as such using
Layer 3 IP Type of Service (ToS) bits or Layer 2 capabilities such as 802.1p and ISL (the
ISL header contains 3 bits for COS). Devices that support sophisticated queuing and
scheduling algorithms such as the Catalyst 8500 can then act on these QoS/COS elds to
provide differentiated service levels. Because these capabilities are still evolving at the time
this book goes to press, they are not discussed here.
TIP The set mls nde ow command can be used to lter the amount of information collected
by NDE. For example, set mls nde ow destination 10.1.1.1/32 source 10.1.1.2/32
collects information owing only from 10.1.1.2 to 10.1.1.1.
502 Chapter 11: Layer 3 Switching
Switching Routers
Whereas MLS relies on hardware-based caching to perform shortcut switching, the
Catalyst 8500 relies on hardware to perform the same tasks as a traditional router, only
faster. To accomplish the extremely high throughput required in modern campus
backbones, the 8500s split routing tasks into two functional groups. The job of running
routing protocols such as OSPF and EIGRP for purposes of topology discovery and path
determination are handled by a general-purpose, RISC-based CPU (these are often referred
to as control plane activities). The job of doing routing table lookups and data forwarding
is handled by high-speed ASICs (this if often called the data plane). Combined, these
create a very fast but feature-rich and exible platform.
NOTE The Native IOS Mode of the Catalyst 6000 can also be used to implement the switching
router style of Layer 3 switching. This will be discussed later in this chapter, as well as in
Chapter 18.
In the case of the Catalyst 8510, Ciscos rst switching router targeted at the campus market,
the routing functions are performed by a Switch Route Processor (SRP). From a hardware
perspective, the SRP is essentially the same as the ATM Switch Processor (ASP) from a
Lightstream 1010 ATM switch. However, rather than running ATM routing protocols such as
PNNI, the SRP is used to run datagram routing protocols such as RIP and OSPF.
After the routing protocol has been used to build a routing table, the CPU uses this information
to create what is called a Cisco Express Forwarding (CEF) table. Just as the routing table lists
all of the possible locations this router can deliver packets to, the CEF table contains an entry
indicating how to reach every known location in the network. However, unlike a routing table,
which is limited to very basic information such as destination route, next hop, and routing
metric, the CEF table can be used to store a variety of information that pertains to features such
as Queuing and QoS/COS. Furthermore, because it is stored in a format that provides
extremely efcient longest-match lookups, it is very fast. CEF fullls the competing goals of
speed and functionality and represents an important step forward in routing technology. Cisco
has been using CEF with great success in their high-end, Internet-oriented routing platforms
since 1997 and has introduced it to the entire line of routers starting in IOS 12.0.
Switching Routers 503
Although the basic concept of CEF is available throughout Ciscos product line, the 8510
introduced a new use of this technology. The CPU located on the SRP is used to create the
CEF table, but it is not used to make forwarding decisions. Instead, the CPU downloads a
copy of the CEF table to every line card. The line cards then contain ASICs that perform
the actual CEF lookups at wire-rate speeds. From the point of view of the ingress port on
the 8510, it has a bunch of ATM-like virtual circuits that connect it to every other port in
the box (there are multiple virtual circuits [VCs] between all of the ports to facilitate QoS).
You can think of these VCs as tubes that the input port can use to send data to each output
port. If you then think of the incoming data as marbles, each input port simply uses the CEF
to determine which marble gets dropped in which tube.The result is a mechanism that
builds an efcient and exible forwarding table centrally using a general-purpose CPU, but
uses a distributed set of high-speed ASICs to handle the resource intensive process of
determining how to move frames through the box. When this is combined with the fact that
8500 switches are based on ATM technology internally and therefore support sophisticated
QoS mechanisms, the benets of CEF become extremely compelling.
The 8540, Ciscos next switching router, uses the same technique but with different
hardware. The primary differences are a new set of control and line cards and a larger
chassis that supports more interfaces and a higher-speed backplane/fabric (because the
8500s use ATM technology internally, they have more of a fabric than a backplane). In the
8540, the single SRP of the 8510 has been split into the Route Processor (RP) and the
Switch Processor (SP). The RP handles functions such as running routing protocols and
building the CEF tables (control plane). The line cards still contain ASICs that use a local
copy of the CEF table to make forwarding decisions (data plane). However, to move data
across the backplane/fabric, the line cards must use the services of the SP.
In most respects, another advantage to the 8500s approach to Layer 3 switching is that the
CPU runs the full IOS. Not only does this result in a more mature implementation of routing
protocols and other features, it makes conguration a breeze for anyone familiar with
Ciscos traditional router platform. Simply perform the normal conf t, int fa X/X/X, and
router ospf 1 sequence of commands and you are ready to roll in most situations. For
example, consider the network illustrated in Figure 11-22.
504 Chapter 11: Layer 3 Switching
VLAN 1
Cat-A
0/0/0
Cat-D
VLAN 4 & 5
Cat-A, Cat-B, Cat-C, and Cat-D are Catalyst 5000 devices performing the usual Layer 2
switching. Each of these has a single VLAN except Cat-D which has two VLANs. All of
the Catalyst 5000s are connected to a central 8500 for Layer 3 routing services. Example
11-16 shows a possible conguration for the 8500.
Switching Routers 505
All VLANs are congured for both IP and IPX trafc except VLAN 1 which is only using
IP. All of the IPX interfaces are using the default Ethernet encapsulation of novell_ether
except Fast Ethernet0/0/2 which is using ARPA (DIX V2). Also, because Cat-D is using
two VLANs, Fast Ethernet0/0/3 is congured for ISL. As with the ISL router-on-a-stick
examples earlier, each VLAN is congured on a separate subinterface.
TIP The show vlan command on the 8500 can be very useful for getting a quick overview of
which VLANs have been congured on which ports.
506 Chapter 11: Layer 3 Switching
EtherChannel
One feature that deserves special mention is EtherChannel. The 8500s support both Fast
and Gigabit EtherChannel. When conguring EtherChannel on any Cisco router (including
the 8500s), the conguration centers around a virtual interface known as the Port-Channel
interface. Your IP and IPX congurations are placed on this interface. The real Ethernet
interfaces are then included in the channel by using the channel-group command. For
example, the partial conguration in Example 11-17 converts Cat-D in Figure 11-22 to use
EtherChannel on Ports 0/0/3 and 0/0/4.
Example 11-17 Conguring EtherChannel on the Catalyst 8500 and Cisco Routers
interface Port-Channel1
description To Cat-D
no ip address
!
interface Port-Channel1.4
description VLAN 4
encapsulation isl 4
ip address 10.1.4.1 255.255.255.0
ipx network 4
!
interface Port-Channel1.5
description VLAN 5
encapsulation isl 5
ip address 10.1.5.1 255.255.255.0
ipx network 5
!
interface FastEthernet0/0/3
no ip address
channel-group 1
!
interface FastEthernet0/0/4
no ip address
channel-group 1
Notice that the ISL subinterfaces are created under the Port-Channel interface, not under
the real Fast Ethernet interfaces.
From a design perspective, MLS and 8500s approach the same problem (Layer 3 switching)
from completely different angles. On one hand, MLS is a technique that adds Layer 3
capabilities into predominately Layer 2 Catalysts. Think of MLS as enabling Layer 2
Catalyst Supervisors to move up into Layer 3 processing. On the other hand, the 8500s
function as a pure router that, like all Cisco routers, happens to also support bridging
functionality. It is not an issue of which device can or cannot do Layer 3 processingafter
all, both devices can do both Layer 2 and Layer 3. Instead, the issue is what layer a device
is most comfortable with (or what the device does by default).
TIP Routing switches and switching routers both support Layer 3 switching, but they approach
it from opposite directions. Routing switches are predominately Layer 2 devices that have
moved up into the Layer 3 arena. Conversely, switching routers are predominately Layer 3
devices that also happen to support Layer 2 bridging.
NOTE As will be discussed later in this section, it turns out that the 8500s make it fairly difcult
to implement VLANs that span multiple IDF switches. Under the 8500 approach, the
recommendation is to use different VLANs on every IDF. This design looks at things from
the point of view Why do they need to be in the same VLAN/subnet? Simply put both
users in different VLANs/subnets and let the wire-speed Layer 3 performance of the 8500
route all packets between these two nodes (after all, it essentially routes and bridges at the
same speed). Also, DHCP can be used to handle user-mobility problems, further
minimizing the need to place these two devices in the same subnet.
Another case where MLS strengths shine is in the wiring closet where port densities and
cost are very important issues. Placing a switching router in the wiring closet is usually cost
prohibitive. Instead, high-density and cost-effective Catalyst 5000s and 6000s can be used.
Where local trafc can be shortcut switched, MLS can ofoad processing from the
backbone routers. Furthermore, the NFFCs additional capabilities such as Protocol
Filtering, IGMP Snooping, and QoS classication can be extremely useful in wiring-closet
applications (in fact, this is where they are most useful).
TIP The primary advantage of a routing switch (MLS) is its unique capability to blend Layer 2
and Layer 3 technology.
On the other hand, MLS requires that you take specic actions to fully realize the scalability
benets of Layer 3 processing. For example, Chapter 7 discussed the importance of using
Layer 3 processing to break large campus networks into smaller Spanning Tree domains.
However, just blindly installing MLS-capable switches does not do this. Figure 11-23
illustrates a large network containing 50 MLS-capable switches with RSMs (for simplicity,
not all are shown) and 50 VLANs.
MLS versus 8500s 509
Layer 2 Core
with 50 VLANs
As you can see, the net effect is a huge, at network with lots of routers sitting on the
perimeter. The RSM and the MLS processing are not creating any Layer 3 barriers. The
VLAN Trunking Protocol (VTP) discussed in Chapter 12, VLAN Trunking Protocol,
automatically puts all 50 VLANs on all 50 switches by default (even if every switch only
uses two or three VLANs). Every switch then starts running 50 instances of the Spanning-
Tree Protocol. If a problem develops in a single VLAN on a single switch, the entire
network can quickly collapse.
Creating Layer 3 partitions when using the MLS-style of Layer 3 switching requires careful
design and planning of VLANs and trunk links. Figure 11-24 illustrates one approach.
510 Chapter 11: Layer 3 Switching
Cat-1D Cat-2D
Building 1
Building 2
Cat-1C Cat-2C
VLAN 30
VLAN 31
In this case, VLANs have not been allowed to spread throughout the campus. Assume that
that the campus represents two buildings. VLANs 110 have been contained with Building
1. VLANs 1120 have been placed in Building 2. A pair of links connects the two buildings.
Rather than simply creating ISL links that trunk all VLANs across to the other building,
non-trunk links have been used. By placing each of these links in a unique VLAN, you are
forcing the trafc to utilize Layer 3 switching before it can exit a building. Also, because
VTP advertisements are sent only on trunk links, this prevents VTPs default tendency of
spreading every VLAN to every switch.
TIP Another strategy that helps create Layer 3 barriers in an MLS network is assigning a unique
VTP domain to each building. VTP advertisements are only shared between Catalysts that
have matching VTP domain names. If each building has a different VTP domain name, the
VLANs are contained.
excels. Because 8500s are simply a faster version of the traditional Cisco router, they
automatically create Layer 3 barriers that are the key to network stability and scalability.
For example, 8500s do not run the Spanning-Tree Protocol unless bridging is specically
enabled. Similarly, the 8500s do not pass VLANs by default. Instead, they terminate
VLANs and then route them into other VLANs. Therefore, you must take specic steps
(such as enabling bridging) on an 8500 to not get the benets of Layer 3 partitions. Figure
11-25 illustrates this point.
Figure 11-25 Using an 8500 to Link Layer 2 Catalysts
Cat-1D Cat-2D
Cat-1C Cat-2C
Without any special effort on the part of the Catalyst 5000s, the 8500s automatically isolate
each building behind a Layer 3 barrier. This provides many benets such as improved
Spanning Tree stability and performance, easier conguration management, and improved
multicast performance.
TIP The primary advantage of switching routers (8500s) is simplicity. They allow a network to
be as simple as the old router and hub design while also having the performance of modern-
day switching.
512 Chapter 11: Layer 3 Switching
Notice that the Catalyst 8500 is such a Layer 3-oriented box that it essentially has no
concept of a VLAN. Yes, it does support bridge groups, an alternate means of creating
multiple broadcast domains. However, it currently does not directly support VLANs and all
of the VLAN-related features you nd on more Layer 2-oriented platforms such as the
Catalyst 5000 and 6000 (such as VTP and Dynamic VLANs). This essentially brings the
discussion full circle to the opening point of this section: if you need a box with
sophisticated Layer 2 features such as VLANs, VTP, and DISL/DTP, but you also need
high-performance Layer 3 switching, go with MLS. If, on the other hand, you desire the
simplicity of a traditional router-based network, 8500s are the solution of choice.
TIP One implication of the discussion in this section is that 8500s virtually require a design that
does not place the same VLAN/subnet on different IDF switches (it can be done through
IRB, but, as discussed early, the use of IRB on a large scale should be avoided). On the other
hand, the more Layer 2-oriented nature of MLS makes it fairly easy to have a single VLAN
connect to multiple IDF switches.
NOTE In a Catalyst 6000, the NFFC functionality is technically handled by an additional card
known as the Policy Feature Card (PFC). However, because current implementations
require an MSFC to allow a PFC to perform Layer 3 switching (alone, the PFC can provide
QoS and access list features), this text will simply refer to the MSFC.
The MSM was the initial Layer 3 offering for the Catalyst 6000s. Based on the 8510 SRP,
this card offers approximately 5 million pps for IP and IPX routing. From a conguration
standpoint, it uses four Gigabit Ethernet connections to the backplane. Each of these ports
can be used in a separate VLAN. Or, by enabling Gigabit EtherChannel on these ports, it
can be used as a single interface supporting any number of VLANs. As with the router-on-
a-stick approach discussed earlier, each VLAN can then be congured on a separate
subinterface.
HSRP 513
The second phase of Layer 3 switching for the Catalyst 6000s introduced the MSFC. This
brings NFFC II functionality to the Catalyst 6000s, allowing full MLS support at 15 million
pps for both IP and IPX. This also provides software-based routing services via technology
derived from the 7200 router NPEs. By doing so, it completely eliminates the need for also
having an MSM in the same chassis. The on-board router uses software routing to handle
the rst packet of every IP or IPX ow. The remaining packets are then handled in hardware
by MLS. Finally, the on-board router can also be used to provide full software-based
multiprotocol routing for protocols such as AppleTalk, DECnet, and VINEs at
approximately 100,000 pps (Fast-Switched speeds).
One of the most interesting features of the MSFC, is that its conguration and management
characteristics can be completely changed by using one of two different software images.
Under the rst option, the software-based router uses a traditional IOS image while the
Supervisor uses the traditional XDI/CatOS image. This results in a user-interface and
conguration process that is virtually identical to that discussed in the MLS section
earlier in the chapter. This is referred to as the MSFC Hybrid Mode. In the second option,
the MSFC Native IOS Mode both the software-based router and the Supervisor run full
IOS images. This creates an extremely integrated user interface. In short, by simply
modifying the software on your Catalyst you can convert a very switch-like device into a
full-blown router! For more information, see Chapter 18, Layer 3 Switching and the
Catalyst 6000/6500s.
HSRP
Ciscos Hot Standby Router Protocol (HSRP) plays an important role in most campus
networks. The primary mission of HSRP is providing a redundant default gateway for end
stations. However, it can also be used to provide load balancing. This section discusses both
of these issues.
Many end stations allow only a single default gateway. Normally, this makes these hosts
totally dependent on one router when communicating with all nodes off the local subnet.
To avoid this limitation, HSRP provides a mechanism to allow this single IP address to be
shared by two or more routers as illustrated in Figure 11-26.
514 Chapter 11: Layer 3 Switching
Figure 11-26 HSRP Allows Multiple Routers to Share IP and MAC Addresses
Backbone
e1 e1
Router-A Router-B
Segment 3
Cat-A Cat-B
Segment 1 Segment 2
Cat-C
Host-A
IP: 10.1.1.42
Default Gateway: 10.1.1.1
Although both routers are assigned unique IP addresses as normal (10.1.1.2 and 10.1.1.3),
HSRP provides a third address that both routers share. The two routers exchange periodic
hello messages (every three seconds by default) to monitor the status of each other. One
router is elected the active HSRP peer and handles all router responsibilities for the shared
address. The other node then acts as the standby HSRP peer. If the standby peer misses
HSRP hellos for Hold Time seconds (by default, Hold Time is 10 seconds and Hello Time
is 3 seconds; therefore, the default timers require that 3 hellos are missed), it assumes that
the active peer has failed and takes over the role of the active peer.
HSRP 515
One of the subtleties of HSRP is that the routers do not just share an IP address. To create
a truly transparent failover mechanism, they must also share a MAC address. The routers
therefore use an algorithm to create a shared virtual MAC address. As with the shared IP
address, the active peer is the only node using the derived MAC address. However, if the
active peer fails, the other device not only adopts the shared IP address, but also the shared
MAC address. By doing so, the ARP cache located in every end station on the network does
not require updating after a failover situation.
TIP Although the shared MAC address prevents ARP cache problems during an HSRP failover
scenario, it can be a problem when initially testing HSRP. For example, assume that you
convert an existing router using the 10.1.1.1 address into an HSRP conguration where
10.1.1.1 becomes the shared IP address. At this point, the end stations still have the real
MAC address associated with the original router, not the virtual MAC address created by
HSRP. To solve this problem, reboot the end stations or clear their ARP caches.
Note that HSRP can be useful even in cases where the TCP/IP stack running on your clients
supports multiple default gateways. In some cases, the mechanisms used by these stacks to
failover to an alternate default gateway do not work reliably. In other cases, such as with
current Microsoft stacks, the redundancy feature only works for certain protocols (such as
TCP, but not UDP). In either case, most organizations do not want to leave default gateway
reliability to chance and instead implement HSRP.
TIP HSRP is useful even if your TCP/IP stack allows multiple default gateways.
Example 11-18 presents a sample HSRP conguration for the Router-A in Figure 11-26.
The real IP address is assigned with the usual ip address command. HSRP parameters are then
congured using various standby commands. The shared IP address is added with standby
group_number ip ip_address command. This command needs to be entered on both routers.
In most campus designs, some thought should be given as to the proper placement of the
active peer. In general, the following guidelines should be used:
The active HSRP peer should be located near or at the Spanning Tree Root Bridge.
A router should relinquish its role as the active HSRP peer if it looses its connection
to the backbone.
In networks that contain Layer 2 loops, the Spanning Tree Root Bridge acts as the center of the
universe. Other bridges then look for the most efcient path to this device. By placing the active
HSRP peer at or near the Root Bridge, the Spanning-Tree Protocol automatically helps end-
user trafc follow the best path to the default gateway. For example, if Router-A is the active
HSRP peer in Figure 11-26 but Cat-B is the Spanning Tree Root Bridge, Segment-1 has a port
in the Blocking state. This forces all of the default gateway trafc to take an inefcient path
through Cat-B (using Segment 2 and Segment 3). By co-locating the active HSRP peer and the
Root Bridge at Cat-A and Router-A, this unnecessary bridge hop can be eliminated.
To force Cat-A to be the Root Bridge, the set spantree root or set spantree priority
commands discussed in Chapter 6 can be used. To force Router-A to the active HSRP peer,
the standby group_number priority priority_value command can be used. The peer with
the highest priority_value becomes the active peer (the default is 100). In this case, Router-
A has a congured priority of 110, making it win the active peer election. However, if
Router-A boots after Router-B, it does not supercede Router-B by default (it waits for
Router-B to fail rst), creating the same inefcient pattern discussed earlier. This can be
avoided by conguring the standby group_number preempt command. This causes a
router to instantly take over as soon as it has the highest priority.
TIP Unlike the Spanning-Tree Protocol where lower values are always preferred, HSRP prefers
higher values.
The second guideline speaks to a situation where a router has the highest priority, but it has
lost its connection to the rest of the network. For example, Router-A is the active HSRP peer
but its Ethernet1 link goes down. Although this does not prevent trafc from reaching the
backbone (Router-A can use its Ethernet0 interface to send trafc to the backbone through
Router-B), it does lead to an inefcient ow. To prevent this situation, the standby track
HSRP 517
option can be used as shown in Example 11-18. The value indicated by the standby track
command is the value that gets decremented from the nodes priority if the specied
interface goes down. Multiple standby track commands can be used to list multiple
interfaces to track (if more than one interface goes down, the decrement values are
cumulative). In this example, if Router-A loses interface Ethernet1, the priority is lowered
to 95. Because this is lower than the default priority of 100 being used by Router-B, Router-
B takes over as the active peer and provides a more optimal ow to the backbone.
Although the conguration discussed in this section does provide a redundant default
gateway for the end stations connected to Cat-C, it does suffer from one limitation: Router-
A is handling all of the trafc. To eliminate this problem, multiple VLANs should be
created on Cat-C. Each VLAN uses a separate group_number on the standby command.
Then, the VLANs should alternate active peers between the two routers. For example,
Router-A could be the active peer for all odd-numbered VLANs, and Router-B could be the
active peer for all even-numbered VLANs. Example 11-19 presents a sample conguration
for Router-A (two VLANs and an ISL interface are used).
TIP Alternate HSRP active peers for different VLANs between a pair of routers. This provides
load balancing in addition to redundancy.
TIP The HSRP syntax allows a single standby group to be created without specifying the
group_number parameter. However, I recommended that you always specify this parameter
so that it is much easier to add other standby groups in the future. Also, using the default
standby group can lead to very strange behavior if you accidentally use it in an attempt to
congure HSRP for multiple VLANs.
HSRP 519
Backbone
Router-A Router-B
HSRP Group 1: 10.1.1.1
Real IP: Real IP:
10.1.1.3 HSRP Group 2: 10.1.1.2 10.1.1.4
Cat-C
Host-A Host-B
The design in Figure 11-27 has the wiring closet switch directly connected to a pair of
switching routers such as the Catalyst 8500. This eliminates all Layer 2 loops and removes
Spanning Tree from the equation (although there is a link between the two routers, it is a
separate subnet). Furthermore, because a single VLAN is in use on Cat-C, the wiring closet
switch, the alternating VLANs trick cannot be used.
In this case, the most effective solution is the use of Multigroup HSRP (MHSRP). This
feature allows multiple HSRP group_numbers to be used on a single interface. For
example, Example 11-21 shows a possible conguration for Router-A.
520 Chapter 11: Layer 3 Switching
NOTE Some of the low-end routers use a Lance Ethernet chipset that does not support MHSRP.
However, all of the devices suitable for campus backbone use do support MHSRP.
The code in Example 11-21 creates two shared addresses between Router-A and Router-B
for a single subnet. Load balancing can then be implemented by having half of the hosts on
Cat-C use 10.1.1.1 as a default gateway and the other half use 10.1.1.2. The potential
downside is that you have to come up with some way of conguring different hosts to use
different default gateways. Fortunately, DHCP provides a simple and effective technique to
accomplish this.
Because existing DHCP standards do not provide for server-to-server communication,
organizations are forced to divide every scope (a scope can be loosely dened as a subnets
worth of DHCP addresses) of addresses into two blocks (assuming they want redundant
DHCP servers). Each of two redundant DHCP servers receives one half of each scope. For
example, a /24 subnet with 54 addresses reserved for xed conguration leaves 200
addresses available for DHCP. 100 of these addresses can be placed on the rst DHCP
server and the other 100 are placed on the second DHCP server. If one of the DHCP servers
fails, the other can provide addresses for clients on the network (assuming that no more than
100 new addresses are required). Because each server has its own block of globally unique
addresses for every server, the lack of a server-to-server protocol is not a problem.
DHCP supports a variety of options that can be used to congure client stations. The DHCP
Option 3 allows DHCP servers to provide a default gateway (or a list of default gateways) to
clients. Simply congure one DHCP server with the rst shared HSRP address (10.1.1.1 in
Figure 11-26) and the other DHCP server with the second shared HSRP address (10.1.1.2).
For this technique to work, it requires a fairly random distribution of leases between the two
DHCP servers. If one server ends up issuing 90 percent of the leases, one of the routers
likely receives 90 percent of the trafc. To help ensure a random distribution, you should
place the two DHCP servers close to each other (generally in the same server farm). You
can also alternate the order of ip helper-address statements between the two routers. For
example, assuming that the DHCP servers have the addresses 10.1.55.10 and 10.1.55.11,
Router-A might use the ip helper-address conguration in Example 11-22.
Integration between Routing and Bridging 521
Conversely, Router-B can then use the opposite order as demonstrated in Example 11-23.
This causes Router-A to give a slight advantage to the DHCP server located as 10.1.55.10,
but Router-B gives an advantage to the other DHCP server. In general, both servers have an
equal chance of responding rst to the DHCP_DISCOVER packets that clients use to
request a lease.
TIP If this DHCP and MHSRP trick is not to your liking, consider placing a Layer 3 switch in
the IDF wiring closet. Although this can be cost-prohibitive, it allows all devices connected
to that IDF to use the IDF switch itself as a default gateway. The Layer 3 routing capabilities
in the IDF switch can then choose the best path to use into the campus backbone and
automatically balance the load over both uplinks. However, I should also point out that this
can be difcult to implement with routing switch (MLS) devices. In general, it is much
easier to accomplish with switching router designs such as the Catalyst 8500 and the native
IOS router mode of the 6000.
NOTE A third technique is possible using the Catalyst 6000 Native IOS Mode. This will be
discussed in Chapter 18.
522 Chapter 11: Layer 3 Switching
TIP Other network designers cringe at this thought because it no longer keeps the VLANs
separate.
This sort of bridging can be easily congured using the same bridge-group technology that
Cisco routers have supported for many years. For example, the conguration in Example
11-24 enables bridging between VLANs 2 and 3 on an 8500.
The conguration in Example 11-24 results in IP and IPX trafc being routed between
subinterfaces Fast Ethernet1/0.2 and 1/0.3 while all other protocols are bridged. No
bridging is performed on subinterfaces Fast Ethernet1/0.1 and 1/0.4. Notice that this
requires IP users in VLANs 2 and 3 to use different IP subnets (and IPX networks) but the
same AppleTalk cable range.
Integration between Routing and Bridging 523
TIP When using bridge-groups, remember that protocols congured with a Layer 3 address will
be routed while all other protocols will be bridged. For example, if you only congure an
IP address on a given interface, IP will be routed and all other protocols (IPX, AppleTalk,
DECnet, and so on) will be bridged.
IRB
Integrated Routing and Bridging (IRB) is a technique Cisco introduced in IOS 11.2 to allow
a single protocol to be both bridged and routed on the same box. IOS 11.1 introduced a
precursor to IRB called Concurrent Routing and Bridging (CRB). This allowed a particular
protocol such as IP to be both bridged and routed on the same device. It allowed all of the
routed interfaces using this protocol to communicate and all of the bridged interfaces to
communicate. However, CRB did not let routed interfaces communicate with the bridged
interfaces. In other words, the routed and bridged worlds for the congured protocol were
treated as two separate islands. Most people were not looking for this functionality.
IRB lled this gap by allowing communication between these two islands. This enabled
congurations such as those shown in Figure 11-28.
10.1.1.20
fa 0
10.1.4.62 /0/3 /0/0
fa 0 Host-A
Host-D
fa 0/0/1
10.1.2.15
/4
fa 0/0 fa
0 /0/ Host-B
2
10.1.4.63
Host-E
10.1.3.42
Host-C
524 Chapter 11: Layer 3 Switching
The interfaces on the right side of the router (fa0/0/0, fa0/0/1, and fa0/0/2) all use IP
addresses on separate IP subnets. Conversely, the interfaces on the left (fa0/0/3 and fa0/0/
4) both fall on the same subnet. And, because IRB is in use, 10.1.4.62 could ping 10.1.1.20
(this is not possible in CRB).
To create a link between the routed and bridged domains, Cisco created a special virtual
interface known as a Bridged Virtual Interface (BVI). The BVI can be congured with
Layer 3 addresses (it cannot be congured with bridging statements) and acts as a routed
interface into the rest of the box. For example, the BVI in Figure 11-28 might use an IP
address of 10.1.4.1, as illustrated in Figure 11-29.
10.1.1.2.0
/0
10.1.4.62
fa fa 0/0
0/0/ Host-A
Host-D 3
1.1
10.1.
fa 0/0/1
BVI
10.1.2.1
fa 0/0/4 10.1.2.15
10.1.4.1
10.1.4.63 10.
1.3 Host-B
.1
fa 0
Host-E /0/2
10.1.3.42
Host-C
If interface fa0/0/4 receives a frame with Host-Ds MAC address, it bridges it out interface
fa0/0/3. However, if Host-D pings Host-A, Host-D sends the IP packet to its default
gateway, address 10.1.4.1 (the BVIs address). If necessary, Host-D ARPs for 10.1.4.1 to
learn the BVIs MAC address. When interface fa0/0/4 receives the trafc with a MAC
address that belongs to the BVI, it knows to route, not bridge, the trafc. The normal routing
process then sends the trafc out interface fa0/0/0.
The BVI essentially acts as a single routed interface on behalf of all of the bridged
interfaces in a particular VLAN. In Figure 11-29, the BVI communicates with the right side
of the box through routing, whereas the left side uses bridging.
TIP You need one BVI interface for each VLAN that contains two or more interfaces on a single
IOS-based router (except for the ports that make up an individual EtherChannel bundle).
Integration between Routing and Bridging 525
Example 11-25 shows sample conguration for the network illustrated in Figures 11-28
and 11-29.
TIP The BVI interface number must match the bridge group number.
IRB is an important feature on platforms such as the Catalyst 8500. For instance, consider the
case where you want to directly connect 10 servers to an 8540 along with 20 trunks that lead
to separate wiring closet switches. Assume the servers are going to all be placed in the same
subnet (therefore bridging IP trafc), whereas each wiring closet switch uses a separate
subnet (therefore routing IP). Without IRB, it is not possible to both route and bridge the IP
trafc in one device. In other words, IRB allows you to route IP subnets while also extending
the server farm VLAN through the router. Another advantage to IRB is that it is performed at
nearly wire speed in the 8500 (IRB is Fast-Switched in software-based routers).
However, IRB does have its downsides. Most importantly, the extensive use of IRB can
create conguration nightmares. For example, consider an 8540 with 100+ interfaces and
30 or 40 BVIs (one for every VLAN that needs to mix routing and bridging). Also, it can
quickly deplete the number of available logical interfaces supported by the IOS. To avoid
these issues, IRB should be used to solve specic and narrowly dened problems. Do not
try to build your entire network on IRB.
526 Chapter 11: Layer 3 Switching
TIP A single BVI that contains many interfaces does not present conguration challenges.
However, a design that uses many BVIs can lead to problems and confusion.
NOTE As discussed earlier, switching router platforms such as the 8500s do not suffer from this
drawback because they perform both Layer 2 and Layer 3 forwarding at essentially the
same speed.
However, even if the throughput problem is not an issue in your network, there is another
problem that can surface. Assume that you link two Layer 2 domains with the conguration
in Example 11-26.
Example 11-26 Routing IP and IPX While Bridging All Other Protocols
interface FastEthernet0/0/0
no ip address
!
interface FastEthernet0/0/0.1
encapsulation isl 1
ip address 10.1.1.1 255.255.255.0
ipx network 1
bridge-group 1
!
interface FastEthernet0/0/0.2
encapsulation isl 2
ip address 10.1.2.1 255.255.255.0
ipx network 2
bridge-group 1
!
bridge 1 protocol ieee
Integration between Routing and Bridging 527
The conguration in Example 11-26 routes IP and IPX between VLANs 1 and 2 but also allows
non-routable trafc such as NetBIOS/NetBEUI to be bridged through the router. However, this
also merges the Spanning Trees in VLAN 1 and 2. This introduces two signicant problems:
Scalability problemsWhen two (or more) Spanning Trees are merged into a single
tree, all of the Spanning Tree scalability benets created by introducing routers
disappear. A single Root Bridge is established between both VLANs. A single set of
least-cost paths is found to this Root Bridge. Spanning Tree instability can easily and
quickly spread throughout both VLANs and create network-wide outages.
Defeats Spanning Tree load balancingRecall from Chapter 7 that Spanning Tree
load balancing uses multiple VLANs to create different logical topologies between
VLANs. For example, VLAN 1 can use Trunk-1 and VLAN 2 can use Trunk-2.
However, if the Spanning Trees are merged because of inter-VLAN bridging, both
VLANs are forced to use a single trunk. This cuts your available bandwidth in half!
Even if you nd a way to avoid the issues previously discussed, you must plan very carefully
to avoid a third problem that I refer to as the Broken Subnet Problem. Figure 11-30 illustrates
this issue.
Figure 11-30 The Broken Subnet Problem
Root Bridge
10.1.1.5
Cat-A
Host-A
10.1.1.1 10.1.1.2
Router-A Router-B
10.1.2.1 10.1.2.2
Cat-B Cat-C
Blocking
Host-B
10.1.2.6
528 Chapter 11: Layer 3 Switching
Three Layer 2 Catalysts and two routers have been used to form a ring. Router-A and
Router-B have been congured to route IP and bridge all other protocols using a bridge-
group 1 command on both Ethernet interfaces. As a result, IP sees the topology shown in
Figure 11-31.
10.1.1.5
Host-A
10.1.1.1 10.1.1.2
10.1.2.1 10.1.2.2
Host-B
10.1.2.6
In other words, IP has divided the network into two subnets. However, the Spanning-Tree
Protocol has a very different view of the world. Spanning Tree knows nothing about IPs
interpretation. Instead, it sees only a ring of ve Layer 2 devices. Assuming that Cat-A
becomes the Root Bridge, Spanning Tree creates the topology shown in Figure 11-32 (see
Chapter 6 for information on DP, RP, F, and B).
Integration between Routing and Bridging 529
Figure 11-32 Spanning Trees View of the Network Shown in Figure 11-29
Root Bridge
Host-A
DP F Cat-A F DP
19 19
RP RP
F F
Router-A Router-B
F F
DP DP
19 19
RP
RP DP F
F
F B
Cat-B 19 Cat-C
Blocking
Unfortunately, this can create a very subtle problem. Consider what happens if Host-A tries to
send IP data to Host-B. Host-A recognizes that Host-B is in a different subnet and forwards the
trafc to Router-A, its default gateway. Router-A does the normal IP routing thing and forwards
the trafc out its interface E1. However, the trafc cannot be delivered to Host-B because the
Layer 2 switches have blocked the path (remember that Spanning Tree blocks all trafc on a
Catalyst, not just non-routable trafc)! The dashed line in Figure 11-30 shows this path.
The Broken Subnet Problem can be extremely difcult to troubleshoot and diagnose. In many
cases, only a very small number of nodes experience problems communicating with each
other. For example, Host-A cannot reach Host-B, but it might be able to reach every other
address in the network. If your network has Spanning Tree stability problems, the broken link
530 Chapter 11: Layer 3 Switching
is constantly shifting locations. Furthermore, the failure is protocol specic. If Host-A tries to
reach Host-B using any protocol other than IP, it succeeds. All of these issues can lead to many
extremely long days of troubleshooting before the actual problem is discovered.
Figure 11-33 shows how this can be done by increasing the cost of the link between Cat-A
and Router-A to 1000 (the cost actually needs to be increased on the Router, not Cat-A,
because Cat-A is the Root Bridge).
Integration between Routing and Bridging 531
Root
Bridge
DP DP
F Cat-A F
Blocking
00
19
10 RP
B Increase
Cost F
Router-A Router-B
F F
RP DP
19
19
RP
DP F
F
Cat-B 19 Cat-C
F F
RP
DP
Another option is to move the Root Bridge to Router-B. When using this approach, Bridge
Priorities or Path Costs might also have to be adjusted to force the Blocking port to the
router-end of the link to Cat-B.
As discussed in Chapter 6, Radia Perlman created the DEC version of the Spanning-Tree
Protocol, which was initially offered on DEC equipment. The IEEE version was the one
standardized in 802.1D. VLAN-Bridge is a Cisco-specic extension to the IEEE protocol
(it uses the same BPDU layout with a SNAP header). It was created to provide exactly the
sort of feature being discussed in this section.
For example, in Figure 11-30, the Catalysts Cat-A, Cat-B, and Cat-C are using the IEEE
Spanning-Tree Protocol, and Router-A and Router-B are using the DEC Spanning-Tree
Protocol.
At rst thought, this sounds like a dangerous proposition. After all, mixing the two
protocols can lead to situations where one of the protocols does not detect a loop and the
whole network collapses. However, because Layer 2 Catalysts process BPDUs slightly
differently than IOS routers, the results can be both safe and effective. This is possible
because of the following two characteristics:
Layer 2 Catalysts ood DEC BPDUs
IOS routers swallow IEEE BPDUs if they are only running the DEC variation of the
Spanning-Tree Protocol
As a result, the IEEE BPDUs are blocked by the routers, creating a topology that resembles
that shown in Figure 11-34.
Figure 11-34 IEEE Topology When Running Two Versions of the Spanning-Tree Protocol
Root Bridge
Cat-A
Cat-B Cat-C
Root Bridge
Integration between Routing and Bridging 533
The IEEE Spanning-Tree Protocol views the network as being partitioned into two separate
halves. Each half elects its own Root Bridge.
On the other hand, the DEC Spanning-Tree Protocol running on the routers sees a very
different view of the network. From the DEC perspective, the Layer 2 Catalysts do not exist
because DEC BPDUs are ooded without alteration by Cat-A, Cat-B, and Cat-C. This
creates the logical topology diagrammed in Figure 11-35.
Figure 11-35 DEC Topology When Running Two Versions of the Spanning-Tree Protocol
Blocking
DP
B Root Bridge
F
Router-A Router-B
F F
RP DP
NOTE Notice that the DEC Spanning-Tree Protocol uses a cost of 1 for Fast Ethernet links.
NOTE As a minor detail, this assumes that the upper interface on Router-B has a lower Port ID.
See the Port/VLAN Priority Load Balancing section of Chapter 7 for more detail.
Although this can hardly be called the most intuitive approach, it can be very useful in
networks that need to mix routing and bridging technology.
TIP Use multiple versions of the Spanning-Tree Protocol to avoid the Broken Subnet Problem
where the simpler techniques discussed in the previous two sections are not an option.
Review Questions
This section includes a variety of questions on the topic of this chapterLayer 3 switching.
By completing these, you can test your mastery of the material included in this chapter as
well as help prepare yourself for the CCIE written and lab tests.
1 What is the difference between routing and Layer 3 switching?
2 Can the router-on-a-stick approach to inter-VLAN routing also support inter-VLAN
bridging?
3 How can the RSM be especially useful in remote ofce designs?
Understanding VTP
VTP is a Cisco proprietary, Layer 2 multicast messaging protocol that can ease some of the
administrative burden associated with maintaining VLANs. VTP maps VLANs across all
media types and VLAN tagging methods between switches, enabling VLAN conguration
consistency throughout a network. VTP reduces manual conguration steps required at
each switch to add a VLAN when the VLAN extends to other switches in the network.
Further, VTP minimizes potential conguration mismatches and manages the addition,
deletion, and renaming of a VLAN in a more secure fashion than making manual changes
at every switch. VTP is a value-add software feature specic to Cisco Catalyst switch
products such as the 1900, 2820, 2948G, 3000, 4003, 5000 family, and 6000 family.
Not infrequently, users confuse the difference between VTP, ISL, 802.1Q, DISL, and DTP.
All of these protocols involve trunks, but have different purposes. Table 12-1 compares
these protocols.
ISL and 802.1Q specify how to encapsulate or tag data transported over trunk ports. The
encapsulation and tagging methods identify a packets source VLAN. This enables the
switches to multiplex the trafc from multiple VLANs over a common trunk link. Chapter 8,
Trunking Technologies and Applications describes these two methods and how they
function.
DISL and DTP help Catalysts to automatically negotiate whether to enable a common link
as a trunk or not. The Catalyst software included DISL until Cisco incorporated support for
802.1Q. When 802.1Q was introduced, the protocol needed to negotiate whether to use ISL
or 802.1Q encapsulation. Therefore, Cisco introduced the second generation trunk
negotiation protocol, DTP. DISL and DTP are described in Chapter 8.
Understanding VTP 539
VTP provides a communication protocol between Catalysts over trunks. The protocol
allows Catalysts to share information about VLANs in the VTP management domain. VTP
operates only after DISL/DTP complete the trunk negotiation process and functions as a
payload of ISL/802.1Q. VTP does not work over non-trunk ports. Therefore, it cannot send/
receive any messages until DISL or DTP negotiate a link into trunk status. VTP works
separately from ISL and 802.1Q in that VTP messages transport conguration data,
whereas ISL and 802.1Q specify encapsulation methods. If you have a protocol analyzer
capable of decoding these protocols and set it up to capture trunk trafc, it displays VTP as
encapsulated within an ISL or 802.1Q frame. Figure 12-12, discussed later in this chapter,
shows a VTP frame encapsulated in ISL.
VTP primarily distributes VLAN information. You must congure VTP before you can
congure any VLANs. Chapter 5 presented the three steps for creating a VLAN.
Specically, the steps include:
Step 1 Assign the Catalyst to a VTP domain (unless the Catalyst is congured
in VTP transparent mode, discussed later.)
Step 2 Create the VLAN.
Chapter 5 described the details of the last two steps, but deferred the discussion about VTP
domains to this chapter.
A VTP domain associates Catalysts with a common conguration interest. Catalysts within
a VTP domain share VLAN information with each other. If an administrator on a Catalyst
creates or deletes a VLAN, the other Catalysts in the VTP domain automatically become
aware of the change in the list of VLANs. This helps to ensure administrative conformity
between the Catalysts. Without conguration uniformity, Spanning Tree might not, for
example, converge upon an optimal topology for the VLANs. VTP also serves to eliminate
conguration steps for you. Without VTP, you need to manually create and delete VLANs
in each Catalyst. But with VTP, VLANs automatically propagate to all other Catalysts
throughout the VTP management domain. This is the principle benet of VTP. Although
this might not sound signicant in a small network, it becomes particularly benecial in
larger networks.
A parallel benet of the management domain limits the extent to which changes can be
propagated. Figure 12-1 shows a Catalyst system with two management domains wally and
world. Domain wally has VLANs 1, 2, 3, and 4 congured, and domain world has VLANs
1, 2, 3, and 10 congured. Assuming there are no Layer 3 issues, workstations assigned to
the same VLAN can communicate with each other even though they are in different
management domains. A station in VLAN 2 in wally belongs to the same broadcast domain
as a station in VLAN 2 in world.
540 Chapter 12: VLAN Trunking Protocol
wally world
Suppose a network administrator decides to add a VLAN 5 to both domains. If you create
VLAN 5 in wally, VTP propagates the new VLAN throughout the domain wally. When the
VTP announcement reaches the border Catalyst in world, that Catalyst ignores the
information from wally. The administrator needs to also create VLAN 5 in world to spread
the VLAN existence.
Suppose the administrator decides to remove VLAN 3 from world. At a Catalyst in domain
world, the administrator clears VLAN 3. What happens to VLAN 3 in wally? Nothing.
When the border Catalyst in world advertises a VTP announcement to the Catalyst in wally,
the wally border Catalyst ignores the information and retains VLAN 3.
TIP In this case where the administrator deletes a VLAN, VTP propagates the deletion
information to the other Catalysts in the management domain. Any hosts attached to ports
in the deleted VLAN lose network connectivity because all Catalyst ports in the domain
assigned to the VLAN become disabled.
At times, network administrators get new equipment. The equipment, of course, arrives
with no conguration. But you cannot immediately start to create VLANs. You must rst
dene a VTP domain. If your Catalyst is congured in the default server VTP mode and
you do not assign a Catalyst to a VTP domain, the Catalyst does not let you create a VLAN
as demonstrated in Example 12-1. Note that the Catalyst posts a message to the console
stating that it refuses to change any VLAN status until it has a domain name. The message
also references a VTP server. This is described in more detail later in the section
Conguring VTP Mode.
Understanding VTP 541
The Catalyst accepts domain names up to 32 characters long. Example 12-3 shows a VTP
domain name conguration example. The administrator congures the Catalyst as a
member of the VTP domain wally.
Example 12-3 set vtp domain Example
Console> (enable) set vtp domain wally
VTP domain wally modified
Console> (enable)
What domain and VLAN do the Catalysts belong to when they have a clean conguration?
If no VTP domain is assigned to the Catalysts, the domain is NULL. All ports belong to
VLAN 1. Use the command show vtp domain any time to discover what VTP domain a
Catalyst belongs to as illustrated in Example 12-4.
542 Chapter 12: VLAN Trunking Protocol
For example, in the highlighted portion of Example 12-4, the Catalysts display indicates that
it belongs to the domain wally. If the Domain Name eld is blank, the domain is NULL.
TIP VTP domain names are case sensitive. A VTP domain name San Jose is not the same as san
jose. Catalysts with the former domain name cannot exchange VLAN conguration
information with Catalysts congured with the later domain name.
As an example of how Catalysts obtain VTP membership, consider the Catalyst system in
Figure 12-2 where multiple Catalysts interconnect over trunks, but no management domain
is assigned.
Understanding VTP 543
Cat-A, Cat-B, Cat-C, and Cat-E all have trunk connections to other Catalysts, whereas Cat-D
and Cat-F only attach through access ports. This prevents Cat-D and Cat-F from receiving
VTP messages from any of the other Catalysts. Entering the command show vtp domain
reveals an output similar to that seen in Example 12-4, but with no domain name assigned on
any unit. Assume that you attach to Cat-As console port to make congurations, and you enter
the command set vtp domain wally. What happens to the other Catalysts in the network? You
examine the VTP status on each Catalyst with the show vtp domain command, and you
discover that Cat-A, Cat-B, Cat-C, and Cat-E all learn about the management domain, but not
Cat-D and Cat-F. Cat-D and Cat-F fail to learn about the management domain because they
do not have trunk ports on which to receive VTP updates.
By setting the VTP domain in one unit, all other Catalysts attached with trunk ports
automatically learn about VTP domain wally and congure themselves as part of the
domain. This works because Cat-B, Cat-C, and Cat-E had trunk connections and did not
already belong to a VTP domain. If however, you had previously associated them with a
different domain, they would ignore the VTP announcement for wally.
At this point, you can create new VLANs in Cat-A, Cat-B, Cat-C, and Cat-E, but not Cat-D
and Cat-F. You can only create VLANs in Catalysts congured with a VTP domain name (as
mentioned earlier, this assumes the default setting of VTP Server Mode as discussed in the
next section). Because D and F do not have a domain name, you cannot create VLANs in
them. So how do you add VLANs to Cat-D and Cat-F? You need to manually enter a VTP
domain name in these two units before you can create any VLANs in them. When you assign
them to a domain, they do not make any VTP announcements because there is no trunk link.
But they do belong to a management domain. Alternatively, you can enable the links between
Cat-A and Cat-D and Cat-C and Cat-F as trunks. When they are enabled as trunks, Cat-D and
Cat-F can then receive VTP updates and become members of the management domain wally.
The requirements for dening a VTP domain are listed earlier. One of the requirements is
that Catalysts with the same VTP domain name must be adjacent to belong to the same
management domain. In Figure 12-3, Catalysts interconnect with trunk links and belong to
several management domains. Two of the domains have the same name, but are separated
by other domains. Even though the Catalysts in the rst and last domain have the same
544 Chapter 12: VLAN Trunking Protocol
management domain name, they actually belong to two different domains from a system
point of view.
Figure 12-3 A Multiple VTP Domain Network
Whenever a Catalyst makes a VTP announcement, it includes the VTP domain name. If the
receiving Catalyst belongs to a different management domain, it ignores the announcement.
Therefore, VTP announcements from the wally domain on the left of the drawing are never
seen by the Catalysts in the wally domain on the right of the drawing.
TIP If you are installing a domain border switch that connects two domains, it becomes a
member of the management domain that it rst hears from. Therefore, be sure to attach it
to the domain that you want it to belong to rst. Make sure that it acquires the VTP domain
name, and then attach it to the other domain. Otherwise, you need to manually congure
the domain name.
WARNING A side effect of VTP domains can prevent an ISL trunk link from negotiating correctly with
dynamic ISL (DISL). When DISL initiates trunk negotiation, it includes the VTP domain
name in the message. If the two ends of the link belong to different domains, the trunk fails
to establish automatically. To enable a trunk on border Catalysts between domains, set both
ends of the trunk to ON or nonegotiate.
VTP Modes
Referencing the set vtp ? output of Example 12-2, notice that you have the option to
congure a VTP mode. You can congure VTP to operate in one of three modes: server,
client, or transparent. The three modes differ in how they source VTP messages and in how
VTP Modes 545
they respond when they receive VTP messages. Table 12-2 summarizes the differences
between the three modes.
Create VLANs
If you desire to create a VLAN, you must create it on a Catalyst congured in server or
transparent mode. These are the only modes authorized to accept set vlan and clear vlan
commands. The difference between them, though, is the behavior after you create the VLAN.
In the case of the server mode, the Catalyst sends VTP advertisements out all trunk ports to
neighbor Catalysts. Transparent mode Catalysts do not issue any type of VTP announcement
when a VLAN is created. The new VLAN is only locally signicant. If you build an entire
network with Catalysts congured in transparent mode, you need to create new VLANs in each
and every Catalyst as an individual command. Catalysts in transparent mode do not, for all
intents and purposes, participate in VTP. To them, it is as if VTP does not exist. You do not need
to assign the transparently congured Catalyst to a VTP domain before you can create any local
VLANs. VTP transparent mode switches do not advertise changes or additions made for
VLANs on the local switch, but they pass through VLAN additions or changes made elsewhere.
Catalysts congured as clients do not have the authority to create any VLANs. If you
associate a client port to a VLAN that it does not know about, the Catalyst generates a
message informing you that you must create a VLAN on a server before it can move the
ports to the VLAN. If, after assigning the ports to the VLAN, you look at the VLAN status
with the show vlans command, you might notice that the ports belong to the new but non-
existent VLAN and are in a suspended state preventing the forwarding of frames in the
VLAN. When you actually create the VLAN on a server in the same management domain
as the client, the client eventually hears about the new VLAN and interprets this as
authorization to activate ports in the new VLAN.
Similarly, you cannot delete VLANs in a client, only in a server or transparent device.
Deleting a VLAN in a transparent device only affects the local device as opposed to when
you delete a VLAN from a server. When deleting a VLAN from a server, you get a warning
message from the Catalyst informing you that this action places any ports assigned to the
VLAN within the management domain into a suspended mode, as shown in Example 12-5.
Example 12-5 Clearing a VLAN in a Management Domain
Console> (enable) clear vlan 10
This command will deactivate all ports on vlan 10
in the entire management domain
Do you want to continue(y/n) [n]?y
Vlan 10 deleted
Console> (enable)
TIP Clearing a VLAN does not cause the ports in the management domain to reassign
themselves to the default VLAN 1. Rather, the Catalysts keep the ports assigned to the
previous VLAN, but in an inactive state. You need to reassign ports to an active VLAN
before the attached devices can communicate again.
Remember VLANs
Whenever you create, delete, or suspend a VLAN in a server or transparent Catalyst, the
Catalyst stores the conguration information in NVRAM so that on power up it can recover
VTP Modes 547
to the last known VLAN conguration. If the unit is a server, it also transmits conguration
information to its Catalyst neighbors.
Clients, on the other hand, do not store VLAN information. When a Catalyst congured in
client mode loses power, it forgets about all the VLANs it knew except for VLAN 1, the
default VLAN. On power up, the client cannot locally activate any VLANs, except VLAN
1, until it hears from a VTP server authorizing a set of VLANs. Any ports assigned to
VLANs other than VLAN 1 remain in a suspended state until they receive a VTP
announcement from a server. When the client receives a VTP update from a server, it can
then activate any ports assigned to VLANs included in the VTP announcement.
Figure 12-4 VLAN Distribution for VTP Modes Server, Client, and Transparent
Table 12-3 shows the starting condition and all subsequent conditions for Catalysts A, B,
and C.
Table 12-3 Status for Catalysts in Figure 12-4 Demonstrating Server, Client, and Transparent Mode
Congured VLANs for:
Cat-A Cat-B Cat-C
Step # Event Server Transparent Client
1 Starting Condition. 1 1 1
2 Create VLAN 2 in Cat-C. 1 1 1
3 Assign ports to VLAN 2 1 1 1
in Cat-C.
4 Create VLAN 2 in Cat-A. 1, 2 1 1, 2
5 Create VLAN 10 in Cat-B. 1, 2 1, 10 1, 2
6 Lose and restore power on 1, 2 1, 10 1, 2
Cat-A and Cat-B.
continues
548 Chapter 12: VLAN Trunking Protocol
Table 12-3 Status for Catalysts in Figure 12-4 Demonstrating Server, Client, and Transparent Mode (Continued)
7 Lose power on Cat-A and N/A 1, 10 1
Cat-C. Restore power to
Cat-C only.
8 Restore power on Cat-A. 1, 2 1, 10 1, 2
9 Create VLAN 20 on all 1, 2, 20 1, 10, 20 1, 2, 20
three Catalysts.
In the starting conguration of Step 1, the Catalysts have a VTP domain name assigned to
them. In reality, VTP in Cat-B isnt really participating and can be ignored for now. All
three Catalysts start with only the default VLAN 1. In Step 2, the administrator starts work
on the client and tries to create a new VLAN. But because Cat-C is a client, Cat-C rejects
the command, posts an error to the console, and does not create any new VLAN. Only
VLAN 1 remains in existence. The administrator assigns ports to VLAN 2 in Step 3 with
the set vlan command. Even though VLAN 2 does not exist yet, Cat-C accepts the port
assignment, moves ports to VLAN 2, and places them in the suspend state. However,
Cat-C only knows about VLAN 1.
In Step 4, the administrator moves to Cat-A, a server, and creates VLAN 2 which then gets
propagated to its neighbor, Cat-B. But Cat-B, congured in transparent mode, ignores the
VTP announcement. It does not add VLAN 2 to its local VLAN conguration. Cat-B oods
the VTP announcement out any other trunk ports to neighbor Catalysts. In this case, Cat-C
receives the VTP update, checks the VTP management domain name, which matches, and
adds VLAN 2 to its local list. Cat-C then activates any ports assigned to VLAN 2. Any
devices attached to ports in VLAN 2 on Cat-A now belong to the same broadcast domain
as ports assigned to VLAN 2 on Cat-C. If Layer 3 permits, these devices can now
communicate with each other.
The administrator now moves (or Telnets) to Cat-B in Step 5 and creates a new broadcast
domain, VLAN 10. As a Catalyst congured in transparent mode, the Catalyst is authorized
to create VLAN 10. But Cat-B does not propagate any information about VLAN 10 to the
other Catalysts. VLAN 10 remains local to Cat-B and is not a global VLAN.
A disaster occurs in Step 6 when the facility loses power to Cat-A and Cat-B. But, because
you are a savvy network engineer, you congured them in server and transparent modes so
they can remember their VLAN conguration information. Although Cat-A and Cat-B
remain without power, Cat-C continues to operate based upon the last VLAN conguration
The Working Mechanics of VTP 549
it knew. All ports on Cat-C remain operational in their assigned VLANs. When power is
restored to Cat-A and Cat-B, both Catalysts remember their authorized VLANs and enable
them. Cat-A also issues VTP messages to neighbors. This does not affect the operations of
Cat-B or Cat-C, though, because they both remember their conguration.
Now consider what happens if Cat-A and Cat-C lose power. In Step 7, this occurs, but
power is restored to Cat-C before Cat-A. When Cat-C recovers, it starts with only VLAN 1
authorized. Ports in any other VLAN are disabled until Cat-C hears a VTP message from a
server. If Cat-A takes one hour to recover, Cat-C remains in this state the entire time. When
Cat-A nally restarts in Step 8, it sends VTP messages that then authorize Cat-C to enable
any ports in VLANs included in the VTP announcement.
Finally, the administrator creates another VLAN for the entire management domain. In Step
9, the administrator creates VLAN 20. But this takes two conguration statements. The
administrator must create the VLAN in Cat-A and in Cat-B. Now there are two global
VLANs in the domain. Any devices in VLAN 20 belong to the same broadcast domain,
regardless of which Catalyst they connect to. VLAN 1 is the other global broadcast domain.
Any devices in VLAN 1 can also communicate with each other. But, devices in VLAN 1
cannot communicate with devices in VLAN 20 unless there is a router in the network.
Although Cisco routers understand trunking protocols like ISL, LANE, and 802.1Q, they
currently do not participate in VTP; routers ignore VTP messages and discard them at the
router interface. Therefore, VTP messages propagate no further than a router interface, or
to another Catalyst that belongs to a different VTP management domain. Figure 12-6 shows
a system with three management domains isolated through varied domain assignments and
through a router. Domain 1 has three management domain border points, one to the router
and two into Domain 2.
The Working Mechanics of VTP 551
ISL
Domain Domain
1 3
Cat-A Cat-B
Trunk Link
Cat-C
Domain
Trunk Link 2
When Cat-A in Domain 1 issues a VTP message, the message gets distributed to all of the
other Catalysts in the domain. Cat-B and Cat-C receive the message and forward it to the
two Catalysts in Domain 2. However, these Catalysts see that the source domain differs
from their own and, therefore, discard the VTP message.
The VTP message generated in Domain 1 also propagates to the router. But the router does
not participate with VTP and discards the message.
Likewise, VTP messages generated in Domain 2 or Domain 3 never affect devices outside
of their domain.
552 Chapter 12: VLAN Trunking Protocol
TIP You can quickly and easily reset the conguration revision number with the set vtp domain
name command. Changing the domain name sets the conguration revision number to zero.
Or, you can make 4,294,967,295 VLAN changes to your system to force the counter to roll
back to zero. Unfortunately, this could take a very long time to accomplish.
Summary Advertisements
By default, server and client Catalysts issue summary advertisements every ve minutes.
Summary advertisements inform neighbor Catalysts what they believe to be the current VTP
conguration revision number and management domain membership. The receiving Catalyst
compares the domain names and, if they differ, ignores the message. If the domain names
match, the receiving server or client Catalyst compares the conguration revision number. If
the advertisement contains a higher revision number than the receiving Catalyst currently has,
the receiving Catalyst issues an advertisement request for new VLAN information.
Figure 12-7 shows the protocol format for a summary advertisement.
The Working Mechanics of VTP 553
Updater Identity
Update Timestamp
(12 Bytes)
MD5 Digest
(16 Bytes)
Each row in Figure 12-7 is four octets long. The Version, Type, Number of Subnet
Advertisement Messages, and Domain Name Length Fields are all one octet long. Some of
the elds can extend beyond four octets and are indicated in the gure. A description of
each of the elds follows the decode in Figure 12-8.
Figure 12-8 decodes a summary advertisement packet encapsulated in an ISL trunking
protocol frame. If the trunk uses 802.1Q rather than ISL, the VTP message is exactly the
same, only encapsulated in an 802.1Q frame.
554 Chapter 12: VLAN Trunking Protocol
The decode starts with the SNAP header that follows the other headers shown in Figure 12-5.
Although VTP uses the same Ethernet multicast address as CDP, the SNAP value differs
between the two. CDP uses a SNAP value of 0x2000, but VTP uses a SNAP value of 0x2003.
This allows the receiving Catalysts to distinguish the protocols.
The VTP header contains the VTP version in use. All Catalysts in the management domain
must run the same version. In this case, they are running VTP version 1. If there are Token Ring
switch ports in your domain, this would have to be VTP version 2. The message type value
indicates which of the four VTP messages listed earlier was transmitted by the source Catalyst.
The following eld, Number of Subset Advertisement Messages, indicates how many VTP
type 2 messages follow the summary advertisement frame. This value can range from zero
to 255. Zero indicates that no subset advertisements follow. A Catalyst only transmits the
subset advertisement if there is a change in the system, or in response to an advertisement
request message.
The domain length and name follows this eld along with any padding bytes necessary to
complete the Domain Name eld.
The source also transmits the VTP conguration revision number and identies itself
through its IP address. Remember from the earlier section, VTP Conguration Revision
Number, that the receiving Catalyst compares the conguration revision number with its
internal number to determine if the source has new conguration information or not.
The Working Mechanics of VTP 555
The message includes a timestamp which indicates the time the conguration revision
number incremented to its current value. The timestamp has the format of yymmddhhmmss
which represents year/month/day and hour/minute/second.
Finally, the source performs an MD5 one-way hash on the header information. An MD5
(message digest type 5) hash algorithm is frequently used in security systems as a non-
reversible encryption process. The receiving Catalyst also performs a hash and compares
the result to detect any corruptions in the frame. If the hashes do not match, the receiving
Catalyst discards the VTP message.
Subset Advertisements
Whenever you change a VLAN in the management domain, the server Catalyst where you
congured the change issues a summary advertisement followed by one or more subset
advertisement messages. Changes that trigger the subset advertisement include:
Creating or deleting a VLAN
Suspending or activating a VLAN
Changing the name of a VLAN
Changing the MTU of a VLAN
Figure 12-9 shows the VTP subset advertisement packet format.
556 Chapter 12: VLAN Trunking Protocol
Domain Name
Version Code Seq-Number Length
VLAN-info Field 1
.
.
.
VLAN-info Field N
802.10 Index
The summary advertisement has a Seq-Number eld in the header indicating the number of
subset advertisements that follow. If you have a long VLAN list, VTP might need to send
the entire list over multiple subset advertisements.
The Working Mechanics of VTP 557
Figure 12-10 shows a subset advertisement (partial listing). As with the summary
advertisement, the message includes the VTP version type, the domain name and related elds,
and the conguration revision number. The header sequence number indicates the identity of
the subset advertisement. If multiple subset advertisements follow the summary advertisement,
this number indicates the subset instance sent by the updater. The sequence numbering starts
with 1. The receiving Catalyst uses this value to ensure that it receives all subset advertisements
and, if not, can request a resend starting with a specic subset advertisement.
The summary advertisement then lists all of the VLANs in the management domain along
with the following information for each:
Length of the VLAN description eld
Status of the VLAN. The VLAN can be active or suspended
VLAN type. Is it Ethernet, Token Ring, FDDI, or other?
MTU (maximum transmission unit) for the VLAN. What is the maximum frame size
supported on this VLAN?
Length of the VLAN name
The VLAN number for this named VLAN
The SAID value to use if the frame is passed over an FDDI trunk
The VLAN name
558 Chapter 12: VLAN Trunking Protocol
The VTP subset advertisement individually lists this information for each VLAN, even the
default VLANs.
Advertisement Requests
A Catalyst issuing the third VTP message type, an advertisement request, solicits summary
and subset advertisements from a server in the management domain. Catalysts transmit an
advertisement request whenever you reset the Catalyst, whenever you change its VTP
domain membership, or whenever it hears a VTP summary advertisement with a higher
conguration revision number than it currently has. This can happen if a Catalyst is
temporarily partitioned from the network and a change occurs in the domain.
Figure 12-11 shows a VTP advertisement request frame format.
Start-Value
An advertisement request includes six elds. The Version eld identies the VTP version
used by the device. The Code eld identies this as an advertisement request. The reserved
(Rsvd) portion is set to zero. The Management Domain Length eld (MgmtD Len)
indicates the length of the domain name in the following eld. These four elds are
followed by the Management Domain Name. Finally, if the Catalyst expected to receive
subset advertisements but failed to receive one or more, it can request a resend starting at a
particular subset instance value. This is signaled in the Start-Value eld. For example, if
the Catalyst expected to see three subset advertisements but only received instances 1 and
3, it can request a resend starting at instance 2.
Figure 12-12 shows an advertisement request captured on an analyzer.
Conguring VTP Mode 559
The advertisement request in Figure 12-12 requests all subset advertisements for the
management domain testvtp. This is recognized because the start value is zero.
As illustrated in the section on decoding VTP subset advertisements, all VLAN information
sent over the wire is cleartext. Anyone with an analyzer package can capture the VTP frame
and decode it. A system attacker can use this information to disrupt your network. For
example, an attacker can fabricate a false subset advertisement with a higher conguration
revision number and delete the VLANs in your management domain.
560 Chapter 12: VLAN Trunking Protocol
As a security option, you can specify a password for VTP subset advertisements. If you
specify a password, the Catalyst uses this as a key to generate an MD5 hash. Whenever a
source Catalyst issues a subset advertisement, it calculates the hash for the advertisement
message and uses the password as the key. VTP includes the hash result in the subset
advertisement. The receiving Catalyst must also have the same password locally
congured. When it receives the subset advertisement, it generates a hash using its
password and compares the result with the received message. If they match, it accepts the
advertisement as valid. Otherwise, it discards the subset advertisement.
You can congure a password using the command set vtp passwd passwd.
VTP statistics:
summary advts received 5392
subset advts received 16
request advts received 3
summary advts transmitted 5280
subset advts transmitted 7
request advts transmitted 0
No of config revision errors 0
No of config digest errors 0
VTP pruning statistics:
Trunk Join Trasmitted Join Received Summary advts received from
non-pruning-capable device
-------- --------------- ------------- ---------------------------
1/1 0 0 0
1/2 0 0 0
Cat-A > (enable)
Notice the highlighted line in Example 12-7. It lists the number of incidents where the
received MD5 hash did not match the locally calculated value. This can indicate either a
corruption of the frame from a physical layer problem or a security issue. If the problem
stems from a physical layer issue, you might anticipate that other frames also have
problems. If you do not experience other transmission errors, there might be an attack
where the attacker is attempting to spoof a Catalyst and corrupt your management domain.
Conguring VTP Mode 561
Left Right
Cat-C Cat-D
Server Server
ISL Trunk
Server Server
Cat-A Cat-B
If you make any changes to the VLAN list on any server in the network, the change gets
distributed to the other Catalysts in the management domain. But what happens if you make
changes while the network is partitioned? For example, assume in Figure 12-13 that the
trunk link between Cat-C and Cat-D servers isolate the three Catalysts on the left from the
562 Chapter 12: VLAN Trunking Protocol
three on the right. All Catalysts should have the same VTP conguration revision number,
N. See Table 12-4 for VLAN table and VTP conguration revision numbers for these steps.
While the network is partitioned, an administrator on the left creates a new VLAN thinking,
When the network recovers, the new VLAN will get distributed to the Catalysts on the
right. At the same time, an administrator on the right makes a VLAN change by deleting
a VLAN. This administrator thinks like the rst administrator and expects VTP to propagate
the deletion when the network recovers.
When the trunk link between Cat-C and Cat-D is restored, both Catalysts issue VTP
summary advertisements announcing a conguration revision update of N+1. Catalysts in
each half compare the number with their own values, see that they match, and never issue
a VTP advertisement request. This leaves the two portions with unsynchronized VLAN
tables as shown in Step 5. To resynchronize the tables, the administrator needs to enter a
VLAN modication as in Step 6, which forces an increment in the revision number. The
Catalyst where the conguration change was made then issues a VTP summary
advertisement and a VTP subset advertisement. This information gets distributed to the
other Catalysts in the network synchronizing the VLAN tables.
A more severe case can occur when two normally isolated portions merge. Suppose the
three Catalysts on the left belong to one corporate department (Engineering) that doesnt
like the corporate division using the three Catalysts on the right (Marketing). The Catalysts
remain isolated until a higher level manager mandates that the two divisions need to
connect their networks. It is decided ahead of time by the manager that the two groups will
change their domain names to a new third name so that they all belong to the same
management domain. The two groups now have a conguration as shown in Step 1 of Table
12-5. (The conguration revision number was arbitrarily selected for this example.)
Conguring VTP Mode 563
The big moment arrives when the two groups shake hands and join the networks. What
happens? Disaster. Because the Marketing side has a higher conguration revision number
than the Engineering side, VTP updates the Engineering VLAN tables to match the
Marketing side. All of the Engineering users that were attached to VLANs 2, 3, 4, and 5
now attach to ports in suspended mode. Marketing just torpedoed the Engineering LAN!
Now engineering is really mad at Marketing.
TIP Before merging different domains, be sure to create a super set of VLANs in both domains
to prevent VLAN table deletion problems. Make sure that all partitions have a complete list
of all VLANs from each partition before joining them together.
Similarly, if you have a Catalyst dedicated to you for playing around, and you attach it to
the production network, you might delete the VLANs in your production network if you are
not careful. Make sure that you either clear cong all, change the VTP domain name, or
put it in transparent mode before attaching it to the production network.
TIP Another real-world situation exists that could delete VLANs in your network. If you replace
a failed Supervisor module on a Catalyst congured in the server mode, you run the
possibility of deleting VLANs if the new module has a higher conguration revision
number than the others in the domain. Be sure to reset the conguration revision number
on the new module before activating it in the system.
Broadcast Flooding
Another issue involves trunks and VTP. Chapter 8 described trunks and the syntax to
establish trunks. Whenever you enable a trunk, the trunk, by default, transports trafc for
all VLANs. This includes all forwarded and ooded trafc. If you have a VLAN that
generates a high volume of ooded trafc from broadcasts or multicasts, the frames ood
throughout the entire network. They even cross VTP management domains. This can have
564 Chapter 12: VLAN Trunking Protocol
a signicant impact on your default management VLAN 1. Refer to Chapters 15, Campus
Design Implementation, and 17, Case Studies: Implementing Switches, for discussions
on cautions and suggestions regarding the management VLAN. In Figure 12-14, a Catalyst
system has trunk and access links interconnecting them. PC-1 in the system attaches
through an access link assigned to VLAN 2. This particular station generates multicast
trafc for video distribution on VLAN 2. The network administrator dedicated VLAN 2 to
this application to keep it separate from other user trafc. Although the administrator
managed to keep any of the multicast trafc from touching Cat-A and Cat-B, it still touches
all of the other Catalysts in the network, even though VLAN 2 is not active in Domain 2.
Why doesnt the multicast trafc bother Cat-A and Cat-B? Cat-A and Cat-B do not see the
trafc from PC-1 because all of the links to these units are access links on different VLANs.
Figure 12-14 Flooding in a Multiple Domain Network
Cat-B
VLAN 3
Domain 2
PC-1 Cat-A
Domain 1
= Trunk Lines
= Access Links
There are methods of controlling the distribution of ooded trafc throughout the network.
These methods include the features of VTP pruning to control ooding, and modications
to the multicast behavior through Cisco Group Management Protocol (CGMP). VTP
pruning is discussed in the section in this chapter, VTP Pruning: Advanced Trafc
Management. Details on controlling multicast with CGMP is described in Chapter 13,
Multicast and Broadcast Services.
Conguring VTP Mode 565
Excessive STP
A related issue with VTP and trunking is the universal participation in Spanning Tree by all
Catalysts in the network for all VLANs. Whenever you create a new VLAN, the Catalysts
create a new instance of Spanning Tree for that VLAN. This feature, Per-VLAN Spanning
Tree (PVST), allows you to optimize the Spanning Tree topology individually for each
VLAN. All Catalysts, by default, participate in PVST. But as in the case of ooding trafc
in the network, each additional instance of Spanning Tree creates additional Layer 2
background trafc. The additional trafc comes from the BPDUs distributed for each
VLANs instance of Spanning Tree. Not only does this add trafc to each link, it also
requires more processing by each Catalyst. Each Catalyst must generate and receive hello
messages at whatever interval is specied by the hello timer, which is every two seconds by
default. And, each Catalyst must calculate a Spanning Tree topology for each VLAN
whenever there is a Spanning Tree topology change. This, too, temporarily consumes CPU
cycles with a complex Spanning Tree topology.
TIP If you have a fairly static network where you are not adding or deleting globally signicant
VLANs, consider using the transparent mode to prevent erroneous actions from deleting
your VLANs. However, setting up a unit in transparent mode does increase your
administrative burden.
VTP V2
With the introduction of Token Ring switching in the Catalyst, Cisco updated the VTP
protocol to version 2. By default, Catalysts disable VTP version 2 and use version 1. As an
administrator, you need to select which version of VTP to use and ensure that all Catalysts
in the VTP management domain run the same version.
VTP version 2 assists Token Ring VLANs by ensuring correct Token Ring VLAN conguration.
If you plan on doing Token Ring switching, you MUST use VTP version 2 so that processes like
DRiP operate. Chapter 3, Bridging Technologies describes DRiP in more detail.
TIP VTP version 2 adds functionality over VTP version 1, but uses a modied form of the protocol.
Version 1 and version 2 cannot coexist in the same management domain. All Catalysts in a
management domain must run a homogenous VTP version to function correctly.
If you enable VTP version 2 on a Catalyst, all VTP version 2 capable Catalysts in the
management domain automatically enable VTP version 2. Not all Catalysts support VTP
version 2, nor do all releases of the Supervisor engine software. If you elect to use VTP
version 2, ensure that all Catalysts in the domain are VTP version 2 capable.
Support for Token Ring is not the only feature added in VTP version 2, but it is the
principle one.
To enable VTP version 2, use the command set vtp v2 enable.
Verify that you correctly changed it with the show vtp domain command.
Example 12-8 shows a Catalyst congured for VTP version 2 and the corresponding output
from the show vtp domain command. Note the warning whenever you attempt to enable
version 2 that all Catalysts in the management domain must support version 2. When you
enable version 2, all Catalysts in the management domain enable version 2. Note in Figure
12-8 that there are two columns related to the VTP version. One column is titled VTP Version
and the other V2 Mode. VTP Version identies what version of VTP your code supports, but
does not indicate what is currently in use. Even if you use VTP version 1, this value still shows
a 2 because you have the possibility of enabling version 2. The other column on the other
hand, V2 Mode, indicates what VTP version you currently enabled. If you have the default
version 1 enabled, then this column indicates disabled as seen in Figure 12-6.
VTP Pruning: Advanced Trafc Management 567
Flood Flood
Cat-D
PC-1 1,2
VTP pruning limits the distribution of the ooded frames to only those Catalysts that have
members of VLAN 2. Otherwise, the sending Catalyst blocks ooded trafc from that VLAN.
Cisco introduced VTP pruning with Supervisor engine software release 2.1 as an extension
to VTP version 1. VTP pruning denes a fourth VTP message type which announces VLAN
membership. Whenever you associate Catalyst ports to a VLAN, the Catalyst sends a
message to its neighbor Catalysts informing them that they are interested in receiving trafc
for that VLAN. The neighbor Catalyst uses this information to decide if ooded trafc from
a VLAN should transit the trunk or not.
An administrator enables pruning in Figure 12-16. When PC-1 generates a broadcast frame
with pruning enabled in the Catalysts, the broadcast does not reach Cat-C as it did in Figure
12-15. Cat-B receives the broadcast and normally oods the frame to Cat-C. But Cat-C
does not have any ports assigned to VLAN 2. Therefore, Cat-B does not ood the frame out
the trunk toward Cat-C. This preserves bandwidth on the trunk and on the Catalysts
backplane.
VTP Pruning: Advanced Trafc Management 569
VLAN
2
Cat-D
PC-1 1,2
Review Questions
This section includes a variety of questions on the topic of this chapterVTP. By
completing these, you can test your mastery of the material included in this chapter as well
as help prepare yourself for the CCIE written and lab tests.
Refer to Figure 12-17 for all review questions.
A LAN switch is not a router (although a router can be incorporated, such as the RSM).
What happens, then, to multicast trafc in a switched network? By default, a switch
(bridge) oods multicast trafc within a broadcast domain. This consumes bandwidth on
access links and trunk links. Depending upon the hosts TCP/IP stack implementation and
network interface card (NIC) attributes, the multicast frame can cause a CPU interrupt.
Why does a switch ood multicast trafc? A switch oods multicast trafc because it has
no entry in the bridge table for the destination address. Multicast addresses never appear as
source addresses, therefore the bridge/switch cannot dynamically learn multicast
addresses. You can manually congure an entry with the set cam static command.
IGMP is a multicast protocol that directly affects hosts. IGMP allows hosts to inform
routers that they want to receive multicast trafc for a specic multicast group address.
Current Catalysts dont understand IGMP messages (unless you have the NetFlow Feature
Card [NFFC]). IGMP messages appear to a Catalyst like any other multicast frames. Cisco
developed the proprietary CGMP that enables routers to inform Catalysts about hosts and
their interest in receiving multicast trafc. This modies the Catalysts default behavior of
ooding the multicast frame to all hosts in the broadcast domain. Rather than ooding the
frame to all hosts, the Catalyst limits the ooding scope to only those hosts in the broadcast
domain that registered with the router through IGMP. If a host does not register with the
router, it does not receive a copy of the multicast frame. This preserves access link bandwidth.
Multicast Addresses
Whenever an application needs to send data to more than one station, but wants to restrict
the distribution to only stations interested in receiving the trafc, the application typically
uses a multicast destination address. Multicast addresses target a subset of all stations in the
network. The other two transmission choices for a transmitting device are unicast or
broadcast frames. If the source uses a broadcast address, all stations in the broadcast
domain must process the frame, even if they are not interested in the information. If the
source transmits unicast frames, it must send multiple copies of the frame, one addressed
to each intended receiver. This is a very inefcient use of network resources and does not
scale well as the number of receivers increases.
By using multicast addresses, the source transmits only one copy of the frame onto the wire
and routers distribute the multicast message to the other segments where interested receivers
reside. Multicast addresses appear at Layer 2 and at Layer 3. A network administrator assigns
the multicast Layer 3 address for an application. The Layer 2 multicast address is then
calculated from the Layer 3 multicast address. This is shown in the section, Layer 2
Multicast Addresses. When you congure a multicast application, the NIC adds the multicast
address to its list of valid MAC addresses. Usually this list consists of the built-in MAC
address plus any user congured addresses. Whenever the station receives a frame with a
matching multicast destination address, the receiver sends the frame to the CPU.
The router examines multicast addresses at both Layer 2 and Layer 3, whereas a switch
examines the Layer 2 address. If the switch has hardware such as a Catalyst 5000 Supervisor III
module with an NFFC, the Catalyst can examine the Layer 3 addresses as well. (The advantage
of this is seen in the section, IGMP Snooping: Advanced Trafc Management, later in the
chapter). Otherwise, the Catalyst simply examines the MAC address in the frame.
CGMP/IGMP: Advanced Trafc Management 575
Ambiguous
Bits
L3 8 16 24 32
Address
12345678123456781234567812345678
Bit Value 1 1 1 0 x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x = Don't care
L2 MAC
Address 01-00-5E 0
OUI 23 bits
Consider an example. The IP address 224.1.10.10, assigned by an administrator, has a low
23-bit value of 1.10.10. In hexadecimal format, this is 0x01-0A-0A. The MAC address
takes the last 23 of the 24 bits and places them into the MAC eld. The complete MAC
address in this case is 01-00-5E-01-0A-0A.
576 Chapter 13: Multicast and Broadcast Services
What happens if the IP multicast address is 225.1.10.10? A side effect of this scheme is
address ambiguity. Although a different IP multicast group is identied at Layer 3, the
Layer 2 address is the same as for 224.1.10.10. Layer 2 devices cannot distinguish the two
multicast groups and receive frames from both multicast groups. The user application needs
to lter the two streams and discard the unwanted multicast frames. Any bit value
combination for the 5 bits in Figure 13-1 identied as ambiguous generates the same Layer
2 MAC multicast address. Five bits means that there are 25 combinations, or 32 Layer 3
addresses that create the same Layer 2 address.
TIP When assigning multicast addresses, be sure to remember the 32:1 ambiguity and try to
avoid multicast overlaps. This helps to preserve bandwidth on access and trunk links. The
end station discards the unwanted multicast at Layer 3, after it interrupts the CPU.
IGMP
IGMP denes a protocol for hosts to register with a router to receive multicast trafc for a
specic multicast group. Two versions of IGMP exist: version 1, specied in RFC 1112,
and version 2, specied in RFC 2236. Version 2 added signicant features to version 1
making it more efcient and enabling hosts to explicitly leave a multicast group. These
features are described in the section on IGMP version 2.
IGMP Version 1
Figure 13-2 shows the 8-octet IGMP version 1 frame format.
0 3 7 15 31
IGMP
IGMP
Message Unused 16-Bit Checksum
Version
Type
The rst eld of the frame indicates what version of IGMP generated the frame. For version
1, this value must be 1. The next eld species the message type. Version 1 denes two
messages: a host membership query and a host membership report. The Checksum eld
carries the checksum computed by the source. The receiving device examines the checksum
value to determine if the frame was corrupt during transmission. If the checksum value
doesnt match, the receiver discards the frame. The source computes the checksum for the
entire IGMP message. The nal eld indicates the multicast destination group address
targeted by the message.
CGMP/IGMP: Advanced Trafc Management 577
In Figure 13-3, several hosts and a router share a LAN segment. When a host on the
segment wants to receive multicast trafc, it issues an unsolicited host membership report
targeting the intended multicast group.
Multicast
Source
Host Host
Membership Membership
Report Query
I Want to
Receive Multicast
Traffic for Group
239.255.160.171
The destination MAC address targets the multicast group it intends to join. If this were the only
information in the frame to identify the group, any of 32 groups might be desired due to the
address ambiguity discussed earlier. However, the Layer 3 address is included in both the IP
header and in the IGMP header. The Layer 3 multicast group desired by the host is
239.255.160.171. This translates to a MAC address of 01-00-5E-7F-A0-AB. All multicast-
capable devices on the shared media receive the membership report. In this situation, however,
only the router is interested in the frame. The frame tells the router, I want to receive any
messages for this multicast group. The router now knows that it needs to forward a copy of
any frames with this multicast address to the segment where the host that issued the report lives.
A device issues an IGMP membership report under two conditions:
Whenever the device rst intends to receive a multicast streamWhen you
enable the multicast application, the device congures the NIC, and the built-in IGMP
processes send an unsolicited membership report to the router requesting copies of the
multicast frames.
In response to a membership query from the routerThis is a solicited
membership report and helps the router to conrm that hosts on the segment still want
to receive multicast trafc for a particular multicast group.
Routers periodically issue host membership queries. You can congure the router query
period from 0 seconds to a maximum of 65,535 seconds with the router command ip igmp
query-interval seconds. The default is 60 seconds.
The host membership query message in Figure 13-5 interrogates the segment to determine
if hosts on the segment still desire to receive multicast frames.
Figure 13-5 An IGMPv1 Multicast Host Membership Query
Only one router on each segment issues the membership query message. If the segment uses
IGMP version 1, the designated router for the multicast routing protocol issues the query.
If the segment uses IGMP version 2, the multicast router with the lowest IP address on the
segment issues the queries.
The host membership query message targets the all multicast hosts address, 224.0.0.1, in
the Layer 3 address eld. The Layer 2 address is 01-00-5E-00-00-01. The IGMP header
uses a group address of 0.0.0.0. This translates to a query for all multicast hosts on all
groups on the segment.
CGMP/IGMP: Advanced Trafc Management 579
A host for each active multicast group must respond to this message. When multicast hosts
receive the host membership query, and if the host wants to continue to receive multicast
frames for a multicast group, the multicast host starts an internal random timer with an
upper range of 10 seconds. The host waits for the timer to expire and then it sends a
membership report for each multicast group that it wants to continue to receive.
If another station sends a membership report before the local timer expires, the host cancels
the timer and suppresses the report. This behavior prevents a segment and a router from
experiencing host membership report oods. Only one station from each multicast group
needs to respond for each group on the segment.
If the router does not receive a membership report for a particular multicast group for three
query intervals, the router assumes no hosts remain that are interested in that group
multicast stream. The router stops forwarding the multicast packets for this group and
informs upstream routers to stop sending frames.
This process denes an implicit leave from the multicast group. A host using IGMP version
1 cannot explicitly inform a router that it left the multicast group. A router learns this from
repeated queries that receive no responses.
IGMP Version 2
Figure 13-6 shows the construction of a frame per the version 2 document, RFC 2236. Note
that the version number disappeared, but the message type eld expanded in length and new
values allow a version 1 device to receive a version 2 frame for backwards compatibility.
Another eld added and not found in IGMP version 1 is the Maximum Response Time. This
is explained later in this section.
0 7 15 31
IGMP Maximum
16-Bit Checksum
Message Type Response Time
IGMP version 2 adds two messages not found in IGMP version 1 to streamline the join and
leave process. RFC 2236 added a version 2 membership report and a leave group message.
The complete list of messages now consists of the following:
0x11Membership query
0x12Version 1 membership report
0x16Version 2 membership report
0x17Leave group
580 Chapter 13: Multicast and Broadcast Services
The membership query and version 1 membership report carry over from IGMP version 1.
However, the membership query can now target specic multicast groups. In version 1, the query
message is a general query with the group address set to 0.0.0.0. All active groups respond to the
general query. Version 2 allows a multicast query router to target a specic multicast group.
When it sends this type of message, a multicast host, if it is running version 2, responds with a
version 2 membership report. Other multicast groups ignore the group specic query if it is not
directed to their group. The group specic membership query only works for version 2 systems.
If a version 2 host leaves a multicast group, it sends an unsolicited leave group message to
inform the query router that it no longer desires to receive the multicast stream. The router
maintains a table of all hosts in the multicast group on the segment. If other hosts still want
to receive the multicast stream, the router continues to send the multicast frames onto the
segment. If, however, the membership report arrives from the last host on the segment for
the multicast group, the router terminates the multicast stream for that group.
Consider the multicast group shown in Figure 13-7. Two hosts belong to the multicast
group 224.1.10.10, and one host belongs to the group 224.2.20.20.
Multicast
Source
The router currently forwards frames for both groups on the segment. The host currently
subscribed to 224.2.20.20 decides that it no longer wants to receive the multicast stream for
this group, so it transmits a leave message. The router receives this message and checks its
multicast table to see if there are any other hosts on the segment that want the stream. In
this example, there are no other hosts in the group. The router sends a group specic query
CGMP/IGMP: Advanced Trafc Management 581
message to the group 224.2.20.20 to make sure. If the router does not receive a membership
report, it stops the 224.2.20.20 stream.
Continuing the example, assume that sometime later, Host 3 decides to leave the group
224.1.10.10 and issues a leave message. The router receives the leave message, sends a
group specic query, and realizes that additional hosts on the segment still want to receive
the multicast stream. Therefore, it continues to transmit all multicast frames for this group.
If at any time the router wants to conrm its need to send the stream, it can transmit a
general or group specic query onto the segment. If it does not receive any responses to a
couple of query messages, the router assumes no more hosts want the stream.
Another feature of IGMP version 2 affects the method for selecting the query router. In
version 1, the query router is selected by the multicast routing protocol. The designated
router for the protocol becomes the querier. In version 2, the IP address determines the
query router. The multicast router with the lowest IP address becomes the query router. All
routers initially assume that they are the query router and send a query message. If a router
hears a query message from another multicast router with a lower IP address, the router
becomes a non-querier router.
A nal feature added to IGMP version 2 is the capability for the multicast router to specify
the hosts response timer range. Remember that when a host receives a membership query,
the host starts a random timer. The timer value is in the range of 0 to maximum response
time, with the maximum response time specied in the routers query message. Version 2
allows you to congure the upper range of the timer to a maximum of 25 seconds. The
default in a Cisco router is 10 seconds.
TIP If there are many groups on a segment, you might want to use a larger timer to spread out
the responses. This helps to smooth out any membership report bursts on the segment. If
you have fewer groups, you might want to lower the value so that a router can terminate a
multicast ow stream sooner.
IGMP
Version 1
V1 V2 V1 V2
All Hosts Run IGMP Version 1
A second case exists when the router supports IGMP version 2, but hosts use IGMP version
1. Although the router understands more message types than the hosts, it ultimately uses
only version 1 messages. When the version 2 router receives the version 1 membership
report, it remembers that they are present and only uses version 1 membership queries.
Version 1 queries use the group address 0.0.0.0 and does not generate group specic
queries. If it generated group specic queries, the version 1 hosts would not recognize the
message and would not know how to respond.
What if there are both version 1 and version 2 hosts on the segment with the version 2 router
as in Figure 13-9? As in the previous case, the router must remember that there are version
1 hosts and must, therefore, issue version 1 membership queries. Additionally, if any of the
version 2 hosts send a leave message, the router must ignore the leave notication. It
ignores the message because it must still issue general queries in case a version 1 member
is still active on the segment.
IGMP
Version 2
V1 V2 V1 V2
Hosts Run Either IGMP Version 1 or IGMP Version 2
CGMP/IGMP: Advanced Trafc Management 583
If two routers attach to the segment where one supports version 1 and the other version 2,
the version 2 router must be administratively congured as a version 1 router. The version
1 router has no way to detect the presence of the version 2 router. Because the two versions
use different methods of selecting the query router, they might not reliably agree on the
query router.
2/1 2/3
PC-1 PC-3
2/2
PC-2 PC-4
Membership Report
Membership Query
When the router sends a general membership query, it uses the MAC multicast address 01-
00-5E-00-00-01. This multicast address forces the switch to send the frame to all ports.
When a host responds to the query with a report, the report goes to all ports.
Clearly, though, it would be nice to restrict the distribution of multicast frames in the
switched network to only those hosts that really want the trafc. In a Catalyst, you have
three potential ways of limiting the multicast scope: static congurations, CGMP, and
IGMP Snooping, each of which is covered later in the chapter.
584 Chapter 13: Multicast and Broadcast Services
This router uses PIM for the multicast routing. Notice the global conguration statement ip
multicast-routing. This is mandatory to enable the routing. If you do not enter this
statement, neither CGMP nor PIM functions.
To check the operation of your multicast router, use the show ip igmp interface command
as demonstrated in Example 13-2.
CGMP/IGMP: Advanced Trafc Management 585
The output of Example 13-2 veries that IGMP is enabled and the version number. It also
displays the timer values and the identity of the query router. In this case, this router is the
query router.
Note that the conguration includes the router port. If you exclude the router from the
command, the router never receives a membership report from a host. When the host
transmits the multicast frame with the address in the conguration, the switch looks at the
destination address and checks the bridge table. The properly congured bridge table in this
case has three ports eligible to receive the multicast frame. The Catalyst does not forward
the frame to any ports without the multicast conguration.
This is a reliable method of modifying the CAM table. But, it is completely manual. If you
need to add or delete a multicast group, or if you need to add or move a host, you need to
change the conguration in the switch. Manually modifying the CAM table does not scale
well if you have many hosts and many multicast groups.
The Catalyst has two dynamic tools for modifying the bridge tables in a multicast
environment. These methods, CGMP and IGMP Snooping, eliminate the need for manual
conguration and become very attractive in dynamic multicast environments. The
following sections describe both dynamic processes.
CGMP
The Cisco proprietary CGMP protocol interacts with IGMP to dynamically modify
bridge tables. Because CGMP is Cisco proprietary, you must use Cisco routers and
Catalyst switches for it to be effective. When a host sends IGMP membership reports to
a CGMP-capable router, the router sends conguration information, via CGMP, to the
Catalyst. The Catalyst modies its local bridge table based upon the information
contained in the CGMP message. Figure 13-11 shows a multicast system with two
Catalysts cascaded together and a router attached to one of them. PCs attach to the
Catalysts and desire to receive a multicast stream.
CGMP/IGMP: Advanced Trafc Management 587
1 2
PC-1
00-60-08-93-DB-Cl
PC-2
In Figure 13-11, a Cisco router receives IGMP membership reports from PC-1 and PC-2.
The router sends a CGMP conguration message to the Catalyst telling it about the source
MAC address of the host and the multicast group from which it wants to receive trafc. For
example, PC-1 asks to join 224.1.10.10. The router tells the Catalyst to send multicast
trafc with the destination MAC address of 01-00-5E-01-0A-0A to the host with the source
MAC address 00-60-08-93-DB-C1. The Catalyst searches its bridge table for the
corresponding unicast address and adds the multicast group address to the port the host
attaches to. Any frames that the Catalyst sees with the multicast address gets forwarded to
the port without bothering other ports. It is possible that many hosts belong to the multicast
group. Each host individually registers with the router, and the router updates the Catalyst.
Notice two things about the CGMP operations. First, CGMP does not require any
modications to the host. CGMP operates independently of the hosts and only involves the
router and the switch. Secondly, CGMP messages ow from the router to the switch, never
from the switch to the router.
588 Chapter 13: Multicast and Broadcast Services
3 7 15 23 31
GDA
GDA USA
USA
Several elds comprise the CGMP frame of Figure 13-12 as detailed in the following list:
VersionDescribes the version of CGMP sending this message.
Message TypeTwo messages dened: join or leave.
ReservedNot used, set to 0.
CountIndicates number of multicast/unicast address pairs contained in this CGMP
message.
Group Multicast AddressIndicates multicast address to be modied. Referred to
as Group Destination Address (GDA).
Source Unicast AddressIndicates the source address of the device joining or
leaving the multicast group. Referred to as the Unicast Source Address (USA).
The Version Number indicates the version of CGMP that the transmitting router is using.
Currently, only one version exists. Therefore, this 4-bit eld has a value of 1.
The Message Type might have one of two values. The message is either a join message with
a type value of 0, or a leave message with a type value of 1.
The Reserved eld is currently unused and sets all bits in the eld to 0.
CGMP/IGMP: Advanced Trafc Management 589
When the router sends a CGMP message to the switch, it includes address pairs of hosts
and the multicast address the host wants to receive. The router can include more than one
pair in the CGMP message. The Count eld tells the switch how many address pairs are
included in the CGMP message.
CGMP refers to the Multicast Group Address as a Group Destination Address (GDA).
Many of the show statements display values labeled as the GDA. The rst three octets of
this eld value should be 0x01-00-5E.
Finally, the Unicast Source Address (USA) appears in the list. It helps the Catalyst
recognize the specic host desiring to receive the multicast stream. The GDA/USA forms
an address pair. A router can include many address pairs in the CGMP frame when it sends
conguration information to the switch.
You might see several combinations of GDAs and USAs in a CGMP message decode.
Table 13-1 shows possible combinations and describes the meaning of each.
The rst two messages of Table 13-1 represent typical messages because the router sends
these when it instructs the Catalyst to add or delete a host or set of hosts from the multicast
group specied in the GDA eld. When the Catalyst receives this message, it correspondingly
modies the bridge table to correctly forward or lter group multicast trafc to the port.
The Catalyst needs to learn where a CGMP-capable router resides. You can manually
congure the information in the switch. However, a CGMP discovery process enables a router
to announce itself to a switch. A router can also tell the switch that it no longer participates in
CGMP by sending a leave message with its MAC address and an all zeros GDA value.
Finally, the router can inform a switch to forget about a specic multicast group and to
remove any entries for this group. Or, it can send a ush to delete all multicast entries for
all multicast groups from the bridge table. Figure 13-13 shows a decoded CGMP join
message for a specic host. What multicast group does this client want to join?
590 Chapter 13: Multicast and Broadcast Services
You cannot tell exactly which multicast group the client wants to join, because there are no
Layer 3 addresses in the message. If you could capture the IGMP join message that
spawned the CGMP join, you could know for certain the desired multicast group. From the
CGMP decode, the only thing you can tell about the multicast group is the last 23 bits of
the address. These bits translate to a decimal address representation of xxx.127.160.171.
You cannot determine the rst octet due to the address ambiguity issue.
The GDA 0100.5E00.0128 is for the multicast routing protocol. The router indicates to the
switch that it wants to receive any multicast frames belonging to this group to ensure that
it receives multicast route updates. Examine the network in Figure 13-14. Multiple hosts
desire to receive a multicast stream from a single multicast group group.
CGMP/IGMP: Advanced Trafc Management 591
PC-2
Cat-A
Cat-B
224.1.10.10
IGMP
Membership
GDA1 = 01-00-5E-01-0A-0A
USA1 = 00-60-08-93-DB-C1
PC-1 wants to join multicast group 224.1.10.10. It sends an IGMP membership report
informing routers that it wants to see these frames. The router creates a CGMP join message
with a GDA of 01-00-5E-01-0A-0A and a USA with PC-1s source address. The router
sends the frame as a CGMP multicast (01-00-0C-DD-DD-DD). The Catalyst detects the
CGMP join message, looks in its bridge table for the host, adds the GDA to the port bridge
table, and starts to forward any frames for this GDA to PC-1. The CGMP join message from
the router causes both Cat-A and Cat-B to modify their bridge tables. Cat-A modies its
bridge table so that the multicast frame forwards to Cat-B. This works if the port is an
access or trunk link. In either case, the multicast frame stays within the VLAN boundaries.
If PC-2 decides to join the multicast group too, the process repeats, but with PC-2s USA.
The multicast query router occasionally issues a general query that both hosts receive. They
both start random timers to determine when to issue a solicited membership report. The
report suppression mechanism works here, because the report frame has the GDA in the
MAC layer destination eld. The switch forwards this to all ports with members of the
group. When the other host receives the membership report, it cancels its timer and does
not send another membership report for that group.
592 Chapter 13: Multicast and Broadcast Services
After some period of time, PC-1 decides it wants to leave the group. In an IGMP version 1
environment, PC-1 does nothing proactively to announce its leave, so the router does not
know explicitly. However, at some point in time, the query router issues a general query.
PC-2 responds to the query and the router continues to forward multicast trafc for the
group. The router still does not know that PC-1 left the group. Therefore, the switch
continues to forward the multicast trafc to PC-1 even though it no longer wants to receive
the frame. In fact, PC-1 continues to receive the group trafc until all hosts leave the group
and the router detects that there are no more members in the group.
TIP In an IGMP version 1 environment, members of a group continue to receive trafc until all
members of the group leave. If at some point in time there were ten members, but only one
currently remains active, all ten continue to receive the multicast stream. IGMP version 2
improves this situation. If all clients on your segments (broadcast domain) support IGMP
version 2, it is wise to set them up to run the later version and gain bandwidth efciencies
that are not possible with IGMP version 1.
Eventually, PC-2 decides that it no longer wants to receive the multicast stream. When the
query router issues a membership query, no hosts respond for this group. The multicast
query router stops forwarding the multicast frames after it fails to see a solicited
membership report for at least two queries. At this point, the router stops forwarding the
multicast trafc and sends a CGMP group leave message to the switch. The group leave
message has the GDA for the group and a USA of 0000.0000.0000. This forces the Catalyst
to ush all entries in the bridge table for this group.
TIP Note that the Catalyst does not prune all multicast addresses. For example, the switch
forwards reserved multicast addresses such as 224.0.0.5 and 224.0.0.6. These are used by
Open Shortest Path First (OSPF) routing. If the switch pruned trafc for these multicast
groups, the protocols would break.
TIP Enabling CGMP Fast Leave processing on a Catalyst switch forces the switch to capture all
IGMP leave messages sent to the IP multicast address 224.0.0.2 (all routers multicast). This is
the same address used for the Cisco protocol Hot Standby Router Protocol (HSRP) and
translates to a multicast MAC address of 01-00-5E-00-00-02. To capture the IGMP leave
message, the Enhanced Address Recognition Logic (EARL) table has static entries for this MAC
address causing the switch to absorb the leave message. This results in the switch also
consuming HSRP frames. Normally the switch does not forward absorbed frames (such as
CDP) to any other ports, and would break HSRP. However, the switch behavior is modied in
the supervisor code so that the switch Supervisor module recognizes non-IGMP frames and
oods them out all router ports and the Spanning Tree Root Port, preserving HSRP functionality.
CGMP/IGMP: Advanced Trafc Management 593
Conguring CGMP
Enabling CGMP requires congurations on both the router and the Catalyst. By default,
CGMP is disabled on both. Note that if you are using IGMP Snooping (described in the
next section) on a Catalyst, you cannot use CGMP. CGMP and IGMP are mutually
exclusive.
Catalyst congurations for CGMP include three set commands. The set cgmp enable
command enables CGMP. This command was introduced in switch supervisor code version
2.2. Ensure that you have at least this revision before trying to use CGMP. The set multicast
router mod_num/port_num command is an optional command that statically congures a
multicast router. Normally, the router announces itself, enabling the switch to dynamically
learn the presence of the multicast router. You can, however, elect to statically congure this
so that your Catalyst does not need to wait to learn this information. Finally, you can use
the set cgmp leave enable command in an IGMP version 2 environment to enable the
Catalyst to look for IGMP version 2 leave messages. If the Catalyst sees a leave message
from a host, it waits to see if any join messages appear on the interface. If not, the Catalyst
prunes the port from the multicast group without sending the leave message to the router.
Without this feature, the Catalyst waits to see a CGMP leave message from the router.
You must also turn on CGMP in the router. Use the ip cgmp interface conguration command
to enable CGMP. You need to enter this command on all router ports participating in CGMP.
The router output in Example 13-4 shows a router announcing itself with CGMP to any
switches listening on the interface.
Notice that the GDA value targets the all groups address, whereas the USA value reects
the routers built-in MAC address on the Ethernet interface.
The router output comes from the router with the conguration shown in Example 13-2.
The router also announces itself as a member of GDA 0100.5e00.0128. Where did this
group come from?
The GDA 0100.5E00.0128 is for the multicast routing protocol. The router indicates to the
switch that it wants to receive any multicast frames belonging to this group to ensure that
it receives multicast route updates.
594 Chapter 13: Multicast and Broadcast Services
Depending upon your version of Catalyst, you have two methods to measure the broadcast and
multicast frames. One method measures the amount of port bandwidth consumed by multicasts
and broadcasts (hardware-based broadcast suppression). The other method measures the
number of broadcast and multicast frames (software-based broadcast suppression). Both
metrics integrate over a 1-second interval. The effect of the two varies, though.
1 2 3 4
100
Percentage of Bandwidth
Configured Threshold
0
1 Second Intervals
Unicast
Broadcast/Multicast
Dropped
To congure hardware-based suppression, use the command set port broadcast mod_num/
port_num threshold%. Note the percent sign at the end of the command. You must include
this for the Catalyst to distinguish the value as a bandwidth threshold rather than a packet
count threshold.
In Figure 13-16, a Catalyst reacts to frames during three time intervals. In the rst interval,
both the unicast and broadcast frames remain below the congured threshold. Therefore,
the Catalyst forwards all frames. In the second interval, unicast frames exceed the
threshold, whereas the broadcast level remains below the threshold. The Catalyst forwards
all frames. In the third interval, the broadcast level exceeds the threshold. At the point in the
interval when this occurrs, the Catalyst drops all frames (both broadcast and unicast) for the
rest of the time interval.
1 2 3
Frames per Second
1 Second Intervals
Unicast
Broadcast/Multicast
Dropped
To enable software-based broadcast suppression on your Catalyst, use the set port
broadcast mod_num/port_num threshold command. Note the absence of the percent sign.
This instructs the Catalyst to use software-based broadcast suppression.
598 Chapter 13: Multicast and Broadcast Services
Review Questions
1 IGMP version 2 includes an explicit leave message for hosts to transmit whenever
they no longer want to receive a multicast stream. Why, then, does version 2 include
the query message?
2 Why doesnt a Catalyst normally learn multicast addresses?
3 What Layer 2, Layer 3, and IGMP information does a multicast device transmit for a
membership report?
4 Assume that you have a switched network with devices running IGMP version 1 and
the switches/routers have CGMP enabled. One of the multicast devices surfs the Web
looking for a particular multicast stream. The user rst connects to group 1 and nds
it isnt the group that he wants. So he tries group 2, and then group 3, until he nally
nds what he wants in group 4. Meanwhile, another user belongs to groups 1, 2, and
3. What happens to this users link?
This page intentionally left blank
PART
V
Real-World Campus Design and
Implementation
Chapter 14 Campus Design Models
Chapter 16 Troubleshooting
The earliest seeds of today's campus networks began with departmental servers. In the mid-
1980s, the growth of inexpensive PCs led many organizations to install small networks
utilizing Ethernet, ArcNet, Token Ring, LocalTalk, and a variety of proprietary solutions.
Many of these networks utilized PC-based server platforms such as Novell's Netware. Not
only did this promote the sharing of information, it allowed expensive hardware such as
laser printers to be shared.
Throughout the late-1980s, these small networks began to pop up throughout most
corporations. Each network was built to serve a single workgroup or department. For
example, the nance department would have a separate network from the human resources
department. Most of these networks were extremely decentralized. In many cases, they
were installed by non-technical people employed by the local workgroup (or outside
consultants hired by the workgroup). Although some companies provided centralized
support and guidelines for deploying these departmental servers, few companies provided
links between these pockets of network computing.
In the early 1990s, multiprotocol routers began to change all of this. Routers suddenly
provided the exibility and scalability to begin hooking all of these network islands into
one unied whole. Although routers allowed media-independent communication across the
many different types of data links deployed in these departmental networks, Ethernet and
Token Ring became the media of choice. Routers were also used to provide seamless
communication across wide-area links.
Early routers were obviously extremely bandwidth-limited compared to today's products.
How then did these networks function when the Gigabit networks of today strain to keep
up? There are two main factors: the quantity of trafc and the type of trafc.
First, there was considerably less trafc in campus networks at the time early router-based
campus networks were popular. Simply put, fewer people used the network. And those who
did use it tended to use less network-intensive applications.
However, this is not to say that early networks were like a 15-lane highway with only three
cars on it. Given the lower available bandwidth of these networks, many had very high
average and peak utilization levels. For instance, before the rise of client/server computing,
many databases utilized le servers as a simple hard drive at the end of a long wire.
Thousands of dBase and Paradox applications were deployed that essentially pulled the
entire database across the wire for each query. Therefore, although the quantity of trafc
has grown dramatically, another factor is required to explain the success of these older,
bandwidth-limited networks.
To explain this difference, the type of trafc must be considered. Although central MIS
organizations used routers and hubs to merge the network into a unied whole, most of the
trafc remained on the local segment. In other words, although the networks were linked
together, the workgroup servers remained within the workgroups they served. For example,
a custom nancial application developed in dBase needed to use only the nance
department's server; it never needed to access the human resource server. The growing
amount of le and printer server trafc also tended to follow the same patterns.
Changing Trafc Patterns 605
These well-established and localized trafc ows allowed designers to utilize the popular
80/20 rule. Eighty (or even 90+) percent of the trafc in these networks remained on the
local segment. Hubs (or possibly early switching hubs) could support this trafc with
relative ease. Because only 20 (or even less than 10) percent of the trafc needed to cross
the router, the limited performance of these routers did not pose signicant problems.
With blinding speed, all of this began to change in the mid-1990s. First, enterprise
databases were deployed. These were typically large client/server systems that utilized a
small number of highly centralized servers. On one hand, this dramatically cut the amount
of trafc on networks. Instead of pulling the entire database across the wire, the application
used technologies such as Structured Query Language (SQL) to allow intelligent database
servers to rst lter the data before it was transmitted back to the client. In practice, though,
client/server systems began to signicantly increase the utilization of network resources for
a variety of reasons. First, the use of client/server technology grew at a staggering rate.
Although each query might only generate one fourth of the trafc of earlier systems, many
organizations saw the number of transactions increase by a factor of 10100. Second, the
centralized nature of these applications completely violated the 80/20 rule. In the case of
this trafc component, 100 percent needs to cross the router and leave the local segment.
Although client/server applications began to tax traditional network designs, it took the rise
of Internet and intranet technologies to completely outstrip available router (and hub)
capacity. With Internet-based technology, almost 100 percent of the trafc was destined to
centralized servers. Web and e-mail trafc generally went to a small handful of large UNIX
boxes running HTTP, Simple Mail Transfer Protocol (SMTP), and Post Ofce Protocol
(POP) daemons. Internet-bound trafc was just as centralized because it needed to funnel
through a single rewall device (or bank of redundant devices). This trend of centralization
was further accelerated with the rise of server farms that began to consolidate workgroup
servers. Instead of high-volume le and print server trafc remaining on the local wire,
everything began to ow across the corporate backbone.
As a result, the traditional 80/20 rule has become inverted. In fact, most modern networks
have less than ve percent of their trafc constrained to the local segment. When this is
combined with the fact that these new Internet-based technologies are wildly popular, it is
clear that the traditional router and hub design is no longer appropriate.
TIP Be sure to consider changing trafc patterns when designing a campus backbone. In doing
so, try to incorporate future growth and provide adequate routing performance.
606 Chapter 14: Campus Design Models
IDF/MDF
For years, the telephone industry has used the terms Intermediate Distribution Frame (IDF)
and Main Distribution Frame (MDF) to refer to various elements of structured cabling. As
structured cabling has grown in popularity within data-communication circles, this IDF/
MDF terminology has also become common.
The following sections discuss some of the unique requirements of switches placed in IDF
and MDF closets. In addition to these specialized requirements, some features should be
shared across all of the switches. For new installations, all of the switches should offer a
wide variety of media types that include the various Ethernet speeds and ATM. FDDI and
Token Ring support can be important when migrating older networks. Also, because
modern switched campus infrastructures are too complex for the plug-it-in-and-forget-it
approach, comprehensive management capabilities are a must.
IDF
IDF wiring closets are used to connect end-station devices such as PCs and terminals to the
network. This horizontal wiring connects to wall-plate jacks at one end and typically
consists of unshielded twisted-pair (UTP) cabling that forms a star pattern back to the IDF
wiring closet. As shown in Figure 14-1, each oor of a building generally contains one or
more IDF switches. Each end station connects back to the nearest IDF wiring closet. All of
the IDFs in a building generally connect back to a pair of MDF devices often located in the
building's basement or ground oor.
Campus Design Terminology 607
Floor 3
End Users
IDF
Floor 2
End Users
IDF
Floor 1
End Users
IDF
Basement
MDF MDF
Given the role that they perform, IDF wiring closets have several specic requirements:
Port densityBecause large numbers of end stations need to connect to each IDF,
high port density is a must.
Cost per portGiven the high port density found in the typical IDF, cost per port
must be reasonable.
RedundancyBecause several hundred devices often connect back to each IDF
device, a single IDF failure can create a signicant outage.
ReliabilityThis point is obviously related to the previous point, however, it
highlights the fact that an IDF device is usually an end station's only link to the rest
of the world.
Ease of managementThe high number of connections requires that per-port
administration be kept to a minimum.
608 Chapter 14: Campus Design Models
Because of the numerous directly connected end users, redundancy and reliability are
critical to the IDF's role. As a result, IDFs should not only utilize redundant hardware such
as dual Supervisors and power supplies, they should have multiple links to MDF devices.
Fast failover of these redundant components is also critical.
IDF reliability brings up an interesting point about end-station connections. Outside of
limited environments such as nancial trading oors, it is generally not cost-effect to have
end stations connected to more than one IDF device. Therefore, the horizontal cabling
serves as a single point of failure for most networks. However, note that these failures
generally affect only one end station. This is several orders of magnitude less disruptive
than losing an entire switch. For important end stations such as servers, dual-port network
interface cards (NICs) can be utilized with multiple links to redundant server farm switches.
The traditional device for use in IDF wiring closets is a hub. Because most hubs are fairly
simple devices, the price per port can be very attractive. However, the shared nature of hubs
obviously provides less available bandwidth. On the other hand, routers and Layer 3
switches can provide extremely intelligent bandwidth sharing decisions. On the downside,
these devices can be very expensive and generally have limited port densities.
To strike a balance between cost, available bandwidth, and port densities, almost all
recently deployed campus networks use Layer 2 switches in the IDF. This can be a very
cost-effective way to provide 500 or more end stations with high-speed access into the
campus backbone.
However, this is not to say that some Layer 3 technologies are not appropriate for the wiring
closet. Cisco has introduced several IDF-oriented features that use the Layer 3 and 4
capabilities of the NetFlow Feature Card (NFFC). As discussed in Chapter 5, VLANs,
and Chapter 11, Layer 3 Switching, Protocol Filtering can be an effective way to limit
the impact of broadcasts on end stations. By allowing a port to only output broadcasts for
the Layer 3 protocols that are actually in use, valuable CPU cycles can be saved. For
example, a broadcast-efcient TCP/IP node in VLAN 2 can be spared from being burdened
with IPX SAP updates. IGMP Snooping is another feature that utilizes the NFFC to inspect
Layer 3 information. By allowing the Catalyst to prune ports from receiving certain
multicast addresses, this feature can save signicant bandwidth in networks that make
extensive use of multicast applications. Finally, the NFFC can be used to classify trafc for
Quality of Service/Class of Service (QoS/COS) purposes.
TIP The most important IDF concerns are cost, port densities, and redundancy.
Campus Design Terminology 609
MDF
IDF devices collapse back to one or more Main Distribution Frame (MDF) devices in a star-
like fashion. Each IDF usually connects to two different MDF devices to provide adequate
redundancy. Some organizations place both MDF devices in the same physical closet and rely
on disparate routing of the vertical cabling for redundancy. Other organizations prefer to place
the MDF devices in separate closets altogether. The relationship between buildings and MDFs
is not a hard rulelarger buildings might have more than two MDF switches, whereas a pair
of redundant MDF devices might be able to carry multiple buildings that are smaller in size.
Figure 14-2 shows three buildings with MDF closets. To meet redundancy requirements,
each building generally houses two MDF devices. The MDF devices can also be used to
interconnect the three buildings (other designs are discussed later).
Building 1
IDF
IDF
MDF MDF
2 Bu
ilding ildi
Bu ng
MD
3
F
MD
IDF
IDF
MD
F
IDF
MD
F
IDF
610 Chapter 14: Campus Design Models
MDF closets have a different set of requirements and concerns than IDF closets:
Throughput
High availability
Routing capabilities
Given that they act as concentration points for IDF trafc, MDF devices must be able to
carry extremely high levels of trafc. In the case of a Layer 2 switch, this bandwidth is
inexpensive and readily available. However, as is discussed later in this chapter, many of
the strategies to achieve robust and scalable designs require routing in the MDF. Achieving
this level of Layer 3 performance can require some careful planning. For more information
on Layer 3 switching, see Chapter 11. Issues associated with Layer 3 switching are also
addressed later in this chapter and in Chapter 15.
High availability is an important requirement for MDF devices. Although the failure of
either an MDF or IDF switch potentially affects many users, there is a substantial
distinction between these two situations. As discussed in the previous section, the failure of
an IDF device completely disables the several hundred attached end stations. On the other
hand, because MDFs are almost always deployed in pairs, failures rarely result in a
complete loss of connectivity. However, this is not to say that MDF failures are
inconsequential. To the contrary, MDF failures often affect thousands of users, many more
than with an IDF failure. This requires as many features as possible that transparently
reroute trafc around MDF problems.
In addition to the raw Layer 3 performance discussed earlier, other routing features can be
important in MDF situations. For example, the issue of what Layer 3 protocols the router
handles can be important (IP, IPX, AppleTalk, and so forth). Routing protocol support
(OSPF, RIP, EIGRP, IS-IS, and so on) can also be a factor. Support for features such as
DHCP relay and HSRP can be critical.
Three types of devices can be utilized in MDF closets:
Layer 2 switches
Hybrid, routing switches such as MLS
Switching routers such as the Catalyst 8500
The rst of these is also the simplesta Layer 2 switch. The moderate cost and high
throughput of these devices can make them very attractive options. Examples of these
devices include current Catalyst 4000 models and traditional Catalyst 5000 switches
without a Route Switch Module (RSM) or NFFC.
However, as mentioned earlier, there are compelling reasons to use Layer 3 processing in
the MDF. This leads many network designs to utilize the third option, a Layer 3 switch that
is functioning as a hardware-based router, what Chapter 11 referred to as a switching router.
The Catalyst 8500 is an excellent example of this sort of device.
Campus Design Terminology 611
Cisco also offers another approach, Multilayer Switching (MLS), that lies between the
previous two. MLS is a hybrid approach that allows the Layer 2-oriented Supervisors to
cache Layer 3 information. It allows Catalysts to operate under the routing switch form of
Layer 3 switching discussed in Chapter 11. A Catalyst 5000 with an RSM and NFFC is an
example of an MLS switch. Other examples include the Catalyst 5000 Route Switch
Feature Card (RSFC) and the Catalyst 6000 Multilayer Switch Feature Card (MSFC).
NOTE It is important to understand the differences between the routing switch (MLS) and
switching router (Catalyst 8500) styles of Layer 3 switching. These concepts are discussed
in detail in Chapter 11.
Although the switching router (8500) and routing switch (MLS) options both offer very
high throughput at Layer 3 and/or 4, there are important differences. For a thorough
discussion of the technical differences, please see Chapter 11. This chapter and Chapter 15
focus on the important design implications of these differences.
TIP The most important MDF factors are availability and Layer 3 throughput and capabilities.
Access
Distribution
Core
Access Layer
The IDF closets are termed access layer closets under the three-layer model. The idea is
that the devices deployed in these closets should be optimized for end-user access. Access
layer requirements here are the same as those discussed in the IDF section: port density,
cost, resiliency, and ease of management.
Distribution Layer
Under the three-layer model, MDF devices become distribution layer devices. The
requirement for high Layer 3 throughput and functionality is especially important here.
TIP In campus networks, the term access layer is synonymous with IDF, and distribution layer
is equivalent to MDF.
Key Requirements of Campus Designs 613
Core Layer
The connections between the MDF switches become the core layer under the three-layer
model. As is discussed in detail later, some networks have a very simple core consisting of
several inter-MDF links or a pair of Layer 2 switches. In other cases, the size of the network
might require Layer 3 switching within the core. Many networks utilize an Ethernet-based
core; others might use ATM technology.
NOTE In general, the terms access layer and distribution layer are used interchangeably with IDF
and MDF. However, the IDF/MDF terms are used most often when discussing two-layer
network designs; the access/distribution/core terminology is used when explaining three-
layer topologies.
Advantages of Routing
One of the key themes that is developed throughout this chapter is the idea that routing is
critical to scalable network design. Hopefully, this is not news to you. However, given the
recent popularity and focus on extremely at, avoid-the-router designs, a fair amount of
attention is devoted to this subject. Many people are convinced that the key objective in campus
network design is to eliminate as many routers as possible. On the contrary, my experience
suggests that this is exactly the wrong aimrouters have a proven track record of being the key
to achieving the requirements of campus design discussed in the previous section.
Scalable bandwidthRouters have traditionally been considered slower than other
approaches used for data forwarding. However, because a routed network uses a very
decentralized algorithm, higher aggregate rates can be achieved than with less
intelligent and more centralized Layer 2 forwarding schemes. Combine this fact with
newer hardware-based routers (Layer 3 switches) and routing can offer extraordinary
forwarding performance.
Broadcast lteringOne of the Achilles heels of Layer 2 switching is broadcast
containment. Vendors introduced VLANs as a partial solution to this problem, but key
issues remain. Not only do broadcasts rob critical bandwidth resources, they also
starve out CPU resources. Techniques such as ISL and LANE NICs that allow servers
to connect to multiple VLANs in an attempt to build at networks with a minimal use
of routers only make this situation much worsenow the server must process the
broadcasts for 10 or 20 VLANs! On the other hand, the more intelligent forwarding
algorithms used by Layer 3 devices allow broadcasts to be contained while still
maintaining full connectivity.
Superior multicast handlingAlthough progress is being made to improve
multicast support for Layer 2 devices through schemes such as IGMP Snooping,
CGMP, and 802.1p (see Chapter 13, Multicast and Broadcast Services), it is
extremely unlikely that these efforts will ever provide the comprehensive set of
features offered by Layer 3. By running Layer 3 multicast protocols such as PIM,
routers always provide a vast improvement in multicast efciency and scalability.
Given the predictions for dramatic multicast growth, this performance will likely be
critical to the future (or current) success of your network.
Optimal path selectionBecause of their sophisticated metrics and path
determination algorithms, routing protocols offer much better path selection
capabilities than Layer 2 switches. As discussed in the Spanning Tree chapters, Layer
2 devices can easily send trafc through many unnecessary bridge hops.
Fast convergenceNot only do routing protocols pick optimal paths; they do it very
quickly. Modern Layer 3 routing protocols generally converge in 510 seconds. On
the other hand, Layer 2 Spanning-Tree Protocol (STP) convergence takes 3050
seconds by default. Although it is possible to change the default STP timers and to
make use of optimizations such as UplinkFast in certain topologies, it is very difcult
to obtain the consistently speedy results offered by Layer 3 routing protocols.
Advantages of Routing 615
TIP Large networks almost always benet from scalability, exibility, and intelligence of
routing. Try to build routing (Layer 3 switching) into your campus design.
616 Chapter 14: Campus Design Models
Floor 3
End Users
Floor 2
End Users
Floor 1
End Users
e1 e2 Basement
e0
Campus Design Models 617
The traditional router and hub model uses Layer 1 hubs in IDF/access wiring closets. These
connect back to unique ports on routers located in MDF/distribution closets. Several options
are available for the campus core. In one approach, the distribution layer routers directly
interconnect to form the network core/backbone. Because of its reliability and performance,
an FDDI ring has traditionally been the media of choice for these connections. In other cases,
some network designers prefer to form a collapsed backbone with a hub or router.
There are several advantages to the router and hub model as well as several reasons why
most new designs have shied away from this approach. Table 14-1 lists the advantages and
disadvantages of the router and hub model.
Table 14-1 Advantages and Disadvantages of the Router and Hub Model
Advantage Disadvantage
Its reliance on routers makes for very Shared-media hubs do not offer enough bandwidth
good broadcast and multicast control. for modern applications. For example, in Figure
14-2, each oor must share a single 10 Megabit
segment (factor in normal Ethernet overhead and
these segments become extremely slow).
Because each hub represents a This design generally uses software-based
unique IP subnet or IPX network, routers that cannot keep up with increasing
administration is straightforward and trafc levels.
easy to understand.
Given moderate levels of trafc and Trafc patterns have changed, invalidating the
departmental servers located on the assumption that most trafc would remain local.
local segment, the router and hub As a result, the campus-wide VLANs model
model can yield adequate performance. became popular.
The hardware for this model is readily
available and inexpensive.
The chief advantage of this approach is the simplicity and familiarity that it brings to campus
network design and management. The primary disadvantage is the limited bandwidth that this
shared-media approach offers. The multilayer design model discussed later attempts to
capitalize on the simplicity of the router and hub model while completely avoiding the limited
bandwidth issue through the use of Layer 2 and 3 switching technology.
Campus-wide VLANs strive to eliminate the use of routers. Because routers had become a
signicant bottleneck in campus networks, people looked for ways to minimize their use.
Because broadcast domains still needed to be held to a reasonable size, VLANs were used to
create logical barriers to broadcasts. Figure 14-5 illustrates a typical campus-wide VLANs
design.
IDF IDF
Access
IDF IDF
Distribution
MDF MDF MDF MDF
Layer 2
Link 1
Core
Core Core
Server Farm
Figure 14-5 uses Layer 2 switching throughout the entire network. To provide
communication between VLANs, two routers have been provided using the router-on-a-
stick conguration (see Chapter 11).
Lack of scalability
Most modern trafc violates the stay in one subnet rule employed by the campus-
wide VLAN model
Modern routers are not a bottleneck
The paragraphs that follow provide more detailed coverage of each of these disadvantages.
Management of these networks can be much more difcult and tedious than originally
expected. The router and hub design had the logical clarity of one subnet per wiring closet.
Conversely, many networks using campus-wide VLANs have developed into a confusing
mess of VLAN and Layer 3 address assignments.
Another downside to campus-wide VLANs is that the lack of logical structure can be
problematic, especially when it comes to troubleshooting. Without a clearly dened
hierarchy, it is very difcult to narrow down the source of each problem. Before each
troubleshooting session, valuable time can be wasted trying to understand the constantly
changing VLAN structure.
Also, campus-wide VLANs result in large and overlapping Spanning Tree domains. As
discussed in Chapter 6, Understanding Spanning Tree, and Chapter 7, Advanced
Spanning Tree, STP uses a complex set of evaluations that elect one central device (the
Root Bridge) for every VLAN. Other bridges and switches then locate the shortest path to
this central bridge/switch and use this path for all data forwarding. The Spanning-Tree
Protocol is extremely dynamicif the Root Bridge (or a link to the Root Bridge) is
apping, the network continuously vacillates between the two switches acting as the Root
Bridge (disrupting trafc every time it does so). Large Spanning Tree domains must use
very conservative timer values, resulting in frustratingly slow failover performance. Also,
as the size and number of the Spanning Tree domains grow, the possibility of CPU overload
increases. If a single device in a single VLAN falls behind and opens up a loop, this can
quickly overload every device connected to every VLAN. The result: network outages that
last for days and are difcult to troubleshoot.
Yet another downside to campus-wide VLANs is that the wide use of trunk links that carry
multiple VLANs makes the Spanning Tree problems even worse. For example, consider
Link 1 in Figure 14-5, a Fast Ethernet link carrying VLANs 115. Assume that the CPU in
a single switch in VLAN 1 becomes overloaded and opens up a bridging loop. Although the
loop might be limited to VLAN 1, this VLAN's trafc can consume all of the trunk's
capacity and starve out all other VLANs. This problem is even worse if you further assume
that VLAN 1 is the management VLAN. In this case, the broadcasts caught in the bridging
loop devour 100 percent of switch's CPU horsepower throughout the network. As more and
more switch CPUs become overloaded, more and more VLANs experience bridging loops.
Within a matter of seconds, the entire network melts down.
An additional problem with the campus-wide VLAN model is that, to avoid these Spanning
Tree and trunking problems, many campus-wide VLAN networks have had to resort to
eliminating all redundant paths just to achieve stability. To solve this problem, redundant links
can be physically disconnected or trunks can be pruned in such a way that a loop-free
Spanning Tree is manually created. In either case, this makes every device in the network a
single point of failure. Most network designers never intend to make this sort of sacrice
Campus Design Models 621
when they sign up for a at earth design. Without routers, there are no Layer 3 barriers in
the network and it becomes very easy for problems to spread throughout the entire campus.
Furthermore, campus-wide VLANs are not scalable. Many small networks have been
successfully deployed using the campus-wide VLAN design. Initially, the users of these
networks are usually very happy with both the utility and the bandwidth of their new
infrastructure. However, as the network begins to grow in size, the previously mentioned
problems become more and more chronic.
Yet another downside to campus-wide VLANs is that it is harder and harder to bypass
routers, the very premise that the entire campus-wide VLANs scheme was built upon. As
trafc patterns have evolved from departmental servers on the local segment to enterprise
servers located in a centralized server farm, it has become very difcult to remove routers
from this geographically dispersed path. For example, it can be difcult to connect an
enterprise web server to 20 or more VLANs (subnets) without going through a router. A
variety of solutions such as ISL, 802.1Q, and LANE NICs have become available; however,
these have generally produced very disappointing performance. And, as mentioned earlier,
these NICs request the server to process all broadcasts for all VLANs, robbing it of valuable
and expensive CPU cycles. Also, the multiple-VLAN NICs have been fraught with other
problems such as slow initialization time, a limited number of VLANs, and unexpected
server behavior.
Finally, another basic premise of the campus-wide VLAN strategy is no longer true.
Specically, routers are now as fast (or nearly as fast) as Layer 2 switches. Although this
equivalent performance generally comes at a price premium, it is no longer worthwhile to
go to such great lengths to avoid Layer 3 routing.
TIP Carefully evaluate the downsides of the campus-wide model before designing your network
in this manner. Although some users are very happy with this approach to campus design,
most have been disappointed with the stability and scalability.
Multilayer Model
The multilayer model strives to provide the stability and scalability of the router and hub
model while also capturing the performance of the campus-wide VLANs model. This
approach takes full advantage of hardware-based routing, Layer 3 switching, to put routing
back into its rightful place. However, it does not ignore Layer 2 switching. In fact, it seeks
to strike the optimal balanceLayer 3 switching is used for control, whereas Layer 2
switching is used for cost-effective data forwarding.
Figure 14-6 illustrates a sample network using the multilayer model.
IDF IDF
Layer 2 Access
IDF IDF
Distribution
Layer 3 or 4 MDF MDF MDF MDF
Layer 2 or 3
Core
Core Core
Server Farm
Campus Design Models 623
Each IDF/MDF cluster forms a separate module in the design. Figure 14-6 shows two
modules. The access layer IDF switches use Layer 2 forwarding to provide large amounts
of cost-effective bandwidth. The distribution layer MDF switches provide the Layer 3
control that is required in all large networks. These IDF/MDF modules then connect
through a variety of Layer 2 or Layer 3 cores.
TIP The multilayer model combines Layer 2 and Layer 3 processing into a cohesive whole. This
design has proven to be highly exible and scalable.
In general, the multilayer model is the recommended approach for enterprise campus
design for several reasons.
First, the use of routers provides adequate Layer 3 control. In short, this allows all of the
benets discussed in the Advantages of Routing section to accrue to your network.
Without listing all of these advantages again, a multilayer design is scalable, exible, high
performance, and easy to manage.
Second, as its name suggests, the multilayer model offers hierarchy. In hierarchical
networks, layers with specic roles are dened to allow large and consistent designs. As the
next section discusses, this allows each layer of the access/distribution/core model to meet
unique and specic requirements.
Third, this approach is very modular. There are many benets to a modular design,
including the following:
It is easy to grow the network.
The total available bandwidth scales as additional modules are added.
Modular networks are easier to understand, troubleshoot, and maintain.
The network can use cookie cutter congurations. This consistency saves
administrative headaches while also reducing the chance of conguration errors.
It is easier to migrate to a modular network. The old network can appear as another
module (although it does not have the consistent layout and congurations of modules
in the new network).
Modular networks allow consistent and deterministic trafc patterns.
Modular designs promote load balancing and redundancy.
It is much easier to provide fast failover in a consistent, modular design than it is in
less structured designs. Because the topology is constrained and well dened, both
Layer 2 and Layer 3 convergence benet.
Modular networks allow technologies to be easily substituted for one another. Not
only does this allow organizations more freedom in the initial design (for example, the
core can be either Ethernet or ATM), it makes it easier to upgrade the network in the
long run.
624 Chapter 14: Campus Design Models
Distribution Blocks
A large part of the benet of the multilayer model centers around the concept of a modular
approach to access (IDF) and distribution (MDF) switches. Given a pair of redundant MDF
switches, each IDF/access layer switch forms a triangle of connectivity as shown in Figure
14-7. If there are ten IDF switches connected to a given set of MDF switches, ten triangles
are formed (such as might be the case in a ten-story building). The collection of all triangles
formed by two MDF switches is referred to as a distribution block. Most commonly, a
distribution block equates to all of the IDF and MDF switches located in a single building.
IDF
IDF
MDF MDF
Because of its simplicity, the triangle creates the ideal building block for a campus network.
By having two vertical links (IDF uplink connections), it automatically provides
redundancy. Because the redundancy is formed in a predictable, consistent, and
uncomplicated fashion, it is much easier to provide uniformly fast failover performance.
TIP Use the concept of a distribution block to simplify the design and maintenance of your network.
The multilayer model does not take a dogmatic stance on Layer 2 versus Layer 3 switching
(although it is based around the theme that some Layer 3 processing is a requirement in
large networks). Instead, it seeks to create the optimal blend of both Layer 2 and Layer 3
technology to achieve the competing goals of low cost, high performance, and scalability.
General Recommendation: Multilayer Model 625
To provide cost-effective bandwidth, Layer 2 switches are generally used in the IDF (access
layer) wiring closets. As discussed earlier, the NetFlow Feature Card can add signicant
value in the wiring closet with features such as Protocol Filtering and IGMP Snooping.
To provide control, Layer 3 switching should be deployed in the MDF (distribution layer)
closets. This is probably the single-most important aspect of the entire design. Without the
Layer 3 component, the distribution blocks are no longer self-contained units. A lack of
Layer 3 processing in the distribution layer causes Spanning Tree, VLANs, and broadcast
domains to spread throughout the entire network. This increases the interdependency of
various pieces of the network, making the network far less scalable and far more likely to
suffer a network-wide outage.
By making use of Layer 3 switching, each distribution block becomes an independent
switching system. The benets discussed in the Advantages of Routing section are baked
into the network. Problems that develop in one part of the network are prevented from
spreading to other parts of the network.
You should also be careful to not circumvent the modularity of the distribution block
concept with random links. For example, Links 1 and 2 in Figure 14-8 break the modularity
of the multilayer model.
Figure 14-8 Links 1 and 2 Break the Modularity of the Multilayer Design
Link 1
IDF IDF
Link 2
IDF IDF
The intent here was good: provide a direct, Layer 2 path between three IDF switches
containing users in the same workgroup. Although this does eliminate one or two router
hops from the paths between these IDF switches, it causes the entire design to start falling
apart. Soon another exception is made, then another, and so on. Before long, the entire
network begins to resemble an interconnected mess more like a bowl of spaghetti than a
carefully planned campus network. Just remember that the scalability and long-term health
of the network are more important than a short-term boost in bandwidth. Avoid spaghetti
networks at all costs.
626 Chapter 14: Campus Design Models
TIP Be certain to maintain the modularity of distribution blocks. Do not add links or inter-
VLAN bridging that violate the Layer 3 barrier that the multilayer model uses in the
distribution layer.
Without descending too far into marketing speak, it is useful to note the potential
application of Layer 4 switching in the distribution layer. By considering transport layer
port numbers in addition to network layer addressing, Layer 4 switching can more easily
facilitate policy-based networking. However, from a scalability and performance
standpoint, Layer 4 switching does not have a major impact on the overall multilayer
modelit still creates the all-important Layer 3 barrier at the MDF switches.
On the other hand, the choice of Layer 3 switching technology can make a difference in
matters such as addressing and load balancing.
Figure 14-9 Switching Router MDF Switches Break the Network into Two Subnets
IDF
Subnet 1
MDF MDF
Subnet 2
The resulting network is completely free of Layer 2 loops. Although some network
designers have viewed this as an opportunity to completely disable the Spanning-Tree
Protocol, this is generally not advisable because misconguration errors can easily create
loops in the IDF wiring closet or end-user work areas (therefore possibly taking down the
General Recommendation: Multilayer Model 627
entire IDF). However, it does mean that STP load balancing cannot be used. Recall from
Chapter 7 that STP load balancing requires two characteristics to be present in the network.
First, it requires redundant paths, something that exists in Figure 14-9. Second, it requires
that these redundant paths form Layer 2 loops, something that the routers in Figure 14-9
prevent. Therefore, some other load balancing technique must be employed.
NOTE The decision of whether or not the Spanning-Tree Protocol should be disabled can be
complex. This book recommends leaving Spanning Tree enabled (even in Layer 2 loop-free
networks such as the one in Figure 14-9) because it provides a safety net for any loops that
might be accidentally formed through the end-user ports. Currently, most organizations
building large-scale campus networks want to take this conservative stance. This choice
seems especially wise when you consider that Spanning Tree does not impose any failover
delay for important topology changes such as a broken IDF uplink. In other words, the use
of Spanning Tree in this environment provides an important benet while having very few
downsides.
For more discussion on the technical intricacies of the Spanning-Tree Protocol, see
Chapters 6 and 7. For more detailed and specic recommendations on using the Spanning-
Tree Protocol in networks utilizing the various forms of Layer 3 switching, see Chapter 15.
In general, some form of HSRP load balancing is the most effective solution. As discussed
in the HSRP section of Chapter 11, if the IDF switch contains multiple end-user VLANs,
the VLANs can be congured to alternate active HSRP peers between the MDF switches.
For example, the left switch in Figure 14-9 could be congured as the active HSRP peer for
the odd VLANs, whereas the right switch would handle the even VLANs. However, if the
network only contains a single VLAN on the IDF switch (this is often done to simplify
network administration by making it more like the router and hub model), the Multigroup
HSRP (MHSRP) technique is usually the most appropriate technology. Figure 14-10
illustrates the MHSRP approach.
628 Chapter 14: Campus Design Models
10
1.1
.1.
.1.
1.2
10
MDF MDF
In Figure 14-10, two HSRP groups are created for a single subnet/VLAN. The rst group
uses the address 10.1.1.1, whereas the second group uses 10.1.1.2. Notice that both
addresses intentionally fall within the same subnet. Half of the end stations connected to
the IDF switch are then congured to use a primary default gateway of 10.1.1.1, and the
other half use 10.1.1.2 (this can be automated with DHCP). For more information on this
technique, see the MHSRP section of Chapter 11 and the Use DHCP to Solve User
Mobility Problems section of Chapter 15.
TIP In general, implementing load balancing while using switching routers in the distribution
layer requires multiple IDF VLANs (each with a separate HSRP standby group) or MHSRP
for a single IDF VLAN.
devices pass Layer 2 trafc by default (this default can be changed). For example, Figure 14-11
illustrates the Layer 2 loops that commonly result when MLS is in use.
Figure 14-11 MLS Often Creates Layer 2 Loops that Require STP Load Balancing
IDF
1/1 1/2 Cost:
Cost: VLAN 2 = 1000
VLAN 2 = 19 VLAN 3 = 19
VLAN 3 = 1000
NFFC NFFC
MDF MDF
Both VLANs 2 and 3 are assigned to all three trunk links, forming a Layer 2 loop. In this
case, STP load balancing is required. As shown in Figure 14-11, the cost for VLAN 3 on
the 1/1 IDF port can be increased to 1000, and the same can be done for VLAN 2 on Port
1/2. For more detailed information on STP load balancing, please see Chapter 7.
TIP The Layer 2/3 hybrid nature of MLS generally requires STP load balancing.
Core
Designing the core of a multilayer network is one of the areas where creativity and careful
planning can come into play. Unlike the distribution blocks, there is no set design for a
multilayer core. This section discusses some of the design factors that should be taken into
consideration.
One of the primary concerns when designing a campus core backbone should be fast
failover and convergence behavior. Because of the reliance on Layer 3 processing in the
MLS design, fast-converging routing protocols can be used instead of the slower Spanning-
Tree Protocol. However, one must be careful to avoid unexpected Spanning Tree
slowdowns within the core itself.
Another concern is that of VLANs. In some cases, the core can utilize a single at VLAN
that spans one or more Layer 2 core switches. In other cases, trafc can be segregated into
630 Chapter 14: Campus Design Models
VLANs for a variety of reasons. For example, multiple VLANs can be used for policy
reasons or to separate the different Layer 3 protocols. A separate management VLAN is
also desirable when using Layer 2-oriented switches.
Broadcast and multicast trafc are other areas of concern. As much as possible, broadcasts
should be kept off of the network's core. Because the multilayer model uses Layer 3
switching in the MDF devices, this usually isn't an issue. Likewise, multicast trafc also
benets from the use of routers in the multilayer model. If the core makes use of routing,
Protocol Independent Multicast (PIM) can be used to dynamically build optimized
multicast distribution trees. If sparse-mode PIM is used, the rendezvous point (RP) can be
placed on a Layer 3 switch in the core. If the core is comprised of Layer 2 switches only,
then CGMP or IGMP Snooping can be deployed to reduce multicast ooding within
the core.
One of the important decisions facing every campus network designer has to do with the
choice of media and switching technology. The majority of campus networks currently
utilize Fast and Gigabit Ethernet within the core. However, ATM can be a viable choice in
many cases. Because it supports a wide range of services, can integrate well with wide area
networks, and provides extremely low-latency switching, ATM has many appealing
aspects. Also, MultiProtocol Label Swapping (MPLS, also known as Tag Switching),
traditionally seen as a WAN-only technology, is likely to become increasingly common in
very large campus backbones. Because it provides excellent trafc engineering capabilities
and very tight integration between Layer 2 and 3, MPLS can be extremely useful in all sorts
of network designs.
However, the most critical decision has to do with the switching characteristics of the core.
In some cases, a Layer 2 core is optimal; other networks benet from a Layer 3 core. The
following sections discuss issues particular to each.
Layer 2 Core
Figure 14-12 depicts the typical Layer 2 core in a multilayer network.
General Recommendation: Multilayer Model 631
IDF IDF
Layer 2 Access
IDF IDF
Distribution
Layer 3
Layer 2
Core
Core Core
This creates a L2/L3/L2 prole throughout the network. The network's intelligence is
contained in the distribution-layer MDF switches. Both the access (IDF) and core switches
utilize Layer 2 switching to maintain a high price/performance ratio. To provide
redundancy, a pair of switches form the core. Because the core uses Layer 2 processing, this
approach is most suitable for small to medium campus backbones.
When building a Layer 2 core, Spanning Tree failover performance should be closely
analyzed. Otherwise, the entire network can suffer from excessively slow reconvergence.
Because the equipment comprising the campus core should be housed in tightly controlled
locations, it is often desirable to disable Spanning Tree entirely within the core of the
network.
TIP I recommend that you only disable Spanning Tree in the core if you are using switching
routers in the distribution layer. If MLS is in use, its Layer 2 orientation makes it too easy
to miscongure a distribution switch and create a bridging loop.
One way to accomplish this is through the use of multiple VLANs that have been carefully
assigned to links in a manner that create a loop-free topology within each VLAN. An
alternate approach consists of physically removing cables that create Layer 2 loops. For
example, consider Figure 14-13.
632 Chapter 14: Campus Design Models
IDF IDF
Layer 2 Access
IDF IDF
Distribution
Layer 3
Core Core
Layer 2 Core
Core Core
Layer 3 Distribution
IDF IDF
Layer 2 Access
IDF IDF
General Recommendation: Multilayer Model 633
In Figure 14-13, the four Layer 2 switches forming the core have been kept loop free at
Layer 2. Although a redundant path does exist through each distribution (MDF) switch, the
pure routing behavior of these nodes prevents any Layer 2 loops from forming.
If Spanning Tree is required within the core, blocked ports should be closely analyzed.
Because STP load balancing can be very tricky to implement in the network core,
compromises might be necessary.
In addition to Spanning Tree, there are several other issues to look for in a Layer 2 core.
First, be careful that multicast ooding is not a problem. As mentioned earlier, IGMP
Snooping and CGMP can be useful tools in this situation (also see Chapter 13). Second,
keep an eye on router peering limits as the network grows. Because each MDF switch is a
router under the multilayer model, a Layer 2 core creates the appearance of many routers
sitting around a single subnet. If the number of routers becomes too large, this can easily
lead to excessive state information, erratic behavior, and slow convergence. In this case, it
can be desirable to break the network into multiple VLANs that reduce peering.
TIP Be careful to avoid excessive router peering when using Catalyst 8500s. One of the easiest
ways to accomplish this is through the use of a Layer 3 core (see the next section).
A Layer 2 core can provide a very useful campus backbone. However, because of the potential
issues and scaling limits, it is most appropriate in small to medium campus networks.
TIP A Layer 2 core can be a cost-effective solution for smaller campus networks.
Layer 3 Core
Figure 14-14 redraws Figure 14-12 with a Layer 3 core.
634 Chapter 14: Campus Design Models
IDF IDF
Layer 2 Access
IDF IDF
Distribution
Layer 3
Core
Core Core
Although Figure 14-12 and Figure 14-14 look very similar, the use of Layer 3 switching
within the core makes several important changes to the network.
First, the path determination is no longer contained only within the distribution layer
switches. With a Layer 3 core, the path determination is spread throughout the distribution
and core layer switches. This more decentralized approach can provide many benets:
Higher aggregate forwarding capacity
Superior multicast control
Flexible and easy to congure load balancing
Scalability
Router peering is reduced
IOS feature throughout a large percentage of the network
In short, the power and exibility of Layer 3 processing eliminates many of the issues
discussed concerning Layer 2 backbones. For example, the switches can be connected in a
wide variety of looped congurations without concern for bridging loops or STP
performance. By cross-linking core switches, redundancy and performance can be
maximized. Also, placing routing nodes within campus core, router mesh and peering
between the distribution switches can be dramatically reduced (however, it is still advisable
to consider areas of excessive router peering).
Notice that a Layer 3 core does add additional hops to the path of most trafc. In the case
of a Layer 2 core, most trafc requires two hops, one through the end user's MDF switch
General Recommendation: Multilayer Model 635
and the other through the server farm's MDF switch. In the case of a Layer 3 core, an
additional hop (or two) is added. However, several factors minimize this concern:
The consistent and modular design of the multilayer model guarantees a consistent
and small number of router hops. In general, no more than four router hops within the
campus should ever be necessary.
Many Layer 3 switches have latencies comparable to Layer 2 switches.
Windowing protocols (such as TCP or IPX Burst Mode) reduce impact of latency for
most applications.
Switching latency is often a very small part of overall latency. In other words, latency
is not as big an issue as most people make it out to be.
The scalability benets of Layer 3 are generally far more important than any latency
concerns.
Figure 14-15 The Server Farm Can Form Another Distribution Block
IDF IDF
Layer 2 Workgroup Access
Server
IDF IDF
Distribution
Layer 3
Workgroup
MDF MDF Server MDF MDF
Layer 3 Distribution
MDF MDF
Layer 2 Access
IDF
Server Farm
Distribution Block
TIP An enterprise server farm is usually best implemented as another distribution block that
connects to the core.
General Recommendation: Multilayer Model 637
Specic tips for server farm design are discussed in considerably more detail in the Server
Farms section of Chapter 15.
TIP The MLS approach to Layer 3 switching can lead to excessive VLAN propagation. Use a
different VTP domain name for each distribution block to overcome this default behavior.
When VTP domains are in use, it is usually best to make the names descriptive of the
distribution block (for example, Building1 and Building 2).
TIP Recall from Chapter 8 that when using trunk links between different VTP domains, the
trunk state will need to be hard-coded to on. The use of auto and desirable will not work
across VTP domain names (in other words, the DISL and DTP protocols check for
matching VTP domain names).
IP Addressing
In a very large campus network, it is usually best to assign bitwise contiguous blocks of
address spaces to each distribution block. This allows the routers in each distribution block
to summarize all of the subnets within that block into a single advertisement that gets sent
into the core backbone. For example, the single advertisement 10.1.16.0/20 (/20 is a
shorthand way to represent the subnet mask 255.255.240.0) can summarize the entire range
of 16 subnets from 10.1.16.0/24 to 10.1.31.0/24 (/24 is equivalent to the subnet mask
255.255.255.0). This is illustrated in Figure 14-16.
638 Chapter 14: Campus Design Models
8
12
64
32
16
12
64
32
16
12
64
32
16
12
64
32
16
8
4
2
1
8
4
2
1
8
4
2
1
8
4
2
1
XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX
Summary Address /20 Subnet Mask
0000 1010 0000 0001 0001 0000 0000 0000 10.1.16.0/20
Summarized Addresses /24 Subnet Mask
0000 1010 0000 0001 0001 0000 0000 0000 10.1.16.0/24
0000 1010 0000 0001 0001 0001 0000 0000 10.1.17.0/24
0000 1010 0000 0001 0001 0010 0000 0000 10.1.18.0/24
0000 1010 0000 0001 0001 0011 0000 0000 10.1.19.0/24
0000 1010 0000 0001 0001 0100 0000 0000 10.1.20.0/24
0000 1010 0000 0001 0001 0101 0000 0000 10.1.21.0/24 These 16
0000 1010 0000 0001 0001 0110 0000 0000 10.1.22.0/24 Entries Can Be
0000 1010 0000 0001 0001 0111 0000 0000 10.1.23.0/24 Summarized
0000 1010 0000 0001 0001 1000 0000 0000 10.1.24.0/24 into the Single
0000 1010 0000 0001 0001 1001 0000 0000 10.1.25.0/24 /20 Above
0000 1010 0000 0001 0001 1010 0000 0000 10.1.26.0/24
0000 1010 0000 0001 0001 1011 0000 0000 10.1.27.0/24
0000 1010 0000 0001 0001 1100 0000 0000 10.1.28.0/24
0000 1010 0000 0001 0001 1101 0000 0000 10.1.29.0/24
0000 1010 0000 0001 0001 1110 0000 0000 10.1.30.0/24
0000 1010 0000 0001 0001 1111 0000 0000 10.1.31.0/24
As shown in Figure 14-16, the /20 and /24 subnet masks (or network prexes) differ by four
bits (in other words, /20 is four bits shorter than /24). These are the only four bits that
differ between the 16 /24 subnet addresses. In other words, because all 16 /24 subnet
addresses match in the rst 20 bits, a single /20 address can be used to summarize all of
them.
In a real-world distribution block, the 16 individual /24 subnets can be applied to 16
different end-user VLANs. However, outside the distribution block, a classless IP routing
protocol can distribute the single /20 route of 10.1.16.0/20.
TIP In very large campus networks, try to plan for future growth and address summarization by
pre-allocating bitwise contiguous blocks of address space.
Network Migrations
Finally, the modularity of the multilayer model can make migrations much easier. In
general, the entire old network can appear as a single distribution block to the rest of the
new network (for example, imagine that the server farm distribution block in Figure 14-15
is the old network). Although the old network generally does not have all of the benets of
the multilayer model, it provides a redundant and routed linkage between the two networks.
After the migration is complete, the old network can be disabled.
Exercises
This section includes a variety of questions on the topic of this chaptercampus design
concepts and models. By completing these, you can test your mastery of the material
included in this chapter as well as help prepare yourself for the CCIE written and lab tests.
Review Questions
1 What are some of the unique requirements of an IDF switch?
2 What are some of the unique requirements of an MDF switch?
3 Describe the access/distribution/core terminology.
4 Why is routing an important part of any large network design?
5 What networks work best with the router and hub model?
6 What are the benets of the campus-wide VLANs model?
7 What are the downsides of the campus-wide VLANs model?
8 Describe the concept of a distribution block.
9 Why is it important to have modularity in a network?
10 What are the concerns that arise when using a Layer 2 core versus a Layer 3 core?
11 How should a server farm be implemented in the multilayer model?
Design Lab
Design two campus networks that meet the following requirements. The rst design should
employ the campus-wide VLANs model using Catalyst 5509 switches. The second design
should implement the multilayer model by using Catalyst 8540 MDF switches and Catalyst
5509 IDF switches. Here are the requirements:
The campus contains three buildings.
Each building has four oors.
640 Chapter 14: Campus Design Models
Each oor has one IDF switch. (In reality there would be more, however, these can be
eliminated from this exercise for simplicity.)
Each building has two MDF switches in the basement.
Each IDF has redundant links (one two each MDF switch).
The MDF switches are fully or partially meshed (choose which one you feel is more
appropriate) with Gigabit Ethernet links (in other words, the core does not use a third
layer of switches).
Each IDF switch should have a unique management VLAN where SC0 can be
assigned.
In the campus-wide VLANs design, assume there are 12 VLANs and that every IDF
switch participates in every VLAN.
In the multilayer design, assume that every IDF switch only participates in a single
end-user VLAN (for administrative simplicity).
How many VLANs are required under both designs?
This page intentionally left blank
This chapter covers the following key topics:
VLANsThe chapter begins with a range of virtual LAN (VLAN)-related topics
from using VLANs to create a scalable design to pruning VLANs from trunk links.
Spanning TreeCovers important Spanning Tree issues that are essential to
constructing a stable network.
Load BalancingDiscusses the ve techniques available for increasing campus
network bandwidth.
Routing/Layer 3 SwitchingDiscusses issues such as MLS (routing switches) and
switching routers.
ATMExamines valid reasons for using ATM in your campus network and how to
deploy it in a scalable fashion.
Campus MigrationsProvides recommendations for migrating your campus
network.
Server FarmsCovers some basic server farm design principles.
Additional Campus Design RecommendationsDiscusses several other design
issues such as VTP, port congurations, and passwords.
CHAPTER
15
VLANs
When the word switching is brought up, the rst thing that comes to most network
engineers minds is the subject of VLANs. The use of VLANs can make or break a campus
design. This section discusses some of the most important issues to remember when
designing and implementing VLANs in your network.
Second, campus-wide VLANs make it possible to use technology like Ciscos User
Registration Tool (URT). By functioning as a sophisticated extension to the VLAN
membership policy server (VMPS) technology discussed in Chapter 5, VLANs, URT
allows VLAN placement to be transparently determined by authentication servers such as
Windows NT Domain Controllers and NetWare Directory Services (NDS). Organizations
such as universities have found this feature very appealing because they can create one or
more VLANs for professors and administrative staff while creating separate VLANs for
students. Consequently, the same physical campus infrastructure can be used to logically
segregate the student trafc while still allowing the use of roving laptop users.
The third benet of campus-wide VLANs is actually implied by the second benet
campus-wide VLANs allow these roving users to be controlled by a centralized set of
access lists. For example, a university using campus-wide VLANs might utilize a pair of
7500 routers located in the data center for all inter-VLAN routing. As a result, access lists
between the VLANs only need to be congured in two places. Consider the alternative
where routers (or Layer 3 switches) might be deployed in every building on campus. To
maintain user mobility, each of these routers needs to be congured with all of the VLANs
and access lists used throughout the entire campus. This can obviously lead to a situation
where potentially hundreds of access lists must be maintained.
TIP Although campus-wide VLANs have several well-publicized benets and are quite popular, they
create many network design and management issues. Try to avoid using campus-wide VLANs.
Although these advantages are very alluring, many organizations that implement this
approach quickly discover their downsides. Most of the disadvantages are the result of one
characteristic of campus-wide VLANs: a lack of hierarchy. Specically, this lack of
hierarchy creates signicant scalability problems that can affect the networks stability and
maintainability. Furthermore, these problems are often difcult to troubleshoot because of
the dynamic and non-deterministic nature of campus-wide VLANs (not to mention that it
can be difcult to know where to start troubleshooting in a at network). For more
information on these issues, please refer to Chapter 14, Campus Design Models, Chapter
11, Layer 3 Switching, and Chapter 17, Case Studies: Implementing Switches.
Although many books and vendors discuss campus-wide VLANs as simply the way to use
switching, Layer 3 switching introduces a completely different approach that is denitely
worthy of consideration. Chapter 14 discussed these Layer 3 approaches under the heading
of the multilayer campus design model. Although this approach cannot match the support
for centralized access lists available under campus-wide VLANs, it can allow you to build
and maintain much larger networks than is typically possible with campus-wide VLANs.
Layer 3 switching can also be used with the Dynamic Host Control Protocol (DHCP), a
very proven and scalable technique for handling user mobility (see the next section).
Therefore, as a general rule of thumb, use the multilayer model as your default design
choice and only use at earth designs if there is a compelling reason to justify the risks. For
more information on the advantages and implementation details of the multilayer model,
see Chapter 11, Chapter 14, and Chapter 17.
VLANs 645
Note that this implies a fundamental difference in how VLANs are used between the two
design models. In the case of campus-wide VLANs, VLANs are used to create logical
partitions unique to the entire campus network. In the case of the multilayer model, they are
used to create logical partitions that may be unique to a single IDF/access layer wiring closet.
TIP The multilayer design model uses VLANs in a completely different fashion from the
campus-wide VLANs model. In the multilayer model, VLANs are very often only unique
to a single IDF device whereas campus-wide VLANs are globally unique.
TIP Be careful to not simply enter no ip forward-protocol upd. Prior to 12.0, entering this
command disabled all of the default UDP ports, including ports 67 and 68 that are used by
DHCP. Although no ip forward-protocol upd does not disable DHCP in early releases of
12.0, proceed with caution. For an example of ip helper-address and no ip forward-
protocol, see Chapter 17.
VLAN Numbering
Although VLAN numbering is a very simple task, having a well thought-out plan can help
make the network easier to understand and manage in the long run. In general, there are two
approaches to VLAN numbering:
Globally-unique VLAN numbers
Pattern-based VLAN numbers
In globally-unique VLAN numbers, every VLAN has a unique numeric identier. For
example, consider the network shown in Figure 15-1. Here, the VLANs in Building 1 use
numbers 1013, Building 2 uses 2023, and Building 3 uses 3033.
646 Chapter 15: Campus Design Implementation
Building 1 Building 2
Management: 12 Management: 22
User: 13 User: 23
IDF IDF
Management: 10 Management: 20
User: 11 User: 21
IDF IDF
Core
MDF MDF
IDF
User: 31
Management: 30
Building 3
IDF
User: 33
Management: 32
TIP When using globally-unique VLANs, try to establish an easily remembered scheme such
as the one used in Figure 15-1 (Building 1 uses VLANs 1X, Building 2 uses 2X, and so on).
VLANs 647
In the case of pattern-based VLAN numbers, the same VLAN number is used for the same
purpose in each building. For example, Figure 15-2 shows a network where the management
VLAN is always 1, the rst end user VLAN is 2, the second end user VLAN is 3, and so on.
IDF IDF
Management: 1 Management: 1
User: 2 User: 2
IDF IDF
Core
IDF
User: 2
Management: 1
Building 3
IDF
User: 2
Management: 1
648 Chapter 15: Campus Design Implementation
Which approach you use is primarily driven by what type of design model you adopt. If you
have utilized the campus-wide VLANs model, you are essentially forced to use globally-
unique VLAN numbers. Although there are special cases and hacks where this may not
be true, not using unique VLANs in at designs can lead to cross-mapped VLANs and
widespread connectivity problems.
If you are using the multilayer model, either numbering scheme can be adopted. Because
VLANs are terminated at MDF/distribution layer switches, there is no underlying technical
requirement that the VLAN numbers must match (this is especially true when using using
switching router platforms such as the Catalyst 8500). In fact, even if the VLAN numbers
do match, they are still maintained as completely separate broadcast domains because of
Layer 3 switching/routing. If you like the simplicity of knowing that the management
VLAN is always VLAN 1, the pattern-based approach might be more appropriate. On the
other hand, some organizations prefer to keep every VLAN number unique just as every IP
subnet is unique (this approach often ties the VLAN number to the subnet numberfor
example, VLAN 25 might be 10.1.25.0/24). In other cases, a blend of the two numbering
schemes works best. Here, organizations typically adopt a single number for use in all
management VLANs but use unique numbers for end-user VLANs.
TIP The multilayer model can be used with both globally-unique VLANs and pattern-based
VLANs.
TIP Descriptive VLAN names are especially important when using campus-wide VLANs.
Although VLAN names are less important when the multilayer design model is in use, the
names should at least differentiate management and end-user trafc. Try to include the
name of the department or IDF/access layer closet where the VLAN is used. Also, some
organizations like to include the IP subnet number in the VLAN name.
TIP Make sure every Layer 2 switch participates in at least two VLANs: one that functions as
the management VLAN and one or more for end-user VLANs.
However, this is not to suggest that having more than two VLANs is a good idea. To the
contrary, the simplicity of maintaining a single end-user VLAN (or at least a small number)
can be very benecial for network maintenance.
Why, then, is it so important to have at least two VLANs? Think back to the material
discussed in Chapter 5 regarding the impact of broadcasts on end stations. Because
broadcasts are not ltered by hardware on-board the network interface card (NIC), every
broadcast is passed up to Layer 3 using an interrupt to the central CPU. The more time that
the CPU spends looking at unwanted broadcast packets, the less time it has for more useful
tasks (like playing Doom!).
Well, the CPU on a Catalysts Supervisor is no different. The CPU must inspect every
broadcast packet to determine if it is an ARP destined for its IP address or some other
interesting broadcast packet. However, if the level of uninteresting trafc becomes too
large, the CPU can become overwhelmed and start dropping packets. If it drops Doom
packets, no harm is done. On the other hand, if it drops Spanning Tree BPDUs, the whole
network could destabilize.
650 Chapter 15: Campus Design Implementation
NOTE Note that this section is referring to Layer 2 Catalysts such as the 2900s, 4000s, 5000s, and
6000s. Because these devices currently have one IP address that is only assigned to a single
VLAN, the selection of this VLAN can be important. On the other hand, this point generally
does not apply to router-like Catalysts such as the 8500. Because these platforms generally
have an IP address assigned to every VLAN, trying to pick the best VLAN for an IP address
obviously becomes irrelevant. For more information on the Catalyst 8500, see Chapter 11.
In fact, this Spanning Tree problem is one of the more common issues in at earth campus
networks. The story usually goes something like this: The network is humming along ne
until a burst of broadcast data in the management VLAN causes a switch to become
overwhelmed to the point where is starts dropping packets. Because some of these packets
are BPDUs, the switch falls behind in its Spanning Tree information and inadvertently
creates a Layer 2 loop in the network. At this point, the broadcasts in the network go into a
full feedback loop as discussed in Chapter 6, Understanding Spanning Tree.
If this loop occurs in one or more VLANs other than the management VLAN, it can quickly
starve out all remaining trunk bandwidth throughout the entire campus in a at network.
However, the Supervisor CPUs are insulated by the VLAN switching ASICs and continue
operating normally (recall that all data forwarding is handled by ASICs in Catalyst gear).
On the other hand, if the loop occurs in the management VLAN (the VLAN where SC0 is
assigned), the results can be truly catastrophic. Suddenly, every switch CPU is hit with a tidal
wave of broadcast trafc, completely crushing every switch in a downward spiral that virtually
eliminates any chance of the network recovering from this problem. If a network is utilizing
campus-wide VLANs, this problem can spread to every switch within a matter of seconds.
NOTE Recall that SC0 is the management interface used in Catalyst switches such as the 4000s,
5000s, and 6000s. This is where the management IP address is assigned to a Catalyst
Supervisor. Because the CPU processes all broadcast packets (and some multicast packets)
received on this interface, it is important to not overwhelm the CPU.
How do you know if your CPU is struggling to keep up with trafc in the network? First,
you can use the Catalyst 5000 show inband command (this is used for Supervisor IIIs; use
show biga on Supervisor Is and IIs [biga stands for Backplane Interface Gate Array]) to
display low-level statistics for the device. Look under the Receive section for the
RsrcErrors eld. This lists the number of received frames that were dropped by the CPU.
Second, to view the load directly on the CPU, use the undocumented command ps -c. The
nal line of this display lists the CPU idle time (subtract from 100 to calculate the load).
Note that ps-c has been replaced by show proc cpu in newer images.
TIP Use the show inband, show biga, ps -c, and show proc cpu commands to determine if
your CPU is overloaded.
VLANs 651
If you nd that you are facing a problem of CPU overload, also read the section Consider
Using Loop-Free Management VLANs later in this chapter.
TIP Never mix end-user trafc with control and management trafc.
When implementing this principle, you must generally choose one of two designs:
Use VLAN 1 for all control and management trafc while placing end-user trafc in
other VLANs (VLANs 21000).
Use VLAN 1 for control trafc, another VLAN (such as VLAN 2) for management
trafc, and the remaining VLAN for end-user trafc (such as VLAN 31000).
The rst option combines control and management trafc in VLAN 1. The advantage of this
approach is management simplicity (it is the default setting and uses a single VLAN). The
primary disadvantage of this approach centers around the default behavior of VLAN 1
because VLAN 1 cannot currently be removed from trunk links, it is easy for this VLAN to
become extremely large. For example, the use of Ethernet trunks throughout a network
along with MLS Layer 3 switching in the MDF/distribution layer will result in VLAN 1
spanning every link and every switch in the campus, exactly what you do not want for your
all-important management VLAN. Therefore, placing SC0 in such as large and at VLAN
can be risky.
652 Chapter 15: Campus Design Implementation
NOTE Although VLAN 1 cannot be removed from Ethernet trunks in current versions of Catalyst
code, Cisco is developing a feature that will provide this capability in the future. In short,
this feature is expected to allow VLAN 1 to be removed from both trunk links and the VTP
VLAN database. Therefore, from a user-interface perspective, enabling this feature
effectively removes VLAN 1 from the device. However, from the point of view of the
Catalyst internals, the VLAN will actually remain in use, but only for control trafc such
as VTP and CDP (for example, a Sniffer will reveal these packets tagged with a VLAN 1
header on trunk links). In other words, this feature will essentially convert VLAN 1 into a
reserved VLAN than can only be used for control trafc.
This risk can be avoided with the second option where the control and management trafc
are separated. Whereas the control trafc must use VLAN 1, the management trafc is
relocated to a different VLAN (many organizations choose to use VLAN 2, 999, or 1000).
As a result, SC0 and the CPU will be insulated from potential broadcast problems in VLAN
1. This optimization can be particularly important in extremely large campus networks that
are lacking in Layer 3 hierarchy.
TIP For the most conservative management/control VLAN design, only use VLAN 1 for control
trafc while placing SC0 in its own VLAN (in other words, no end-user trafc will use
this VLAN).
Also, when using the upcoming feature that removes VLAN 1 from a Catalyst, you are
effectively forced to use this approach.
TIP If you recongure SC0 for troubleshooting (or other) purposes, be sure to return it to its
original state.
However, there is a hidden downside to the advantage of every switch not needing to know
what VLANs other switches are usingooded trafc must be sent to every switch in the
Layer 2 network. In other words, by default, one copy of every broadcast, multicast, and
unknown unicast frame is ooded across every trunk link in a Layer 2 domain.
Two approaches can be used to reduce the impact of this ooding. First, note that if you are
using campus-wide VLANs, this ooding problem also becomes campus-wide. Therefore,
one of the simplest and most scalable ways to reduce this ooding is to partition the network
with several Layer 3 barriers that utilize routing (Layer 3 switching) technology. This breaks
the network into smaller Layer 2 pockets and constrains the ooding to each pocket.
654 Chapter 15: Campus Design Implementation
Where Layer 3 switching cannot prevent unnecessary ooding (such as with campus-wide
VLANs or within each of the Layer 2 pockets created by Layer 3 switching), a second
technique of VLAN pruning can be employed. By using the clear trunk command
discussed in Chapter 8, Trunking Technologies and Applications, unused VLANs can be
manually pruned from a trunk. Therefore, when a given switch needs to ood a frame, it
only sends it out access ports locally assigned to the source VLAN and trunk links that have
not been pruned of this VLAN. For example, an MDF switch can be congured to ood
frames only for VLANs 1 and 2 to a given IDF switch if the switch only participates in these
two VLANs. To automate the process of pruning, VTP pruning can be used. For more
information on VTP pruning, please refer to Chapter 12, VLAN Trunking Protocol.
One of the most important uses of manual VLAN pruning involves the use of a Layer 2
campus core, the subject of the next section.
TIP VLAN pruning on trunk lines is one of the most important keys to the successful
implementation of a network containing Layer 2 Catalyst switching.
IDF-1B IDF-2B
Be Sure to Remove
the Core VLAN
Here
IDF-1A IDF-2A
Core VLANs
Core-A Core-B
The core in Figure 15-3 is formed by a pair of redundant Layer 2 switches each carrying a
single VLAN. All four of the MDF switches connect to one of the core switches (Core-A
or Core-B), allowing any single link or switch to fail without creating a permanent outage.
If the four MDF switches are congured with Catalyst 8500-style switching routers, then
this will automatically result in a loop-free core. On the other hand, the use of Layer 3
router switching (MLS) in the MDF devices requires more careful planning. Specially, the
core VLAN must be removed from the links to IDF switches as well as on the link between
MDF switches.
TIP When using MLS (and other forms of routing switches), be certain that you remove the core
VLAN from links within the distribution block (the triangles of connectivity formed by
MDF and IDF switches).
656 Chapter 15: Campus Design Implementation
Larger Layer 2 campus cores require even more careful planning. For example, Figure 15-4
shows a network that covers a larger geographic area and therefore uses four Layer 2 switches
within the core. This design is often referred to as a split Layer 2 core.
IDF-1B IDF-1B
IDF-1A IDF-1A
Core-A Core-B
No Links between
Core Switches
Core-C Core-D
IDF-3A IDF-4A
IDF-3B IDF-4B
VLANs 657
In this case, the key to creating a fast-converging and resilient core is to actually partition
the core into two separate VLANs and not cross-link the switches to each other. The rst
core VLAN is used for the pair of switches on the left, and the second VLAN is used for
the pair of switches on the right. If the core switches in Figure 15-4 were cross-linked or
fully meshed and a single VLAN were deployed, Spanning Tree convergence and load
balancing issues would become a problem.
Finally, notice that creating a loop-free core requires the use of Layer 3 switching in the
MDF/distribution layer closets. When using campus-wide VLANs, the only way to achieve
a loop-free core is to remove all loops from the entire network, obviously a risky endeavor
if you are at all concerned about redundancy. Again, follow the suggestion of this chapters
rst section and try to always use the multilayer model and the scalability benets it
achieves through the use of Layer 3 switching.
TIP When using split Layer 2 cores, some network designers chose to use this to segregate the
trafc by protocol to provide additional control. For example, the Core-A and Core-C
switches could be used for IP trafc while the Core-B and Core-D can carry IPX trafc.
This can be a useful way of guaranteeing a certain amount of bandwidth for each protocol.
It is especially useful when you have non-routable protocols that require bridging
throughout a large section of the network. This will allow one half of the core to carry the
non-routable/bridged trafc while the other half carries the multiprotocol routed trafc.
This section has repeatedly discussed the pruning of VLANs from links. Obviously, one
way to accomplish this is to use the clear trunk command discussed in the Restricting
VLANs on a Trunk section of Chapter 8. However, the simplest and most effective
approach for removing VLANs from a campus core is to just use non-trunk links. By
merely assigning these ports to the core VLAN, you will automatically prevent VLANs
from spanning the core and creating at earth VLANs.
TIP Use non-trunk links in the campus core to avoid campus-wide end-user VLANs.
In fact, this technique is also the most effective method of removing VLAN 1 from the core.
Recall that current versions of Catalyst code do not allow you to prune VLAN 1 from Ethernet
trunks. Therefore, as discussed earlier, this can result in a single campus-wide VLAN in the
all-important VLAN 1 (the last place you want to have loops and broadcast problems).
TIP Use non-trunk links in the campus core to avoid a campus-wide VLAN in VLAN 1 (this is
where you least want a at earth VLAN, especially if SC0 is assign to VLAN 1).
658 Chapter 15: Campus Design Implementation
ISL Trunk
Red
Green
10.0.3.1/24
10.0.1.21 10.0.3.109
10.0.1.1/24 10.0.2.1/24
10.0.2.183
Blue
10.0.2.48 10.0.2.95
Each VLAN in Figure 15-6 has been redrawn as a separate segment connected to a different
router interface. It depicts the logical separation of VLANs with the physical separation
used in traditional router and hub designs. However, from a Layer 3 perspective, both
networks are identical.
By doing this, it makes the network extremely easy to understand. In fact, it makes it
painfully obvious that this network contains a problemthe host using 10.0.2.183 is
located on the wrong segment/VLAN (it should be on the Blue VLAN).
Although this might seem like a simple example, simple addressing issues trip up even the
best of us from time to time. Why not use a technique that removes VLANs as an extra layer
of obfuscation? However, PLANs can be useful in many situations other than for your own
troubleshooting. Even if you understand why a network is having a problem, PLANs can
be useful for explaining it to other people who might not see the problem as clearly. PLANs
can also be used to simplify a new design and help you better analyze the trafc ows and
any potential problems.
TIP PLANs are no jokeuse them to help troubleshoot and explain your network.
660 Chapter 15: Campus Design Implementation
Spanning Tree
Intertwined with the issue of VLANs is the subject of the Spanning-Tree Protocol. In fact,
it is the inappropriate use of VLANs (the at earth theory) that most often leads to Spanning
Tree problems in the rst place. This section discusses some of the dos and donts of the
Spanning-Tree Protocol.
One of the primary themes developed throughout this section is that although Spanning
Tree can be quite manageable when used in conjunction with Layer 3 switching, it can also
become very complex when used in large, at designs like campus-wide VLANs.
Combining good Spanning Tree knowledge with a good design is the key to success.
When using the switching router (Catalyst 8500) form of the multilayer design model,
Spanning Tree load balancing can be eliminated. In this case, the IDF trafc creates
Layer 2 Vs that are inherently loop free and therefore do not require the Spanning-
Tree Protocol, although I recommend that you dont disable Spanning Tree; see the
next section.
Note If you enable bridging and/or IRB on Catalyst 8500 devices, they
will starting bridging trafc and convert the Layer 2 Vs that they
produce by default into Layer 2 triangles. This will obviously
require the use of Spanning Tree (use of the Root Bridge placement
technique discussed in the following bullet point is recommended).
When using the routing switch (MLS) form of the multilayer design model, Spanning
Tree load balancing can be dramatically simplied through the use of the Root Bridge
placement technique. When using MLS and the multilayer model, each IDF and a pair
of MDFs create Layer 2 triangles that, although not loop free, are easy to manage. For
more information on the Root Bridge placement approach to Spanning Tree load
balancing, see Chapter 7, Advanced Spanning Tree, and Chapter 17.
Spanning Tree becomes much simpler to design, document, and understand.
Troubleshooting becomes much easier.
Figure 15-7 illustrates the Layer 2 triangles created by MLS (Part A) and the Layer 2 Vs
created by switching routers. Although MLS very often uses route-switch modules (RSMs),
a logical representation has been used for Part A.
Figure 15-7 Layer 2 Topologies under Routing Switches (MLS) and Switching Routers
NOTE It is important to realize that both routing switches (MLS) and switching routers (8500s)
can be used to create the designs shown in Figure 15-7. This section is merely trying to
point out the default behavior and most common use of these platforms.
When using campus-wide VLANs, it is often possible to achieve some of the benets listed
in this section by manually pruning VLANs from selected trunks. However, it is not
possible to create the simplicity and scalability that are available when using Layer 3
switching. Also, the pruning action can often reduce redundancy in the network.
The multilayer model allows the benets listed in this section to be easily designed into the
network. When using routing switches (MLS) as shown in Part A, this can be accomplished
by pruning selected VLANs from key trunk links (such as links in the core and between
MDF switches). When using switching routers such as the 8500 as shown in Part B, the
benets of having small Spanning Tree domains accrue by default.
must be careful to not create Layer 2 loops outside the LANE backbone. Not only does
this include the examples discussed in the previous bullet, it also includes such practices
as using redundant Ethernet links to extend the ATM backbone to IDF wiring closets.
In general, it is better to use scalable design techniques and Spanning Tree tuning rather
than to disable the Spanning-Tree Protocol altogether. As discussed in the previous section,
designs such as the multilayer model can achieve network stability without having to resort
to disabling Spanning Tree. Also, a carefully planned design can then allow Spanning Tree
to be tuned for better performance.
NOTE Although Spanning Tree will not impact failover performance of the IDF uplink ports when
using Layer 2 Vs, it is still enabled by default and may impact end-user devices. Therefore,
you may wish to congure PortFast on end-user ports to facilitate start-up protocols such
as DHCP and NetWare authentication.
Unlike 8500s where Layer 2 Vs are far more common, MLS (and routing switches) allow
you to easily congure either Layer 2 triangles or Vs. By default, MLS allows all VLANs
to transit the switch. Therefore, assuming that you have removed end-user VLANs from the
network core, you will be left with Layer 2 triangles by default (Part A of Figure 15-7).
However, by pruning a given VLAN from the link between the MDF/distribution switches,
this VLAN can easily be converted into a V (Part B of Figure 15-7). In other words, by
simply pruning the VLAN from the triangles base, it is converted into a V.
From a Spanning Tree perspective, it is important to evaluate the differences that this brings
to your network. If you opt for using triangles, then Spanning Tree will be in full effect. The
Root Bridge placement form of load balancing and features such as UplinkFast will be
important. If you opt for the Layer 2 Vs, you will be left with the same almost Spanning
Tree free situation described several paragraphs earlier in connection with the 8500s.
664 Chapter 15: Campus Design Implementation
TIP Be sure to consider the impact and performance of Spanning Tree where you have Layer 2
triangles in your campus network.
NOTE Note that Layer 2 Vs can also be created with routing switch (MLS) platforms by pruning
VLAN from selected links (in this case, the base of the trianglethe MDF-to-MDF link).
TIP Make certain that your design minimizes the risk of braodcast storms occurring in the
management VLAN.
Therefore, ensuring that the management VLAN itself is loop free can provide an additional
layer of protection. In general, two techniques can be used to create a loop-free
management VLAN:
The use of Catalyst 8500-style switching routers in the MDF/distribution layer
automatically creates loop-free management VLANs on the IDF/access devices by
default. Notice that this also implies that you should not use IRB to merge the
management VLANs back into a single VLAN. Although this can appear to simplify
the management of your network by placing all of the switches in a single VLAN, it
can create management problems in the long term by adding loops into the
management VLAN.
Spanning Tree 665
As discussed in the section Make Layer 2 Cores Loop-Free, you should also keep an eye on
VLAN 1. Although you may have carefully used Layer 3 switching to create hierarchy in your
network, you can still be left with a campus-wide VLAN in VLAN 1 (especially if you are
using MLS Layer 3 switching). Note that this will be true even if you followed the earlier
advice (see the section Prune VLANs from Trunks) of pruning VLANs core VLANs from
the wiring closet trunks and wiring closet VLANs from the core trunks (recall that VLAN 1
cannot be deleted and cannot be pruned from Ethernet trunk links in current code images).
666 Chapter 15: Campus Design Implementation
Because VLAN 1 is given special priority because of the control trafc discussed in the section
Deciding What Number Should be Used for the Management VLAN, a broadcast loop in this
VLAN can be devastating to the health of your network. How, then, are you supposed to control
this situation? In general, organizations have used one or more of the following techniques:
Probably the simplest and most effective option involves using non-trunk links in the
core. By assigning each of these core links to a single VLAN (do not use VLAN 1
here!), the core will block the transmission of VLAN 1 information.
TIP Consider using non-trunk links in the core. This can be an extremely simple but effective
way to reduce Sprawling VLANs in your network.
Use switching routers such as the Catalyst 8500s that do not forward VLAN 1 by
default.
Once it is available, use the upcoming feature that will allow VLAN 1 to be removed
from trunk links (see the section Deciding What Number Should be Used for the
Management VLAN).
If you are using an ATM core, VLAN 1 can be removed from this portion of the network
(see Chapter 9).
NOTE For the record, heavy broadcast trafc can also be a problem for routers. They are no different
from other devicesall broadcasts must be processed to see if they are interesting or not. In
fact, this phenomenon can be worse for routers because, by denition, they are connected to
multiple subnets and therefore must process the broadcasts from every subnet.
However, with this being said, routers (and Layer 3 switches) are still the best tools for
handling broadcast problems. Although the routers themselves can be susceptible to
broadcast storms, their very use can greatly reduce the risk of Layer 2 loops ever forming.
The multilayer model is designed to maximize this benet by reducing Layer 2 connectivity
to many small triangles and Vs. Furthermore, although a broadcast loop can overload any
directly-connected routers, the problem does not spread to other sections of the network, a
huge improvement over the problems described earlier in this section and in the section
Use Separate Management VLANs.
TIP All networks using groups of contiguous Layer 2 switches or transparent bridges should
specify a primary and a backup Root Bridge.
IDF-B
MDF-A MDF-B
This causes the odd VLANs to use the left riser link (the right IDF port is Blocking for these
VLANs), whereas the even VLANs use the right link (the left IDF port is Blocking). As
discussed in the following section and Chapter 11, this should be coordinated with any Hot
668 Chapter 15: Campus Design Implementation
Standby Routing Protocol (HSRP) load balancing being performed by your MDF/
distribution layer devices.
TIP The Root Bridge placement form of Spanning Tree load balancing is both simple and effective.
TIP In general, distributed Root Bridges can add more complexity to the network than they
are worth.
Centralized Root Bridges are useful in situations where the trafc ows are highly
concentrated (such as in the case of a centralized server farm). Another advantage of this
approach is that it can ease troubleshooting by creating identical (or at least very similar)
logical topologies in all VLANs. Overall, centralized Root Bridges are more common.
Spanning Tree 669
Timer Tuning
Your decision to utilize Spanning Tree timer tuning should be based primarily on your
campus architecture. If you have utilized the campus-wide VLAN model, timer tuning is
almost always an exercise in futility and frustration. Because campus-wide VLANs lead to
very large Spanning Tree domains, timer tuning usually results in a network plagued by
instability.
TIP Do not attempt Spanning Tree timer tuning if your network uses the campus-wide
VLAN model.
On the other hand, the Layer 3 barriers created by the multilayer model make timer tuning
a very attractive option for most networks. When performing timer tuning, it is usually best
to use the set spantree root macro discussed in the Using A Macro: set spantree root
section of Chapter 6. In general, the values in Table 15-1 have been shown to be a good
670 Chapter 15: Campus Design Implementation
compromise between network stability and speed of convergence (for more information on
the details of these timer values, refer to Chapters 6 and 7).
Because timer tuning is not recommended for campus-wide VLANs and should therefore
not be specied on the set spantree root command, these values have been omitted from
Table 15-1. (Although, as discussed in Chapter 7, 802.1D assumes a diameter of 7 hops and
the Hello Time defaults to 2 seconds.) The routing switch (MLS) and switching router
values are based on fairly conservative assumptions about link failures and the possibility
of additional bridging devices being attached to the network (these values are also used and
discussed in the case studies covered in Chapter 17).
Also, if you are willing and able to incur the extra load of Spanning Tree BPDUs, the Hello
Time can be reduced to 1 second to further improve convergence times. However, notice
that this doubles the bandwidth consumed by BPDUs, and, more importantly, the load on
the supervisor CPUs. Therefore, if each device only participates in a small number of
VLANs, Hello tuning can successfully improve Spanning Tree convergence times with
minimal impact on the CPU. Conversely, if your devices participate in a large number of
VLANs, changing the Hello Time can overload your CPUs. When using a large number of
VLANs, only lower the Hello Time for a subset of the VLANs where you need the
improved convergence time as a compromise. If lowering the Hello Time to one second,
consider using the values specied in Table 15-2.
Table 15-2 Spanning Tree Timer Values When Using a Hello Time of 1 Second
Specied Specied Resulting Resulting
Network Design Diameter Hello Time Max Age Forward Delay
Multilayer and routing 3 hops 1 secs 7 secs 5 secs
switches (MLS)
Multilayer and switching 2 hops 1 secs 5 secs 4 secs
routers (8500s)
Spanning Tree 671
Finally, be certain that you set the chosen timer values on both the primary and backup Root
Bridges. You can set the values on other bridges/switches, but it has no effect (for simplicity,
some organizations simply set the values on every device).
TIP A picture is worth a thousand words... diagram your Layer 2 topologies (including
Spanning Tree).
At a minimum, these diagrams should illustrate the extent of each VLAN, the location of
the Root Bridge, and which switch-to-switch ports are Blocking or Forwarding
(diagramming end-user ports is rarely benecial). In addition, it might be useful to label the
Forwarding ports as either Designated Ports or Root Ports. See Chapters 6 and 7 for more
information on these ports.
The importance of having Layer 2 diagrams is inuenced by, once again, the choice of the
networks design. They are especially important in the case of campus-wide VLANs where
the combination of many VLANs and Blocking/Forwarding ports can become very complex.
Fortunately, another benet of the multilayer model is that it reduces the need for diagrams.
First, the Layer 3 hierarchy created by this design makes the traditional Layer 3 maps much
more useful. Second, the simplistic Layer 2 triangles and Vs created by this design allow two
or three template drawings to be used to document the entire Layer 2 network.
672 Chapter 15: Campus Design Implementation
TIP Dont waste your time designing lots of Spanning Tree optimizations (such as UplinkFast and
BackboneFast) into a heavily Layer 3-oriented networkthey will have little or no effect.
On the other hand, UplinkFast and BackboneFast can be extremely useful in more Layer 2-
oriented designs such as campus-wide VLANs and the multilayer model with routing
switches (MLS). In either case, UplinkFast should be enabled only on IDF wiring closet
switches while BackboneFast is enabled on every switch in each Spanning Tree domain. It
is important to follow these guidelines. Although both protocols have been carefully
engineered to not completely disable the network when they are used incorrectly, it causes
the feature to either be completely ineffective (as is possible with BackboneFast) or to
invalidate load balancing and Root Bridge placement (as is possible with UplinkFast). See
Chapter 7 for more detailed information on BackboneFast and UplinkFast.
Even though Catalysts allow you to enter the set spantree portfast mod_num/port_num
enable command on a trunk link, the command is ignored. Despite this feature, it is best to
leave PortFast disabled on trunk links and spare other administrators of the network some
confusion when they see it enabled.
TIP Although PortFast is extremely useful in Ethernet-only networks, you might wish to avoid its
use in networks that employ a LANE core. Because PortFast suppresses TCN BPDUs, it can
interfere with LANEs process of learning about devices/MAC addresses that have been
relocated to a different LANE-attached switch. As a result, nodes that relocate may have to
wait ve minutes (by default) for their connectivity to be re-established if PortFast is in use.
By disabling PortFast, LANE will receive a TCN (both when the node is initially
disconnected from the original switch and when it is reconnected to the new switch) that
shortens the MAC aging process to the Spanning Tree Forward Delay timer (see Chapter
6). As an alternative you can manually (and permanently) lower the bridge table aging
period using the set cam agingtime command. Both techniques will cause LANE to
remove the MAC address to NSAP address mapping in the LES more quickly and force it
to relearn the new mapping for a device that has been relocated. See Chapter 7 for more
detailed information on the operation of LANE.
TIP Watch out for the broken subnet problem. It can create difcult to troubleshoot
connectivity problems.
674 Chapter 15: Campus Design Implementation
As detailed in Chapter 11, the solution is to use two versions of the Spanning-Tree Protocol. The
Layer 2 Catalysts such as the 5000 and the 6000 only use the IEEE version of the Spanning-Tree
Protocol. However, IOS-based devices such as the routers and Catalyst 8500s can either run the
DEC version of Spanning Tree-Protocol or Ciscos proprietary VLAN-Bridge Spanning-Tree
Protocol. In both cases, the BPDUs for these two protocols are treated as normal multicast data
by the Layer 2 Catalysts and ooded normally. Conversely, the IOS-based devices swallow the
IEEE BPDUs when they are running a different version of the Spanning-Tree Protocol.
Consequently, the IOS-based devices partition the IEEE protocol into smaller pockets. Within
each pocket, the IEEE Spanning-Tree Protocol ensures that the logical topology is loop free. The
DEC or VLAN-Bridge version of the Spanning-Tree Protocol ensures that the collection of
pockets remains loop free. The result is a network where both routed and non-routed protocols
have full connectivity throughout the network. For more information, see the Using Different
Spanning-Tree Protocols section in Chapter 11.
Load Balancing
Load balancing can be one of the telltale signs that indicate whether a network has been
carefully planned or if it has grown up like weeds. By allowing redundant links to
effectively double the available bandwidth, load balancing is something that every network
should strive to implement.
This chapter briey mentions the most popular alternatives available for implementing load
balancing. As you go through this section, recognize that none of these accomplish round robin
or per-packet load balancing. Therefore, although these techniques are most often referred to with
the name load balancing, the name load sharing or load distribution might be more appropriate.
However, do not get hung up with trying to achieve an exact 50/50 split when you implement load
balancing over a pair of links. Just remember that any form of load balancing is preferable to the
default operation of most campus protocols where only a single path is ever used.
Finally, consider the intelligence of a load balancing scheme. For example, some
techniques such as EtherChannel use a very simple XOR algorithm on the low-order bits
of IP or MAC address. On the other hand, Layer 3 routing protocols offer very sophisticated
and tunable load balancing and, more importantly, path selection tools.
Spanning Tree
Spanning Tree load balancing is useful within a redundant Layer 2 domain. As discussed
in Chapter 7, there are four techniques available for load balancing under the Spanning-
Tree Protocol:
Root Bridge placement
Port priority (portvlanpri)
Bridge priority
Port cost (portvlancost)
As discussed in Chapter 7 and earlier in this chapter, Root Bridge placement is the simplest
and most effective technique if the networks trafc ows support it. Fortunately, the
multilayer model with routing switches (MLS) automatically generates a topology where the
Root Bridges can be alternated between redundant MDF switches within a distribution block.
TIP When working with the Spanning-Tree Protocol, try not to use the Root Bridge placement
form of Spanning Tree load balancing.
Root Bridge placement is not effective in less constrained topologies such as campus-wide
VLANs. In these cases, it is best to use the portvlancost form of load balancing. Although
portvlancost is harder to use than Root Bridge placement, it is useful in almost any
redundant topology. Think of it as the Swiss army knife of Spanning Tree load balancing.
TIP When working with the Spanning-Tree Protocol, use portvlancost load balancing when the
use of Root Bridge placement is not possible.
676 Chapter 15: Campus Design Implementation
HSRP
In situations where Layer 3 switching is being used, HSRP plays an important role. When using
Layer 3 switching in networks that contain Layer 2 loops in the distribution block, such as with
the multilayer model and routing switches (MLS), Spanning Tree and HSRP load balancing
should be deployed in a coordinated fashion. For example, the network in Figure 15-9 modied
the Spanning Tree parameters to force the odd VLANs to use the left link and the even VLANs
to use the right link. HSRP should be added to this design by making MDF-A the active HSRP
peer for the odd VLANs and MDF-B the active peer for the even VLANs.
TIP Be sure to coordinate HSRP and Spanning Tree load balancing. This is usually required in
networks employing routing switches and the multilayer model.
In cases where the switching router (8500) approach to the multilayer model is in use,
HSRP might be the only option available for load balancing within each distribution block
(there are no loops for Spanning Tree to be effective). Consequently, two HSRP groups
should be used for each subnet. This conguration was discussed in Chapter 11 and is
referred to as Multigroup HSRP (MHSRP). MHSRP can be used to load balance by
alternating the HSRP priority values.
TIP Use MHSRP load balancing for networks using switching router technology.
Figure 15-10 illustrates an example that provides load balancing for one subnet/VLAN on
an IDF switch.
Figure 15-10 MHSRP Load Balancing
IDF
Standby Group 1
Active
10.0.1.1
10.0.1.3 10.0.1.2 10.0.1.4
Active
Both of the MDF switches are assigned two real IP addresses, 10.0.1.3 and 10.0.1.4. Rather
than using a single standby group (which results in only one router and one riser link
actively carrying trafc), two standby groups are congured. The rst standby group uses
an IP address of 10.0.1.1 and the priority of MDF-A has been increased to make it the active
peer. The second standby group uses 10.0.1.2 and has MDF-B congured as the active peer.
If both MDF switches are active, both riser links and both devices actively carry trafc. If
either MDF device fails, the other MDF takes over with 100 percent of the load.
IP Routing
Another advantage in using Layer 3 switching is that IP routing protocols support very
intelligent forwarding and path determination mechanisms. Whereas it can take
considerable conguration to enable load balancing over two paths using techniques such
as Spanning Tree load balancing and HSRP, Cisco routers automatically load balance up to
six equal-cost paths (although Catalyst 8500s currently only load balance over two equal-
cost paths because of the microcode memory limitations). Moreover, Layer 3 routing
protocols support extensive path manipulation tools such as distribute lists and route maps.
Given that the multilayer design model focuses on Layer 3 switching in the MDF/
distribution layer closets (and possibly the core), IP routing can be an extremely effective
approach to load balancing across critical areas of the network such as the core (expensive
WAN links are another area).
ATM
One of the benets in using ATM in a campus environment is the sophistication of Private
Network-Network Interface (PNNI) as an ATM routing and signaling protocol. Like IP,
PNNI automatically load balances trafc over multiple paths. However, unlike IP, PNNI
does not perform routing on every unit of information that it receives (cells). Instead, ATM
only routes the initial call setup that is used to build the ATM connection. After the
connection has been established, all remaining cells follow this single path. However, other
calls between the same two ATM switches can use a different set of paths through a
redundant ATM network (therefore, PNNI is said to do per connection load balancing). In
this way, all of the paths within the ATM cloud are automatically utilized.
For more information on ATM, LANE, and PNNI, please consult Chapter 9, Trunking
with LAN Emulation.
EtherChannel
A nal form of load balancing that can be useful for campus networks is EtherChannel.
EtherChannel can only be used between a pair of back-to-back switches connected between
two and eight links (although some platforms allow limited combinations). It uses an XOR
algorithm on the low-order bits of MAC or IP addresses to assign frames to individual links.
For more information, see Chapter 8, Trunking Technologies and Applications.
678 Chapter 15: Campus Design Implementation
TIP The 802.3ad committee of the IEEE is working on a standards-based protocol similar to
Ciscos EtherChannel.
Routing/Layer 3 Switching
As this chapter has already mentioned many times, Layer 3 switching is a key ingredient in
most successful large campus networks. This section elaborates on some issues specic to
Layer 3 switching.
IDF IDF
Access
IDF IDF
Distribution
Core
Routing/Layer 3 Switching 679
The Layer 3 barrier created by the routing function embedded in the MDF switches
separates each building from the core. The primary benets of this technique are:
The modularity allows for cookie-cutter designs. Although the IP addresses (as well
as other Layer 3 protocol addresses) change, each distribution block can be
implemented with almost identical switch and router code.
The network is very easy to understand and troubleshoot. Technicians can apply most
of the same skills used for managing and troubleshooting router and hub networks.
The network is highly scalable. As new buildings or server farms are added to the
campus, they merely become new distribution blocks off the core.
The network is very deterministic. As devices or links fail, the trafc will failover in
clearly dened ways.
Although some degree of modularity can be created with more Layer 2-oriented designs
such as campus-wide VLANs, it is much more difcult to get the separation required for
true modularity. Without a Layer 3 barrier of scalability, the Layer 2 protocols tend to
become intertwined and tightly coupled. Consequently, it becomes more difcult to grow
and rearrange the network.
TIP If your design calls for the extensive use of IRB, consider using the Catalyst 6000 Native
IOS Mode detailed in Chapter 18. In general, it will result in a network that is considerably
easier to congure and maintain.
TIP Reducing unnecessary peering can be especially important with Catalyst 8500 routers and
the Catalyst 6000 MSM.
ATM 681
Load Balancing
As discussed in the Spanning Tree sections, the style of load balancing that is needed
depends primarily on the type of Layer 3 switching that is in use. To summarize the earlier
discussion, MLS generally requires that a combination of Spanning Tree and HSRP load
balancing techniques be used within the distribution block. When using switching routers,
MHSRP should be used.
Also, Layer 3 switches automatically load balance across the campus core if equal-cost
paths are available.
ATM
As Layer 3 switching has grown in popularity, it has demonstrated that ATM is not the only
technology capable of great speed. However, ATM does have its place in many campus
networks. This section examines some of the more important issues associated with
completing an ATM-based campus network design.
TIP Although the growth of ATM in campus networks has slowed at the time this book goes to
press, it is important to note that the use of ATM technology in the WAN continues to
expand rapidly.
ATM 683
IDF IDF
Ethernet
IDF IDF
ATM Core
The advantage of this approach is that it uses cost-effective Ethernet technology in the
potentially large number of IDF closets. This design is often deployed using the campus-
wide VLAN model to extend the speed of ATM through the Ethernet links. The downside
is that it creates a large number of Layer 2 loops where redundant MDF-to-IDF links are
used. Unfortunately, these links have been shown to create Spanning Tree loops that can
disable the entire campus network. Furthermore, it is harder to use ATM features such as
QoS when the edges of the network use Ethernet.
684 Chapter 15: Campus Design Implementation
The opposing view is that the ATM backbone should extend all the way to the IDF closets.
Under this design, the entire network utilizes ATM except for the links that directly connect
to end-user devices. This approach is illustrated in Figure 15-13.
IDF IDF
IDF IDF
ATM Core
The downside of this alternative is a potentially higher cost because it requires more ATM
uplink and switch ports. However, the major benet of this design is that it eliminates the
Layer 2 loops formed by the Ethernet links in the previous approach. Because LANE
inherently creates a loop-free Layer 2 topology, the risk of Spanning Tree problems is
considerably less (in fact, some vendors who promote this design leave Spanning Tree
disabled by default, something many network engineers feel is a risky move).
Having worked with implementations using both designs, I feel that the answer should be
driven by the use of Layer 3 switching (like many other things). If you are using the
multilayer model to create hard Layer 3 barriers in the MDF/distribution layer devices, the
MDF switches can be the attachment point to the ATM core and Ethernet links to the IDF
devices can be safely used. However, when the campus-wide VLAN model is in use,
extending the ATM backbone to the IDFs allows for the most stable and scalable design.
Trying to use the MDF-attachment method with campus-wide VLANs results in Spanning
Tree loops and network stability issues.
TIP The use of Layer 3 switching in your network should drive the design of an ATM core.
Using SSRP
Until standards-based LANE redundancy mechanisms become widely available, Simple
Server Redundancy Protocol (SSRP) will remain an important feature in almost any
LANE-based core using Cisco ATM switches. Although SSRP allows more than one set of
redundant devices, experience has shown that this can lead to scaling problems. See
Chapter 9 for more information on SSRP.
ATM 685
BUS Placement
Always try to place your LANE Broadcast and Unknown Server (BUS) on a Catalyst
LANE module. Because the BUS must handle every broadcast and multicast packet in the
ELAN (at least in current versions of the protocols), the potential trafc volume can be
extremely high. The Catalyst 5000 OC-3 and Catalyst 5000/6000 OC-12 LANE modules
offer approximately 130 kpps and 450 kpps of BUS performance respectively, considerably
more than any other Cisco device currently offered.
One decision faced by designers of large LANE cores involves whether a single BUS or
multiple distributed BUSes should be utilized. The advantage of a single BUS is that every
ELAN has the same logical topology (at least the primary topologies are the same, the
backup SSRP topology is obviously different). The disadvantage is that the single BUS can
more easily become a bottleneck.
Distributed BUSes allow each ELAN to have a different BUS. Although this can offer
signicantly higher aggregate BUS throughput, it can make the network harder to manage
and troubleshoot. With the introduction of OC-12 LANE modules and their extremely high
BUS performance, it is generally advisable to use a single BUS and capitalize on the
simplicity of having a single logical topology for every ELAN.
TIP With the high BUS throughput available with modern equipment, centralized BUS designs
are most common today.
MPOA
Multiprotocol Over ATM (MPOA) can be a useful technology for improving Layer 3
performance. MPOA, as discussed in Chapter 10, Trunking with Multiprotocol over
ATM, allows shortcut virtual circuits to be created and avoids the use of routers for
extended ows. When considering the use of MPOA, keep the following points in mind:
MPOA can only create shortcuts in sections of the network that use ATM. Therefore,
if the MDF devices attach to an ATM core but Ethernet is used to connect from the
MDF to the IDF switches, MPOA is only useful within the core itself. If the core does
not contain Layer 3 hops, MPOA offers no advantage over LANE. In general, MPOA
is most useful when the ATM cloud extends to the IDF/access layer switches.
Because MPOA is mainly designed for networks using ATM on an IDF-to-IDF basis,
you must intentionally build Layer 3 barriers into the network. Without careful
planning, MPOA can lead to at earth networks and the associated scaling problems
discussed earlier in this chapter and in Chapters 11, 14, and 17.
At presstime, signicant questions remain about the stability and scalability of MPOA.
686 Chapter 15: Campus Design Implementation
TIP MPOA only optimizes unicast trafc (however, related protocols such as a MARS can be
used to improve multicast performance).
Hardware Changes
In most Catalyst equipment (such as the Catalyst 5000), both MPOA and LANE use MAC
addresses from the chassis or Supervisor to automatically generate ATM NSAP addresses.
For a detailed discussion of how NSAP addresses are created, refer to Chapter 9. When
designing an ATM network, keep the following address-related points in mind:
Devices with active backplanes such as the Catalyst 5500s use MAC addresses pulled
from the backplane itself. Changing the chassis of one of these devices therefore
changes the automatically-generated NSAP addresses.
Devices with passive backplanes such as the Catalyst 5000 use MAC addresses from
the Supervisor. Therefore, changing a Catalyst 5000 Supervisor module changes the
pool of addresses used for automatically generating NSAP addresses.
In both cases, 16 MAC addresses are assigned to each slot. Therefore, simply moving
a LANE module to a different slot alters the automatically generated NSAP addresses.
Because of these concerns, many organizations prefer to use hard-coded NSAP addresses.
For more information, see the section Using Hard-Coded Addresses in Chapter 9.
Campus Migrations
It can be very challenging to manage a campus migration. New devices are brought online as
older equipment is decommissioned or redeployed. However, while the rollout is taking place,
connectivity must be maintained between the two portions of the network. This section makes
a few high-level recommendations.
In general, the most effective solution for dealing with campus migrations is to use the overlay
technique.
As shown in Figure 15-14, the overlay approach treats the two networks as totally separate.
Rather than connecting the new devices to the existing links, a completely out-of-band set of
new links are used. If old and new devices are located in the same wiring closet, both connect
to separate links. Therefore, the new network is said to overlay the existing network.
Figure 15-14 The Overlay Approach to Campus Migrations
Building 1 Building 2
Core of
Server New Network Old
Farm Network
Building 3
To maintain connectivity between the old and the new network, a pair of redundant routers
is used. This provides a single line where the two networks meet. Issues such as route
redistribution and access lists can be easily handled here. Also notice that this causes the
old network to resemble just another distribution block connected to the core of the new
network (another benet of the modularity created by the multilayer model).
Server Farms
Servers play a critical role in modern networks. Given this importance, they should be
considered early in the design process. This section discusses some common issues
associated with server farm design.
688 Chapter 15: Campus Design Implementation
Core
Server
Farm
Server Farm
Distribution Block
Building 2
The servers in Figure 15-15 can be connected by a variety of means. The gure shows the
servers directly connected to the pair of Layer 3 switches that link to the campus core. An
Server Farms 689
alternative design is to use one or more Layer 2 switches within the server farm. These
Layer 2 devices can then be connected to the Layer 3 switches through Gigabit Ethernet or
Gigabit EtherChannel. Although some servers can connect to only a single switch,
redundant NICs provide a measure of fault-tolerance.
The key to this design is the Layer 3 barrier created by the pair of Layer 3 switches that link
the server farm to the core. Not only does this insulate the server farm from the core, but it
also creates a much more modular design.
Some network designs directly connect the servers to the core as shown in Figure 15-16.
Figure 15-16 Connecting Servers Directly to the Campus Core
Building 1
ATM
Servers
Building 2
690 Chapter 15: Campus Design Implementation
Figure 15-16 illustrates a popular method used for core-attached serversusing an ATM
core. By installing LANE-capable ATM NICs in the servers, the servers can directly join
the ELAN used in the campus core. A similar design could have been built using ISL or
802.1Q NICs in the servers.
Most organizations run into one of two problems when using servers directly connected to
the campus core:
Inefcient ows
Poor performance
The rst problem occurs with implementations of the multilayer model where the routing
component contained in the MDF/distribution layer devices can lead to inefcient ows.
For example, consider Figure 15-16. Assume that one of the servers needs to communicate
with an end user in Building 1. When using default gateway technology, the server does not
know which MDF Layer 3 switch to send the packets to. Some form of Layer 3 knowledge
is required as packets leave the server farm. One way to achieve this is to run a routing
protocol on the servers themselves. However, this can limit your choice of routing protocols
throughout the remainder of the network, and many server administrators are reluctant to
congure routing protocols on their servers. A cleaner approach is to simply position the
entire server farm behind a pair of Layer 3 switches, as shown in Figure 15-15.
The second problem occurs with implementations of campus-wide VLANs where the
servers can be made to participate in every VLAN used throughout the campus (for
example, most LANE NICs allow multiple ELANs to be congured). Although this sounds
extremely attractive on paper (it can eliminate most of the need for routers in the campus),
these multi-VLAN NICs often have poor performance and are subject to frequent episodes
of strange behavior (for example, browsing services in a Microsoft-based network).
Moreover, this approach suffers from all of the scalability concerns discussed earlier in this
chapter and in Chapters 14 and 17.
In general, it is best to always place a centralized server farm behind Layer 3 switches. Not
only does this provide intelligent forwarding to the MDF switches located throughout the
rest of the campus, but it also provides a variety of other benets:
This placement encourages fast convergence.
Access lists can be congured on the Layer 3 switches to secure the server farm.
Server-to-server trafc is kept off of the campus core. This can not only improve
performance, but it can also improve security.
It is highly scalable.
Layer 3 switches have excellent multicast support, an important consideration for
campuses making widespread use of multicast technology.
Server Farms 691
TIP Fault-tolerant NICs allow two (or more) server NICs to share a single Layer 2 and Layer 3
address.
When selecting a fault-tolerant NIC, also consider what sort of load balancing it supports (some
do no load balancing, and others only load balance in one direction). Finally, closely analyze
the technique used by the NICs to inform the rest of the network that a change has occurred.
For example, many NICs perform a gratuitous ARP to force an update in neighboring switches.
692 Chapter 15: Campus Design Implementation
In some cases, this update process can be fairly complex and require a compromise of timer
values. For example, when using fault-tolerant Ethernet NICs in conjunction with a LANE
backbone, it is not enough to simply update the Layer 2 CAM tables and Layer 3 ARP
tables. If redundant LANE modules are used to access the server farm, the LANE LE-ARP
tables (containing MAC address to ATM NSAP address mappings) also need to be updated.
When faced with this issue, you might be forced to disable PortFast and intentionally incur
a Spanning Tree delay. The upside of this delay is that it triggers a LANE topology change
message and forces the LE-ARP tables to update.
Obviously, redundant NICs should be carefully planned and thoroughly tested before a real
network outage occurs.
TIP You may need to disable PAgP on server ports using fault-tolerant NICs to support the
binding protocols used by some of these NICs during initialization.
NOTE This feature had not received an ofcial name at the time this book goes to press. Contact
your Cisco sales team for additional information.
Host-A Host-B
10.0.1.10 10.0.1.11
Subnet
Cat-A Cat-B
10.0.1.0
Subnet
Broken
10.0.1.1 10.0.1.2
Router-A Router-B
Subnet 10.0.2.0
694 Chapter 15: Campus Design Implementation
The link between Cat-A and Cat-B has failed, partitioning the 10.0.1.0 subnet into two
halves. However, because neither router is aware of the failure, they are both trying to
forward all trafc destined to this subnet out their upper interface. Therefore, Router-A will
not be able to reach Host-B and Router-B will not be able to communicate with Host-A.
In general, there are two simple and effective ways to x this problem:
Utilize a mixture of Layer 2 and Layer 3 (such as with routing switches/MLS)
Place only a single Layer 2 switch between routers (as well as between the switching
router forms of Layer 3 switches)
Under the rst approach, MLS is used to create a Layer 2 environment that, because it is
redundant, remains contiguous even after the failure of any single link. Figure 15-18
illustrates this approach.
Figure 15-18 Avoiding Discontiguous Subnets With A Routing Switch (MLS)
Host-A Host-B
10.0.1.10 10.0.1.11
Cat-A Cat-B
Subnet Subnet
10.0.1.0 Broken
10.0.1.1 10.0.1.2
Subnet 10.0.2.0
MLS-Cat-A MLS-Cat-B
Figure 15-18 shows a logical representation of MLS devices where the Layer 2 and Layer
3 components are drawn separately in order to highlight the redundant Layer 2
conguration.
Additional Campus Design Recommendations 695
NOTE The design in Figure 15-18 could also be implemented using switching router technology
such as the Catalyst 8500s by utilizing bridging/IRB.
The second solution to the discontiguous subnet problem is to always use a single Layer 2
switch between routers, as shown in Figure 15-19.
Cat-A
Subnet
10.0.1.0
10.0.1.1 10.0.1.2
Because this eliminates the chain of Layer 2 switches shown in Figure 15-17, it allows
any single link to fail without partitioning the subnet.
VTP
In some situations, the VLAN Trunking Protocol (VTP) can be useful for automatically
distributing the list of VLANs to every switch in the campus. However, it is important to realize
that this can automatically lead to campus-wide VLANs. Moreover, as discussed in Chapter
12, VTP can create signicant network outages when it corrupts the global VLAN list.
When using VTP in large networks, consider overriding the default behavior using one of
two techniques:
VTP transparent mode
Multiple VTP domains
696 Chapter 15: Campus Design Implementation
First, many large networks essentially disable VTP by using the transparent mode of the
protocol (there is no set vtp disable command). When using VTP transparent mode, you
have absolute control over which VLANs are congured on each switch. This can allow you
to prune back VLANs where they are not required to optimize your network.
Second, when organizations do decide to utilize VTP server and client mode, it is often
benecial to use a separate VTP domain name for each distribution block. This provides
several benets:
It breaks the default behavior of spreading every VLAN to every switch (in other
words, campus-wide VLANs).
It constrains VTP problems to a single building.
It allows the VTP protocol to better mirror the multilayer model.
It can reduce Spanning Tree overhead.
Passwords
Because the XDI/CatOS-interface Catalysts (currently this includes 4000s, 5000s, and
some 6000 congurations) automatically allow access by default, be sure to set user and
privilege mode passwords. In addition, be certain to change the default SNMP community
strings (unlike the routers, SNMP is enabled by default on XDI/CatOS-interface Catalysts).
Port Congurations
When conguring ports, especially important trunk links, hard-code as many parameters as
possible. For example, relying on 10/100 speed and duplex negotiation protocols has been
shown to occasionally fail. In addition, the state (on or off) and type (isl or 802.1Q) of your
Ethernet trunks should be hard-coded.
TIP One exception to this rule concerns the use of PAgP, the Port Aggregation Protocol used to
negotiate EtherChannel links. If PagP is hard-coded to the on state, this prevents the
Catalyst from performing some value-added processing that can help in certain situations
such as Spanning Tree failover.
Review Questions 697
Review Questions
This section includes a variety of questions on the topic of campus design implementation.
By completing these, you can test your mastery of the material included in this chapter as
well as help prepare yourself for the CCIE written and lab tests.
1 This chapter mentioned many advantages to using the multilayer model. List as many
as possible.
2 This chapter also mentioned many disadvantages to using campus-wide VLANs. List
as many as possible.
3 List some of the issues concerning management VLAN design.
4 What are some factors to be considered when determining where to place Root Bridges?
5 List ve techniques that are available for campus load balancing.
6 What is the primary difference between using routing switches (MLS) and switching
routers in MDF/distribution layer devices?
7 What are the pros and cons of using ATM?
This chapter covers the following key topics:
Troubleshooting PhilosophiesDescribes philosophical approaches and practices
to problem solving in a network. Lists probable areas where switched network
problems can occur. Also covers using the OSI network model to organize
troubleshooting thoughts.
Catalyst Troubleshooting ToolsDescribes various show commands and other
facilities that provide further insight into network operations.
LoggingDiscusses how to use and congure Catalyst features to log signicant
events with your equipment.
CHAPTER
16
Troubleshooting
Throughout this book, you have seen suggestions on troubleshooting specic areas relevant
to the chapter topic. This chapter differs in that it focuses on high level, structured
troubleshooting techniques in the LAN switched environment and describes a number of
resources and tools available to facilitate troubleshooting.
Troubleshooting Philosophies
Troubleshooting philosophies vary depending upon training, knowledge, ability,
suspicions, system history, personal discipline, and how much heat you are getting from
users and management. Many philosophies fall apart when pressure mounts from users
screaming that they need network services now, and when managers apply even more
pressure because they do not understand or appreciate the complexities of network
troubleshooting. If you are susceptible to these pressures, this causes you to become
unstructured in your approach and to depend upon random thoughts and clues. Ultimately,
this increases the time to restore or deploy network services. Characteristics of a good
troubleshooting philosophy, though, include structure, purpose, efciency, and the
discipline to follow the structure.
Two philosophical approaches are presented here to organize your thoughts. The rst
method to be discussed describes an approach recognizing problems based upon their
probability of occurrence. They are then categorized into one of three buckets for the
differing probabilities. This is the bucket approach to troubleshooting.
The second approach tackles network problems based upon the OSI model. Each layer
represents a different network structure that you might need to examine to restore your
network. This is the OSI model approach for troubleshooting.
The approaches really tackle problems in a very similar manner, but represent different
methods of remembering the approach. The second method does differ, though, in its
granularity. The bucket approach groups troubles into three buckets. Each bucket contains
problems with similar characteristics and represents areas of probable problems. The
second approach tackles problems through the OSI model. The model helps to think
through symptoms and what high-level sources might cause the problems.
In reality, your troubleshooting technique probably uses a little of both. The bucket method
lumps the OSI model, to some extent, into three areas.
700 Chapter 16: Troubleshooting
Regardless of which approach you use, you must have one foundational piece to
troubleshoot your network: documentation.
Bucket 1: Cabling
The bucket of cable problems contains issues such as wrong cables, broken cables, and
incorrectly connected cables. Too often, administrators overlook cables as a trouble source.
This is especially true whenever the system was working. This causes troubleshooters to
assume that because it was working, it must still be working. They then investigate other
problem areas, only to return to cables after much frustration.
Common cable mistakes during installation generally include using the wrong cable type.
One, for example, is the use of a crossover cable rather than a straight through cable, or vice
versa. The following list summarizes many of the typical problems:
Crossover rather than straight through, or vice versa
Single-mode rather than multimode
Connecting transmit to transmit
Connecting to the wrong port
Partially functional cables
Cable works in simplex mode, but not full-duplex
Cables too long or too short for the media
Remember that when attaching an MDI (media dependent interface) port to an MDI-X
(media dependent crossover interface) port, you must use a straight through cable. All other
combinations require a crossover cable type. Fortunately, using the wrong cable type keeps
the link status light extinguished on equipment. This provides a clue that the cable needs to
be examined. An extinguished link status light can result from the correct cable type, but a
broken one. Be aware that an illuminated link status light does not guarantee that the cable
is good either. The most that you can conclude from a status light is that you have the
correct cable type and that both pieces of equipment detect each other. This does not,
however, mean the cable is capable of passing data.
A form of partial cable failure can confuse some network operations like Spanning Tree.
For example, if your cable works well in one direction, but not the other, your Catalyst
might successfully transmit BPDUs, but not receive them. When this happens, the
converged Spanning Tree topology might be incorrect and, therefore, dysfunctional.
NOTE I have a box lled with cables that were healthy enough to illuminate the status light on
equipment, but not good enough to transmit data. Consequently, I wasted much time
investigating other areas only to circle back to cables. I should have stuck with my
troubleshooting plans and checked the cables rather than bypassing it. Make sure that you
have a cable tester handyone capable of performing extensive tests on the cable, not just
continuity checks.
Troubleshooting Philosophies 703
Cisco introduced a feature in the Catalyst 6000 series called Uni-Directional Link
Detection (UDLD), which checks the status of the cable in both directions, independently.
If enabled, this detects a partial cable failure (in one direction or the other) and alerts you
to the need for corrective action.
Another copper cable problem can arise where you fully expect a link to autonegotiate to
100 Mbps, but the link resolves to 10 Mbps. This can happen when multiple copper cables
exist in the path, but are of different types. For example, one of the cable segments might
be a Category 3 cable rather than a Category 5. Again, you should check your cables with
a cable tester to detect such situations.
Another example of using a wrong cable type is the use of single-mode ber rather than
multimode, or multimode rather than single-mode. Use the correct ber mode based upon
the type of equipment you order. There are a couple of exceptions where you can use a
different ber mode than that present in your equipment, but these are very rare. Plan on
using the correct ber type. As with any copper installation, look for a status or carrier light
to ensure that you dont have a broken ber or that you didnt connect the transmit of one
box to the transmit of the other box. And as with copper, a carrier light does not always
ensure that the cable is suitable for data transport. You might have too much attenuation in
your system for the receivers to decode data over the ber. If using single-mode ber, you
might have too much light entering the receiver. Make sure that you have at least the
minimal attenuation necessary to avoid saturating the optical receiver. Saturating the
receiver prevents the equipment from properly decoding the equipment.
Unless there is a clearly obvious reason not to do so, particularly in an existing installation,
check cables. Too often, troubleshooting processes start right into bucket 2 before
eliminating cables as the culprit.
Bucket 2: Conguration
After conrming that cables are intact, you can start to suspect problems in your
conguration. Usually, conguration problems occur during initial installations, upgrades,
or modications. For example, you might need to move a Catalyst from one location to
another, but it doesnt work at the new location. Problems here can arise from not assigning
ports to the correct VLAN, or forgetting to enable a trunk port. Additional conguration
problems include Layer 3 subjects. Are routers enabled to get from one VLAN to another?
Are routing protocols in place? Are correct Layer 3 addresses assigned to the devices? The
following list summarizes some of the things to look for in this bucket:
Wrong VLAN assignments on a port.
Wrong addresses on a device or port.
Incorrect link type congured. For example, the link might be set as a trunk rather than
an access link, or vice versa.
VTP domain name mismatches.
704 Chapter 16: Troubleshooting
NOTE You can enable PortFast on all ports except for trunk ports to alleviate the probability of
client/server attachment problems. However, enable this feature with caution, as you can
create temporary Layer 2 loops in certain situations. PortFast assumes the port is not a part
of a loop and does not startup by looking for loops.
Bucket 3: Other
This bucket contains most other problem areas. The following list highlights typical
problems:
Hardware failures
Software bugs
Unrealistic user expectations
PC application inadequacies
Sometimes, a user attempts to do things with his application program that it was not
designed to do. When the user fails to make the program do what he thinks it should do, he
blames the network. Of course, this is not a valid user complaint, but is an all-too-often
scenario. Ensure that the user need is valid before launching into a troubleshooting session.
Troubleshooting Philosophies 705
Unfortunately, you can discover another culprit in this bucket. Equipment designers and
programmers are not perfect. You will encounter the occasional instance where a product does
not live up to expectations due to a manufacturers design aw or programming errors. Most
manufacturers do not intentionally deploy awed designs or code, but it does occasionally
happen. When corporate reputations are volatile and stockholder trust quickly evaporates,
vendors work aggressively to protect their image. Vendors rarely have a chance to intervene
reputation slams on the Internet because word spreads quickly. It is much more difcult for a
vendor to recover a reputation than to maintain it. Therefore, vendors usually work under this
philosophy and strive to avoid the introduction of bad products into the market.
As administrators, though, we tend to quickly blame the manufacturer whenever we
experience odd network behavior that we cannot resolve. Although easy to do, it does not
reect the majority of problems in a network. We do this because of the disreputable
companies that polluted the market and occasionally crop up today. Many networks fail to
achieve their objectives due to unscrupulous vendors. As an industry, we now tend to
overreact and assume that all companies operate that way. Do not be too quick to criticize
the manufacturer.
NOTE Yes. Even Cisco has occasional problems. Be sure to check the Cisco bug list on the Web
or send an inquiry to Ciscos Technical Assistance Center (TAC) if you experience unusual
network problems. This might be an unexpected feature of the software/hardware.
VLAN 2
Router 1 Router 2
2
3
4
1
VLAN 1 VLAN 3
PC-1 PC-2
In Figure 16-2, PC-1 desires, but fails, to communicate with PC-2 in the gure. Assume that
it is an IP environment. From one PC or the other, attempt to communicate (maybe with
ping) to the rst hop router. For example, you might rst initiate a ping from PC-1 to the
Router 1 interface (point 1 in the gure). Do this by pinging the IP address of the ingress
port of Router 1 which belongs to the same subnet as PC-1. Then, try the outbound interface
on Router 1 (point 2 in the gure). Continue through the hops (points 3 and 4) until you
reach PC-2. Probably, somewhere along the path, pings fail. This is your problem area.
Now, you need to determine if there is a routing problem preventing the echo request from
reaching the router, or an echo reply from returning. For example, suppose the ping fails
on the ingress port of Router 2 (point 3). To determine if the problem is at Layer 2, attempt
to ping the router interface from another device in the VLAN. This might be another
workstation or router in the broadcast domain. If the ping fails, you might reverse the
process. Try pinging from the router to other devices in the broadcast domain. Check the
routers interface with the show interface command. On the other hand, you might need to
check the Catalyst port to ensure that the port is active. You might need to check the
following items for correctness:
Is the port enabled?
Is the port a trunk or access link?
Catalyst Troubleshooting Tools 707
show Commands
Throughout this book, each chapter has presented show commands relevant to the chapter
subject material. However, several additional show commands exist in the Catalyst to
further enable you to diagnose your switched network environment.
EARL Status :
NewLearnTest: .
IndexLearnTest: .
DontForwardTest: .
MonitorTest .
DontLearn: .
FlushPacket: .
ConditionalLearn: .
EarlLearnDiscard: .
EarlTrapTest: .
The rst highlighted portion shows the results of the power supply tests. Because no F
appears next to the supply entries, they passed the test. Other environmental test results are
shown in this block. The second category tests the Enhanced Address Recognition Logic
(EARL) functionality. The EARL manages the bridge tables. Again, only dots . appear
next to each test and therefore represent pass.
Ler
Port CE-State Conn-State Type Neig Con Est Alm Cut Lem-Ct Lem-Rej-Ct Tl-Min
----- -------- ---------- ---- ---- --------------- ---------- ---------- ------
3/1 isolated connecting A U no 9 9 7 0 0 102
3/2 isolated connecting B U no 9 8 7 0 0 40
Several of these elds merit discussion as values in some columns can suggest areas to
investigate. Values in the Align-Err and FCS-Err elds indicate that the media cable
deteriorated or that the station NIC no longer operates correctly. These values increment
whenever the received frame has errors in it. The errors are detected by the receiver with
the CRC eld on the frame. Align-Err further indicates that the frame had a bad number of
octets in it. This can strongly point to a NIC failure.
710 Chapter 16: Troubleshooting
The Xmit-Err and Rcv-Err elds indicate that the port buffers overowed, causing the
Catalyst to discard frames. This happens if the port experiences congestion preventing the
Catalyst to forward frames onto the switching BUS, or out the interface onto the media. To
help resolve the rst case where the port cannot transfer the frame over the BUS (Rcv-Err),
increase the port priority to high with the set port priority command. When set to high, the
BUS arbiter grants the port access to the BUS at a rate ve times more frequently than
normal. This has the effect of emptying the buffer at a faster rate.
TIP Do not set all ports to high priority as this effectively eliminates any advantage to it. Use
this setting on your high volume servers.
If the Catalyst drops frames because it cannot place frames onto the media, this can indicate
a congestion situation where there is not enough bandwidth on the media to support the
amount of trafc trying to transmit through it. Figure 16-3 illustrates a switched network
where multiple sources need to communicate with the same device.
100 M
100 Mbps
100 Mbps
100 M
bps
bps
100 Mbps
10 Mbps
Catalyst Troubleshooting Tools 711
In Figure 16-3, the aggregate trafc from the sources exceeds the bandwidth available. The
upper devices connect at 100 Mbps, but attempt to access a device running at 10 Mbps. If
all of the stations transmit at the same time, they quickly overwhelm the 10 Mbps link. This
forces the Catalyst to internally buffer the frames until bandwidth becomes available. Like
any LAN device, however, the Catalyst does not hold onto the frame indenitely. If it
cannot transmit the frame in a fairly short period of time, the frame is discarded. This can
happen if the Catalyst repeatedly experiences collisions when it attempts to transmit the
frame. Like other LAN devices, the Catalyst attempts to transmit the frame up to 16 times.
After 16 collisions, the Catalyst drops the frame. A Catalyst can also discard a frame if there
is no more buffer space available. To x this, you might need to increase the port bandwidth,
or create multiple collision or broadcast domains on the egress side of the system.
Three elds indicate bad frame sizes: UnderSize, Runts, and Giants. The rst two elds
indicate frames that are less than a legal media frame size, whereas Giants indicates frames
too large per the media specication. UnderSize and Giant frames usually mean that the
frame format and CRC are valid, but the frame size falls outside of the media parameters.
For example, a malfunctioning Ethernet station might create an Ethernet frame less than 64
bytes in length. Although the MAC header and CRC values are valid, they do not meet the
Ethernet frame size requirements. The Catalyst discards any such frame. Runt frames differ
from UnderSize frames in that they are usually a byproduct of a collision on a shared media.
Runts, unlike UnderSize frames, do not carry valid CRC values. If you see the Runt counter
continuously incrementing across periods of time, this can indicate that you either have too
many devices contending for bandwidth in the collision domain or you have a media
problem generating collisions and the runt byproduct. If the problem stems from bandwidth
contention, break the segment into smaller collision domains. If the problem is from media
(such as a 10BASE2 termination problem), x it!
Four elds describe collision combinations: Single-Coll, Multi-Coll, Excess-Col, and Late-
Coll. Single-Coll counts how many times the Catalyst wanted to transmit a frame, but
experienced one and only one collision. After the collision, the Catalyst attempted to transmit
again, but this time successfully. Multi-Coll counts collisions inclusively, from 215. The
Catalyst attempted multiple times to transmit the frame, but experienced collisions when doing
so. Eventually, it successfully transmitted the frame. Excess-Col counters increment whenever
the Catalyst tries 16 times to transmit. When this counter increments, the Catalyst discards the
frame. Late-Coll stands for late collision. A late collision occurs when the Catalyst detects a
collision outside of the collision time domain described in Chapter 1, Desktop Technologies.
This means that the collision domain is too large. You either have too many cables or repeaters
extending the end to end distance beyond the media timeslot specications. Shorten the
collision domain with bridges or by removing offending equipment.
Although most of the column headers are fairly self explanatory, a couple deserve
additional clarication. Example 16-4 shows a partial listing of the show mac output.
The rst line of the output shows the frame counters mentioned previously. The second line,
highlighted in this example, counts other events. Dely-Exced indicates how many times
that the Catalyst had to discard a frame when it wanted to transmit, but had to defer (wait
to transmit) because the media was busy. The wait time was excessive because a source
transmitted longer than what is expected for the media. This is sometimes referred to as
jabber and is caused by a malfunctioning NIC in a shared media network. Rather than
indenitely holding the frame, the Catalyst discards the frame. Therefore, this counter
displays the number of frames discarded because of the jabber. This should only occur
when the port is attached to shared media.
MTU-Exced counts how many times the port received a frame where the frame exceeded
the Maximum Transmission Unit (MTU) frame size congured on the interface. The size
is set to the media maximum by default, but you can elect to reduce this value. You can do
this when you have an FDDI source trying to communicate to an Ethernet source and want
to ensure that any frames over the Ethernet MTU are discarded by the switch.
In-Discard reects the number of times that the Catalyst discards a frame due to bridge
ltering. This occurs when the source and destination reside on the same interface. See
Chapter 3, Bridging Technologies, for details on ltering.
Bridges (and Catalysts) have a nite amount of memory space for the bridge tables. The
bridge lls the table through the bridge learning process described in Chapter 3. Depending
upon the model of Catalyst you have, the Catalyst can remember up to 16,000 entries. But
if you have a very large system where this memory space gets lled because of many
stations, the Catalyst must replace existing entries until older entries are aged from the table
to free space. The Lrn-Discrd counter tracks the number of unlearned addresses where the
switch normally learns the source address, but cannot because the bridge table is already
full.
In-Lost and Out-Lost represent the number of frames dropped by the Catalyst due to
insufcient buffer space. In-Lost counts the frames coming into the port from the LAN.
Out-Lost counts the frames to go out the port to the LAN.
Catalyst Troubleshooting Tools 713
SPAN
Sometimes you want to examine trafc owing in and out of a port, or within a VLAN. In
a shared network, you attach a network analyzer to an available port and your analyzer
promiscuously listens to all trafc on the segment. Your analyzer can then decode the
frames and provide you with a detailed analysis of the frame content. In a switched
network, however, this is not nearly as simple as in a shared network. For one thing, a
switch lters a frame from transmitting out a port unless the bridge table believes the
destination is on the port, or unless the bridge needs to ood the frame. This is clearly
inadequate for trafc analysis purposes. Therefore, the normal Catalyst behavior must be
modied to capture trafc on other ports. The Catalyst feature called Switched Port
Analyzer (SPAN) enables you to attach an analyzer on a switch port and capture trafc from
other ports in the switch.
High performance analysis tools are also available such as the Network Analysis Module
which provides enhanced RMON reporting to your network management station. This
module plugs into a slot in your Catalyst and monitors trafc from a SPAN port or from
NetFlow.
Another Cisco monitoring tool, the SwitchProbe, attaches externally to a Catalyst SPAN
port or network segment and gathers RMON statistics that can then be retrieved by your
network management station.
By default, this feature is disabled. You need to explicitly enable SPAN to capture trafc
from other ports. When you enable SPAN, you need to specify what you want to monitor
and where you want to monitor it.
What you can monitor includes:
An individual port
Multiple ports on the local Catalyst
Local trafc for a VLAN
Local trafc for multiple VLANs
Monitored trafc goes to a port on the local Catalyst. Figure 16-4 illustrates that the trafc
from VLAN 100 is monitored and directed to the analyzer attached to Port 4/1.
714 Chapter 16: Troubleshooting
100 100
Cat-A Cat-B
100 100
4/1
SPAN Port
Although the set span 100 4/1 command says to monitor VLAN 100, note that only VLAN
100 trafc local to Cat-A is captured. If stations on Cat-B transmit unicast trafc to each
other, and the frames are not ooded, the analyzer does not see that trafc. The only trafc
that the analyzer can see is trafc ooded within VLAN 100, and any local unicast trafc.
TIP Be careful when monitoring Gigabit Ethernet. The 9-port Gigabit Ethernet module provides
local switching and cannot SPAN the switch backplane. If you have the 3-port Gigabit
Ethernet module, this is not so. You can monitor all trafc within the Catalyst.
TIP Although you can direct VLAN trafc to a SPAN port, the port sees only the local VLAN
trafc. If a VLAN has a presence in multiple Catalysts, the SPAN port displays the VLAN
trafc found in the local Catalyst where you enable SPAN. Therefore, you get all of the
local trafc. You see VLAN trafc from other Catalysts only if the frame is forwarded or
ooded to your local Catalyst. If a frame stays local in a remote Catalyst, your SPAN port
does not detect this frame.
TIP Be careful with the syntax for this command. It is very similar to the set spantree
command. set span and set spant are the short forms of two different commands.
Logging 715
Logging
It is a good idea to maintain a log of signicant events of your equipment. An automatic
feature in your Catalyst can transmit information that you deem as important to a TFTP le
server for you to evaluate at a later time. You might want this information for
troubleshooting reasons or security reasons. You can use the le, for example, to answer
questions such as What was the last conguration? or Did any ports experience unusual
conditions?
A number of conguration commands modify the logging behavior. By default, logging is
disabled. However, you can enable logging and direct the output to an internal buffer, to the
console, or to a TFTP server. The following commands send events to the server:
set logging server {enable | disable}This command enables or disables the log to
server feature. You must enable it if you plan to automatically record events on the
server.
set logging server ip_addrUse this command to inform your Catalyst about the IP
address for the TFTP server.
set logging server facility server_ facility_ parameterA number of processes can
be monitored and logged to the server. For example, signicant VTP, CDP, VMPS,
and security services can be monitored. Reference the Catalyst documentation for a
detailed list.
set logging server severity server_severity_levelVarious degrees of severity
ranging in value from 0 through 7 describe the events. 0 indicates emergency
situations, and 6 is informational. 7 is used for debugging levels. If you set the severity
level to 6, you will have a lot of entries in the logging database because it provides
information on trivial and signicant events. If you set the level to 0, you will only get
records when something catastrophic happens. An intermediate level is appropriate
for most networks.
This chapter covers the following key topics:
Real-World Design IssuesThis chapter presents an opportunity to apply the skills
learned in earlier chapters in two real-world designs.
Campus-Wide VLANsConsiders the real-world downsides of at earth designs.
MLS DesignDiscusses and analyzes the pros and cons of a campus design that uses
Multilayer Switching (MLS) for Layer 3 switching.
Hardware-Based Routing DesignAnalyzes the benets and unique characteristics
of a campus design based on the Catalyst 8500-style of hardware-based routing.
Conguration ExamplesLooks at actual congurations for two different campus
designs.
CHAPTER
17
Case Studies: Implementing
Switches
Previous chapters have focused on building specic skills required to successfully
understand and create scalable campus networks. This chapter steps back from specic
campus skills and technologies to focus on the big picture.
In doing so, this chapter examines the design requirements for a rapidly growing campus
network. To maximize the opportunity for analysis, two separate designs will be created for
this single client requirement. Because of its proven advantages, both designs utilize Layer
3 switching in the distribution layer/Main Distribution Frame (MDF) devices. However, the
rst approach uses MLS to retain a distinct Layer 2 component at the distribution layer. The
second design uses Catalyst 8500-style technology to create a hard Layer 3 barrier in the
distribution layer.
Both designs also utilize a wide variety of other switching-related features. This presents a
real-world environment where the pros and cons of different features and approaches can
be discussed. Because, as a network designer, you are certain to face many of these same
decisions, this chapter should serve as a useful template for your own campus designs.
Finally, do not focus on the specic products and models of equipment discussed in this
chapter. Although the chapter mentions various products in an effort to be as precise and
real world as possible, the main focus should be on campus design thought processes and
methodologies. Although products are guaranteed to change at an ever-faster pace, the
hallmarks of a good design rarely change (furthermore, the syntax shown in the
conguration examples included in this chapter also rarely change signicanly).
718 Chapter 17: Case Studies: Implementing Switches
Third Floor
Finance
NetWare 3.12
Servers
NT 3.51
Server
Second Floor
Engineering &
Marketing
OS/2
Server
NetWare 3.11
Server
2514
First Floor
Sales
NetWare 4.02
Server
Basement
Internet
4000
Web
Site
The basement of Building 1 contains a Cisco 4000 router that links the company to a local
Internet service provider. The company currently has a Web site hosted by its ISP but wants
to bring this function in-house. The Ethernet1 port of the router connects to a 10Base2
segment linked to two NetWare 3.11 servers and a NetBIOS-based e-mail server. The
Ethernet0 port connects to a 10BaseT hub located in the sales department on the rst oor.
The sales department uses a total of three 48-port hubs in a daisy-chain conguration as
well as a single NetWare 4.02 server. In all, 109 connections are in use on the rst oor.
For the second oor, a four-port software-based bridge was purchased. Two ports are used
to link to the rst and third oors, and a third port links to the marketing departments
NetWare 3.11 server. The fourth port connects to a 96-port hub that contains the end-user
connections for the marketing department. Being more technically savvy, the engineering
department has installed their own 2514 router and an 8-port Ethernet switch. The switch
connects to two 48-port hubs and an OS/2 server running a CAD package. In total, there
are 163 users on the second oor.
The nance department uses a series of 48-port hubs daisy chained off of the bridge on the
second oor. The nance department has 99 end-user connections and three servers: two
NetWare 3.12 servers and an NT 3.51 server.
Across the formal gardens from Building 1, Building 2 is in the nal stages of construction.
This building will be used to house the staff that replaces the regional ofce currently
located in High Point, North Carolina. The engineering group will occupy the rst oor,
and the nance and marketing groups will use the second oor. The sales department has
arranged mahogany ofces on the third oor (they convinced senior management that the
commanding view of Happy Homes growing headquarters would improve sales). Four
hundred and thirty-one employees are expected to occupy Building 2.
As additional regional ofces are closed and relocated to Baltimore, more buildings will be
built. When these relocations are combined with Happys ongoing success in the
marketplace, they expect a total of six buildings and 2,600 employees within two years.
Although the initial design should only include Buildings 1 and 2, the client has repeatedly
stressed the importance of having a design that will easily scale to accommodate all six of
the planned buildings.
Should the Flat Earth Model Be Used? 721
Management is well aware of the drawbacks of the current network and wants to capitalize
on technology as a competitive weapon. They want Happy to not only be known for
building great houses, but also for being a technology leader. They recognize that a high
speed and exible campus network will play a key role in this. However, bandwidth alone
will not be enough. Given the many outages experienced with the existing network, the new
design must offer redundancy, stability, and high availability.
Building 1
L2
Switches
ATM
L2
Switches
Building 2
Should the Flat Earth Model Be Used? 723
As the Happy staff described their design, it was clear that they subscribed to the campus-
wide VLANs model discussed in Chapter 14, Campus Design Models. Some of the key
features they mentioned are included in the following list:
Using 30 or 40 VLANs to create lots of communities of interest. This would provide
ne-grained control over security and broadcast radiation.
Because every VLAN would be congured on every switch, it would be easy to place
users in different buildings and oors in the same subnet. This would allow trafc
within the team to avoid the slowness of routing.
It would be easy to add a new user to any VLAN. A vendor had recently demonstrated
a drag-and-drop VLAN assignment tool that had the whole team very excited.
An ATM core could be used to trunk every VLAN between every building. The ATM
core could also provide multiservice voice and video capabilities in the future.
Servers could be attached to multiple VLANs/ELANs with LANE and ISL NICs.
Again, this would provide a direct Layer 2 path from every user to every server and
minimize the use of routers.
VMPS could be used to dynamically assign VLANs to the rapidly growing number of
laptop computer users. This would allow VLANs to follow the users as they moved
between various ofces and conference rooms. Their Cisco Sales Representative had
also demonstrated a product called User Registration Tool (URT). This would allow
VLAN assignments to be made based on NT Domain Controller authentication. The
exibility of this feature excited the Happy Homes network staff.
The small amount of remaining inter-VLAN trafc could be handled by a pair of 7507
routers. These routers would use one HSRP group per VLAN to provide fault-tolerant
routing for each VLAN.
The design crew had run into these sorts of expectations in the past. In fact, they had
recommended almost this exact design to several clients one or two years earlier. At that
time, the design team felt that avoid-the-router designs were necessary because software-
based routers could not keep pace with the dramatic growth in campus bandwidth demands.
However, the results of these campus-wide VLANs created many unforeseen problems,
including the following:
Spanning Tree loops and instability were common. These frequent outages were often
campus-wide and difcult to troubleshoot.
Even when fast-converging protocols such as EIGRP, HSRP, and SSRP provided 510
second failover performance, Spanning Tree still created 3050 second outages.
Drag-and-drop VLAN assignment schemes were never as easy to use as everyone
expected. Not only were some of the management platforms buggy, the VLANs-
everywhere approach made it almost impossible to troubleshoot network problems.
Instead of jumping in to solve the problem, network staff found themselves spending
lots of time simply trying to comprehend the constantly changing VLAN layout.
724 Chapter 17: Case Studies: Implementing Switches
The performance of ISL and ATM NICs was disappointing and also led to
unexplainable server behavior.
It was becoming harder and harder to keep trafc within a single VLAN, the very
premise of at earth networks.
On the plus side, the design team did acknowledge that in certain situations, the advantages
of the campus-wide VLAN model might outweigh its downsides. For example, the design
team had recently worked on a large design for a university. The school wanted to create
separate VLANs for students, professors, and administrative staff. Furthermore, they
wanted these communities to be separate across each of the universitys departments (for
example, the biology department and physics department would use six VLANs: two for
administrative staff, two for students, and two for professors). Because the school assigned
a laptop to every student and professor, it was impossible to make static assignments to
these VLANs. Campus-wide VLANs and URT/VMPS allowed students and university
personnel to simply plug into any available outlet and receive the same connectivity
throughout the entire campus. However, the inter-VLAN routing could still be centralized,
allowing for much simpler access list conguration.
The design team mentioned that they also had some large hospital and government clients
using similar designs. However, because the design team did not see the need for this sort
of dynamic VLAN assignment and centralized access lists in the Happy Homes network,
they recommended against this approach.
The discussion continued throughout the day. During this time, the design team brought up
other issues discussed in Chapter 7, Advanced Spanning Tree, Chapter 11, Layer 3
Switching, Chapter 14, Campus Design Models, and Chapter 15, Campus Design
Implementation. In the end, Happy Homes decided that the risks of the campus-wide
VLAN approach were too great. They agreed that Layer 3 switching eliminated virtually
all of the downsides they associated with traditional routers. Therefore, rather than striving
Design 1: Using MLS to Blend Layer 2 and Layer 3 Processing 725
to avoid Layer 3 routing, they decided to use it as a means to achieve stability, scalability,
simplicity, and ease of management.
As the design team left at the end of the day, they agreed to return in several weeks with
two separate designs for Happy Homes consideration. Although both designs would utilize
Layer 3 switching, one would blend Layer 2 and Layer 3 processing and the other would
maximize the Layer 3 component. The results of their design efforts are presented in the
following two sections.
IDFs 5509
Cat-B1-2A
5509
Cat-B1-1A
Enet Enet
MDFs 5500
LANE LANE
Cat-B1-0A Cat-B1-0B
Cat-B2-0A Cat-B2-0B
LANE LANE
Cat-B2-1A
5509
Cat-B2-2A
IDFs 5509
Cat-B2-3A
2820
Building 2
Design 1: Using MLS to Blend Layer 2 and Layer 3 Processing 727
Design Discussion
This section introduces some of the design choices that were made for the rst design.
However, before diving into the specics, it is worth pausing to look at the big picture of
Design 1. As discussed earlier, both designs use Layer 3 switching in the MDF/distribution
layer devices. This isolates each building behind a Layer 3 barrier to provide scalability and
stability. By placing each building behind the safety of an intelligent Layer 3 router, it is
much more difcult for problems to spread throughout the entire campus. Also, by
providing a natural hierarchy, routers (Layer 3 switches) simplify the troubleshooting and
maintenance of the network.
However, notice that Layer 2-oriented Catalysts, such as the Catalyst 5000s used in this
design, do not automatically provide this Layer 3 barrier. In other words, simply plugging
in a bunch of Catalyst 5000s or non-Native IOS Mode Catalyst 6000s (see Chapter 18,
Layer 3 Switching and the Catalyst 6000/6500s for more information), add every VLAN
to every switch (recall that VTP defaults to server mode). Only by manipulating VTP and
carefully pruning selected VLANs from certain links can Layer 3 hierarchy be achieved
when using technologies with a strong Layer 2 component (such as RSMs, MLS, and
Catalyst 5000s and 6000s without any Layer 3 hardware/software).
For example, in this case the trafc from the end-user VLANs 1114 and 2124 could be
forced through a separate VLAN in the core (VLAN 250) to create a true Layer 3 barrier.
If left to the defaults where all of the devices are VTP servers in the same domain and
therefore contain the full list of VLANs, routing might still be required between VLANs,
but a Layer 3 barrier of scalability is not created. For more information on this point, see
Chapter 14 and the section MLS versus 8500s in Chapter 11.
NOTE It is extremely important to recognize that most of the devices in Ciscos product line can
be used to build either Layer 2 or Layer 3 designs. This chapter is focusing on its relative
strengths and default behavior. For example, as Chapter 11 pointed out, Catalyst 8500s can
be used to build either Layer 2 or Layer 3 networks. However, by default, the 8500s
function as switching routers, where every interface is a uniquely routed subnet/VLAN.
Although you can use 8500s in Layer 2 designs, this generally involves the use of IRB,
something that can easily become difcult to manage as the network grows.
Similarly, MLS can easily be used to build all of the Layer 3 topologies discussed in this
chapter. However, many people are misled into believing that they automatically have Layer
3 hierarchy simply because they paid for some Layer 3 switching cards. As stated in the
preceding text, this is not the case. Therefore, although MLS is suitable for almost all Layer
3 campus topologies, it does not maximize the scalability benets of Layer 3 switching by
default (you need to intervene to control VTP and implement selective VLAN pruning).
Finally, it is worth noting that the MSFC Native IOS Mode, discussed in Chapter 18, is equally
adept at both designs. Consider it the multipurpose tool of Layer 3 campus switching.
728 Chapter 17: Case Studies: Implementing Switches
Although both Design 1 and Design 2 create a Layer 3 barrier, for the reasons mentioned
in previous paragraphs, the way in which the Layer 3 switching is implemented constitutes
the primary difference between the two designs. In the case of Design 1, the Layer 3 barrier
is created at the point where trafc enters and leaves the building. The result: trafc can
continue to maintain a Layer 2 path within each building. In effect, Layer 3 switching has
been implemented in such a way that Layer 2 triangles have been maintained within each
building (the IDF switch represents one corner of the triangle with the other two corners
being the MDF switches). By breaking the Layer 2 processing into clearly-dened and
well-contained regions, this approach can provide a very scalable, high-performance, and
cost-effective solution for campus networks.
By contrast, later sections of the chapter explore an alternate approach to Layer 3 switching
used in Design 2. This design uses 8500-style hardware-based routing to implement routing
both between and within the buildings. Although, as discussed in Chapter 11, Catalyst
8500s can be congured to provide a mixture of Layer 2 and Layer 3 switching, these
devices are most comfortable as a pure Layer 3 device (this is from a conguration and
maintenance standpoint, not from the standpoint of the data forwarding rate). This
effectively chops off the bottom of the Layer 2 triangles in Design 1 to create Layer 2 Vs.
NOTE Note that MLS can also be used to create Layer 2 Vs by simply pruning the MDF-to-MDF
link of the IDF VLANs. Although this is a popular design choice successfully used by many
organizations, this chapter does not utilize it in an attempt to maximize the differences
between Design 1 and Design 2.
Although the difference between these two designs might seem trivial, it can be dramatic
from a network implementation standpoint. By looking at specic conguration
requirements and commands used by these two approaches, this chapter explores in detail
the many implications of these two approaches to campus design.
Hardware Selection
Because of their high port densities and proven exibility, Catalyst 5500s were chosen for
the bulk of the devices used in Design 1. The horizontal wiring from end stations connect
to an IDF/access layer switch located on each oor. Except for the third oor of Building
2, Catalyst 5509s have been selected as the IDF switches. Because the mahogany sales
department ofces on the third oor will take up considerably more space than other ofces
within Happy Homes, this will dramatically reduce the number of end stations located here.
As a result, a Catalyst 2820 will be deployed on the third oor of Building 2.
The IDF switches will then connect via redundant links to a pair of MDF/distribution layer
switches located in the basement of each building. Because they provide both ATM and
Ethernet switching capabilities, Catalyst 5500s will be used in the MDFs. Route Switch
Modules (RSMs) and MLS will also play a key role here.
Design 1: Using MLS to Blend Layer 2 and Layer 3 Processing 729
The design also calls for a small server farm located in the basement of Building 1. This
facility is designed to handle Happy Homes server farm needs until construction can be
completed on a separate data center building. The server farm will use a Catalyst 2948G
switch to provide 10/100 Ethernet connectivity to the servers and Gigabit Ethernet uplinks
to the Cat-B1-0A and Cat-B1-0B switches.
VLAN Design
The design utilizes ve VLANs in each building plus an additional VLAN for the backbone.
The rst VLAN in each building is reserved for the management VLAN and only contains
Catalyst SC0 interfaces (or ME1 interfaces on some models). The other four VLANs are
used for end users: Sales, Marketing, Engineering, and Finance. Table 17-1 presents the
VLAN names and numbers recommended by the design.
In other words, the rst digit (or two digits in the case of the Backbone) of the VLAN
number species the building number, and the last digit species the VLAN within the
building.
The backbone VLAN, VLAN 250, corresponds to an ELAN named Backbone. Finally,
notice that although the same ve user communities exist in both buildings, separate
broadcast domains are maintained because of the Layer 3 barrier created by MLS and the
RSMs in the distribution layer.
Also note that this approach implements the recommendation made in Chapters 14 and 15
to separate end-user and management trafc. This is done to isolate the Catalyst CPU from
the broadcast trafc that might be present in the end-user VLANs. By doing so, the stability
of the network can be improved (for example, the CPU is not deprived of cycles for such
important tasks as network management and the Spanning-Tree Protocol).
730 Chapter 17: Case Studies: Implementing Switches
IP Addressing
Each VLAN utilizes a single IP subnet. Happy Homes will use network 10.0.0.0 with
Network Address Translation (NAT) to reach the Internet. The design document calls for
the following IP address scheme:
10.Building.VLAN.Node
The subnet mask will be /24 (or 255.255.255.0) for all links. For example, the thirtieth
address on the Sales VLAN in Building 1 would be 10.1.11.30. Because HSRP will be in
use, three node addresses are reserved for routers on each subnet. The .1 node address is
reserved for the shared HSRP address, whereas .2 and .3 will be used for the real addresses
associated with each router (.1 will be the default gateway address used by the end users).
This scheme results in the IP subnets presented in Table 17-2.
NOTE The server farm is listed with a building of N/A because it has its own addressing space that
falls outside the 10.Building.VLAN.Node convention. This is also true because it will
originally be located in basement of Building 1 and later be relocated to a separate building.
Happy Homes would like to start using DHCP in the new network. The rst 20 addresses
on each segment will be reserved for devices that do not (or should not) utilize DHCP such
as printers, servers, and router addresses. The remaining addresses in each subnet will be
Design 1: Using MLS to Blend Layer 2 and Layer 3 Processing 731
divided between a pair of DHCP servers for redundancy. For example, the Marketing
subnet in Building 1 will have two DHCP scopes: the rst DHCP server will be congured
with 10.1.12.2110.1.12.137, and the second server will receive 10.1.12.13810.1.12.254.
Therefore, if the rst DHCP server fails, the second server will have its own block of unique
address for every subnet.
NOTE DHCP scopes are typically split in this fashion because the DHCP protocol currently does
not specify a mechanism for server-to-server communication. For example, if the scopes
did overlap and one of the servers failed, the second server would have no way of knowing
what new leases were issued while it was down. Therefore, it might try to issue the same IP
address again and create a duplicate IP address problem. Future enhancements to the DHCP
standards (as well as proprietary DHCP implementations) can be used to avoid this
problem. See Chapter 11 for more information on using DHCP.
IPX Addressing
Although Happy Homes expects most new applications to be IP based, it currently makes
extensive use of Novell servers and the IPX protocol. For consistency, the design
recommends that the IPX network numbers should be based on the IP subnet values. IPX
network numbers are 32 bits in length, the same as a full IP address. Therefore, IP subnets
can be converted from the usual dotted quad notation to an eight-character hex value
suitable for use as an IPX network number. For example, the Sales VLAN in Building 1
uses IP subnet 10.1.11.0. By converting each of these four decimal values into their hex
equivalents, the corresponding IPX network number would be 0x0A010B00.
TIP For IPX internal network numbers on NetWare servers, the full IP address assigned to the
servers NIC can be converted to hex.
Table 17-3 presents the IPX addresses along with the corresponding IP subnet values.
732 Chapter 17: Case Studies: Implementing Switches
VTP
To maximize the Layer 2-orientation of this design, the proposal calls for the use of VTP
server mode. However, to avoid some of the scalability issues of VTP, each building will
use a unique VTP domain. Two mechanisms will be used to partition the VTP trafc:
The removal of VLAN 1 from the backbone
Separate VTP domain names
Because the backbone utilizes LANE as a trunking technology, VLAN 1 can be removed
from the core of the network by simply not creating a default ELAN that maps to VLAN
1 (note that VLAN 1 cannot be removed from Ethernet trunks). Because VTP trafc must
be carried in VLAN 1, this action prevents VTP information from propagating between
buildings. However, it is not advisable to rely only on this techniqueif someone
accidentally enabled VLAN 1 on the backbone, it could seriously corrupt the VTP
information as discussed in the VLAN Table Deletion section of Chapter 12, VLAN
Trunking Protocol.
To prevent this sort of VTP database corruption between buildings, separate VTP domains
should be employed (however, note that using anything other than VTP transparent mode
still allows VLAN corruption to occur within a single building). Because Catalysts only
exchange VTP information if their VTP domain names match, this creates an effective
barrier for VTP. Design 1 calls for Building 1 to use the domain Happy-B1, whereas
Building 2 uses Happy-B2.
Design 1: Using MLS to Blend Layer 2 and Layer 3 Processing 733
TIP By creating a VTP barrier, the use of unique VTP domain names in each building also
modies the Catalyst behavior to create a Layer 3 barrier at the edge of every building.
Keep this technique in mind when you create your own campus designs.
Trunks
To enhance the stability and scalability of the network, Design 1 calls for several
optimizations on trunk links. First, it recommends that manual conguration be used to
override all speed, duplex, and trunk state negotiation protocols. Relying on
autonegotiation of 10/100 Ethernet speed and duplex can lead to many frustrating hours of
troubleshooting and network downtime. To avoid these issues, important trunk and server
links should be hard-coded. End-station ports generally continue to use speed and duplex
autonegotiation protocols to maximize freedom of movement in PC hardware deployment.
Similarly, the trunk links have hard-coded trunk state information. By not relying on DISL
and DTP negotiation, network stability can be improved.
Second, the design recommends that the trunk links be pruned of unnecessary VLANs.
Because this can constrict unnecessary broadcast ooding, it can also be an important
optimization in Layer 2-oriented networks. For example, broadcasts and multicasts for
VLANs 2224 are not ooded to Cat-B2-3A because it only participates in VLANs 20 and
21 (the management and sales departments VLANs). The need for pruning becomes even
greater in very at networks without the Layer 3 barriers of scalability that automatically
reduce broadcast and multicast ooding.
Load Balancing
Because of the Layer 2-orientation of Design 1, Spanning Tree load balancing must be
employed. As discussed in Chapter 7, the Root Bridge placement form of Spanning Tree
load balancing is both effective and simple to congure and maintain. That is, if your
topology supports it. One of the advantages of having the Layer 2 triangles employed by
this design is that it easily facilitates this form of load balancing. For example, by making
Cat-B1-0A the Root Bridge for VLAN 21, trafc in the B1_Sales VLAN automatically uses
the left-hand riser link. Design 1 calls for the A MDF devices (Cat-B1-0A and Cat-B2-0A)
to act as the Root Bridge for the trafc for the odd-numbered VLANS, whereas the B
devices (Cat-B1-0B and Cat-B2-0B) handle the even-numbered VLANs.
To create a cohesive load balancing scheme, the Spanning Tree Root Bridge placement
should be coordinated with HSRP. This can be done by using the HSRP priority command
to alternate the active HSRP peer for odd and even VLANs.
734 Chapter 17: Case Studies: Implementing Switches
Spanning Tree
In addition to Root Bridge placement, several other Spanning Tree parameters should be
tuned in Design 1. Because the Layer 3 barrier in Design 1 limits Layer 2 connectivity to
small triangles, the largest number of bridges that can exist between two end stations is
three hops. For example, if the link between Cat-B1-1A and Cat-B1-0B failed, trafc
owing between an end station connected to Cat-B1-1A and the RSM in Cat-B1-0B would
have to cross three Layer 2 switches (Cat-B1-1A, Cat-B1-0A, and Cat-B1-0B). This is
illustrated in Figure 17-4 (note that the Catalyst backplane is being counted as a link here).
Figure 17-4 Path from an End User to the RSM in Cat-B1-0B after a Link Failure
End User
Cat-B1-1A
1
Cat-B1-0B
2 3
Cat-B1-0A
RSM
Therefore, the Spanning Tree Max Age and Forward Delay parameters can be safely
reduced to 12 and 9 seconds, respectively (assuming the default Hello Time of 2 seconds).
The safest and simplest way to accomplish this is to use the set spantree root macro to
automatically modify the appropriate Spanning Tree parameters. As a result, convergence
time can be reduced from a default of 3050 seconds to 1830 seconds.
To further speed Spanning Tree convergence, UplinkFast, BackboneFast, and PortFast can
be implemented. UplinkFast is only congured on the IDF switches and can reduce failover
of uplinks to less than 3 seconds. BackboneFast, if in use, must be enabled on every switch
in a Layer 2 domain and can reduce convergence time of indirect failures to 18 seconds
(given the Forward Delay of 9 seconds specied in the previous paragraph). Although
PortFast is not helpful in the failure of trunk links, it can be a useful enhancement to allow
Design 1: Using MLS to Blend Layer 2 and Layer 3 Processing 735
end stations more immediate access to the network and reduce the impact of Spanning Tree
Topology Change Notications (see Chapter 6, Understanding Spanning Tree, and
Chapter 7 for more information on TCNs).
Congurations
This section presents sample congurations used for Design 1. Rather than include all of
the congurations, you see an example of each type of device. First, you see an IDF/access
layer switch. Next, you see coverage of the various components of an MDF/distribution
layer switch: the Supervisor, the RSM module, and the LANE module. This section
concludes with discussion of a conguration for one of the ATM switches in the core.
NOTE Cisco is working on a feature that will only show non-default conguration commands.
This should be available in the future.
Early releases of code also required the set prompt command to include the name in the
display prompt. However, starting in 4.X Catalyst images, this step is done automatically.
Next, create the VTP domain and add the appropriate VLANs as in Example 17-2.
736 Chapter 17: Case Studies: Implementing Switches
Because Design 1 uses VTP server mode, the domain name must be set before the VLANs
can be added. Although VTP defaults to server mode, the second command ensures that the
default setting has not been changed.
Next, assign an IP address to the SC0 logical interface as in Example 17-3.
Notice that SC0 is assigned to VLAN 20, the management VLAN for Building 2. Next, the
set ip route command is used to provide a single default gateway for the Catalyst. 10.2.20.1
uses HSRP on the routers to provide redundancy (see the RSM section later).
Example 17-4 shows how to congure the Spanning-Tree Protocol for the IDF switch.
Design 1: Using MLS to Blend Layer 2 and Layer 3 Processing 737
Warning: Spantree port fast start should only be enabled on ports connected
to a single host. Connecting hubs, concentrators, switches, bridges, etc. to
a fast start port can cause temporary Spanning Tree loops. Use with caution.
The rst command (set spantree portfast) enables PortFast on all of the end-user ports.
Notice that trunk links are not included (you can set PortFast on trunk ports and it will be
ignored, but it is best to avoid this because it can lead to administrative confusion). Next,
BackboneFast is enabled (set spantree backbonefast enable) to improve STP
convergence time associated with an indirect failure. As discussed in Chapter 7, this
command must be enabled on every Catalyst in a Layer 2 domain. The last command (set
spantree uplinkfast enable) enables UplinkFast. Unlike BackboneFast, UplinkFast should
only be enabled on leaf-node IDF switches. You can also see that enabling UplinkFast
automatically modies several Spanning Tree parameters to reinforce this leaf-node
behavior. First, it increases the Bridge Priority to 49,152 so that the current bridge does not
become the Root Bridge (unless there are no other bridges available). Second, the Path Cost
is increased to greater than 3000 to encourage downstream bridges to use some other path
to the Root Bridge (however, if no path is available, this bridge handles the trafc
normally).
Next, congure the trunk links as in Example 17-5.
738 Chapter 17: Case Studies: Implementing Switches
The rst four commands assign a name to the trunk ports, useful information when trying
to troubleshoot and maintain the network. Next, the 1/1 and 2/1 ports are forced into ISL
trunking mode with the set trunk command. If you know that a port is going to be a trunk,
it is best to hard-code the trunking state rather than rely on the auto and negotiate settings
(these mechanisms have been known to fail and also require that the VTP domain names
match). Finally, the clear trunk command is used to remove unnecessary VLANs from the
1/1 and 1/2 links. This sort of pruning can signicantly improve the scalability of your
network.
The code in Example 17-6 sets up passwords in the form of SNMP community strings and
login passwords.
Design 1: Using MLS to Blend Layer 2 and Layer 3 Processing 739
Because SNMP is enabled by default with widely known community strings (public,
private, and secret), you should always modify the SNMP community strings. Do not
forget to modify all three. (Most devices only use two community strings, one for reading
and one for writing. Catalysts have a third community string that also allows the community
strings themselves to be modied.) Finally, because community strings are not encrypted
(either in the conguration or as they travel through the network), it is best to make them
different than the console/Telnet login passwords.
The bottom section of the Example 17-6 sets both the user and privileged passwords.
Unlike Cisco routers that do not allow any remote access until passwords have been
congured, Catalysts allow full access by default. Therefore, always remember to change
the passwords.
Next, you need to congure a variety of management commands as in Example 17-7.
740 Chapter 17: Case Studies: Implementing Switches
Although none of the commands in Example 17-7 are essential for Catalyst operation, they
can all be useful when maintaining a network over the long term.
Example 17-8 creates an IP permit list to limit Telnet access to the device.
Because Design 1 calls for Supervisor IIIs with NetFlow Feature Cards (NFFCs), useful
IDF features such as IGMP Snooping (to reduce multicast ooding) and Protocol Filtering
(to reduce broadcast ooding) can be enabled as in Example 17-9.
Enabling SNMP traps cause the Catalyst to report to 10.100.100.21 information it detects
related to issues such as Spanning Tree changes, device resets, and hardware failures. Link
up/down traps are enabled for the important uplink ports (because of the potential volume
of data, it is almost always best not to enable this on end-station ports).
Finally, the commands in Example 17-11 congure the Catalyst to send Syslog information
to the network management station.
!
#dns
set ip dns server 10.100.100.42 primary
set ip dns server 10.100.100.68
set ip dns enable
set ip dns domain happy.com
!
#tacacs+
set tacacs attempts 3
set tacacs directedrequest disable
set tacacs timeout 5
!
#authentication
set authentication login tacacs disable console
set authentication login tacacs disable telnet
set authentication enable tacacs disable console
set authentication enable tacacs disable telnet
set authentication login local enable console
set authentication login local enable telnet
set authentication enable local enable console
set authentication enable local enable telnet
!
#bridge
set bridge ipx snaptoether 8023raw
set bridge ipx 8022toether 8023
set bridge ipx 8023rawtofddi snap
!
#vtp
set vtp domain Happy-B2
set vtp mode server
set vtp v2 disable
set vtp pruning disable
set vtp pruneeligible 2-1000
clear vtp pruneeligible 1001-1005
set vlan 1 name default type ethernet mtu 1500 said 100001 state active
set vlan 20 name B2_Management type ethernet mtu 1500 said 100020 state active
set vlan 21 name B2_Sales type ethernet mtu 1500 said 100021 state active
set vlan 22 name B2_Marketing type ethernet mtu 1500 said 100022 state active
set vlan 23 name B2_Engineering type ethernet mtu 1500 said 100023 state active
set vlan 24 name B2_Finance type ethernet mtu 1500 said 100024 state active
set vlan 250 name Backbone type ethernet mtu 1500 said 100250 state active
set vlan 1002 name fddi-default type fddi mtu 1500 said 101002 state active
continues
744 Chapter 17: Case Studies: Implementing Switches
!
#spantree
#uplinkfast groups
set spantree uplinkfast enable
#backbonefast
set spantree backbonefast enable
set spantree enable all
#vlan 1
set spantree fwddelay 15 1
set spantree hello 2 1
set spantree maxage 20 1
set spantree priority 32768 1
#vlan 20
set spantree fwddelay 15 20
set spantree hello 2 20
set spantree maxage 20 20
set spantree priority 32768 20
#vlan 21
set spantree fwddelay 15 21
set spantree hello 2 21
set spantree maxage 20 21
set spantree priority 32768 21
#vlan 22
set spantree fwddelay 15 22
set spantree hello 2 22
set spantree maxage 20 22
set spantree priority 32768 22
#vlan 23
set spantree fwddelay 15 23
set spantree hello 2 23
set spantree maxage 20 23
set spantree priority 32768 23
#vlan 24
set spantree fwddelay 15 24
set spantree hello 2 24
set spantree maxage 20 24
set spantree priority 32768 24
#vlan 250
set spantree fwddelay 15 250
set spantree hello 2 250
set spantree maxage 20 250
set spantree priority 32768 250
#vlan 1003
Design 1: Using MLS to Blend Layer 2 and Layer 3 Processing 745
continues
746 Chapter 17: Case Studies: Implementing Switches
Example 17-13 Conguring the Catalyst Name, VTP, and IP Address Parameters
Console> (enable) set system name Cat-B2-0B
System name set.
Cat-B2-0B> (enable) set vtp domain Happy-B2
VTP domain Happy-B2 modified
Cat-B2-0B> (enable) set vtp mode server
VTP domain Happy-B2 modified
Cat-B2-0B> (enable)
Cat-B2-0B> (enable) set interface sc0 20 10.2.20.8 255.255.255.0
Interface sc0 vlan set, IP address and netmask set.
Cat-B2-0B> (enable) set ip route default 10.2.20.1
Route added.
Cat-B2-0B> (enable)
Notice that because VTP server mode is in use, the VLANs do not need to be manually
added to this switch. In fact, assuming that the Supervisor contained an empty
conguration, Cat-B2-0B would have also automatically learned the VTP domain name
(making the set vtp domain Happy-B2 command optional). Because all of the devices in
Building 2 share a single management VLAN, Cat-B2-0B receives an IP address for the
same IP subnet and uses the same default gateway address.
Next, you need to modify the Spanning Tree parameters as in Example 17-14.
Design 1: Using MLS to Blend Layer 2 and Layer 3 Processing 751
Warning: Spantree port fast start should only be enabled on ports connected
to a single host. Connecting hubs, concentrators, switches, bridges, etc. to
a fast start port can cause temporary Spanning Tree loops. Use with caution.
To implement load balancing, the MDF switches require more Spanning Tree conguration
than the IDF switches. The rst six set spantree root commands congure Cat-B1-0Bs
portion of the Root Bridge placement for Building 2 (one command is required for each of
the six VLANs in use). Notice that Cat-B2-0B is congured as the primary Root Bridge for
the even-numbered VLANs (20, 22, and 24) and the secondary Root Bridge for the odd-
numbered VLANs (21 and 23). Cat-B2-0A would have the opposite conguration for
VLANs 2024 (primary for odd VLANs and secondary for even VLANs). For VLAN 250,
the backbone VLAN, Cat-B1-0A is congured as the primary Root Bridge (not shown here)
with Cat-B2-0B as the secondary. This allows Cat-B2-0B to take over as the Root Bridge
for the core in the event that connectivity is lost to Building 1.
PortFast is congured for all the ports on module six. In the event that some of the Building
2 servers are connected here using fault-tolerant NICs that toggle link state (most fault-
tolerant NICs do not do this), this allows the NICs to quickly bring up the backup ports
without waiting through the Spanning Tree Listening and Learning states.
The last command enables BackboneFast (as discussed earlier, it must be enabled on all
switches to work correctly). Finally, notice that UplinkFast is not enabled on the MDF
switches. Doing so disturbs the Root Bridge placement carefully implemented with the
earlier set spantree root command.
Example 17-15 shows how to congure the trunk ports.
As with the IDF switch, the ports are labeled with names and hard-coded to be ISL trunks. The
10/100 Supervisor connection to Cat-B2-3A is also hard-coded to 100 Mbps and full-duplex.
The Gigabit Ethernet links to Cat-B2-1A and Cat-B2-2A do not require this step because the 3-
port Gigabit Ethernet Catalyst 5000 module are xed at 1000 Mbps and full-duplex.
The clear trunk command manually prunes VLANs from the trunk links. Because the
Catalyst on the third oor will only contain ports in the Sales VLAN, all VLANs except 20
and 21 have been removed from the 1/1 uplink. Happy Homes is less certain about the
location of employees on the rst two oors of Building 2. Although the immediate plans call
for engineering to be located on the rst oor and for nance and marketing to share the
second oor, the company knows that there will be a large amount of movement between
these oors for the next two years. As a result, both Cat-B2-1A and Cat-B2-2A will be
congured with all four end-user VLANs. However, other VLANs (219 and 251005) have
still been pruned.
TIP When manually pruning VLANs, be careful not to prune the Management VLAN. If you
do, Telnet, SNMP, and other IP-based communication with Supervisor are not possible. If
you are using VLAN 1 for the Management VLAN, this is not an issue because VLAN 1
cannot be cleared from a trunk link.
It is important to notice that the backbone VLAN, VLAN 250, has been excluded from
every link within the building, including the link between the two MDF switches (Port 5/3
on Cat-B2-0B). In other words, the only port congured for VLAN 250 on the four MDF
switches should be the ATM link into the campus core. By doing this, it guarantees a loop-
free core with more deterministic and faster converging trafc ows as discussed in the
section Make Layer 2 Cores Loop Free in Chapter 15.
754 Chapter 17: Case Studies: Implementing Switches
TIP When using a Layer 2 core, be sure to remove the core VLAN from all links within each
distribution block.
The commands in Example 17-16 complete the conguration and are almost identical to
the IDF conguration discussed with Examples 17-6 through 17-11.
Example 17-16 Conguring Passwords, Banner, System Information, DNS, IP Permit List, IGMP Snooping,
SNMP, and Syslog
Cat-B2-0B> (enable)
Cat-B2-0B> (enable) set password
Enter old password:
Enter new password:
Retype new password:
Password changed.
Cat-B2-0B> (enable)
Cat-B2-0B> (enable) set enablepass
Enter old password:
Enter new password:
Retype new password:
Password changed.
Cat-B2-0B> (enable)
Cat-B2-0B> (enable)
Cat-B2-0B> (enable) set banner motd ~PRIVATE NETWORK -- HACKERS WILL BE SHOT!!~
MOTD banner set
Cat-B2-0B> (enable) set system location Building 2 MDF
System location set.
Cat-B2-0B> (enable) set system contact Joe x111
System contact set.
Cat-B2-0B> (enable)
Cat-B2-0B> (enable) set ip dns enable
DNS is enabled
Cat-B2-0B> (enable) set ip dns domain happy.com
Default DNS domain name set to happy.com
Cat-B2-0B> (enable) set ip dns server 10.100.100.42
10.100.100.42 added to DNS server table as primary server.
Cat-B2-0B> (enable) set ip dns server 10.100.100.68
10.100.100.68 added to DNS server table as backup server.
Cat-B2-0B> (enable)
Cat-B2-0B> (enable) set ip permit enable
IP permit list enabled.
WARNING!! IP permit list has no entries.
Cat-B2-0B> (enable) set ip permit 10.100.100.0 255.255.255.0
10.100.100.0 with mask 255.255.255.0 added to IP permit list.
Cat-B2-0B> (enable)
Cat-B2-0B> (enable)
Cat-B2-0B> (enable) set igmp enable
IGMP feature for IP multicast enabled
Cat-B2-0B> (enable)
Design 1: Using MLS to Blend Layer 2 and Layer 3 Processing 755
Example 17-16 Conguring Passwords, Banner, System Information, DNS, IP Permit List, IGMP Snooping,
SNMP, and Syslog (Continued)
Cat-B2-0B> (enable)
Cat-B2-0B> (enable) set snmp community read-only lesspublic
SNMP read-only community string set to lesspublic.
Cat-B2-0B> (enable) set snmp community read-write moreprivate
SNMP read-write community string set to moreprivate.
Cat-B2-0B> (enable) set snmp community read-write-all mostprivate
SNMP read-write-all community string set to mostprivate.
Cat-B2-0B> (enable)
Cat-B2-0B> (enable) set snmp trap 10.100.100.21 trapped
SNMP trap receiver added.
Cat-B2-0B> (enable) set snmp trap enable module
SNMP module traps enabled.
Cat-B2-0B> (enable) set snmp trap enable chassis
SNMP chassis alarm traps enabled.
Cat-B2-0B> (enable) set snmp trap enable bridge
SNMP bridge traps enabled.
Cat-B2-0B> (enable) set snmp trap enable auth
SNMP authentication traps enabled.
Cat-B2-0B> (enable) set snmp trap enable stpx
SNMP STPX traps enabled.
Cat-B2-0B> (enable) set snmp trap enable config
SNMP CONFIG traps enabled.
Cat-B2-0B> (enable) set port trap 1/1 enable
Port 1/1 up/down trap enabled.
Cat-B2-0B> (enable) set port trap 5/1 enable
Port 5/1 up/down trap enabled.
Cat-B2-0B> (enable) set port trap 5/2 enable
Port 5/2 up/down trap enabled.
Cat-B2-0B> (enable)
Cat-B2-0B> (enable)
Cat-B2-0B> (enable) set logging server enable
System logging messages will be sent to the configured syslog servers.
Cat-B2-0B> (enable) set logging server 10.100.100.21
10.100.100.21 added to System logging server table.
Cat-B2-0B> (enable)
Cat-B2-0B> (enable)
The only signicant difference between Examples 17-6 through 17-11 and Example 17-16
is that Protocol Filtering is not enabled.
continues
758 Chapter 17: Case Studies: Implementing Switches
continues
760 Chapter 17: Case Studies: Implementing Switches
continues
762 Chapter 17: Case Studies: Implementing Switches
continues
764 Chapter 17: Case Studies: Implementing Switches
continues
766 Chapter 17: Case Studies: Implementing Switches
Each VLAN interface has been congured with a separate HSRP group for default gateway
redundancy with Cat-2B-0A. Because Happy Homes will require NetWare and IPX services
for the foreseeable future, the RSM has been congured with IPX network addresses (notice
that IPX automatically locates a new gateway when the primary fails [although a reboot might
be required] and therefore does not require the support of a feature such as HSRP).
Each interface is also congured with a pair of ip helper-address commands to forward
DHCP trafc to the Server Farm. If desired, a single ip helper-address could have been
specied using the server farm subnets broadcast address (10.100.100.255). Also notice
the two no ip forward-protocol udp statements. These prevent the ooding of chatty
NetBIOS over TCP/IP name resolution trafc, a potentially important enhancement in
networks with large amounts of Microsoft-based end stations.
EIGRP has been congured as the IP routing protocol (IPX uses IPX RIP by default). Because
EIGRP includes interfaces on a classful basis, the passive-interface command has been used
to keep routing trafc off the IDF segments. Although this is not going to save much update
trafc with a protocol such as EIGRP (in this case, it only prevents EIGRP hello packets from
being sent), it prevents a large number of unnecessary EIGRP neighbor relationships (by
default, there is one for every pair of routers in every VLAN). By reducing these peering
relationships, you can improve the performance and stability of the routing protocol.
TIP Reducing unnecessary peering can be especially useful in the Catalyst 8500s where
excessive control plane trafc can overwhelm the CPU. However, it is an important
optimization for all VLAN-based router platforms.
768 Chapter 17: Case Studies: Implementing Switches
The RSM has also been congured with many of the same management features as Catalyst
Supervisors, including the following:
SNMP community strings
SNMP host and location information
SNMP traps
A message-of-the-day banner
Passwords
A VTY access-class to limit Telnet access from segments other than the Server Farm
DNS
Syslog logging
Timestamps of logging information
continues
770 Chapter 17: Case Studies: Implementing Switches
Both of the LS1010s require four conguration items to support LANE under Design 1:
The addresses of the LECSs (in this case, the LS1010s themselves) must be
congured with the atm lecs-address-default command. Because the design calls for
SSRP, both ATM switches are congured with two LECS addresses. See Chapter 9
for more information on SSRP.
The LECS database. Again because of SSRP, there are two LES/BUS devices in use.
Because both LES/BUSs are using dual-PHY connections to different ATM switches,
a total of four different LES addresses are possible and must all be included in the
database.
The conguration on logical interface atm 2/0/0 (the ATM Switch Processor [ASP]
itself) of the lane cong auto-cong-atm-address and lane cong database
commands to start the LECS process.
The conguration on the logical subinterface atm 2/0/0.1 of a LANE client to provide
an in-band management channel for the ATM switch.
In addition to the in-band management channel provided by the LEC located on interface
atm 2/0/0.1, an additional connection is provided for occasions where the ATM network is
down. One way to accomplish this is to provide a modem on the AUX port of the ASP.
However, in campus networks, it is often more effective to utilize the ASPs Ethernet
management port. In this case, the port is congured with an IP address on the Building 1
Management VLAN and then connected to a 10/100 port on Cat-B1-0B.
The order of the statements in the LECS database deserves special notice. Figure 17-5
shows a detailed view of the ATM links specied in Design 1.
772 Chapter 17: Case Studies: Implementing Switches
Cat-B1-0A Cat-B1-0B
A B A B
Preferred 3
1
Prefix = Prefix =
47.0091.8100.0000. 47.0091.8100.0000.
0010.2962.E801 0010.11BE.AC01
4 Preferred
A B A B
Cat-B2-0A Cat-B2-0B
Secondary LES/BUS
LES MAC = 0010.2941.D031
Recall from Chapter 9 that careful planning of the order of LECS database can avoid
unnecessary backtracking. Because Cat-B1-0A is the primary LES/BUS and is congured
with PHY A as its preferred port, the combination of LS1010-As prex and Cat-B1-0As
ESI is listed rst in the database. If this port fails, it takes 10 or more seconds for Cat-B1-
0As PHY B to become active, making it a poor choice for the secondary LES. Because Cat-
B2-0Bs preferred port, PHY B, should already be fully active, it is more efcient as a
secondary LES address. If Cat-B2-0Bs PHY B fails, the tertiary LES address can be Cat-
B1-0As PHY B. As a last resort, Cat-B2-0Bs PHY A is used. For more information on this
issue, see the Dual-PHY section of Chapter 9.
Finally, the LS1010 is congured with many of the same management options as earlier
devices: SNMP, passwords, logging, a banner, and DNS.
Design 1: Using MLS to Blend Layer 2 and Layer 3 Processing 773
Design Alternatives
Although an endless variety of design alternatives exist, several are common enough to
deserve special mention. One popular design alternative involves pruning the IDF VLANs
from the link that connects the MDFs together. This effectively converts the Layer 2
triangles discussed in this design into the Layer 2 Vs used in Design 2 (a Catalyst 8500-
based design). It is exactly this sort of minor change in a campus topology that can have a
dramatic impact on Spanning Tree and the overall design. For details on how this affects
the network, refer to Design 2 (from a Spanning Tree and load balancing perspective, this
modication to Design 1 makes it equivalent to Design 2).
In addition, network designers wanting to fully utilize the Layer 2 features of their networks
might want to implement Dynamic VLANs and VMPS. Given the Layer 2-orientation of
MLS and the approach presented in Design 1, this enhancement is fairly simple to
congure. For more information on Dynamic VLANs and VMPS, see Chapter 12.
Furthermore, VTP pruning can be used to automate the removal of VLANs from trunk
links. This prevents the need for the manual pruning via the clear trunk command as
discussed earlier.
Also, when implementing a design that maintains any sort of Layer 2 loops, you should at
least consider implementing a loop-free topology within the management VLAN. As
discussed in Chapter 15, loops in the management VLAN can quickly lead to collapse of
the entire network. Although one of the great benets of creating a Layer 3 barrier is that it
isolates this failure to a single building (and it further helps by making the Layer 2 domains
small enough that loops are unlikely to form), some form of looping is always a possibility
when using Layer 2 technology.
In another common change, many organizations like to make trunk and server links high
priority using the set port level command.
Finally, the servers can be directly connected to the ATM core by supplying them with ATM
NICs. However, one of the downsides to this approach is the question of how to handle
default gateway routing from the servers to the routers located in the MDF switches. For
example, if the servers are congured with a default address of 10.250.250.4, the address
of the interface VLAN 250 on Cat-B2-0Bs RSM, all trafc is directed to Building 2. Trafc
destined for Building 1 would therefore incur an additional routing hop and cross the
backbone twice (unless ICMP redirects were supported). Another problem with using
default gateways is the issue of redundancy. Although HSRP can be congured, it
exacerbates the previous issue by disabling ICMP redirects on the router. In general, the
best solution is to run a routing protocol on your servers (also requiring you to migrate the
RSMs in this design from EIGRP to something like OSPF).
774 Chapter 17: Case Studies: Implementing Switches
Building 1
Cat-B1-3A
IDFs
Cat-B1-2A
Cat-B1-1A
MDFs
Cat-B1-0A Cat-B1-0B
Server
Farm
MDFs
Cat-B2-1A
Cat-B2-2A
IDFs
Cat-B2-3A
Cat-B2-0B
Cat-B2-0A Building 2
Design 2: Maximizing Layer 3 with Catalyst 8500 Switching Routers 775
Several differences from the physical layout used in Design 1 are important. First, the ATM
core has been replaced with Gigabit Ethernet. Second, the Building 2 third oor has been
replaced with a Catalyst 5509. However, both designs are similar in that a pair of redundant
MDF devices is used in each basement with two riser links going to each IDF.
Design Discussion
Whereas Design 1 sought to blend Layer 2 and Layer 3 technology, Design 2 follows an
approach that maximizes the Layer 3 content in the MDF/distribution layer switches. In
doing so, this somewhat subtle change has a dramatic impact on the rest of the design.
The most important change created by this design is that all IDF VLANs are terminated at the
MDF switch. In other words, users connected to different IDFs always fall in different
VLANs. As discussed in Chapter 11, although it is possible to have a limited number of
VLANs traverse a Catalyst 8500 using IRB, this is not a technique that you want to use many
times throughout your campus (it is appropriate for one or two special-case VLANs). In other
words, this style of Layer 3 switching is best used as a fast version of a normal routing.
The second most important change, a simplication of Spanning Tree, is discussed in the
next section.
Spanning Tree
Although some view the loss of IDF-to-IDF VLANs as a downside to the approach taken
in Design 2, it is important to offset this with the simplications that hardware-based
routing make possible. One of the most important simplications involves the area of Layer
2 loops and the Spanning-Tree Protocol. In fact, hardware-based routing has completely
eliminated the Layer 2 loops between the IDF and MDF switches. Whereas Design 1 used
Layer 2 triangles, this design uses Layer 2 Vs.
NOTE As was stressed in the discussion of Design 1, MLS can be used to build loop-free Layer 2
Vs. However, it is important to realize that switching routers such as the 8500 do this by
default, whereas MLS (and routing switches) require you to manually prune certain
VLANs from selected links. See the earlier section Trunks for more information.
Because this design removes all Layer 2 loops (at least the ones that are intentionally formed),
some organizations have decided to completely disable Spanning Tree when using this
approach. However, because it does not prevent unintentional loops on a single IDF switch
(generally as the result of a cabling mistake), other network designers want to maintain a
Spanning Tree security blanket on their IDF switches. However, it is important to recognize
that even in the cases where Spanning Tree remains enabled (as it is in Design 2), the
operation of the Spanning-Tree Protocol is dramatically simplied for a variety of reasons.
776 Chapter 17: Case Studies: Implementing Switches
First, Root Bridge placement becomes a non-issue. Each IDF switch is not aware of any
other switches and naturally elects itself as the Root Bridge.
TIP It can still be a good idea to lower the Bridge Priority in case someone plugs in another
bridge some day.
In addition, Spanning Tree load balancing is not required (or, for that matter, possible).
Also, features such as UplinkFast and BackboneFast are no longer necessary for fast
convergence.
Finally, the Spanning Tree network diameter has been reduced to the IDF switch itself. As
a result, the Max Age and Forward Delay times can be aggressively tuned without concern.
For example, Design 2 species a Max Age of 10 seconds and a Forward Delay of 7
seconds. Although somewhat more aggressive values can be used, these were chosen as a
conservative compromise. As a result, failover performance where a loop exists is between
14 and 20 seconds. However, because the topology is loop free at Layer 2, there should be
no Blocking ports during normal operation. As a result, IDF uplink failover performance is
governed by HSRP, not Spanning Tree. Also as a result, the network can recover from uplink
failures in as little as one second (assuming that the HSRP parameters are lowered).
TIP The Spanning-Tree Protocol does not affect failover performance in this network.
VLAN Design
Although the concept of a VLAN begins to blur (or fade) in this design, the IDF switches are
congured with the same end user VLAN names as used in Design 1. However, notice that
all of the VLANs use essentially the same numbers throughout this version of the design. The
management VLAN in all switches is always VLAN 1 (even though they are different IP
subnets). Similarly, the rst end-user VLAN on an IDF switch is VLAN 2. If more than one
VLAN is required on a given IDF switch, VLANs 3 and greater can be created.
Notice that this brings a completely different approach to user mobility than Design 1.
Design 1 attempted to place all users in the same community of interest located within a
single building in the same VLAN. In the case of Design 2, that is no longer possible
without enabling IRB on the Catalyst 8500. Here, it is expected that users in the same
community of interest may very well fall into different subnets. However, because DHCP
is in use, IP addressing is transparent to the users. Furthermore, because the available Layer
3 bandwidth is so high with 8500 technology, the use of routing (Layer 3 switching) does
not impair the networks performance.
Design 2: Maximizing Layer 3 with Catalyst 8500 Switching Routers 777
NOTE Note that a similar case for Layer 3 performance can be made for the Catalyst 6000/6500.
See Chapter 18 for more detail.
VTP
Given the Layer 3 nature of Design 2, VTP server mode has little meaning (8500s do not
propagate VTP frames). Therefore, Design 2 calls for VTP transparent mode. Although not
a requirement, the design also calls for a VTP domain name of Happy (unlike server and
client modes, transparent mode does not require a VTP domain name).
As a result, each IDF switch must be individually congured with the list of VLANs it must
handle. However, this is rarely a signicant issue because each IDF switch usually only
handles a small number of VLANs.
TIP If the VLAN conguration tasks are a concern (or, for that matter any other conguration
task), consider using tools such as Perl and Expect. Both run on a wide variety of UNIX
platforms as well as Windows NT.
Design 2: Maximizing Layer 3 with Catalyst 8500 Switching Routers 779
Trunks
To present an alternative approach, Design 2 uses Fast EtherChannel links between the
MDF and IDF switches. To provide adequate bandwidth in the core, Gigabit Ethernet links
are used.
Server Farm
This design calls for a separate Server Farm building (a third building at the corporate
headquarters campus will be used). The Server Farm could have easily been placed in
Building 1 as it was with Design 1, however, an alternate approach was used for variety.
Congurations
This section presents the congurations for Design 2. As with Design 1, you see only one
example of each type of device. First, you see congurations for and discussion of a
Catalyst 5509 IDF switch, followed by congurations for and discussion of a Catalyst 8540
MDF switch.
Unlike Design 1, this design utilizes VTP transparent mode and requires only a single end-
user VLAN for Cat-B2-1A as shown in Example 17-22.
780 Chapter 17: Case Studies: Implementing Switches
The SC0 interface also uses a different conguration under Design 2. First, the IP address and
netmask are obviously different. Second, SC0 is left in VLAN 1, the default. Third, Design 2
calls for two default gateway addresses to be specied with the ip route command (this
feature was rst supported in Version 4.1 of Catalyst 5000 code). This can simplify the overall
conguration and maintenance of the network by not requiring a separate HSRP group to be
maintained for each management subnet/VLAN. Example 17-23 demonstrates these steps.
The rst two commands (set spantree root) lower the Max Age and Forward Delay timers
to 10 and 7 seconds, respectively. For consistency, this also forces the IDF switch to be the
Root Bridge. (Although this is useful in the event that other switches or bridges have been
cascaded off the IDF switch, in most situations this has no impact on the actual topology
under Design 2.) Finally, PortFast is enabled on all of the end-user ports in slots 48.
Next, the trunk ports are congured as in Example 17-25.
As mentioned earlier, Design 2 uses Fast EtherChannel links from Cat-B2-1A and Cat-B2-
2A to the MDF switches. For stability, these are hard-coded to the port channel on state.
The resulting EtherChannel bundles are also hard-coded as ISL trunks. Also notice that
although the set trunk command is only applied to a single port, the Catalyst automatically
applies it to every port in the EtherChannel bundle.
The commands in Example 17-26 are very similar to those used in Example 17-16 of
Design 1.
782 Chapter 17: Case Studies: Implementing Switches
Example 17-26 Conguring SNMP, Password, Banner, System Information, DNS, IP Permit List, IGMP Snooping,
Protocol Filtering, SNMP, and Syslog
Cat-B2-1A> (enable) set snmp community read-only lesspublic
SNMP read-only community string set to lesspublic.
Cat-B2-1A> (enable) set snmp community read-write moreprivate
SNMP read-write community string set to moreprivate.
Cat-B2-1A> (enable) set snmp community read-write-all mostprivate
SNMP read-write-all community string set to mostprivate.
Cat-B2-1A> (enable)
Cat-B2-1A> (enable) set password
Enter old password:
Enter new password:
Retype new password:
Password changed.
Cat-B2-1A> (enable)
Cat-B2-1A> (enable) set enablepass
Enter old password:
Enter new password:
Retype new password:
Password changed.
Cat-B2-1A> (enable)
Cat-B2-1A> (enable)
Cat-B2-1A> (enable) set banner motd ~PRIVATE NETWORK -- HACKERS WILL BE SHOT!!~
MOTD banner set
Cat-B2-1A> (enable) set system location Building 2 First Floor
System location set.
Cat-B2-1A> (enable) set system contact Joe x111
System contact set.
Cat-B2-1A> (enable)
Cat-B2-1A> (enable) set ip dns enable
DNS is enabled
Cat-B2-1A> (enable) set ip dns domain happy.com
Default DNS domain name set to happy.com
Cat-B2-1A> (enable) set ip dns server 10.100.100.42
10.100.100.42 added to DNS server table as primary server.
Cat-B2-1A> (enable) set ip dns server 10.100.100.68
10.100.100.68 added to DNS server table as backup server.
Cat-B2-1A> (enable)
Cat-B2-1A> (enable) set ip permit enable
IP permit list enabled.
WARNING!! IP permit list has no entries.
Cat-B2-1A> (enable) set ip permit 10.100.100.0 255.255.255.0
10.100.100.0 with mask 255.255.255.0 added to IP permit list.
Cat-B2-1A> (enable)
Cat-B2-1A> (enable)
Cat-B2-1A> (enable) set igmp enable
IGMP feature for IP multicast enabled
Cat-B2-1A> (enable)
Cat-B2-1A> (enable) set protocolfilter enable
Protocol filtering enabled on this switch.
Cat-B2-1A> (enable)
Cat-B2-1A> (enable)
Cat-B2-1A> (enable) set snmp trap 10.100.100.21 trapped
Design 2: Maximizing Layer 3 with Catalyst 8500 Switching Routers 783
Example 17-26 Conguring SNMP, Password, Banner, System Information, DNS, IP Permit List, IGMP Snooping,
Protocol Filtering, SNMP, and Syslog (Continued)
SNMP trap receiver added.
Cat-B2-1A> (enable) set snmp trap enable module
SNMP module traps enabled.
Cat-B2-1A> (enable) set snmp trap enable chassis
SNMP chassis alarm traps enabled.
Cat-B2-1A> (enable) set snmp trap enable bridge
SNMP bridge traps enabled.
Cat-B2-1A> (enable) set snmp trap enable auth
SNMP authentication traps enabled.
Cat-B2-1A> (enable) set snmp trap enable stpx
SNMP STPX traps enabled.
Cat-B2-1A> (enable) set snmp trap enable config
SNMP CONFIG traps enabled.
Cat-B2-1A> (enable) set port trap 3/1-8 enable
Port 3/1-8 up/down trap enabled.
Cat-B2-1A> (enable)
Cat-B2-1A> (enable)
Cat-B2-1A> (enable) set logging server enable
System logging messages will be sent to the configured syslog servers.
Cat-B2-1A> (enable) set logging server 10.100.100.21
10.100.100.21 added to System logging server table.
Cat-B2-1A> (enable)
Cat-B2-1A> (enable)
continues
784 Chapter 17: Case Studies: Implementing Switches
continues
786 Chapter 17: Case Studies: Implementing Switches
continues
788 Chapter 17: Case Studies: Implementing Switches
continues
790 Chapter 17: Case Studies: Implementing Switches
continues
794 Chapter 17: Case Studies: Implementing Switches
continues
796 Chapter 17: Case Studies: Implementing Switches
continues
798 Chapter 17: Case Studies: Implementing Switches
Three logical port-channel interfaces are congured to handle the links to the three IDF
switches. Because the EtherChannels are using ISL encapsulation to trunk multiple VLANs
to the IDFs, each port-channel is then congured with multiple subinterfaces, one for each
IDF VLAN. For example, interface port-channel 2 is used to connect to Cat-B2-2A on the
second oor. Subinterface port-channel 2.1 is created for the management VLAN, 2.2 for
the Finance VLAN, and 2.3 for the Marketing VLAN. Each subinterface is congured with
an encapsulation isl statement and the appropriate IP and IPX Layer 3 information.
The subinterfaces supporting end-user trafc are also congured with two HSRP groups.
As explained in Chapter 11, HSRP load balancing should be employed in designs where a
single end-user VLAN is used on each IDF and there are no Layer 2 loops (making
Spanning Tree load balancing impossible). To enable HSRP load balancing, a technique
called Multigroup HSRP (MHSRP) is used. Under MHSRP, two (or more) HSRP groups
are created for every subnet. By having each MDF device be the active HSRP peer for one
of the two HSRP groups, load balancing can be achieved. For example, Design 2 calls for
two HSRP groups per end-user subnet (as mentioned earlier, the management VLANs use
multiple default gateways instead). The rst HSRP group uses .1 in the fourth octet of the
IP address, and the second group uses .2. By making Cat-B2-0A the active peer for the rst
group and Cat-B2-0B the active peer for the second group, both router ports can be active
at the same time.
NOTE Note that the recommendation to use MHSRP is predicated upon the fact that a single
VLAN is being used on the IDF switches (as discussed in Chapter 11, this is often done to
facilitate ease of network management). If you are using multiple VLANs on the IDFs, you
can simply alternate active HSRP peers between the VLANs. See Chapter 11 for more
information and conguration examples.
Design 2: Maximizing Layer 3 with Catalyst 8500 Switching Routers 799
The catch with this approach is nding some technique to have half of the end stations use
the .1 default gateway address and the other half use .2. Chapter 11 suggests using DHCP
for this purpose. For example, Happy Homes is planning to deploy two DHCP servers
(from the ip helper-address statements, we can determine that the IP addresses are
10.100.100.33 and 10.100.100.81). All leases issued by the rst DHCP server,
10.100.100.33, specify .1 as the default gateway. On the other hand, all leases issued by the
second DHCP server, 10.100.100.81, specify .2 as a default gateway. To help ensure a fairly
random distribution of leases between two DHCP, the order of the ip helper-address
statements can be inverted between the two MDF switches. For example, the conguration
for Cat-B2-0B shows 10.100.100.81 as the rst ip helper-address on every end-user
subinterface. On the other MDF switch, Cat-B2-0A, 10.100.100.33 should be listed rst.
Further down in the conguration, the actual Fast Ethernet ports are shown. Notice that
these do not contain any direct conguration statements (the entire conguration is done on
the logical port-channel interface). The only statement added to each interface is a channel-
group command that includes the physical interface in the appropriate logical port-channel
interface.
Because the Gigabit Ethernet interfaces are not using EtherChannel, the conguration is
placed directly on the interface itself. Each interface receives an IP address and an IPX
network statement. Because these interfaces do not connect to any end stations, HSRP and
IP helper addresses are not necessary.
The remaining conguration commands set up the same management features discussed in
the earlier congurations.
Design Alternatives
As with Design 1, hundreds of permutations are possible for Design 2. This section briey
discusses some of the more common alternatives.
First, as shown in Figure 17-5, Design 2 calls for a pair of 8500s for the server farm. Figure
17-7 illustrates a potential layout for the server farm under Design 2.
800 Chapter 17: Case Studies: Implementing Switches
Cat-B1-0B
GE
Catalyst
6500 Layer 2
Switching
Server
Farm A
Server
Farm A
GE
Catalyst Layer 2
6500 Switching
Cat-B2-0B
Layer 3 Switching
In this plan, a pair of Catalyst 6500 switches are directly connected to the backbone via Cat-
B1-0B and Cat-B2-0B. By using the Catalyst 6500s MSFC Native IOS Mode, you can
leverage the capability of these devices to simultaneous behave as both routing switches
and switching routers (see Chapter 18 for more information on this capability). This gives
you the exibility to provide Layer 2 connectivity within the server farm while also
utilizing Layer 3 to reach the backbone. In essence, the server farm becomes a miniature
version of one of the buildings, but all contained within a pair of devices (the 6500s are
acting like MDF and IDF devices at the same time).
As an alternative, some organizations have used the design shown in Figure 17-8.
Design 2: Maximizing Layer 3 with Catalyst 8500 Switching Routers 801
Gigabit Ethernet
Server Connections
Cat-B1-0B
GE
Catalyst
4003
Server
Farm A
Backbone GE GE
Server
Farm B
GE
Catalyst
4003
Layer 3 Layer 2
Switching Switching
In this example, the Layer 2 Catalysts (in this case, 4003s) have been directly connected to
the existing 8540s, Cat-B1-0B and Cat-B2-0B. The advantage of this approach is that it
saves the expense of two Layer 3 switches and potentially removes one router hop from the
typical end-user data path.
Unfortunately, this design is susceptible to the same default gateway issues discussed
earlier in association with directly connecting servers to the LANE cloud in Design 1. As a
result, it can actually add router hops by unnecessarily forwarding trafc to the wrong
building. (You can run HSRP, but all trafc is directed to the active peer. MHSRP can be
used, but it is generally less effective with servers than end users because of their extremely
high bandwidth consumption.) If you do implement this design, consider running a routing
protocol on your servers.
802 Chapter 17: Case Studies: Implementing Switches
However, potentially the most serious problem involves IP addressing and link failures.
Consider the case of where the Gigabit Ethernet link between the 4000s failsboth 8500s
continue trying to send all trafc destined to the server farm subnet out their rightmost port.
For example, Cat-B2-0B still tries to reach servers connected to Server Farm A by sending
the trafc rst to Server Farm B. And if the link between Server Farm B and Server Farm
A is down, the trafc obviously never reaches its destination. This is a classic case of the
discontinuous subnet problem.
TIP Look for potential discontinuous subnets in your network. This can be especially important
in mission-critical areas of your network such as a server farm.
Probably the most common modication to Design 2 entails using a Layer 2 core rather
than directly connecting the MDF switches to each other with a full or partial mesh of
Gigabit Ethernet links. Although the approach used in Design 2 is ne for smaller networks,
a Layer 2 core is more scalable for several reasons:
It is easier to add distribution blocks.
It is easier to upgrade access bandwidth to one building block (simply upgrade the
links to the Layer 2 core versus upgrading all the meshed bandwidth).
Routing protocol peering is reduced from the distribution layer to the core.
The most common implementation is to use a pair of Layer 2 switches for redundancy
(however, be careful to remove all Layer 2 loops in the core).
A third potential modication to Design 2 involves VLAN numbering. Notice that Design
2 uses the pattern-based VLAN numbering scheme discussed in Chapter 15. Because
designs with a strong Layer 3 switching component effectively nullify the concept of
VLANs being globally-unique broadcast domains, this approach is appropriate for designs
such as Design 2. However, some organizations prefer to maintain globally-unique VLAN
numbers even when utilizing Layer 3 switching. In this case, every subnet is mapped to a
unique VLAN number. See Chapter 15 for more information on pattern-based versus
globally-unique VLAN numbering schemes.
Finally, another option is to deploy Gigabit EtherChannel within the core and server farm.
By offering considerably more available bandwidth, this can provide additional room for
growth with the Happy Homes campus.
Summary 803
Summary
This chapter has sought to enhance the many concepts and commands covered earlier in
this book by looking at real-world issues, problems, and designs. Two out of hundreds of
possible solutions were considered. Although both designs have their advantages and
disadvantages, neither design represents the right answer. Table 17-5 summarizes some of
the more important differences discussed earlier.
NOTE The term MSFC is used in this chapter to refer to both the MSFC and the Policy Feature
Card (PFC). Note that currently the MSFC cannot be purchased without an accompanying
PFC. On the other hand, a PFC can be purchased and used without an MSFC. This
conguration supports QoS and access list functionality (but not Layer 3 switching).
CHAPTER
18
Layer 3 Switching and the
Catalyst 6000/6500s
The Catalyst 6000 family represents a signicant step forward in switching technology
while retaining a strong foundation in existing and proven Cisco designs. As this chapter
discusses, the Catalyst 6000s and 6500s can act as faster versions of Layer 2 Catalyst 5000s.
For shops starved for Gigabit Ethernet bandwidth and port density, this can be extremely
useful. However, it is the Layer 3 switching capabilities of the Catalyst 6000s that set them
apart from other switches and is the primary focus of this chapter.
Because of these powerful new features, the Catalyst 6000s are intentionally discussed in the
last chapter of this book. In essence, the 6000s draw on material learned in virtually every
prior chapter of this book but also add exciting new capabilities. For example, Chapter 4,
Conguring the Catalyst, discussed the XDI/CatOS Catalyst user interface of set, clear, and
show commands. In one conguration, Catalyst 6000s use all of these commands and
concepts. However, by taking advantage of the Catalyst 6000 Native IOS Mode, you can
instantly convert your box into a full-edged Cisco router (an entirely new, but familiar, user
interface)! Similarly, the Catalyst 6000 can be used to implement most of the Layer 3
switching designs discussed in Chapter 11, Layer 3 Switching. However, it goes beyond
these features by offering a completely new approach to Layer 3 switch conguration and
managementthe previously mentioned Native IOS Mode. The Native IOS mode builds on
the material discussed in Chapter 14, Campus Design Models, and Chapter 15, Campus
Design Implementation, by supporting more exible Layer 3 switching designs.
NOTE Recall from earlier chapters that XDI is the name of the UNIX-like kernel originally used
to build early Catalyst devices.
808 Chapter 18: Layer 3 Switching and the Catalyst 6000/6500s
Although Layer 2 Catalyst 6000s offer the same look and feel as Catalyst 5000s, they
obviously offer increased capacity and throughput. For example, whereas the Catalyst
5000s use a 1.2-Gbps backplane and the 5500s use a 3.6-Gbps crossbar backplane, the
6000s use a 16-Gbps backplane (most vendors have started measuring switch capacity on
a full-duplex basis, resulting in a backplane capacity rating of 32 Gbps for the 6000s). In
addition to the 16-Gbps backplane, Catalyst 6500s also provide a 256-Gbps crossbar matrix
(although initial Supervisor congurations do not utilize this capacity). Obviously, the
6000s and 6500s provide a dramatic increase in available Layer 2 switching throughput and
Gigabit Ethernet port densities.
Because of the similarity between Layer 2 Catalyst and other Catalyst platforms discussed
in detail throughout this book, this chapter does not dwell on their details.
TIP The Catalyst 5000 Supervisor III uses an RJ-45 style console port that is pinned out in the
exact opposite of the console port found on Cisco 2500 (and other) routers, creating
widespread confusion when it initially shipped. The Catalyst 6000 Supervisor also uses an
RJ-45 connector. However, to maintain backward compatibility with both the 2500 routers
and the Catalyst 5000 Supervisor III, the Catalyst 6000 features a switch to select the pinout
you prefer. When set to the in position, it uses the same console cable as a 2500 (that is, a
rolled cable). When set to the out position, it uses the same cable as a Catalyst 5000
Supervisor III (a straight-through cable). To adjust the setting of this switch, use a paper
clip (it is recessed behind a small hole in the faceplate).
NOTE The MSM contains a faster CPU that is currently used in the 8510.
From a conguration standpoint, the MSM connects to the Catalyst 6000 backplane via
four Gigabit Ethernet interfaces. These interfaces are labeled as GigabitEthernet 0/0/0,
1/0/0, 3/0/0, and 4/0/0. Note that 2/0/0 is not used and that these numbers do not refer to
the slot where the MSM is installed (they are always locally signicant). Figure 18-1
illustrates the Catalyst 6000 backplane connections.
MSM Layer 3 Switching 809
16/32 Gbps
Catalyst 6000
Backplane
MSM
Gigabit Ethernet 3/0/0
TIP The MSM requires a coordinated conguration on both the Catalyst 6000 Supervisor and
the MSM itself.
Using each of the Gigabit Ethernet interfaces as a separate router port is the simplest of the
two conguration types. For example, the partial conguration shown in Example 18-1
congures the interfaces to handle routing for VLANs 14.
810 Chapter 18: Layer 3 Switching and the Catalyst 6000/6500s
This assigns each MSM router interface to a separate Layer 2 VLAN. Note that Example
18-2 assumes the MSM is located in Slot 7.
Although the conguration in Example 18-1 and Example 18-2 correctly provides routing
services, you can obtain a much more exible conguration by grouping all four of the
Gigabit Ethernet interfaces into a single EtherChannel bundle. By doing so, certain VLANs
are not tied to specic Gigabit Ethernet ASICs onboard the MSM, allowing for a more even
distribution of trafc.
To create an EtherChannel bundle on the MSM, simply follow the steps outlined in the
EtherChannel section of Chapter 11:
Step 1 Create one subinterface for each VLAN using the interface port-channel
port-channel.subinterface-number command.
Step 2 Congure each subinterface. At a minimum, this consists of assigning a
VLAN with the encapsulation isl vlan-identier command and an IP
and/or IPX address (802.10 can also be used). (It is possible to congure
bridging but not recommendedsee the Integration between Routing
and Bridging section of Chapter 11.)
Step 3 Assign all four Gigabit Ethernet interfaces to the Port-channel interface
using the channel-group channel-number command.
NOTE Although generally not useful, it is possible to create more than one Port-channel interface
and assign different Gigabit Ethernet interfaces to each.
For instance, Example 18-3 displays the complete conguration from an MSM that has
been congured with a single EtherChannel interface to the Catalyst 6000 backplane.
812 Chapter 18: Layer 3 Switching and the Catalyst 6000/6500s
Example 18-3 congures three subinterfaces on the EtherChannel interface: one each for
VLAN 100, VLAN 101, and VLAN 102. All four of the Gigabit Ethernet interfaces have
been included in the channel to provide a single high-speed pipe to the rest of the Catalyst.
To create the EtherChannel on the Layer 2 Supervisor, the commands shown in Example
18-4 are required.
814 Chapter 18: Layer 3 Switching and the Catalyst 6000/6500s
These two commands rst assign all four MSM interfaces to a single EtherChannel
interface and then enable ISL trunking across the entire bundle. (Although the trunk
command is only entered for Port 7/1, it is automatically applied to all four ports.)
NOTE Notice that the MSM, being derived from 8500 technology, functions under the switching
router form of Layer 3 switching discussed in Chapter 11.
PFC
MSFC RP (NetFlow Feature Card)
The Supervisor/SP contains a RISC CPU and the ASICs necessary to perform the duties of
a Layer 2 switch. The PFC uses technology similar to the NFFC discussed in the MLS
section of Chapter 11. Functioning as a exible pattern matching and rewrite engine, it can
be used to provide a wide range of high-speed features such as Layer 3 switching, Quality/
Class of Service (QoS/CoS), multicast support, and security ltering. From a Layer 3
switching perspective, it provides the MLS-SE shortcut services discussed in Chapter 11.
(Technically speaking, the PFC replaces the Layer 2 forwarding ASIC on the Supervisor
and also assumes these duties.) The MSFC daughter card is derived from the NPE-200 used
in the Cisco 7200 routers. Being a high-performance and feature-rich router, it handles the
MLS-RP end of the MLS scheme and routes the rst packet in every IP and IPX ow. It can
also be used to provide software-based routing for other protocols such as AppleTalk and
DECnet (expect forwarding rates of approximately 125,000 150,000 pps).
In short, the MSFC Hybrid Mode offers the equivalent of a souped up Catalyst 5000 Route
Switch Module (RSM) and NFFC in a single-slot solution.
Example 18-5 Using the show module Command to Determine the MSFC RP Virtual Slot Number
Cat6000 (enable) show module
Mod Slot Ports Module-Type Model Status
--- ---- ----- ------------------------- ------------------- --------
1 1 2 1000BaseX Supervisor WS-X6K-SUP1-2GE ok
15 1 1 Multilayer Switch Feature WS-F6001-RSFC ok
3 3 24 100BaseFX MM Ethernet WS-X6224-100FX-MT ok
4 4 24 100BaseFX MM Ethernet WS-X6224-100FX-MT ok
5 5 8 1000BaseX Ethernet WS-X6408-GBIC ok
6 6 48 10/100BaseTX (RJ-45) WS-X6248-RJ-45 ok
Mod MAC-Address(es) Hw Fw Sw
--- -------------------------------------- ------ ---------- -----------------
1 00-50-54-6c-a9-e6 to 00-50-54-6c-a9-e7 1.4 5.1(1) 4.2(0.24)DAY35
00-50-54-6c-a9-e4 to 00-50-54-6c-a9-e5
00-50-3e-05-58-00 to 00-50-3e-05-5b-ff
15 00-50-73-ff-ab-00 to 00-50-73-ff-ab-ff 0.305 12.0(2.6)T 12.0(2.6)TW6(0.14)
3 00-50-54-6c-a5-34 to 00-50-54-6c-a5-4b 1.2 4.2(0.24)V 4.2(0.24)DAY35
4 00-50-54-6c-a4-74 to 00-50-54-6c-a4-8b 1.2 4.2(0.24)V 4.2(0.24)DAY35
5 00-50-f0-a8-44-64 to 00-50-f0-a8-44-6b 1.4 4.2(0.24)V 4.2(0.24)DAY35
6 00-50-f0-aa-58-38 to 00-50-f0-aa-58-67 1.0 4.2(0.24)V 4.2(0.24)DAY35
Notice that the second line (marked in bold type) under the uppermost headers in Example
18-5 lists the MSFC RP as a Multilayer Switch Feature WS-F6001-RSFC in Slot 15.
NOTE Example 18-5 shows the output of a 6009/6509 containing a single Supervisor in Slot 1. An
MSFC physically located in Slot 2 uses a virtual slot number of 16. A 6006/6506 also uses
Slots 15 and 16.
MSFC Hybrid Mode Layer 3 Switching 817
Therefore, by entering the command session 15, you are connected to the MSFC RP where
you can enter router commands.
TIP Although the numbering pattern is fairly simple, use the show module command to
determine and remember the virtual slot numbers used by MSFC RP modules.
NOTE Chapter 11 presented this list as a ve-step list because it included a step (Step 3) to
congure non-trunk links on external routers. Because this is not necessary for integrated
routers such as the MSFC RP, this step has been omitted here.
For example, the conguration displayed in Example 18-6 enables MLS for VLANs 1
through 3 on an MSFC RP (both IP and IPX are congured)
818 Chapter 18: Layer 3 Switching and the Catalyst 6000/6500s
Note that the conguration in Example 18-6 is functionally equivalent to the MSM
conguration shown in Example 18-3.
Example 18-7 shows the results of show mls rp on the MSFC RP.
continues
820 Chapter 18: Layer 3 Switching and the Catalyst 6000/6500s
mac 0050.73ff.ab38
vlan id(s)
1 2
mac 0050.73ff.ab38
vlan id(s)
1 2
The rst section of Example 18-7 shows useful information such as whether IP and IPX
MLS are enabled and the currently active ow masks. The next section documents aspects
of the MultiLayer Switching Protocol (MLSP) such as the VTP domain name and MLSP
sequence number.
Example 18-8 displays the output of show mls on the Catalyst SP.
Example 18-8 Output of show mls on the Catalyst 6000 Supervisor (Continued)
IPX flow mask is Destination flow
IPX max hop is 255
Active IPX MLS entries = 0
Example 18-8 shows some of the statistics collected from the NFFC/PFC. For example, the
total number of packets Layer 3 switched using MLS is shown on the rst line. The second
line displays the total number of active shortcut entries in the NFFC/PFC cache. The output
also displays information on aging, ow masks, NetFlow Data Export, and IP/IPX MLS-RPs.
For more information on conguring MLS, see the MLS section of Chapter 11.
Although both approaches can be very effective in the appropriate design (MLS in Layer 2-
oriented designs and 8500s in Layer 3-oriented designs), both suffer from some drawbacks
(although some argue that most of these downsides are not a big deal).
Native IOS Mode is uniquely positioned in between the MLS and Catalyst 8500 approaches to
Layer 3 switching. As such, it captures the Ethernet-based benets of both while completely
avoiding the downsides of both. As a result, the Native IOS Mode offers the following
advantages:
It provides an extremely useful metaphor for conguring and integrating Layer 2 and
Layer 3 technology. These capabilities are discussed in detail later in the chapter.
Because it uses a single user interface, organizations can avoid the confusion and
training costs associated with supporting two interfaces.
MSFC Native IOS Mode Layer 3 Switching 823
Because it uses the IOS interface, most network personnel can use the Native IOS
Mode technology with minimal training.
Because of the integrated user interface, organizations can more readily see and
visualize their logical topology. Therefore, people are less likely to mistakenly create
at earth networks and campus-wide VLANs.
Although it is based internally on the same MLS technology used in the Hybrid Mode,
the end user is insulated from having to congure MLS directly.
It understands and has full support for VLANs. (Although the Catalyst 6000 ASICs
support Dynamic VLANs and VMPS, this feature is currently not supported on the
platform because of its anticipated use as a backbone switch where all ports are hard-
coded into each VLAN.)
It maintains almost all of the Layer 2 features of the XDI/CatOS Catalysts such as
VTP, DTP/DISL, and PAgP.
These capabilities allow the Native IOS Mode MSFC to function as either a routing
switch (as with MLS) or a switching router (as with 8500s). Although this can create
confusion for those trying to discuss and explain such concepts, it creates an
extremely powerful and exible approach to Layer 3 switching.
Because the differences between the Hybrid Mode and Native IOS Mode consist only
of software, it is easy to convert the box between the two as an organizations needs
change (its merely a matter of changing the images on ash).
Most importantly, the Native IOS Mode allows you to achieve all of these benets while
retaining the speed of the Hybrid Mode (rst generation speeds will be approximately 15 mpps).
NOTE Note that the Catalyst 6000s and Native IOS Mode do not have the Catalyst 8500s ATM
switching and ATM-to-Ethernet integration capabilities.
From a physical standpoint, the RJ-45 console port on the front of the Catalyst 6000
Supervisor is obviously connected to the SP CPUs hardware. However, during the 6000s
boot cycle, control is passed to the RP CPU. Therefore, in Native IOS Mode, the RP acts
as the primary CPU, and the SP acts as the secondary. All human interaction is done directly
through the RP CPU. As commands are entered that affect the Layer 2/SP side of the device,
commands are passed over an internal bus from the RP to the SP.
Also notice that for performance reasons, both CPUs are fully functioning (the SP CPU is not
sitting completely idle). The SP CPU concentrates on port-level management (such as link up/
down) and Layer 2 protocols such as Spanning Tree, VTP, and DTP. The RP CPU performs
duties such as routing the rst packet in every IP or IPX ow (or all packets for protocols that
are not supported by the NFFC/PFC), running routing protocols, CDP, and PAgP.
The two CPUs boot from two different binary image les. First, the SP CPU takes control
and loads an image from ash memory. Then, it passes control to the RP image (along with
passing control of the console port) so that it can boot another image and take over as the
primary CPU for the entire device.
Nonvolatile RAM
As with all Cisco devices, there must be some place to store the conguration while the
device is powered down. Almost all Cisco devices use Nonvolatile RAM (NVRAM) for this
purpose. However, the Native IOS Mode MSFC is somewhat unique in that it maintains two
sets of NVRAM congurations:
The VLAN database
The local switch conguration
Although it might appear strange at rst to have two different sorts of NVRAM
information, it makes complete sense upon closer inspection. Consider that each NVRAM
repository is storing a different type of information. The VLAN database contains
information that is global to the entire network. As information is added to this list, it should
be immediately (or almost immediately) saved and shared through the entire VTP domain.
(Assume that the current switch is a VTP server; for more information, see Chapter 12,
VLAN Trunking Protocol.) Also, as VTP advertisements are received from other devices,
they should immediately be saved (recall that the denition of a VTP server requires that
all VLANs be saved to NVRAM). On the other hand, the switch conguration is locally
signicant. Furthermore, these normal IOS conguration statements are only supposed to
be saved when the user enters the copy run start or write memory commands. After all,
MSFC Native IOS Mode Layer 3 Switching 825
one of the benets to the IOS is that you can make all the changes you want and return to
the exact place you started from by merely rebooting the box (assuming that you did not
save the new conguration).
Therefore, rather than trying to interleave the two sets of data, two completely different stores
are maintained. The VLAN database holds globally signicant information that gets saved
right away (as it would under the XDI/CatOS interface). The switch conguration stores
locally signicant information only when the user enters some form of a save command.
Conguration Modes
The split in NVRAM storage also corresponds to two different conguration modes:
VLAN database conguration mode
Normal IOS conguration mode
NativeMode(vlan)# exit
APPLY completed.
Exiting....
NativeMode#
To use the VLAN database conguration mode, enter the vlan database command from
the EXEC prompt. This places you in the VLAN database mode and modies the prompt
to indicate this (the prompt now consists of the RPs name and the word vlan in
parentheses). As shown by the online help in Example 18-9, the vtp command can be used
826 Chapter 18: Layer 3 Switching and the Catalyst 6000/6500s
to control the VTP conguration of a device. For example, set vtp client and set vtp
transparent change a Catalyst to client and transparent modes, respectively. The device
can be returned to the default of server mode with the command set vtp server. The
command set vtp domain Skinner changes the VTP domain name to Skinner. There are
other commands to control VTP features such as passwords, pruning, and version 2 support.
The vlan command shown in Example 18-9 can be used to add, delete, or modify VLANs.
For example, the command vlan 2 name Marketing creates VLAN 2. Then, issuing the
command vlan 2 name Finance changes the VLANs name from Marketing to Finance.
Entering no vlan 2 removes VLAN 2 altogether. When creating or modifying VLANs,
additional parameters can be specied to control attributes such as MTU and media type.
Example 18-10 shows a list from the online help screen.
After making changes, you can apply them. At this point, the changes are saved to the VLAN
database section of NVRAM (therefore, VLAN changes made under Native IOS Mode are
not as immediate as those made under the XDI/CatOS interface). The exit command can be
used to rst apply the changes to the database and then return to the EXEC mode.
command to move one level at a time up towards the EXEC mode. To jump straight from a
submode to EXEC mode, use the end command or press Ctrl-Z. In Normal IOS
conguration mode, commands are applied to the running conguration as soon as you
press the Enter key. However, commands are only saved to NVRAM when you enter copy
run start or write mem.
TIP The Native IOS Mode and the XDI/CatOS user interfaces use the terms interface and port
differently. In Native IOS Mode, Layer 3 external connections are called interfaces, and
Layer 2 connections are referred to as ports (actually, switchports). Under XDI/CatOS, the
term interface is used to only refer to the management entities such as SC0, SL0, and ME1
(in the case of SC0 and SL0, these are logical ports that you cannot see and touch; in the
case of ME1, it is a physical port, but it cannot be used for end-user trafc). XDI/CatOS
uses the term port to refer to all external points of connection (these are ports you can
physically touch). This chapter uses the terms interface and port interchangeably.
Also notice that the Native IOS Mode software numbers interfaces starting from 1, not 0
(in other words, the rst interface is 1/1, not 0/0). Although this is different than other IOS
devices, it is consistent with the rest of the Catalyst platform.
In a similar fashion, you receive the following error message if you try to assign both
interfaces to the same IPX network number:
%IPX network 0A000100 already exists on interface FastEthernet5/1
NOTE Note that this behavior is the same throughout Ciscos entire router product line. It is not
some strange thing cooked up just for the Catalyst 6000 Native IOS Mode.
Example 18-11 Placing Two Interfaces in the Same VLAN (Default = VLAN 1)
NativeMode# configure terminal
NativeMode(Config)# interface FastEthernet5/1
NativeMode(Config-if)# switchport
NativeMode(Config-if)# interface FastEthernet5/2
NativeMode(Config-if)# switchport
NativeMode(Config-if)# end
NativeMode#
Switchports automatically default to VLAN 1 (although this assignment is not made until
after the switchport command has been entered). To alter this assignment, you can use
MSFC Native IOS Mode Layer 3 Switching 829
additional switchport commands. First, decide if you want the interface to be an access
port (one VLAN) or a trunk port (multiple VLANs using ISL or 802.1Q). This section looks
at access ports (trunk ports are discussed later). To create an access port, rst enter the
switchport mode access command on the interface. Then enter switchport access vlan
vlan-identier to assign a VLAN. For example, Example 18-12 assigns 5/1 and 5/2 to
VLAN 2.
However, trying to assign an IP address to 5/1 or 5/2 at this point does not work. If you try
this, the RP outputs the following message:
% IP addresses may not be configured on L2 links.
If you think about it, this makes complete sensethese two interfaces have been converted
to Layer 2 ports that do not directly receive Layer 3 IP and IPX addresses. This is the same
restriction placed on other Layer 2 ports. For example, you cannot apply IP addresses to
ports in the XDI/CatOS Catalyst conguration. Also, when using bridging on IOS-based
Cisco routers, you cannot assign Layer 3 addresses to the same interface that contains a
bridge-group (the software lets you make the assignment, but the interface is now routing
that protocol, not bridging it; see Chapter 11 for more information on bridge-groups).
NOTE Notice that this is the same as the RSM. In fact, it is this automatic linkage between Layer
3 SVI VLAN interfaces and Layer 2 switchports that makes the Native IOS Mode such an
attractive vehicle for conguring campus networks.
830 Chapter 18: Layer 3 Switching and the Catalyst 6000/6500s
To create an SVI, simply enter the interface VLAN vlan-identier command. This places you
in interface mode for that VLAN (the prompt changes to RP_Name(Cong-if)#) where you
can congure the appropriate Layer 3 information. For example, Example 18-13 creates and
congures VLANs 1 and 2 with IP and IPX addresses.
NOTE Although all ports are assigned to VLAN 1 by default, the VLAN 1 SVI does not exist by
default. To assign Layer 3 attributes to VLAN 1, you must create this SVI.
Therefore, in total, the MSFC Native IOS Mode uses four port/interface types as
summarized in Table 18-2.
1/1 1/2
Catalyst
6000
Routed
RP
interface vlan 2
ip address 10.0.2.1 255.255.255.0
SVI SVI
VLAN 2 VLAN 3
V2 V3
VLAN VLAN
2 3
Native IOS Mode Conguration 833
In Figure 18-3, the Gigabit Ethernet ports on the Supervisor (1/1 and 1/2) have
been congured as fully routed interfaces. Slot 5 contains a Fast Ethernet line card.
Ports 5/15/3 have been congured as Layer 2 switchports in VLAN 2. Ports 5/45/6 have
been congured as switchports in VLAN 3. Port 5/10 has been congured as an 802.1Q
trunk. Summarized conguration examples are also illustrated in the gure. As shown in
Figure 18-3, the Native IOS mode centers around the concept of a virtual router. Physically
routed ports such as Gigabit Ethernet 1/1 and 1/2 directly connect to the virtual router. In
the case of switchports, they connect to the router via an SVI. The SVI acts as a logical
bridge/switch for the trafc within that VLAN. It is also assigned the Layer 3 characteristics
of that VLAN for the purpose of connecting to the virtual router. Trunk
links use the magic of ISL and 802.1Q encapsulation to simultaneously connect to multiple
VLANs and SVIs.
The sections that follow walk step-by-step through a complete MSFC Native IOS
Mode conguration that is similar to the MSM and MSFC congurations shown in
Example 18-2 and Example 18-4.
The steps for completing the MSFC Native IOS Mode conguration are as follows:
Step 1 Assign a name to the router
Step 2 Congure VTP
Step 3 Create the VLANs
Step 4 Congure the Gigabit Ethernet uplinks as routed interfaces
Step 5 Congure the VLAN 2 switchports
Step 6 Congure the VLAN 3 switchports
10.100.100.1 10.200.200.1
1/1 1/2
Catalyst
6000
RP
5/6
802.1Q
Trunk
TIP The apply command can be used to apply the VTP and VLAN changes without leaving the
vlan database mode.
TIP When creating similar congurations across many different ports, use the interface range
command discussed in the Useful Native IOS Mode Commands section later in this chapter.
This assigns 77 ports to VLAN 2 in one step, a huge time saver! In short, Example 18-24 is
equivalent to the set vlan 2 5/1-5,6/1-24,7/1-48 command on XDI/CatOS-based Catalysts.
Notice that interface range commands do not directly appear in the conguration. Instead, the
results of Example 18-24 will appear in the output of show running-cong on each of the
separate 77 interfaces it references.
TIP Be sure to use the interface range command when conguring Native IOS Mode devices.
Users accustomed to the show port and show trunk XDI/CatOS commands will nd
familiar ground in the enhancements to the show interface syntax. For example, the
XDI/CatOS command show trunk has been ported to the show interface trunk command as
shown in Example 18-25.
Useful Native IOS Mode Commands 839
Many of the show port XDI/CatOS commands have also been ported. For example,
Example 18-26 displays information on counters with the show interface counters
command.
Notice that Example 18-26 uses the module module-number argument to lter the output
to only show information for module 1. This option exists on most of the new switching-
oriented show interface commands.
TIP Also use the powerful Output Modier feature introduced in IOS 12.0. Simply terminate
any show command with a pipe symbol (|) and specify a pattern to match (regular
expressions are supported!). There are options to include and exclude lines (including an
option to output all text found after the rst match). The slash (/) key can also be used to
search for text.
840 Chapter 18: Layer 3 Switching and the Catalyst 6000/6500s
Example 18-27 displays the output of the errors option to show interfaces counters.
The show interfaces counters trunk command can be used to show the number of frames
transmitted and received on trunk ports. Encapsulation errors are also included (use this
information to check for ISL/802.1Q encapsulation mismatches).
The show vlan command has also been ported to the Native IOS Mode. In fact, as shown
in Example 18-28, it is almost identical to the XDI/CatOS version of this command.
VLAN Type SAID MTU Parent RingNo BridgeNo Stp BrdgMode Trans1 Trans2
---- ----- ---------- ----- ------ ------ -------- ---- -------- ------ ------
1 enet 100001 1500 - - - - - 0 0
2 enet 100002 1500 - - - - - 0 0
3 enet 100003 1500 - - - - - 0 0
1002 fddi 101002 1500 - 0 - - - 0 0
1003 tr 101003 1500 - 0 - - srb 0 0
1004 fdnet 101004 1500 - - - ieee - 0 0
1005 trnet 101005 1500 - - - ibm - 0 0
NativeMode#
Review Questions 841
Review Questions
This section includes a variety of questions on the topic of campus design implementation.
By completing these, you can test your mastery of the material included in this chapter as
well as help prepare yourself for the CCIE written and lab tests.
1 In what sort of situation would a Catalyst 6000/6500 using XDI/CatOS software and
no MSFC daughter-card be useful?
2 What Layer 3 switching conguration is used by the MSM?
3 The MSM connects to the Catalyst 6000 backplane via what type of interfaces?
4 What disadvantages are there in having an entire network running in 100BaseX full-
duplex mode?
One disadvantage can be cost. Every full-duplex device needs to attach to its own dedicated
switch port. You cannot have multiple full-duplex devices attached to the same port. If you
have many devices, you need many ports, which equates to increased cost.
Another disadvantage can be congestion at the servers. As you increase the number of
devices operating in full-duplex mode, higher amounts of bandwidth hit your servers. This
amount can greatly exceed the bandwidth capacity of the server connection.
5 Can a Class II repeater ever attach to a Class I repeater? Why or why not?
You should not attach a Class I repeater to a Class II repeater because this violates the
latency rules of Fast Ethernet. When you have a Class I repeater, there can be only one
repeater.
6 What is the smallest Gigabit Ethernet frame size that does not need carrier extension?
The need for the carrier extension bytes is driven by the slotTime. Gigabit Ethernet uses a slot
time for 4096 bits. This equates to 512 bytes. Therefore, any frames of 512 bytes or larger do
not need carrier extension, whereas all frames less than 512 MUST have carrier extension.
1 Examine Figure 2-15. How many broadcast and collision domains are there?
There are 14 collision domains. Each port on a bridge, router, and switch dene a new
broadcast domain. You cannot easily tell how many broadcast domains exist in the network
due to the presence of the switch. A switch can create one or many broadcast domains,
depending upon how you congure it. If the switch is congured as one broadcast domain,
there are two broadcast domains. This is not a recommended solution, because both sides
of the router attached to the switch and bridge belong to the same broadcast domain. This
is not good. On the other hand, the switch can have many broadcast domains dened.
2 In Figure 2-15, how many Layer 2 and Layer 3 address pairs are used to transmit
between Stations 1 and 2?
For Station 1 to communicate with Station 2, two Layer 2 address pairs are needed. One
MAC pair is used on the top segment, and the other on the other side of the router. However,
only one Layer 3 address pair is needed end to end.
Refer to the network setup in Figure 2-16 to answer Question 3.
4 If you attach a multiport repeater (hub) to a bridge port, how many broadcast domains
are seen on the hub?
Legacy hubs have all ports in the same collision and broadcast domains, regardless of the
internetworking device they attach to.
5 Can a legacy bridge belong to more than one broadcast domain?
Generally, all ports on a legacy bridge belong to the same broadcast domain.
Supervisor Supervisor
Supervisor Supervisor
cl250401.eps
Cat-A Cat-B
850 Appendix A: Answers to End of Chapter Exercises
Supervisor Supervisor
Supervisor Supervisor
cl250402.eps
Cat-A Cat-B
Supervisor Supervisor
Supervisor Supervisor
cl250403.eps
Cat-A Cat-B
3 Table 4-4 shows how to recall and edit a command from the history buffer. How would
you recall and edit the following command so that you move the ports from VLAN 3 to
VLAN 4?
set vlan 3 2/1-10,3/12-21,6/1,5,7
You cannot simply use the edit ^3^4 because this changes not just the VLAN, but the port list
too. Ports 3/12-21 become 4/12-21. Rather, you could use the command ^vlan 3^vlan 4 and
be more specic about the string you are modifying. This changes only the VLAN assignment
without modifying the port values.
Answers to Chapter 5 Review Questions 851
4 What happens if you congure the Supervisor console port as sl0, and then you directly
attach a PC with a terminal emulator through the PC serial port?
The PC cannot attach to the console, because the Catalyst expects SLIP-based IP
connections through the interface. The sl0 conguration means that you access the Catalyst
console port by treating the serial line as an IP device. When you attach a PC through the
serial port and do not use the IP mode, the PC does not attempt to use its IP stack to build
the connection.
5 The Catalyst 5500 supports LS 1010 ATM modules in the last 5 slots of the chassis. Slot
13 of the 5500 is reserved for the LS1010 ATM Switch Processor (ASP). Can you use
the session command to congure the ASP?
No, you cannot use the session command. This command only works to attach to modules
through the Catalysts switching bus, not the ATM bus. To congure the ASP, you need to
attach a console to the ASPs console port. Use IOS commands to congure the ASP and
line modules.
6 The command-line interface has a default line length of 24 lines. How can you
conrm this?
Example 4-6 in Chapter 4 shows the conguration le for a Catalyst 5000 family device.
Note the conguration command set length 24 default, which sets the screen length.
A B
Cat-A Cat-B
2/5 2/4
2/7 2/15
2/18 2/8
C 2/11 2/11 D
172.16.2.1 172.16.2.2
L250521
Example 5-8 Cat-A and Cat-B Congurations
Cat-A >t active
(enable) show vlan
VLAN Name Status Mod/Ports, Vlans
---- -------------------------------- --------- ----------------------------
1 default active 1/1-2
2/1-8
2 vlan2 active 2/9-24
1002 fddi-default active
1003 token-ring-default active
1004 fddinet-default active
1005 trnet-default active
Although Station A and B addresses belong to the same logical network, A and B attach to
different VLANs. This makes it very difcult for them to talk to each other. Layer 2
switches do not forward trafc between VLANs, and no router exists in the system to
interconnect the VLANs. Therefore, they cannot communicate with each other.
Answers to Chapter 5 Review Questions 853
3 Again referring to Figure 5-22 and Example 5-8, can Station C communicate with
Station D?
No. Although Stations C and D belong to the same logical network and the same VLAN,
there is no connectivity between the switches for VLAN 2. Note that, on Cat-A, one inter-
switch port belongs to VLAN 1 and the other to VLAN 2. But for Cat-B, neither of the
inter-switch ports belong to VLAN 2. This prevents any VLAN 2 trafc from passing
between the Catalysts.
4 Are there any Spanning Tree issues in Figure 5-22?
There are no loops in the network, but VLANs 1 and 2 in Cat-A are merged into one
broadcast domain through the VLAN 1 virtual bridge in Cat-B. See Chapters 6 and 7 for a
discussion of Spanning Tree.
5 Draw a logical representation of Figure 5-22 of the way the network actually exists as
opposed to what was probably intended.
Figure A-4 presents a logical representation of the network in Figure 5-22.
Cat-A Cat-B
2/7 2/15
2/11
A
VLAN 1 VLAN 1
2/11
C B D
VLAN 2 VLAN 2
854 Appendix A: Answers to End of Chapter Exercises
100 100
PCs PCs
Cat-1 Cat-2
The network in Figure 6-24 contains the following: One Root Bridge, three Root Ports, and
404 Designated Ports (one per segment, including the 400 segments connected to end users).
3 When running the Spanning-Tree Protocol, every bridge port saves a copy of the best
information it has heard. How do bridges decide what constitutes the best information?
Bridges use a four-step decision sequence:
Step 1 Lowest Root BID
Step 2 Lowest Path Cost to Root Bridge
Step 3 Lowest Sender BID
The bridge uses this decision sequence to compare all BPDUs received from other bridges as
well as the BPDU that would be sent on that port. A copy of the best (lowest) BPDU is saved.
4 Why are Topology Change Notication BPDUs important? Describe the TCN process.
Topology Change Notication BPDUs play an important role in that they help bridges
relearn MAC addresses more quickly after a change in the active STP topology. A bridge
that detects a topology change sends a TCN BPDU out its Root Port. The Designated Port
for this segment acknowledges the TCN BPDU with the TCA ag in the next Conguration
BPDU it sends. This bridge also propagates the TCN BPDU out its Root Port. This process
continues until the BPDU reaches the Root Bridge. The Root Bridge then sets the TC ag
in all Conguration BPDUs sent for twice the Forward Delay period. As other bridges
receive the TC ag, they shorten the bridge table aging period to Forward Delay seconds.
5 How are Root Path Cost values calculated?
Root Path Cost is the cumulative cost of the entire path to the Root Bridge. It is calculated
by adding a port's Path Cost value to the BPDUs received on that port.
6 Assume that you install a new bridge and it contains the lowest BID in the network.
Further assume that this devices is running experimental Beta code that contains a
severe memory leak and, as a result, reboots every 10 minutes. What effect does this
have on the network?
STP is a preemptive protocol that constantly seeks the Root Bridge with the lowest BID.
Therefore, in this network, the new bridge wins the Root War, and the entire active topology
converges on this bridge every ten minutes. Where links change state during this
convergence process, temporary outages of 3050 seconds occur. When the bridge fails
several minutes later, the network converges on the next most attractive Root Bridge and
creates another partial network outage for 3050 seconds.
856 Appendix A: Answers to End of Chapter Exercises
In short, the network experiences partial outages every time the new bridge restarts and fails.
7 When using the show spantree command, why might the timer values shown on the
line that begins with Root Max Age differ from the values shown on the Bridge Max
Age line?
The values shown in the Root Max Age line are the timer values advertised in the
Conguration BPDUs sent by the current Root Bridge. All bridges adopt these values. On the
other hand, every bridge shows its locally-congured values in the Bridge Max Age line.
8 Label the port types (RP=Root Port, DP=Designated Port, NDP=non-Designated Port)
and the STP states (F=Forwarding, B=Blocking) in Figure 6-25. The Bridge IDs are
labeled. All links are Fast Ethernet.
IDF-Cat-5
BID =
32768.00-90-92-44-44-44
IDF-Cat-4
MDF-Cat-2 MDF-Cat-3
BID = 32768.00-90-92-22-22-22 BID = 32768.00-90-92-33-33-33
Server-Farm-Cat-1
BID = 32768.00-90-92-11-11-11
625.eps
BID =
32768.00-90-92-55-55-55
IDF-Cat-5
F RP NDP B
19 BID = 19
32768.00-90-92-44-44-44
F B
RP IDF- NDP
Cat-4
19 19
F DP F DP DP F DP F
DP 19 NDP
MDF-Cat-2 MDF-Cat-3
F B
F RP RP F
BID = 32768.00-90-92-22-22-22 BID = 32768.00-90-92-33-33-33
19 19
F DP DP F
Root Bridge
Server-Farm-Cat-1
BID = 32768.00-90-92-11-11-11
0628.eps
Svr_Farm-Cat-1 becomes the Root Bridge because it has the lowest BID. MDF-Cat-2 and
MDF-Cat-3 elect Root Ports based on the lowest Root Cost Path (19 versus 38). IDF-Cat-4
and IDF-Cat-5 have two equal-cost paths (38) to the Root Bridge. Therefore, to elect a Root
Port, they have to use Sender BID as a tie breaker. Because MDF-Cat-2 has a lower Sender
BID than MDF-Cat-3, both IDF Cats select the link connecting to MDF-Cat-2 as the Root
Port. Designated Port elections are all based on Root Path Cost.
858 Appendix A: Answers to End of Chapter Exercises
Cat-1 Cat-6
Cat-2 Cat-7
50626.eps
The network is partitioned into two halves. Each half elects its own Root Bridge. There is
a partial outage of approximately 50 seconds. After the Root Bridges have been established,
connectivity resumes within the two halves, but the two halves cannot communicate.
10.1.1.1
PC-1
PC-2
10.1.1.2
250627.eps
Answers to Chapter 6 Hands-On Lab 859
1/2 1/2
BID= BID=
32,768.AA-AA-AA-AA-AA-AA 100.BB-BB-BB-BB-BB-BB
25 0728.eps
Figure A-6 provides the labels requested in Question 1 for Figure 7-30.
Answers to Chapter 7 Review Questions 861
NDP DP
1/1 1/1
B F
Cat-A Cat-B
F F
1/2 1/2
RP DP
BID= BID=
32,768.AA-AA-AA-AA-AA-AA 100.BB-BB-BB-BB-BB-BB
Cat-B becomes the Root Bridge because it has the lower BID. Cat-A therefore needs to
select a single Root Port. In the previous examples of back-to-back switches, the links did
not cross and Port 1/1 became the Root Port because of the lower Port ID (0x8001).
In this case, the crossed links force you to think about the fact that it is the received Port ID
that inuences the Cat-A, not Cat-A's local Port ID values. Although Cat-A:Port-1/2 has
the higher local value, it is receiving the lower value. As a result, Port-1/2 becomes the Root
Port. Understanding this issue is critical to effectively use portvlanpri load balancing.
2 When do bridges generate Conguration BPDUs?
Bridges generate Conguration BPDUs in the following instances:
Every Hello Time seconds on all ports of the Root Bridge (unless there is a
Physical-Layer loop).
When a non-Root Bridge receives a Conguration BPDU on its Root Port, it sends
an updated version of this BPDU out every Designated Port.
When a Designated Port hears a less attractive BPDU from a neighboring bridge.
3 When do bridges generate Topology Change Notication BPDUs?
Bridges generate Topology Change Notication BPDUs in the following instances:
A bridge port is put into the Forwarding state and the bridge has at least one
Designated Port.
A port in the Forwarding or Learning states transitions to the Blocking state.
A non-Root Bridge receives a TCN (from a downstream bridge) on a Designated Port.
4 How many Spanning Tree domains are shown in Figure 7-31? Assume that all of the
switches are using ISL trunks and PVST Spanning Tree.
862 Appendix A: Answers to End of Chapter Exercises
VLANs 101-103
C-1 C-4 C-8
C-5 C-6
CL25 0729.eps
VLANs 101-102 VLAN 101
C-10
VLANs 101-115
10+3+2+15=30.
Although the same numbers are used for all of the VLANs, the routers break the network
into four Layer-2 pockets (Cat-1 through Cat-3, Cat-4, Cat-5 through Cat-6, and Cat-7
through Cat-10). The VLANs in each Layer-2 pocket then form a separate STP domain.
One of the tricks in this layout is to notice that Cat-5 and Cat-6 form a single Layer-2
domain containing two, not three VLANs. Because of the backdoor links between the two
switches, the routers do not break this into separate Layer-2 pockets.
5 When is the Root Bridge placement form of STP load balancing most effective? What
command(s) are used to implement this approach?
When trafc patterns are well dened and clearly understood. In hierarchical networks such
as those adhering to the multilayer design model discussed in Chapters 14 and 15, Root
Bridge placement is an extremely effective form of STP load balancing. Simply collocate
the Root Bridge with the corresponding default gateway router for that VLAN (see
Chapters 14 and 15 for information). For non-hierarchical, at-earth networks, load
balancing usually requires different VLANs to have server farms in different physical
locations.
Answers to Chapter 7 Review Questions 863
When placing Root Bridges, either the set spantree priority or the set spantree root
commands can be used.
6 When is the Port Priority form of STP load balancing useful? What command(s) are
used to implement this approach? What makes this technique so confusing?
This form of load balancing is rarely useful. It can only be used with back-to-back switches.
It should only be used in early versions of code or when connecting to non-Cisco devices.
The set spantree portvlanpri command is used to implement this feature. This technique
can be very confusing because it requires that the set spantree portvlanpri command be
entered on the upstream switch.
7 When is the Bridge Priority form of STP load balancing useful? What command(s) are
used to implement this approach? What makes this technique so confusing?
The Bridge Priority form of STP load balancing can be useful if you are using pre-3.1 code
and cannot use Root Bridge placement (because of trafc patterns) or portvlanpri (because
the switches are not back-to-back). If you are using 3.1+ code, portvlancost is generally a
better choice. The set spantree priority command is used to implement this approach. This
technique can be confusing for several reasons:
The Bridge Priority values must be adjusted on devices that are upstream of where
the load balancing takes place.
The Bridge Priority values must not be adjusted too low or your Root Bridge
placement is disrupted.
It can be difcult to remember why each Bridge Priority was set.
8 When is the portvlancost form of load balancing useful? What is the full syntax of the
portvlancost command? What is the one confusing aspect of this technique?
The portvlancost form of load balancing is useful in almost all situations. It is the most
exible form of STP load balancing. The full syntax of the portvlancost command is:
set spantree portvlancost mod_num/port_num [cost cost_value] [preferred_vlans]
One confusing aspect to this command is that it only allows two cost values to be set for
each port. One value is set with the portcost command and the other is set with the
portvlancost command.
9 What technology should be used in place of portvlanpri?
EtherChannel.
864 Appendix A: Answers to End of Chapter Exercises
10 What are the components that the default value of Max Age is designed to account for?
There is no need to specify the exact formula, just the major components captured in
the formula.
The default Max Age value of 20 seconds is designed to take two factors into account: End-
to-end BPDU propagation delay and Message Age Overestimate.
11 What are the components that the default value of Forwarding Delay is designed to
account for? There is no need to specify the exact formula, just the major components
captured in the formula.
The default Forward Delay value of 15 seconds is designed to take four factors into account:
End-to-End BPDU Propagation Delay, Message Age Overestimate, Maximum
Transmission Halt Delay, and Maximum Frame Lifetime.
The last two factors (Maximum Transmission Halt Delay and Maximum Frame Lifetime)
could be simplied into a single factor called "time for trafc to die out in the old topology."
12 What are the main considerations when lowering the Hello Time from the default of two
seconds to one second?
Lowering the Hello Time value can allow you to improve convergence time by lowering
Max Age or Forward Delay (you have to do this separately) but also doubles the load that
STP places on your network. Notice that load here refers to both the load of Conguration
BPDU trafc and, more importantly, Spanning Tree CPU load on the switches themselves.
13 Where should PortFast be utilized? What does it change about the STP algorithm?
PVST+ is useful when you are trying to connect traditional PVST Catalyst devices with
801.Q switches that only support a single instance of the Spanning-Tree Protocol.
Chapter 7 Hands-On Lab 865
MST and PVST regions cannot be connected through trunk links (MST switches only
support 802.1Q trunks, and PVST switches only support ISL trunks). However, the two
types of switches can be connected through access (non-trunk) links (although this is rarely
useful).
18 Can you disable STP on a per-port basis?
STP cannot be disabled on a per-port basis on Layer 2 Catalyst equipment such as the
4000s, 5000s, and 6000s. In fact, some Layer 3 Catalyst switches (Sup III with NFFC)
require that STP be disabled for the entire device (all VLANs).
19 Why is it important to use a separate management VLAN?
It is important to use a separate management VLAN to prevent CPU overload. If the CPU
does overload as a result of excessive broadcast or multicast trafc, the Spanning Tree
information can become out-of-date. When this occurs, it becomes possible that a bridging
loop could open. If this loop forms in the management VLAN, remaining CPU resources
are quickly and completely exhausted. This can spread throughout the network and create
a network-wide outage.
20 What happens if UplinkFast sends the fake multicast frames to the usual Cisco multicast
address of 01-00-0C-CC-CC-CC?
If UplinkFast sends the dummy frames to the usual Cisco multicast address of 01-00-0C-
CC-CC-CC, older, non-UplinkFast-aware Cisco Layer-2 devices do not ood the frames.
Therefore, this does not update bridging tables through the network.
Building 1
C-1D
C-1C
C-1A C-1B
Server
Farm
Bu
2
ild
g
in
C-
2B
in
3
ild
C- 3
A
Bu
C-
2C
3C
C-
C-
C-
2
3B
C-
3D
2D
C-
CL25 0730.eps
Each building contains two MDF switches (A and B) and two IDF switches (C and D). The
number of IDF switches in each building is expected to grow dramatically in the near
future. The server farm has its own switch that connects to Cat-1A and Cat-1B. The network
contains 20 VLANs. Assume that each server can be connected to a single VLAN (for
example, the SAP server can be connected to the Finance VLAN). Assume that all links are
Fast Ethernet except the ring of links between the MDF switches, which are Gigabit
Ethernet.
Be sure to address the following items: STP timers, Root Bridges, Load Balancing, failover
performance, and trafc ows. Diagram the primary and backup topologies for your design.
Chapter 7 Hands-On Lab 867
Building 1
Root
Bridge C-1C Non-Designated Port,
All Ports Forwarding
on Root Bridge C-1A C-1B
are Designated
Ports, Forwarding Non-Designated Port,
Blocking
Root Port,
Forwarding
Bu
2
ild
ng
in
C-
2B g
i
3A
C-
ild
3
Bu
C-
2C
3
C-
C
C-
A
C-
2
3B
C-
3D
2D
C-
CL25 0732.eps
868 Appendix A: Answers to End of Chapter Exercises
Figure A-8 shows the same information for the even-numbered VLANs. Cat-1B is the Root
Bridge.
Figure A-8 Primary Topology for Even VLANs; Backup Topology for Odd VLANs
Non-Designated
Root Port, Forwarding
Port, Blocking
C-1D
Non-Designated Port,
Blocking Server
Farm
Root Port,
Forwarding
C-
2B
C- A
C- 3
2C
3C
C-
C-
A 2 C-
3B
C-
3D
2D
C-
CL25 0733.eps
Cat-1A is the backup Root Bridge for the even VLANs and Cat-2B is the backup Root Bridge
for the odd VLANs. Therefore, the backup topology for the odd VLANs is the same as Figure
A-7c, whereas the backup topology for the even VLANs is the same as Figure A-7b.
Answers to Chapter 9 Review Questions 869
This method is a little awkward in that it allows all VLANs and then removes the
disallowed VLANs.
To do the ISL trunk, you could follow the same process, except that the rst command
would say set trunk 1/1 on isl.
3 What is the difference between a Catalyst with two LANE modules and a two-port ATM
switch?
As an edge device, the Catalyst only switches frames between ports; the LS1010, as an
ATM switch, only switches cells. See Figure 9-8 in Chapter 9.
4 What is the difference between a VPI, a VCI, and an NSAP? When is each used?
VPI and VCI values are two parts of the address placed in the header of every cell (both
PVCs and SVCs). NSAPs are only used to build the SVC. After the SVC is built, the VPI/
VCI values are used to switch cells along the VC.
5 Assume you attached an ATM network analyzer to an ATM cloud consisting of one
LS1010 ATM switch and two Catalysts with LANE modules. What types of cells could
you capture to observe VPI and VCI values? What type of cells could you capture to
observe NSAP addresses?
All cells contain VPI/VCI values, but NSAP addresses can only be observed in cells that
carry signaling messages (such as a UNI 4.0 call SETUP message).
6 What are the three sections of an NSAP address? What does each part represent?
The following list outlines the three sections of an NSAP address and what each represents.
PrexWhat ATM switch?
ESIWhat devices on the ATM switch?
Selector ByteWhat software component in the end station?
7 How do Catalysts automatically generate ESI and selector byte values for use with LANE?
The following list shows how Catalysts automatically generate ESI and selector byte values
for use with LANE.
LEC = MAC . **
LES = MAC + 1 . **
BUS = MAC + 2 . **
LECS = MAC + 3 . 00
8 What is the ve-step initialization process used by LANE Clients to join an ELAN?
The ve-step initialization process used by LANE Clients to join an ELAN is as follows:
Step 1 Client contacts LECS (Bouncer).
Step 2 Client contacts LES (Bartender).
Step 3 LES contacts Client.
Step 4 Client contacts BUS (Gossip).
Step 5 BUS contacts Client.
Answers to Chapter 9 Review Questions 871
9 What are the names of the six types of circuits used by LANE? What type of trafc does
each carry?
The following list outlines the names of the six types of circuits used by LANE and what
type of trafc each carries.
Conguration DirectRequests to join ELAN and NSAP of LES
Control DirectLE_ARPs
Control DistributeLE_ARPs that need to be ooded to all Proxy Clients
Multicast SendBroadcast, multicast, and unknown unicast trafc that needs to
be ooded to all Clients
Multicast ForwardBroadcast, multicast, and unknown unicast trafc that is
being ooded
Data DirectEnd-user data from LEC to LEC
10 What is the difference between an IP ARP and an LE_ARP?
IP ARPs are used to request the MAC address associated with an IP address.
LE_ARPs are used to request the NSAP address associated with a MAC address.
11 In a network that needs to trunk two VLANs between two Catalysts, how many LECs
are required? How many LECSs? How many LESs? How many BUSs?
LECs = 4One per Client per ELAN
LECSs = 1One per LANE network
LESs = 2One per ELAN
BUSs = 2One per ELAN
12 If the network in Question 11 grows to ten Catalysts and ten VLANs, how many LECs,
LECSs, LESs and BUSs are required? Assume that every Catalyst has ports assigned to
every VLAN.
LECs = 100
LECSs = 1
LESs = 10
BUSs = 10
13 Trace the data path in Figure 9-26 from an Ethernet-attached node in VLAN 1 on
Cat-A to an Ethernet-attached node in VLAN 2 on Cat-B. Why is this inefcient?
872 Appendix A: Answers to End of Chapter Exercises
LEC LEC
LEC
Cat-A LES Cat-B
BUS LEC
VLAN 1 ELAN1 VLAN 1
ATM
The trafc travels through the ELAN1 to the router where it is routed to ELAN2. It then
travels across ELAN2 to reach the node on Cat-B. This is inefcient because: 1) the router
might not be as fast as the ATM network and 2) the router was required to reassemble the
cells back into a complete packet before the routing decision could be made. After the
packet is routed, it has to again be segmented into cells.
Cat-A Cat-B
CL25 0929.eps
Table 9-6 shows the LANE components that should be congured on each device.
When you are done building the network, perform the following tasks:
Test connectivity to all devices.
Turn on debug lane client all and ping another device on the network (you might need
to clear the Data Direct if it already exists). Log the results.
With debug lane client all still running, issue shut and no shut commands on the atm
major interface. Log the results.
Examine the output of the show lane client, show lane cong, show lane server,
show lane bus, and show lane database commands.
Add SSRP to allow server redundancy.
If you have multiple ATM switches, add dual-PHY support (don't forget to update
your SSRP congurations).
LEC-A
Example A-1 provides a sample conguration for LEC-A.
Example A-1 Sample Conguration for LEC-A for Hands-On Lab
hostname LEC-A
!
interface ATM0
atm preferred phy A
atm pvc 1 0 5 qsaal
atm pvc 2 0 16 ilmi
!
interface ATM0.1 multipoint
lane server-bus ethernet ELAN1
lane client ethernet 1 ELAN1
!
interface ATM0.2 multipoint
lane client ethernet 2 ELAN2
!
interface ATM0.3 multipoint
lane client ethernet 3 ELAN3
!
line con 0
line vty 0 4
no login
!
end
LEC-B
Example A-2 provides a sample conguration for LEC-B.
Example A-2 Sample Conguration for LEC-B for Hands-On Lab
hostname LEC-B
!
interface ATM0
atm preferred phy A
atm pvc 1 0 5 qsaal
atm pvc 2 0 16 ilmi
!
interface ATM0.1 multipoint
lane client ethernet 1 ELAN1
!
interface ATM0.2 multipoint
lane server-bus ethernet ELAN2
lane client ethernet 2 ELAN2
!
interface ATM0.3 multipoint
lane client ethernet 3 ELAN3
continues
876 Appendix A: Answers to End of Chapter Exercises
Example A-2 Sample Conguration for LEC-B for Hands-On Lab (Continued)
!
!
line con 0
line vty 0 4
no login
!
end
LS1010
Example A-3 provides a sample conguration for LS1010.
Example A-3 Sample Conguration for LS1010 for Hands-On Lab
hostname LS1010
!
atm lecs-address-default 47.0091.8100.0000.0010.2962.e801.0010.2962.e805.00 1
atm address 47.0091.8100.0000.0010.2962.e801.0010.2962.e801.00
atm router pnni
node 1 level 56 lowest
redistribute atm-static
!
!
lane database Test_Db
name ELAN1 server-atm-address 47.00918100000000102962E801.00102962E431.01
name ELAN2 server-atm-address 47.00918100000000102962E801.00102941D031.02
name ELAN3 server-atm-address 47.00918100000000102962E801.001014310819.03
!
!
interface ATM13/0/0
no ip address
atm maxvp-number 0
lane config auto-config-atm-address
lane config database Test_Db
!
interface ATM13/0/0.1 multipoint
ip address 10.1.1.110 255.255.255.0
lane client ethernet ELAN1
!
interface ATM13/0/0.2 multipoint
ip address 10.1.2.110 255.255.255.0
lane client ethernet ELAN2
!
interface ATM13/0/0.3 multipoint
ip address 10.1.3.110 255.255.255.0
lane client ethernet ELAN3
!
interface Ethernet13/0/0
no ip address
!
no ip classless
Solution to Chapter 9 Hands-On Lab 877
Example A-3 Sample Conguration for LS1010 for Hands-On Lab (Continued)
!
line con 0
line aux 0
line vty 0 4
login
!
end
Router
Example A-4 provides a sample conguration for the router.
Example A-4 Sample Conguration for Router for Hands-On Lab
!
hostname Router
!
interface FastEthernet2/0
no ip address
shutdown
!
interface ATM3/0
no ip address
atm pvc 1 0 5 qsaal
atm pvc 2 0 16 ilmi
!
interface ATM3/0.1 multipoint
ip address 10.1.1.253 255.255.255.0
no ip redirects
lane client ethernet ELAN1
standby 1 preempt
standby 1 ip 10.1.1.254
!
interface ATM3/0.2 multipoint
ip address 10.1.2.253 255.255.255.0
no ip redirects
lane client ethernet ELAN2
standby 2 priority 101
standby 2 preempt
standby 2 ip 10.1.2.254
!
interface ATM3/0.3 multipoint
ip address 10.1.3.253 255.255.255.0
no ip redirects
lane server-bus ethernet ELAN3
lane client ethernet ELAN3
standby 3 preempt
standby 3 ip 10.1.3.254
!
continues
878 Appendix A: Answers to End of Chapter Exercises
Example A-4 Sample Conguration for Router for Hands-On Lab (Continued)
router rip
network 10.0.0.0
!
ip classless
!
!
line con 0
line aux 0
line vty 0 4
login
!
end
Cat-A-RSM
Example A-5 provides a sample conguration for Cat-A-RSM.
Example A-5 Sample Conguration for Cat-A-RSM for Hands-On Lab
hostname Cat-A-RSM
!
interface Vlan1
ip address 10.1.1.252 255.255.255.0
no ip redirects
standby 1 priority 101
standby 1 preempt
standby 1 ip 10.1.1.254
!
interface Vlan2
ip address 10.1.2.252 255.255.255.0
no ip redirects
standby preempt
standby 2 ip 10.1.2.254
!
interface Vlan3
ip address 10.1.3.252 255.255.255.0
no ip redirects
standby 3 priority 101
standby 3 preempt
standby 3 ip 10.1.3.254
!
router rip
network 10.0.0.0
!
no ip classless
!
line con 0
line aux 0
line vty 0 4
login
!
end
Answers to Chapter 10 Review Questions 879
The MPC cannot issue a shortcut request because it cannot establish a relationship with an
MPS. This results from the absence of an LEC to MPC binding. Notice in the last line of
the output that no LANE clients are bound to mpc2. Assuming that a valid LEC exists, you
can x this with the lane client mpoa client command.
2 When might the ingress and egress MPS reside in the same router?
The ingress and egress MPS might reside in the same router whenever the ingress and
egress MPCs are only one router hop away along the default path. That router can then
service both the ingress and egress roles.
3 What creates the association of an MPC with a VLAN?
Because an LEC must be associated with an MPC in a Catalyst, the VLAN associated with
the LEC also associates the MPC to the VLAN.
4 Example 10-6 has the following conguration statement in it: lane client ethernet
elan_name. Where is the VLAN reference?
This is from an MPS conguration that resides on a router. The router does not associate
VLANs like a Catalyst does. Only Catalyst client interfaces need a VLAN reference to
bridge the VLAN to the ELAN. The router associates only with an ELAN.
The following lines appear in both Example 10-14 and Example 10-15: lane client
ethernet 21 elan1 and lane client ethernet 22 elan2. Is there any problem with this? Could
they both say ethernet 21? The values 21 and 22 combine those VLAN numbers to the
correct ELANs (1 and 2). Both ELANs dene different broadcast domains and support
different IP subnetworks. Conventionally then, the VLAN numbers differ. However, the
two VLAN numbers could be the same because they are isolated by a router. If, however,
they were not isolated by a router, the VLAN values could not be the same because they
would be bridged together merging the broadcast domains.
880 Appendix A: Answers to End of Chapter Exercises
5 If a frame must pass through three routers to get from an ingress LEC to an egress LEC,
do all three routers need to be congured as an MPS?
No. Only the ingress and egress routers need to be congured as an MPS. However, any
other intermediate routers in the default path must have at least an NHS congured. Further,
the NHS must be able to source and receive trafc through LECs.
6 Can you congure both an MPC and an MPS in a router?
Yes. The router may have both concurrently. You can elect to do this when the router
functions as an intermediate router or as an ingress/egress router, while at the same time
serving local Ethernet or other LAN connections as an MPC.
MHSRP stands for Multigroup Hot Standby Router Protocol. It is a technique that creates
two (or more) shared IP addresses for the same IP subnet. It is most useful for load
balancing default gateway trafc.
12 What is the difference between CRB and IRB?
Although both features allow a particular protocol to be routed and bridged on the same
device, CRB does not let the bridged and routed halves communicate with each other. IRB
solves this by introducing the BVI, a single routed interface that all of the bridged interfaces
can use to communicate with routed interfaces in that device.
13 When is IRB useful?
When you want to have multiple interfaces assigned to the same IP subnet (or IPX network,
AppleTalk cable range, and so on), but also want to have other interfaces that are on
different IP subnets. The interfaces on the same subnet communicate through bridging. All
of these interfaces as a group use routing to talk to the interfaces using separate subnets.
14 What are some of the dangers associated with mixing bridging and routing?
In a general sense, mixing the two technologies can lead to scalability problems.
Specically, it merges multiple Spanning Trees into a single tree. This can create Spanning
Tree instability and defeat load balancing. It can lead to excessive broadcast radiation. It
can make troubleshooting difcult. In general, it is advisable to create hard Layer 3 barriers
in the network to avoid these issues.
15 What is the benet of using the IEEE and DEC Spanning-Tree Protocols at the same
time? Where should each be run?
Both protocols can be used to avoid the Broken Subnet Problem. IEEE must be run on the
Layer 2 Catalysts (they only support this variation of the Spanning-Tree Protocol). The
IOS-based routers therefore need to run the DEC or VLAN-Bridge versions.
4 Assume that you have a switched network with devices running IGMP version 1 and the
switches/routers have CGMP enabled. One of the multicast devices surfs the Web
looking for a particular multicast stream. The user rst connects to group 1 and nds it
isn't the group that he wants. So he tries group 2, and then group 3, until he nally nds
what he wants in group 4. Meanwhile, another user belongs to groups 1, 2, and 3. What
happens to this user's link?
The user's link continues to carry trafc from all four multicast groups until there are no
members in the broadcast domain for those groups. CGMP and IGMP version 1 cannot
remove a user from a multicast stream until there are no more active members of the group.
This stems from the implicit leave function of IGMP version 1. This can create a bandwidth
problem for the user because he might have four multicast streams hitting his interface.
Layer 2 cores are not as scalable as Layer 3 cores. Tuning Spanning Tree and load balancing
in a Layer 2 core can be tricky. In many cases, physical loops should be removed to improve
failover performance.
11 How should a server farm be implemented in the multilayer model?
As another distribution block off of the core. Workgroup servers can attach to MDF or IDF
switches (depending on what users they serve).
k
n 2
oc
D
Bu but
tio g
is
Bl
bu in
tri
ild ion
tri ild
in B
is Bu
g lo
1 ck
D
Layer 2
Figure A-10 illustrates a campus design built around the multilayer model. Each
distribution block is a self-contained unit. The switching router form of Layer 3 switches
are used in the distribution Layer 3. To maximize the potential scalability of the network, a
Layer 3 core is used.
Answers to Chapter 15 Review Questions 889
k
n 2
oc
D
Bu but
tio g
is
Bl
bu in
tri
ild ion
tri ild
in B
is u
g lo
D B
1 ck
r2
ye
La
3
Every Switch Participates
r
ye
in All 12 VLANs
La
Distribution Block
Building 3
Layer 2
7 Is a Catalyst 6000 running Native IOS Mode software more of a routing switch or a
switching router?
The exibility of the Native IOS Mode interface allows the Catalyst 6000 to function as
either type of device. Because it is based on switching hardware, it has a wide variety of
Layer 2 features and functions. However, because both CPUs are running full IOS images,
it inherits the attributes shared by virtually all Cisco routers. By conguring most of the
ports as switchports, the box takes on a very routing switch-like feel. However, if you leave
the interfaces at their default (where every interface is a routed port), the box looks like a
switching router. At some point, the difference doesn't matter and the discussion drops off
into a meaningless debate of semantics. Don't let the exibility of the MSFC Native IOS
Mode leave you in a situation of brain lock. Instead, simply take advantage of its benets.
INDEX
Symbols
practical applications, 679
! (bang), 9091 restricting SPT, 664
? (question mark), accessing help system, 93 See also switching routers
Numerics A
5/3/1 rule, 40 AAL (ATM Adaption Layer), 351
10BaseT, Manchester encoding, 14 access layer closets, 612
12-port EtherChannel, 309310 See also distribution layer
24-port EtherChannel, 309310 access links, 302303
80/20 rule, 45, 605 EtherChannel, 307312
100BaseFX, 18 scalability, 303
100BaseT2, 1718 access lists
100BaseT4, 17 Layer 3, 615
100BaseTX, 17 legacy networks, 127
100BaseX, 1920, 2324 MLS, 478482
802.10 encapsulation, 326328 access methods
1000BaseCX Gigabit Ethernet, 29 Catalyst, 88
1000BaseLX Gigabit Ethernet, 28 CSMA/CD, 67
1000BaseSX Gigabit Ethernet, 28 access switchport interfaces, 828829
1000BaseT, 29 accessing
1900/2800 series (Catalyst), conguring, 109 Catalyst 5000s
3000 series, conguring, 109 via Telnet, 8788
5000 series (Catalyst) via TFTP, 88
CLI, 8990 via console, 86
configuring, 8489 help system (Catalyst), 9293
8500s activating Catalyst permit lists, 87
CEF (Cisco Express Forwarding), 502503 active peers (HSRP), placement, 516518
comparing to MLS (Catalyst 5000), 506512 active topologies
design scenario, 774 convergence, 209210
IDF Supervisor conguration, 779791 load balancing, 235
IP addressing, 777 Spanning Tree, 192195
MDF Supervisor conguration, 792797 manual Root Bridge placement, 195196
server farms, 779 programming Root Bridge priority,
Spanning Tree, 775776 197198
trunks, 779 adaptive cut-through switching, 61, 63
VLAN design, 776 adding NSAP to ATM switches, 395
VTP, 778 addressing
EtherChannel, 506 ATM (Asynchronous Transfer Mode)
HSRP, load balancing, 676 hard-coded addresses, 403
Layer 3 switching, 503505, 774799 NSAPs, 356358, 388389
MDF switches, 626628 VPI/VCI addresses, 355356, 358
MLS, 628629
896 addressing
C
designing
cabling campus-wide VLANs model, 617620
100BaseTX, 17 multilayer model, 622626
carrier sense, 7 requirements, 613
collision enforcement signals, 7 router and hub model, 616617
Fast Ethernet discontiguous subnets, 693695
100BaseFX, 18 distribution layer, 693
100BaseT2, 1718 HSRP (Hot Standby Router Protocol), 513518
100BaseT4, 17 active peer placement, 516518
100BaseTX, 17 standby peers, 514
fiber mode, selecting, 703 IDF (Intermediate Distribution Frame),
fiber-optic, multimode fiber, 18 606608
IDF (Intermediate Distribution Frame), 606 access layer, 693
access layer, 612 cost, 608
hubs, 608 hubs, 608
Layer 2 Vs, 661 reliability, 608
reliability, 608 requirements, 607608
requirements, 607608 IP addressing, 637638
MDF (Main Distribution Frame), 609 link bandwidth, 638
distribution layer, 612 load balancing, 235237
high availability, 610 ATM, 677
Layer 2 switches, 610 Bridge Priority, 238241
Layer 2 triangles, 661 EtherChannel, 677678
MLS switches, 611 HSRP, 676677
redundancy, 609 IP routing, 677
requirements, 610 management VLANs
VTP domains, 637 isolating, 652
multiple access, 6 loop-free, 665666
sharing, 6 numbering, 651
troubleshooting, 702703 reassigning SC0, 652
calculating MDF (Main Distribution Frame)
End-To-End BPDU Propagation Delay, high availability, 610
252253 Layer 2 switches, 610
Ethernet slotTime, 12 MLS switches, 611
Forward Delay value, 258 redundancy, 609
Message Age Overestimate, 253254 requirements, 610
multicast addresses, 575 migration, overlay approach, 686687
packets/second rate, 13 multilayer model
campus networks core, 629633, 635
80/20 rule, 605 hierarchical structure, 623
ATM (Asynchronous Transfer Mode) Layer 2 loops, 666
practical applications, 681682 resiliency, 654
SONET, 351352 routing, 614615
core layer, 613, 656
900 campus networks
ltering, 58 filtering, 58
Layer 3, 615 forwarding, 59
packets, MLS, 478482 jabber, 712
protocols, 152153 latency, 127
FIM (Fabric Integration Module), 361 multicast, 910
at earth VLANs, 617 pause frames, 15
at networks, Root Bridge placement, 669 RIF (routing information field), 65, 71
awed designs, troubleshooting, 705 runt frames, 45
ooding, 58 Token Ring, 30
multicast addresses, 267 unicast, 8,
PVST+ BPDUs, 282 looping, 163
VTP domains, 563564 unknown unicast, 567
ow charts, transparent bridging, 5657 full-duplex operation, 15, 27
ow control, 15
ows
HSRP, optimizing, 516517
inter-ELAN, 423
G
masks, 479482 GARP (Generic Attributed Registration
MPOA, 420421 Protocol), 500
FLP (Fast Link Pulse), 1617 GARP Multicast Registration Protocol. See GMRP
Flush protocol, 385 GBIC (Gigabit Ethernet Interface Converter), 29
Forward Delay, 177 217, 256259 GDA (Group Destination Address), 588-590
forwarding packets, switched environments, 5961 geographical limitations, collision domains, 40
Forwarding state (STP), 174176, 209 Gigabit Ethernet, 24
four-segment EtherChannel, 311312 architecture, 2526
fragment-free switching, 61, 63 carrier extension, 27
frames DISL (Dynamic ISL), 319321
ARE (all routes explorer), 65 full-duplex operation, 27
ARP (Address Resolution Protocol), 41 half-duplex mode, 27
baby giants, 323 media options, 2729
best path, 117 monitoring, 714
BPDUs (Bridge Protocol Data Units), 166 segments, bundling, 308
Conguration BPDUs, 184186 global conguration
Max Age, 177 MPCs (Multiprotocol Clients), 435
processing, 208211 MPS (Multiprotocol Server), 432434
TCN, 186187 global database settings, VMPS, 147148
broadcast, 9, 44 globally-unique VLAN numbers, 645646
CGMP (Cisco Group Management Protocol), GMRP (GARP Multicast Registration
588589 Protocol), 500
comparing to packets, 1011 Group Destination Address. See GDA
encapsulation
DTP, 325326
ISL, 316318
TRISL, 318319
Ethernet
rates, 13
LANE, 376377
910 half-duplex operation
H I
half-duplex operation, 15, 27 IDF (Intermediate Distribution Frame), 606608
Happy Homes case study access layer, 612
alternative designs, 773 campus networks, designing, 693
Layer 3 switching, 725772 cost, 608
switching routers, 774799 hubs, 608
hard-coded addresses (LANE), 403 Layer 2 Vs, 661
hardware redundancy, 608
broadcast suppression, 595596 reliability, 608
manufacturing flaws, 705 requirements, 607608
MLS (MultiLayer Switching), design scenario, STP (Spanning Tree Protocol), 206207
728729 Supervisor configuration
MSFC (Multilayer Switch Feature Card), 8500s, 779791
814815 MLS design scenario, 735749
headers UplinkFast, 265
802.10 encapsulation, 323, 328 VLAN definition, 113, 116
ISL (Inter-Switch Link), 317 See also MDF
Hello Time, 177, 217, 259260, 670 IEEE 802.1p, 322323, 500
help system (Catalyst), 9293 IEEE 802.1Q, 121122, 322323
hierarchical design configuring, 324
campus networks, multilayer model, 623 Spanning Tree, 123, 324
MLS, implementing, 507 tag scheme, 323
Root Bridge placement, 224225 IEEE 802.3, FLP (Fast Link Pulse), 1617
high availability, MDF, 610 IEEE 802.3z (Gigabit Ethernet), 24
history buffer (XDI/CatOS), 9091 architecture, 2526
hosts carrier extension, 27
membership queries, 576 Gigabit Ethernet media options, 2729
membership reports, 576 IEEE 802.10 encapsulation, 326329
MPOA (Multiprotocol Over ATM), 423425 IGMP (Internet Group Management Protocol),
HSRP (Hot Standby Router Protocol), 513518 576577
active peer placement, 516518 bridge tables, modifying, 586587
configuring, syntax, 518 configuring, 584
load balancing, 627, 676677, 798 host membership queries, 579
MHSRP (Multigroup HSRP), 798 membership reports, 578
standby peers, 514 messages, 574
testing, 515 non-querier routers, 581
See also MHSRP Snooping, 499501, 594, 608
HSSI port adapter, power overloads, 467 static multicast configuration, 585586
hubs, 608 version 2, 579
autonegotiation, 1617 interoperability with version 1, 581583
Token Ring architecture, 31 membership queries, 580
Hybrid mode, MSFC (Multilayer Switch Feature ILMI (Integrated Local Management Interface), 361
Card), 814815
IVLs (independent VLANs) 911
commands modularity
show mls, 486 multilayer campus model, 623
show mls entry, 486487 routing, 678679
show mls rp, 488 modules
show mls statistics entry, 487 configurations, viewing, 9798
show mls statistics protocol, 487488 Supervisor
comparing to switching routers, 506512 conguration les,saving, 102103
design scenario, 725727 images, importing, 105
hardware selection, 728729 redundancy, 106
IDF Supervisor conguration, 735749 standby mode, 106
IP addressing, 730731 Supervisor Engine Software, 104
IPX addressing, 731 synchronizing, 107
load balancing, 733 Supervisor III, syntax, 102
MDF Supervisor conguration, 750772 monitoring
Spanning Tree, 734735 CPU performance, 650
trunks, 733 Gigabit Ethernet, 714
VLAN design, 729 MPOA (Multiprotocol over ATM)
VTP server mode, 732 clients, 437438
flow masks, 479482 servers, 435
hierarchical designs, 507 SPT (Spanning Tree Protocol),
IGMP Snooping, 499501 patterns, 663664
Layer 3 partitions, 509510 VTP activity, 560
MDF (Main Distribution Frame), 611 MPCs (Multiprotocol Clients), 423
multiple router ports, 489492, 497498 global configuration, 435437
NDE (NetFlow Data Export), 501 MPS (Multiprotocol Server), 422423
NFFC (NetFlow Feature Card) MPOA (Multiprotocol Over ATM), 119120,
central rewrite engines, 477 334336, 419, 685686
internal routers, 488 clients, monitoring, 437438
practical applications, 679 configuring, 429431
protocol filtering, 499 control flows, 420421
QoS (Quality of Service), 501 devices
Root Bridge placement, 669 edge, 424425
Spanning Tree, 492495 host, 423424
Root Bridge placement, 661 LECS (LAN Emulation Configuration Server),
stacking, 495497 database configuration, 431432
standards compliance, 476 MPS (Multiprotocol Server)
theory of operations, 469478 global conguration, 432434
WAN links, 488489 major interface conguration, 434
mobility monitoring, 435
multilayer network users, 639 sample conguration, 439441
VLAN users, 128131, 645modes subinterface conguration, 434
switching, 61 troubleshooting, 441
modes MPS (Multiprotocol Server), 422423
switching, 61 global configuration, 432434
VTP, configuring, 544545, 559560 major interface configuration, 434
modifying bridge tables, 586587 subinterface configuration, 434
networks 917
NHRP (Next Hop Router Protocol), 427429 OSPF (Open Shortest Path First),
LAGs (Logical Address Groups), 429 multicast frames, 10
non-Designated Ports, 211 OUI (Organizational Unique Identier), 8
non-querier routers (IGMP), 581 out-of-band networks, loop-free management
non-routable protocols, 660 VLANs, 665666
normal aging (MLS), 478 Output Modier feature (IOS 12.0), 839
NORMAL mode, Catalyst 5000, 84 overhead connections, building, 391392
NSAPs (Network Service Access Points), 356358, overhead protocols, ATM (Asynchronous Transfer
388389, 686 Mode)
adding to ATM switch, 395 ILMI (Integrated Local Management
well-known, 408409 Interface), 361
numbering PNNI (Private Network-Network Interface),
interfaces (Native IOS mode), 828 361362
VLANs, 645647, 651 overlapping Spanning Tree domains, 620
NVRAM (Non-Volatile RAM) overlay approach, campus migration, 686687
MSFC configurations, 824825
normal IOS conguration, 826827
VLAN database conguration, 825826
storing, 546547
P
saving storage space, 235 packet-by-packet switching, 117
packets
candidate packets, 472474
O comparing to frames, 1011
CPU processing, 649650
obtaining VTP domain membership, 542544 dropping, 650
one-link-per-VLAN approach filtering
(router-on-a-stick), 491492, 455456 Layer 3, 615
optimizing MLS, 478482
HSRP flows, 516517 enable packets, 474475
Max Age (SPT), 268 PAgP (Port Aggregation Protocol), disabling,
Root Port, 265 271273
Spanning Tree timers, 669670 paper documentation, maintaining, 700
OR operations, 310 parameters, restoring default values, 100
organizing troubleshooting "buckets", 701 partitions, Layer 3 switching
cabling, 702703 MLS (Multilayer Switching), 509510
configuration, 703704 modularity, 678679
miscellaneous problems, 704705 passwords
OSI model (Open Systems Interconnection) Catalysts, 696
Layer 2, multicast addresses, 575 enable, setting, 101
Layer 3 recovering, 101
deploying in VLANs, 133135 RSM, recovering, 464
legacy networks, 133 security, 100102
multicast addresses, 575
switching, 118
troubleshooting method, 705707
VLAN user mobility, 129131
920 passwords
three-layer model, campus network design, 611612 over legacy networks, 125
throughput simulating, 364
ATM cells, 346347 suppressing, 594597
bridged VLANs, 526527 capturing, 125
switching, 458459 collisions
timers 5/3/1 rule, 40
Max Age, tuning, 250256 late, 40
STP, 176178, 216218, 669670 runt frames, 45
Token Ring, 29 combining types, 651652
comparing to Ethernet, 7778, 318319 control traffic, 651
components, 31 end-user, 651
early token release, 30 Ethernet LANE, frame format, 376377
hub architecture, 31 filtering, 152153
migrating to Ethernet, 78 flow control, 15, 516517
ring monitor, 30 Gigabit Ethernet, monitoring, 714
SRB (source-route bridging), 6466 ISL encapsulation, 317318
SRT (source-route transparent bridging), 6668 isochronous, 362
SR-TLB (source-route translational bridging), management traffic, 651
6972 multicast, 573574
switching IGMP Snooping, 594
DRiP, 77 routing, 453
source-route, 7576 multiplexing
TrBRF, 7475 IEEE 802.1Q, 322323
TrCRF, 7374 over VLANs, 307
VLANs, 76 multiservice, ATM, 349
VTP version 2, 566567 packets, comparing to frames, 1011
topologies patterns, 207, 603605
back-to-back switches, load balancing, 231 peak utilization, 604
convergence unicasts, 7
Designated Ports, electing, 171172 VLANs, separating, 649650
improving, 250273 VTP pruning, 567569
Root Bridge, electing, 168169 trafc shaping (ATM), 410
Root Ports, electing, 170171 translational bridging, Catalyst Token Ring
TCN BPDUs, 187191 modules, 72
loop-free, 160, 665 transparent bridging, 43, 55
STP (Spanning Tree Protocol) aging timers, 5960
convergence, 167172 filtering, 58
designing, 669 flooding, 58
domains, 220 flow chart, 5657
reachability, 218220 forwarding, 59
Root Bridge, positioning, 192195 learning process, 57
tracking VTP activity, 560 transparent devices, removing VLANs, 546
trafc traps, SNMP (Simple Network Management
80/20 rule, 605 Protocol), 741
broadcasts TrBRF (Token Ring Bridge Relay Function), 7475
ARP frames, 41 TrCRF (Token Ring Concentrator Relay
over LANE, 332333 Function), 68, 74
928 transparent bridging
V
distribution, 547549
Vs (SPT), evaluating, 663664 domain management, 139
VCs (Virtual Circuits), ATM dynamic, 144148
building, 377386 end-to-end, 133
Configuration Direct, 369370 IEEE 802.1Q, 121123
Data Direct, 374 ISL (Inter-Switch Link), 316318
point-to-multipoint, 354 isolating, Layer 3 partitions, 509510
point-to-point, 354 latency, 127
verifying Layer 3,
ATM LANE configuration, 398401 deploying, 133135
DISL configuration, 321 switching, 8500s, 776777
ELAN connectivity, 442445 management VLANs
multicast router operation, 584 isolating, 664
versions, IGMP interoperability, 576, 581583 loop-free, 665666
viewing numbering, 651
configuration files, 9498 reassigning SC0, 652
history buffer, 91 MLS (Multilayer Switching), design
statistics (STP), 179183, 274275 scenario, 729
VLAN configuration, 142143 multilayer model, numbering, 648
VIP (Versatile Interface Processor), 459, 467 multiple SPT per-VLAN, 673674
virtual circuits. See VCs multiple, trunking, 653654
VLAN Trunking Protocol. See VTP multiplexing
VLANs, 129 DTP, 326
access links, 302303 IEEE 802.1Q, 322323
access lists, 127 naming conventions, 648649
affect on network administration, 128 non-routable protocols, 660
application switching, 120 numbering, 645647
bandwidth, 126 peering, restrictions, 680
bridging, 522523 popularity of, 643644
guidelines, 534 port/VLAN cost load balancing, 241248
merged Spanning Trees, 527534 port/VLAN priority load balancing, 227235
throughput, 526527 ports, assigning, 140141
BVI (Bridged Virtual Interface), 524525 pruning, 654, 662
campus-wide PVST (Per-VLAN Spanning Tree), 565
logical structure, 620 PVST+ (Per-VLAN Spanning Tree plus),
loop-free cores, 657 280282
managing, 620 redistribution, partitioned networks, 562
scalability, 207, 621 redundant links, 221222
stability, 620621 revision numbers, 552
clearing, 546 Root Bridges, specifying, 666
comparing to ELANs, 363 router-on-a-stick, 454
configuring (Catalyst), 136139 one-link-per-VLAN, 455456
creating, 139141, 546 trunking, 457462
deleting, 141142
designing, 658659
930 VLANs