Be on Guard This Spooking Spanning Tree Season

0
112
Be on Guard This Spooking Spanning Tree Season


It’s Halloween — a time for an excessive amount of sweet, scary motion pictures, children in enjoyable costumes, and many methods and treats. As I considered what to jot down for my weblog this month, I shortly went to one of many scariest issues for each community engineer: SPANNING TREE!!!! That’s proper… can something else deliver the identical stage of dread and chilly sweats because the potential for a bridging loop?!

Fear not. With a bit of excellent sensible design and configuration practices, spanning tree doesn’t must be scary. However, even the perfect engineers (or reasonably first rate ones like myself) can neglect a finest observe or two. Let me set the spooky scene for you…

It was a darkish and stormy night time…

The following anecdote happened about three or 4 years in the past once I was a part of the DevNet Sandbox group. We had not too long ago stood up a brand new information heart for internet hosting labs, and I had returned residence from California after spending a number of weeks onsite, standing up the community and techniques on the information heart. I used to be feeling fairly good about how properly issues had gone. Particularly, the velocity and effectivity we had been capable of deliver issues on-line, due to a heavy quantity of automation and programmability. In retrospect, I ought to have identified one thing was going to go incorrect…

I feel the primary signal there is likely to be an issue within the community was once I observed my distant connection into the brand new location began to get actually laggy. I even obtained disconnected from some servers. It would clear up pretty shortly. But when the problems repeated a number of occasions, I began to surprise what is likely to be the trigger.

I checked different monitoring techniques. Intermittent community points had not too long ago began displaying up; sluggish response from techniques, occasional disconnects that will clear up pretty shortly, that form of factor. Nothing overly drastic, however they actually had been signs that indicated one thing won’t be completely wholesome within the community. I started to poke round a bit extra. Eventually, I stumbled throughout a couple of issues that pointed to a potential situation someplace within the layer 2 elements of the community.

It was fairly some time in the past, so the small print are slightly fuzzy. I feel I used to be on one of many prime of rack Nexus 9000 switches in a {hardware} internet hosting rack when syslog messages hit the terminal about MAC flapping occurring. Now, MACs will transfer round a community often. However, a flapping MAC handle occurs when a change sees it altering forwards and backwards between two ports. This will not be regular. It usually factors to a community loop — one thing spanning tree is meant to forestall from occurring.

Here is an instance syslog message associated to MAC Flapping:

*Apr 5 18:17:43.242 GMT: %SW_MATM-4-MACFLAP_NOTIF: Host d8e6.a5cd.3f41 in vlan 61 is flapping between port Ethernet1/23 and port Ethernet1/24

After a bit extra troubleshooting, I additionally observed that the community was reconverging spanning tree, altering the foundation bridge over and over. This was undoubtedly an issue. Even “rapid” spanning tree convergence is noticeable to community customers who discover themselves ready for a port to transition to forwarding after ports change state.

Explore how Loop Detection Guard prevents community loops on Catalyst 9000 switches. Read “Preventing Network Loops! A Feature You Need to be Aware of” now.

Enough of the trick already, Hank… the place’s the deal with?

Long story quick, the foundation of the issue (pun TOTALLY meant) was a brand new bodily change that was being added to the community for one of many {hardware} labs we had been organising.

The new change hadn’t been totally configured for its new position but, and the upstream switches it was linked to already had the ports enabled in preparation for the brand new lab gear being added. The lab topology had a number of ports linked between this new change and the info heart cloth for various functions and networks, however not one of the remaining configuration had been utilized but. There had been truly some remnants of outdated configuration utilized to the change, which resulted within the bridging loop and MACFLAP log messages.

Furthermore, this change had beforehand served because the spanning tree root in a earlier community and had a decrease (i.e., higher) precedence than the precise spanning-tree root in our information heart. Between connections being made/eliminated, ports getting errdisabled for various causes, and different instabilities, the foundation was bouncing between this new change and the primary distribution switches within the information heart each couple of minutes.

I used to be capable of shortly cease the issues from occurring by shutting down the ports linked to this new change till it was appropriately configured and able to be made an energetic a part of the community. So, downside solved… kinda.  

The larger downside was that I had ignored the vital spanning tree design and finest practices for the configuration step in bringing the brand new information heart community up and on-line. Had I remembered my fundamentals, this downside wouldn’t have occurred: The community would have mechanically blocked ports that had been behaving in sudden methods.

You are NOT root: Preventing sudden root bridges with root guard

Consider this quite simple triangle of switches as a fast evaluate of the significance of the foundation bridge in a spanning-tree community. 

Switches linked along with layer 2 hyperlinks use BPDUs (bridge protocol information models) to find out about one another and decide the place the “root” of the spanning tree shall be positioned. The change that has the perfect (i.e., lowest) precedence turns into root. With the foundation bridge recognized, switches start the method of breaking loops within the community by blocking ports that spanning tree identifies as having the worst precedence on redundant hyperlinks.

A full dialogue on the spanning-tree course of for constructing the tree is out of scope for this weblog put up. It is a vital subject for community engineers to grasp, so I’d return to spanning tree in future weblog posts. If you’d wish to dive deeper into the subject now, take a look at our CCNA and ENCOR programs.

The means of electing the foundation bridge and converging on a loop-free community can take tens of seconds to even a minute (or extra) in giant networks, relying on which model of spanning tree is used and the way properly the community is designed. During the method of convergence, the community prevents bridging loops by defaulting to blocking site visitors on ports. This will lead to vital disruption to any customers and functions which can be actively utilizing the community. Remember in my instance above, how my community entry had gotten “laggy” and my connections had even turn out to be disconnected? As lengthy as the foundation bridge stays steady and does NOT change, including a brand new change to a community is a non-disruptive exercise.

So, how does a community engineer stop the foundation bridge from altering within the community? I’m glad you requested.

Identifying the foundation bridge for the community

The first step is to take a look at the community design and establish which change makes essentially the most logical sense to be the foundation, explicitly configuring it to have the perfect (i.e., lowest) precedence. Here, I configure my root change to run speedy per-vlan spanning tree (rapid-pvst) and set the precedence to 16384.

root#present run | sec spanning

spanning-tree mode rapid-pvst
spanning-tree lengthen system-id
spanning-tree vlan 1-4094 precedence 16384


root#present span

VLAN0001
  Spanning tree enabled protocol rstp
  Root ID    Priority    16385
             Address     5254.000e.dde8
             This bridge is the foundation
             Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec

  Bridge ID  Priority    16385  (precedence 16384 sys-id-ext 1)
             Address     5254.000e.dde8
             Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec
             Aging Time  300 sec

Interface           Role Sts Cost      Prio.Nbr Type
------------------- ---- --- --------- -------- --------------------------------
Gi0/1               Desg FWD 4         128.2    P2p 
Gi0/2               Desg FWD 4         128.3    P2p 
Gi0/3               Desg FWD 4         128.4    P2p 

Note: With “per-vlan spanning-tree” each VLAN could have its personal spanning-tree constructed. The precedence of every bridge is the configured precedence plus the VLAN quantity. So for VLAN 1, the precedence is 16384+1 or 16385.

If we take a look at the spanning-tree state on one of many different switches within the community, we are able to affirm the foundation bridge and the creation of a loop-free community.

switch-1#present span

VLAN0001
  Spanning tree enabled protocol rstp
  Root ID    Priority    16385
             Address     5254.000e.dde8
             Cost        4
             Port        2 (GigabitEthernet0/1)
             Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec

  Bridge ID  Priority    32769  (precedence 32768 sys-id-ext 1)
             Address     5254.0017.ae37
             Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec
             Aging Time  300 sec

Interface           Role Sts Cost      Prio.Nbr Type
------------------- ---- --- --------- -------- --------------------------------
Gi0/1               Root FWD 4         128.2    P2p 
Gi0/2               Desg FWD 4         128.3    P2p 
Gi0/3               Altn BLK 4         128.4    P2p 

switch-1#present cdp neighbors gigabitEthernet 0/1

Device ID        Local Intrfce     Holdtme    Capability  Platform  Port ID
root             Gig 0/1           146             R S I            Gig 0/1

If you examine the handle of the foundation bridge proven on switch-1 to the output above from root, you will notice that the Address and Priority for the foundation bridge match. Also, discover that interface G0/1 has the position of “Root” — that is the interface on the change that has the perfect path again to the foundation bridge. And because the output from CDP reveals, it’s truly instantly linked to the foundation.

Stopping a brand new root on the block… err, community

Identifying an meant root bridge to your community is nice, but it surely doesn’t stop a newly added change from inflicting hassle.

Consider again to my instance from my anecdote the place a brand new change was being added to the community that had beforehand been configured as the foundation in one other community. While it might be argued that it’s best observe and essential to clear outdated configuration from a change earlier than including it to the community, the fact is… issues like this occur. It is essential to engineer a community to deal with occasions like this.

First, let’s see what occurs to the spanning-tree community when bad-root is cabled into the community with none additional configuration defending the spanning-tree community.

switch-1#present span

VLAN0001
  Spanning tree enabled protocol rstp
  Root ID    Priority    4097
             Address     5254.001e.82a2
             Cost        4
             Port        1 (GigabitEthernet0/0)
             Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec

  Bridge ID  Priority    32769  (precedence 32768 sys-id-ext 1)
             Address     5254.0017.ae37
             Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec
             Aging Time  300 sec

Interface           Role Sts Cost      Prio.Nbr Type
------------------- ---- --- --------- -------- --------------------------------
Gi0/0               Root FWD 4         128.1    P2p 
Gi0/1               Desg FWD 4         128.2    P2p 
Gi0/2               Desg FWD 4         128.3    P2p 
Gi0/3               Altn BLK 4         128.4    P2p 


switch-1#present cdp neighbors gigabitEthernet 0/0

Device ID        Local Intrfce     Holdtme    Capability  Platform  Port ID
bad-root         Gig 0/0           154             R S I            Gig 0/1

Total cdp entries displayed : 1

Notice how the handle and precedence for the foundation bridge have modified, and that port Gi0/0 is now the “Root” port for switch-1. This is unquestionably not what we might need to occur if a bad-root had been linked to the community.

Bringing out the Guard… root guard, that’s

We can leverage root guard to forestall this from occurring. Root guard is without doubt one of the “optional spanning-tree features” that basically shouldn’t be thought-about “optional” in most community designs.

As a community engineer, you must have the ability to take a look at your community and know which ports “should be” the foundation port on every change. Then think about the redundancy that you simply’ve constructed into the community and establish which port ought to turn out to be the foundation port if the first port had been to have issues. Every different port on every change ought to by no means turn out to be the foundation port. Those are the ports that needs to be configured with root guard.

Note: The root bridge in a community has NO root ports as it’s the root of the tree. Therefore ALL PORTS of the foundation bridge ought to have root guard enabled.

Now we’ll go forward and allow root guard on interface Gig0/0 on each switch-1 and switch-2.

switch-1(config)#interface gigabitEthernet 0/0
switch-1(config-if)#spanning-tree guard root 

*Oct 13 15:06:28.893: %SPANTREE-2-ROOTGUARD_CONFIG_CHANGE: Root guard enabled on port GigabitEthernet0/0.
*Oct 13 15:06:28.909: %SPANTREE-2-ROOTGUARD_BLOCK: Root guard blocking port GigabitEthernet0/0 on VLAN0001. 

And take a look at that. As quickly as it’s enabled, we see syslog messages indicating that root guard has begun blocking the port. If we examine the standing of spanning tree on switch-1 we are able to confirm that the foundation of the spanning tree has returned to the proper root change.

switch-1#present span

VLAN0001
  Spanning tree enabled protocol rstp
  Root ID    Priority    16385
             Address     5254.000e.dde8
             Cost        4
             Port        2 (GigabitEthernet0/1)
             Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec

  Bridge ID  Priority    32769  (precedence 32768 sys-id-ext 1)
             Address     5254.0017.ae37
             Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec
             Aging Time  300 sec

Interface           Role Sts Cost      Prio.Nbr Type
------------------- ---- --- --------- -------- --------------------------------
Gi0/0               Desg BKN*4         128.1    P2p *ROOT_Inc 
Gi0/1               Root FWD 4         128.2    P2p 
Gi0/2               Desg LRN 4         128.3    P2p 
Gi0/3               Altn BLK 4         128.4    P2p  

There’s one different command that’s useful to know when troubleshooting spanning-tree ports that aren’t behaving as anticipated:

switch-1#present spanning-tree inconsistentports 

Name                 Interface                Inconsistency
-------------------- ------------------------ ------------------
VLAN0001             GigabitEthernet0/0       Root Inconsistent

Number of inconsistent ports (segments) within the system : 1  

Take the scare out of spooky spanning tree with data

Hopefully, this put up helps to decrease your coronary heart fee slightly the subsequent time you consider making adjustments to the community which may influence your spanning-tree community. But I additionally hope it reveals you, as a community engineer, the significance of recalling the basic abilities and data you’ve got discovered as you progress onward to extra specialised areas of networking. I used to be undoubtedly kicking myself once I realized that I had utterly ignored guaranteeing that our spanning-tree community was well-designed and protected against sudden or unintended adjustments.

While nobody needs to have a community outage or perhaps a minor disruption, they are going to occur. What is essential, is that we study from them. And we turn out to be higher community engineers for them.

Do you’ve got a spooky community ghost story from your personal work as a community engineer? Ever had a scary encounter with a community outage or downside that helped you study a lesson you’ll always remember? Share them within the feedback. Trick or deal with!

Some useful hyperlinks for digging deeper into spanning tree:

If you’d wish to dive deeper into this subject, I pulled a couple of hyperlinks collectively for you.

 

 

Join the Cisco Learning Network in the present day without spending a dime.

Follow Cisco Learning & Certifications

Twitter | Facebook | LinkedIn | Instagram

Use #CiscoCert to hitch the dialog.

Share:



LEAVE A REPLY

Please enter your comment!
Please enter your name here