CCIE Pursuit Blog

August 17, 2009

Internetwork Expert Volume IV (Troubleshooting) Workbook Review: Part 3

Once you get the initial configurations loaded you’re ready to begin the lab.  This is when the “fun” begins.  Those of us who are used to starting labs with barebone configurations and searching for a few misconfigurations will be in for a bit of a shock.  This is not how this troubleshooting will go.  You’ll be looking at a fully configured network…which you did not build. It was at this point that I should have realized that this would not be easy and that the 2 hour time limit – which initially sounded like all of the time in the world – would be an issue.

When I tell you that you’re looking at a fully configured network, that means things like QoS, Multicast, and IP Services.  You can start to see how difficult Cisco can make these labs.  Throw in a number of devices that you cannot access and INE’s recommendation that you only use show and debug commands, and you’re looking at a bad day on the CLI.

The lab document starts with a “Baseline” section.  This will give you a list of the devices under your control as well as details about how the network has been configured.  This is broken down by well-known sections.  For instance, Bridging and Switching might tell you which devices are STP roots, which VLANs are present, which flavor of STP is running, if and how VTP is set up, etc.  The IGP section describes the routing protocols, any route filtering, redistribution (yes, there is plenty of that), etc.

I read through the baseline and then started making network maps.  INE has some nice examples of the maps that you’ll want to build and how long it should take in the solution guide:

We recommend making your own diagrams, including the following
information:
• IP addressing + IGPs.
• Layer 2 topology.
• BGP diagram.
• IPv6 topology.
• Multicast and Redistribution diagram.

Overall, don’t spend too much time building the baseline – the goal is to spend around 20 minutes. By the end of the baseline analysis phase, you should have clear understanding of the protocols and applications deployed in your network.

It took me a LOT longer than 20 minutes to get my head around what was going on in the network.  It’s much harder to get quickly up to speed on a complex CCIE network when you haven’t built it from scratch.  😦

After getting a basic idea of what was going on, it was time to start looking at the tickets.  There are ten tickets, each with a point value between 2 to 4 points.  The total amount of points is 30 points, so each ticket will average 3 points.  Like the “classic lab”, you’ll need to fix each issue completely – no partial points are awarded.  There may also be tickets that you cannot resolve unless you’ve already fixed previous tickets.  For example, ticket 10 in lab 1 has the following requirements:

Ticket 10: Multicast
Note: Prior to starting with this ticket make sure you resolved Tickets 4 and 5

Since you need 80% to pass the troubleshooting portion of the lab, you’ll need to get at least 24 points.  This means that you can only really miss about 2 tickets (depending on point values).

Logging is turned off on the devices.  I would strongly suggest enabling logging buffered on all devices(remove the configuration before finishing the lab).  There are a number of logging messages that will point to some initial issues that you might miss if you’re not on the device when the log is generated.  This way you can issue “show log” and see what’s going on.

Another suggestion: work on the tickets that seem easy first.  Then work on any tickets that are requirements for other tickets.  Finally, work on the tougher tickets last.

I used some of my basic, initial troubleshooting habits to find a couple of issues.  In the lab – after building each section – I do basic troubleshooting.  For instance, once all Layer 2 configuration is complete, I verify that I can ping across each link.  After each IGP configuration, I verify that the proper routes are being advertised and received, as well as pinging (at least a subset) of the routes.  I would suggest putting together a “toolkit” of common commands to run on each device when approaching the troubleshooting section such as ‘show ip int br | e ass’, ‘show ip protocol’, ‘show ip [protocol] route’, etc.

Let’s look one of the (easy) tickets from Lab 1:

Ticket 4: Connectivity Issue
• Another ticket from VLAN7 users. They cannot reach any resource on VLAN 5 – all IP Phones have unregistered, and nothing else works.
• However, they are still able to reach the local resources.
• Using the baseline description as your reference, resolve this issue in optimal manner.

3 Points

You will notice this issue if you do a Layer 2 check by pinging across directly connected links.  Basically, you cannot ping from r4 to sw1 on VLAN41.  Looking at sw1, I could see that the SVI interface for VLAN41 was not up.  Sounds like an easy fix.  Make sure that VLAN 41 has been added to the VLAN database.

Actually, it VLAN41 was in the VLAN database.  The IP addressing was correct.  All of the other SVIs were up and working.  WTF?

Here is the configuration for the SVI:

interface Vlan41
ip address 164.16.47.7 255.255.255.0
ip access-group REMOTE_DESKTOP in
ip pim sparse-dense-mode
ntp broadcast client
ntp broadcast

Hmmm….I’ll bet that INE has a dastardly access-list configured.  Let’s see the configuration for that sucker:

ip access-list extended REMOTE_DESKTOP
dynamic RDP timeout 10 permit tcp any host 164.16.7.100 eq 3389
deny   tcp any host 164.16.7.100 eq 3389
permit ip any any

Oh fucking joy.  A dynamic access-list.  But I don’t see how this would be breaking connectivity, let alone keeping my SVI down.  Just to be sure I removed the ACL.  The SVI remained down, but I did catch a break when issued ‘no shut’ after readding the ACL:

*Mar  1 02:19:56.300: %SPANTREE-2-ROOTGUARD_BLOCK: Root guard blocking port FastEthernet0/14 on VLAN0041.

Interesting.  According to our baseline guidelines sw1 should be the root bridge for all active VLANs.  The initial configuration reflects this:

spanning-tree vlan 1-4094 priority 8192

Fa0/14 is a trunk link to sw2:

interface FastEthernet0/14
switchport trunk encapsulation dot1q
switchport mode trunk
spanning-tree guard root

So it looks like someone is generating a better BPDU for VLAN41.  Wanna bet that it’s either sw3 or sw4 – the two switches that are restricted?  🙂

Rack16SW2#sh spanning-tree root

Root    Hello Max Fwd
Vlan                   Root ID          Cost    Time  Age Dly  Root Port
—————- ——————– ——— —– — —  ————
VLAN0001          8193 0017.0e3f.3900        19    2   20  15  Fa0/14
VLAN0003          8195 0017.0e3f.3900        19    2   20  15  Fa0/14
VLAN0005          8197 0017.0e3f.3900        19    2   20  15  Fa0/14
VLAN0007          8199 0017.0e3f.3900        19    2   20  15  Fa0/14
VLAN0009          8201 0017.0e3f.3900        19    2   20  15  Fa0/14
VLAN0013          8205 0017.0e3f.3900        19    2   20  15  Fa0/14
VLAN0018          8210 0017.0e3f.3900        19    2   20  15  Fa0/14
VLAN0026          8218 0017.0e3f.3900        19    2   20  15  Fa0/14
VLAN0041          4137 0012.4337.1880        19    2   20  15  Fa0/17         <-NOTE!
VLAN0043          8235 0017.0e3f.3900        19    2   20  15  Fa0/14
VLAN0055          8247 0017.0e3f.3900        19    2   20  15  Fa0/14
VLAN0062          8254 0017.0e3f.3900        19    2   20  15  Fa0/14

Rack16SW2#sh spanning-tree vlan 41
VLAN0041
Spanning tree enabled protocol ieee
Root ID    Priority    4137                <-less than 8233 on sw1
Address     0012.4337.1880
Cost        19
Port        19 (FastEthernet0/17)
Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec

Bridge ID  Priority    32809  (priority 32768 sys-id-ext 41)
Address     001f.9e4a.fa00
Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec
Aging Time 300

Interface        Role Sts Cost      Prio.Nbr Type
—————- —- — ——— ——– ——————————–
Fa0/4            Desg FWD 19        128.6    P2p
Fa0/14           Desg FWD 19        128.16   P2p
Fa0/17           Root FWD 19        128.19   P2p             <-goes to sw3 not sw1!

I think that I’ve found my problem.  sw3 is advertising a lower priority for VLAN 41.  I went ahead and set the STP priority to 0 for ALL VLANs.  INE chose to only change VLAN 41.

Me:
Rack16SW1(config)#spanning-tree vlan 1-4094 priority 0

INE:
spanning-tree vlan 41 priority 0

Either way, this unblocked f0/14 for VLAN41 and restored the STP instance on that VLAN, which brought up the SVI…which brought up IP connectivity.  🙂

*Mar  1 02:33:29.608: %SPANTREE-2-ROOTGUARD_UNBLOCK: Root guard unblocking port FastEthernet0/14 on VLAN0041.

That should give you an idea of one of the easier tickets.  Even though this was a fairly easy issue, it threw me off my game because 99.99% of the time when I see an SVI down, it’s because the VLAN is not in the VLAN database.

In the final part of the review I’ll discuss the solution guide as well as my overall impressions.

7 Comments »

  1. I think that Cisco have finally defeated me with this one. I was ready, but couldn’t get a lab date. I admit that this was a good direction for the CCIE program to go in, but after one and 1/2 years of looking at lab books it look as though it would take another 6 months to get up to scratch with MPLS, edge routing, OEQ’s (the new ones covering the new technologies) and this trouble-shooting. I’d rather do my CCVP…

    Comment by Gunga_Jim — August 21, 2009 @ 3:52 am | Reply

  2. Overall I think the only major issue will be the 2 hour limit .. Will need to practice my ass off to beat this random tim elimit imposed by Cisco..

    Comment by mayurgai — August 28, 2009 @ 1:15 pm | Reply

  3. Where did you go? When is the lab scheduled? I miss reading the blog!

    Comment by Wally — September 14, 2009 @ 4:35 pm | Reply

  4. So, I bought INE’s troubleshooting workbook based on this review and tackled the first lab. To my chagrin, I immediately ran into an issue with the OSPF over frame network between R3, R4, and R5. Basically, OSPF adjacency kept failing during Exchange state between R3-R4 and R3-R5. After burning 2 sessions with this, I finally broke down and cheated by first reading through the solutions guide to see how it was handled (it wasn’t even addressed), then did “sh run”s on all three routers. I found out that R4 and R5 had frame fragmentation turned on (as shown in solutions guide), but R3 did not. As soon as I turned it on at R3, it fixed my problem. Did you run into this when you did this lab?

    Comment by jsnmngwn — September 14, 2009 @ 9:13 pm | Reply

  5. where are you i am missing you

    Comment by ciscogeek — October 8, 2009 @ 9:21 am | Reply

  6. Are you alive?

    Comment by Mike in OZ — January 25, 2010 @ 2:23 am | Reply

  7. Hi CCIE Pursuit! Where are you now man?

    Comment by Mar Apuhin — March 25, 2010 @ 6:54 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.