CCIE Pursuit Blog

January 21, 2008

LFU 14: Menu Lockout

I was working on a task where you need to build a router menu for the NOC.  I added this line to my (existing) menu:

r2(config)#menu NOCMENU clear-screen

I invoked my menu…

r2#menu NOCMENU

…and ended up with a blank screen (which I could not break out of).

I jumped on a different router and telnetted to r2 and found out that I had overwritten my menu with the single line:

r2(config)#do sh run | sec menu
menu NOCMENU clear-screen

No biggie.  I’ll just remove that configuration:

r2(config)#no menu NOCMENU clear-screen
% Menu NOCMENU not deleted. In use

CRAP!!!

What to do?  I thought of a few options:

1) Reload – that would work as I had not written the configuration yet.  But it would waste time.
2) Kill my session on the console port.  That would work, but then I would need to reestablish my reverse telnet from the access server.  This would be a little quicker than reloading.

r2(config)#do sh user
    Line       User       Host(s)              Idle       Location
   0 con 0                idle                 00:06:05
*514 vty 0                idle                 00:00:00 141.1.36.6

I ended up doing this instead:

r2(config)#menu NOCMENU text 1 1 END!!!
r2(config)#menu NOCMENU command 1 exit

I reverse-telnetted back in to r2 via the access server and hit “1″

r2 con0 is now available

Press RETURN to get started.

Sweet!!!! I pressed RETURN as instructed and the curse of the evil menu was lifted!!!

I removed the menu:

r2(config)#no menu NOCMENU
r2(config)#do sh run | sec menu
r2(config)#

December 15, 2007

LFU 13: Know Your Defaults

If you reverse out something, make sure that it’s not a default.

I accidentally configured “logging monitor”.  No problem, I’ll just remove the command with “no logging monitor”.  No problem…except that “logging monitor” is a default command. 

r3(config)#logging mon
r3(config)#no logg mon
r3(config)#do sh run | i logg
no logging monitor   <-oops!!!!
logging trap debugging

By not knowing that “logging monitor” is on by default, I just changed the configuration of my router and I could lose points when the lab is graded.

If you accidentally turn on a function, do a quick “show run | i [function]“.  If it does not show up, then it is a default and you can go on your merry way.  If it does show up, then you will need to back it out.

December 5, 2007

LFU 11: Interfaces Can Have Multiple IPv6 Addresses

I ran into an issue this weekend with an IPv6 address that I thought that I had removed showing up in my routing table. 

r1(config)#ipv6 unicast-routing
r1(config)#int s1/0
r1(config-if)#ipv6 address 2001::11/64
r1(config-if)#do sh ipv6 int br | e down|unass
Serial1/0                  [up/up]
    FE80::CE00:12FF:FE10:0
    2001::11

Oh crap!  I wanted to configure 2001::1/64 not 2001::11/64.

r1(config-if)#ipv6 address 2001::1/64
r1(config-if)#do sh ipv6 int br | e down|unass
Serial1/0                  [up/up]
    FE80::CE00:12FF:FE10:0
    2001::1
    2001::11

r1(config-if)#do sh run int s1/0
Building configuration…

Current configuration : 262 bytes
!
interface Serial1/0
 ip address 10.0.0.1 255.0.0.0
 encapsulation frame-relay
 ipv6 address 2001::1/64
 ipv6 address 2001::11/64
 serial restart-delay 0
 no dce-terminal-timing-enable
 frame-relay map ip 10.0.0.2 102 broadcast
 no frame-relay inverse-arp
end

Unlike IPv4, configuring a different IPv6 address on an interface will NOT overwrite the existing IPv6 address with the new IPv6 address.  This is because an interface can have multiple IPv6 addresses.

You can use the “no ipv6 address” to remove all IPv6 addreses on an interface.  Then you can configure the new IPv6 address.

interface Serial1/0
 ip address 10.0.0.1 255.0.0.0
 encapsulation frame-relay
 ipv6 address 2001::1/64
 ipv6 address 2001::11/64
 ipv6 address 2001::111/64
 ipv6 address 2001::1111/64
 serial restart-delay 0
 no dce-terminal-timing-enable
 frame-relay map ip 10.0.0.2 102 broadcast
 no frame-relay inverse-arp
end

r1(config-if)#no ipv6 address
r1(config-if)#do sh run int s1/0
Building configuration…

Current configuration : 211 bytes
!
interface Serial1/0
 ip address 10.0.0.1 255.0.0.0
 encapsulation frame-relay
 serial restart-delay 0
 no dce-terminal-timing-enable
 frame-relay map ip 10.0.0.2 102 broadcast
 no frame-relay inverse-arp
end

December 1, 2007

LFU 10: No VLAN…No CDP!!!

I was working on a practice lab today and was using CDP to verify that all of the connections were correct.  I cruised through sw1 and sw2 but when I hit sw3 I started to see strange issues with CDP.
 
sw3 fa0/3 is connected to r3 fa0/1 but I cannot see a CDP neighbor:
 
sw3#sh run int fa0/3
Building configuration…
 
Current configuration : 95 bytes
!
interface FastEthernet0/3
 switchport access vlan 33
 switchport mode dynamic desirable
end
 
Port      Name               Status       Vlan       Duplex  Speed Type
Fa0/3                        connected    33         a-full  a-100 10/100BaseTX
 
sw3#sh int fa0/3
FastEthernet0/3 is up, line protocol is up (connected)
 
sw3#sh cdp neigh fa0/3
Capability Codes: R – Router, T – Trans Bridge, B – Source Route Bridge
                  S – Switch, H – Host, I – IGMP, r – Repeater, P – Phone
 
Device ID            Local Intrfce         Holdtme   Capability    Platform   Port ID
sw3#

The interface is up/up and CDP is running (it would have given me an error if it was not) but CDP does not see r3′s fa0/1 interface. 

Let’s look at r3:

r3#sh run int fa0/1
Building configuration…
 
Current configuration : 95 bytes
!
interface FastEthernet0/1
 ip address 132.1.33.3 255.255.255.0
 duplex auto
 speed auto
end
 
r3#sh ip int br | i net0/1
FastEthernet0/1
            132.1.33.3      YES manual up                    up
 
r3#sh cdp neigh fa0/1
Capability Codes: R – Router, T – Trans Bridge, B – Source Route Bridge
                  S – Switch, H – Host, I – IGMP, r – Repeater
 
Device ID        Local Intrfce     Holdtme    Capability  Platform  Port ID
r3#

r3′s fa0/1 interface is up and up and CDP is running.  I can actually physically trace that cable to sw3′s fa0/3 interface.  So why can’t sw3 see r3 (and vice versa) via CDP? 

Here’s the problem:
 
sw3#sh int fa0/3 switch
Name: Fa0/3
Switchport: Enabled
Administrative Mode: dynamic desirable
Operational Mode: static access
Administrative Trunking Encapsulation: negotiate
Operational Trunking Encapsulation: native
Negotiation of Trunking: On
Access Mode VLAN: 33 (Inactive)
Trunking Native Mode VLAN: 1 (default)
Administrative Native VLAN tagging: enabled
Voice VLAN: none

I configured the port as an accesss port assigned to VLAN 33, but VLAN 33 does not exit on the switch (it will once VTP does it’s magic).  Once VLAN 33 is configured on the switch CDP will work:

sw3#sh vlan id 33

VLAN Name                             Status    Ports
—- ——————————– ——— ——————————-
33   VLAN0033                         active    Fa0/3, Po2

VLAN Type  SAID       MTU   Parent RingNo BridgeNo Stp  BrdgMode Trans1 Trans2
—- —– ———- —– —— —— ——– —- ——– —— ——
33   enet  100033     1500  -      -      -        -    -        0      0

Remote SPAN VLAN
—————-
Disabled

Primary Secondary Type              Ports
——- ——— —————– ——————————————

sw3#sh int fa0/3 switch
Name: Fa0/3
Switchport: Enabled
Administrative Mode: dynamic desirable
Operational Mode: static access
Administrative Trunking Encapsulation: negotiate
Operational Trunking Encapsulation: native
Negotiation of Trunking: On
Access Mode VLAN: 33 (VLAN0033)
Trunking Native Mode VLAN: 1 (default)
Administrative Native VLAN tagging: enabled
Voice VLAN: none

sw3#sh cdp neigh fa0/3
Capability Codes: R – Router, T – Trans Bridge, B – Source Route Bridge
                  S – Switch, H – Host, I – IGMP, r – Repeater, P – Phone

Device ID            Local Intrfce         Holdtme   Capability    Platform   Port ID
r3                  Fas 0/3               179           R S I     2651XM    Fas0/1

November 26, 2007

LFU 9: Beware The Implied OSPF Process!!!

I ran into an interesting issue on the Internetwork Expert Volume II lab 2 this weekend.  I noticed that I had a rogue OSPF process showing up on one of my switches (3560).  I had configured OSPF process 100, but not OSPF process 17 (17 wasthe area that the sw1 OSPF interfaces were in):

sw1#sh ip proto sum
Index Process Name
0     connected
1     static
2     ospf 100
3     rip
4     ospf 17   <-where did this come from?
*** IP Routing is NSF aware ***

sw1#sh ip proto | b ospf
Routing Protocol is “ospf 100″
  Outgoing update filter list for all interfaces is not set
  Incoming update filter list for all interfaces is not set
  Router ID 150.1.7.7
  It is an autonomous system boundary router
  Redistributing External Routes from,
    rip, includes subnets in redistribution
  Number of areas in this router is 1. 1 normal 0 stub 0 nssa
  Maximum path: 4
  Routing for Networks:
    132.1.17.7 0.0.0.0 area 17
  Routing Information Sources:
    Gateway         Distance      Last Update
    150.1.3.3            110      00:18:43
    150.1.2.2            110      00:18:43
    150.1.1.1            110      01:44:44
    150.1.7.7            110      01:44:44
  Distance: (default is 110)

There was no other OSPF process configure.  I thought that maybe I accidentally configured “router os 17″ at some point, but when I looked at the configuration all that was there was OSPF 100.  I thought that there must be some bug in the switch IOS.  The phantom OSPF process was showing up in the “show ip protocol summary” output but not in the “show ip protocol” output.  I reloaded the switch just in case.

That didn’t solve the issue.  I looked through the configuration and found the issue…under the RIP process!  This solves the mystery of the “ospf 17″:

sw1#sh run | b router rip
router rip
 version 2
 redistribute ospf 17 metric 1   <-whoops!  
 offset-list EVEN in 16 Vlan783
 network 150.1.0.0
 network 204.12.1.0
 no auto-summary

This explains my route redistribution issue as well!  I must have typed in the area number instead of the correct OSPF process (100) when redistributing OSPF into RIP.  This is the type of fat-finger issue that can make you fail your lab.  I banged my head on my desk and corrected the process ID.

sw1#sh run | b router rip
router rip
 version 2
 redistribute ospf 100 metric 1    
 offset-list EVEN in 16 Vlan783
 network 150.1.0.0
 network 204.12.1.0
 no auto-summary

Strange, IOS must assume the existence (or imminent existence) of a routing process if you reference it in a redistribution statement.  What’s even stranger is that even after I removed (changed) the reference…it still showed up:

sw1#sh ip proto sum
Index Process Name
0     connected
1     static
2     ospf 100
3     rip
4     ospf 17
*** IP Routing is NSF aware ***

WTF?  Was I being haunted by an undead OSPF process?  I verified that there was nothing left in the configuration that referred to OSPF process 17:

sw1#sh run | i 17
 ip address 132.1.17.7 255.255.255.0
interface FastEthernet0/17
 area 17 authentication
 network 132.1.17.7 0.0.0.0 area 17

So the process was not configured and there was no longer any reference to it, so why was it still showing up?

sw1#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
sw1(config)#no router ospf 17
sw1(config)#
sw1(config)#do sh ip proto sum
Index Process Name
0     connected
1     static
2     ospf 100
3     rip
*** IP Routing is NSF aware ***

Lesson learned: if you reference an OSPD process that does not exist (in RIP at least) then the router assumes that this process is actually running.  You’ll need to explicitly remove the process (“no router ospf x”) in order to remove it from the “show ip protocol summary” output.

November 17, 2007

LFU 8: Read The Prompts!!!

ARRGHH!!!!!  Today I was saving out a running configuration on a 2651XM to flash.  Very simple thing to do.  I had plenty of space in flash:

r1#sh flash:

System flash directory:
File  Length   Name/status
  1   29631128  c2600-adventerprisek9-mz.124-10.bin
[29631192 bytes used, 20176164 available, 49807356 total]
49152K bytes of processor board System flash (Read/Write)

I went ahead and copied the running configuration to flash, but I neglected to read the prompts carefully before hitting the “enter” key:

r1#copy run flash:r1_basic_cfg
Destination filename [r1_basic_cfg]?
Erase flash: before copying? [confirm]
Erasing the flash filesystem will remove all files! Continue? [confirm] <-DOH!!!!
Erasing device… eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee

I didn’t figure on melting the entire box.  There was plenty of room in flash for the config.  I just hit enter one too many times!!!

The correct way:
r3#copy run flash:r3_basic_cfg
Destination filename [r3_basic_cfg]?
Erase flash: before copying? [confirm]n
Verifying checksum…  OK (0x1EC9)
1802 bytes copied in 4.707 secs (383 bytes/sec)

r3#sh flash:

System flash directory:
File  Length   Name/status
  1   29631128  c2600-adventerprisek9-mz.124-10.bin
  2   1802     r3_basic_cfg
[29633060 bytes used, 20174296 available, 49807356 total]
49152K bytes of processor board System flash (Read/Write)

Back to R1:

r1#sh ver | i IOS
Cisco IOS Software, C2600 Software (C2600-ADVENTERPRISEK9-M), Version 12.4(10), RELEASE SOFTWARE (fc1)

r1#sh flash:

System flash directory:
File  Length   Name/status
  1   1224     r1_basic_cfg
[1288 bytes used, 49806068 available, 49807356 total]
49152K bytes of processor board System flash (Read/Write)

So I have my config file, but no IOS.  At least this should be an easy fix.

I have the appropriate IOS running on r3:

r3#sh ver | i IOS
Cisco IOS Software, C2600 Software (C2600-ADVENTERPRISEK9-M), Version 12.4(10), RELEASE SOFTWARE (fc1)

I have connectivity to r3 via a PTP connection. I just need to set r3 up to be a tftp server for that image:

r3(config)#tftp-server flash:c2600-adventerprisek9-mz.124-10.bin

Now I need to copy that image to flash on r1:

r1#copy tftp flash:
Address or name of remote host []? 155.1.13.3
Source filename []? c2600-adventerprisek9-mz.124-10.bin
Destination filename [c2600-adventerprisek9-mz.124-10.bin]?
Accessing tftp://155.1.13.3/c2600-adventerprisek9-mz.124-10.bin…
Erase flash: before copying? [confirm]n <-learn from your mistakes!!!
Loading c2600-adventerprisek9-mz.124-10.bin from 155.1.13.3 (via Serial0/1):
!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
[OK - 29631128 bytes]

Verifying checksum…  OK (0xC7EA)
29631128 bytes copied in 753.755 secs (39311 bytes/sec)

After getting a cup of coffee (or 3 cups – I have the PTP link set at 64K)  , everything is  wonderful again on r1:

r1#sh flash:

System flash directory:
File  Length   Name/status
  1   1224     r1_basic_cfg
  2   29631128  c2600-adventerprisek9-mz.124-10.bin
[29632480 bytes used, 20174876 available, 49807356 total]
49152K bytes of processor board System flash (Read/Write)

Configure the boot statement and reload:

r1(config)#boot system flash:c2600-adventerprisek9-mz.124-10.bin

 

November 8, 2007

LFU 7: RTFM

Last night (actually early this morning due to my inability to translate time zones) I rented some time on Internetwork Expert’s racks.  I was assigned to rack 6.  I logged on and after doing some of the technology labs, I decided to do as much of Lab 1 from the Volume III workbook as time allowed.  I had already done the layer 2 part of this lab on my rack and figured that it was good practice to do it again and see how far I could get through the IGP configuration before I fell asleep.

One of the great things about renting a vendor’s rack is that they are set up precisely for their workbooks, right down to the interface.  On my rack I need to constantly tweak vendor provided initial configurations to match my actual interfaces.  It’s not a huge deal, but it does burn up some time when setting up for a lab.  On IE’s rack I was able to quickly load the initial configurations and start on the lab.

I tore through the layer 2 stuff.  A lot of that had to do with this being the second time that I’ve done this lab, but also because the tasks are pretty basic and unambiguous. 

I was zipping along until I hit task 3.2  In this task you’re asked to create a point-to-point Frame Relay connection from r6 to bb1 (backbone router).  Pop on an IP address and configure Frame Relay.  Piece of cake.   That’s when the fun began:

I could not ping bb1 from r6.  I configured it correctly and I was getting a good frame map:

r6#p 54.1.2.254

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 54.1.2.254, timeout is 2 seconds:
…..
Success rate is 0 percent (0/5) 

r6#sh run | sec Serial0/0
interface Serial0/0
 no ip address
 encapsulation frame-relay
interface Serial0/0.1 point-to-point
 ip address 54.1.2.6 255.255.255.0
 frame-relay interface-dlci 100

r6#sh frame map
Serial0/0.1 (up): point-to-point dlci, dlci 100(0×64,0×1840), broadcast
          status defined, active

Weird.  I did a shut/no shut on the physical interface and was still unable to ping the bb1.  I jumped on the access server and tried to reverse telnet to bb1 but was not allowed to access it.  At this point I grabbed the answer key to verify my configuration.  I felt stupid doing this because the task was so simple and I had successfully completed it before on my own rack.  My configuration matched the answer key.  So what the hell was going on?

I verified that Frame Relay was working.  LMI, Frame Maps, Frame PVCs, interfaces were up and up…everything looked good.  Routing protocols had not been configured at this point, so it HAD to be something with the layer 2 configuration.  But what?

I had a couple of troubleshooting routes that I could take a this point.  I was leaning towards something being configured wrong on the bb1, so I decided to default r6′s s0/0 interface and then use Frame Relay Inverse-ARP on the physical interface to create a dynamic mapping.  If successful, that should allow me to verify the DLCI and IP address of bb1:

This is after I defaulted and rebuilt the s0/0 interface:
r6#sh run int s0/0
Building configuration…

Current configuration : 89 bytes
!
interface Serial0/0
 ip address 54.1.2.6 255.255.255.0
 encapsulation frame-relay
end

r6#sh ip int br | i Serial0/0
Serial0/0                  54.1.2.6        YES manual up                    up

Serial0/0.1                unassigned      YES manual deleted               down

r6#sh frame map
Serial0/0 (up): ip 54.6.1.254 dlci 101(0×65,0×1850), dynamic,
              broadcast,, status defined, active
Serial0/0 (up): ip 54.6.2.254 dlci 100(0×64,0×1840), dynamic,
              broadcast,, status defined, active
Serial0/0 (up): ip 54.6.3.254 dlci 51(0×33,0xC30), dynamic,
              broadcast,, status defined, active

Fuck me running!  The IP address on bb1 was wrong.  The second octet is 6 and should be 1:

Serial0/0 (up): ip 54.6.2.254 dlci 100(0×64,0×1840), dynamic,
              broadcast,, status defined, active

r6#sh ip int br | i Serial0/0
Serial0/0                  54.1.2.6        YES manual up                    up

My configuration matched the topology and the answer key.  I opened the initial configuration for the bb1 from Internetwork Expert’s own documentation (remember that I could not access the bb1):

interface Serial0.100 point-to-point
 description PVC 100 to Rack8
 ip address 54.1.2.254 255.255.255.0 
 ipv6 address 2001:54:1:2::254/64
 ipv6 address FE80::254 link-local
 frame-relay interface-dlci 100

Why hath thou forsaken me Internetwork Expert?  I was crushed.  I was up way past my bedtime and looking at only a couple hours of sleep before the roar of my alarm woke me to face a day at work with minimal sleep.  I would spend my few hours of sleep plotting the slow and painful demise of the Brians.  :-)

I went ahead and altered R6′s s0/0 interface IP and everything worked just fine.

I have learned a couple of things about myself over the years:

  1. I am often wrong.
  2. If I wait a bit and keep my big mouth shut, I’ll usually see the error of my ways.

Unfortunately, I usually skip step two and end up with my foot in my mouth.  In this case I thankfully saw the error of my ways before emailing IE complaining about the incorrect configuration of the bb1.

If you’ve used IE’s labs before, you’ll know that the IP address on the topology are written in the form of “10.x.1.0/24″ where x = your rack number.  I have always used x = 1.  This has never been a problem because I use my own rack most of the time.  When I have rented racks before, it’s never been an issue either because I’ve never used the bb routers before. 

In this case I was on rack 6 and had pasted in the initial configurations which used x=1 (this negates my “initial configs need no editing” statement from earlier).  This meant that IE had the bbs configured correctly (x = 6) but that ALL of my interfaces were configured incorrectly (x = 1).

ARRGHH!!!!!!!

I went ahead and changed the octet only on the connections to the bb routers.  I was getting pretty tired and it was obvious that I was not going to complete the entire lab before passing out, so I did not worry about any issues caused by the bb-injected routes using x=6.

In the end I was disappointed that I spent so much time troubleshooting this issue, but I am happy that I was able to troubleshoot the issue and eventually find the underlying problem (my idiocy).

The lesson: Read the instructions and don’t get so comfortable with a topology or routine that you don’t think/question why you’re doing something a certain way.  If this had happened to me on the actual lab, it would have sunk me.

October 25, 2007

LFU 6: Traffic Shaping Won’t Start By Itself

Frame Relay traffic-shaping tasks can be a real pain in the ass.  Make sure that you don’t skip the simple steps when tackling a complicated FRTS task. 

In this scenario I want to create a simple Frame Relay map-class and apply it to DLCI 102 on interface s1/0.  Here’s my configuration:

map-class frame-relay MYFRAMEMAP
 frame-relay tc 100
 frame-relay cir 128000
!
interface Serial1/0
 ip address 10.1.1.1 255.255.255.0
 encapsulation frame-relay
 frame-relay map ip 10.1.1.2 102 broadcast
 frame-relay interface-dlci 102
  class MYFRAMEMAP
 no frame-relay inverse-arp

Done, right?  I go to verify my traffic-shaping and get nothing, nada, zilch:

r1#sh traffic
   <-note: no output
r1#

I try a few more commands:

r1#sh traffic queue
  <-note: no output
r1#

r1#sh traffic stat
                  Acc. Queue Packets   Bytes     Packets   Bytes     Shaping
I/F               List Depth                     Delayed   Delayed   Active

Finally I stumble across the problem:

r1#sh traffic s1/0
Traffic shaping not configured on Serial1/0 dumbass!!!

Okay, so IOS didn’t actually say “dumbass”, but I know that it wanted to.  :-)

Of course I didn’t do “sh traffic s1/0″ right away.  No, it took tons of swearing, adding and removing configurations, and making sure that Frame Relay was set up right before I discovered that I had not actually TURNED FRAME RELAY TRAFFIC SHAPING ON!!! 

Quick fix:

r1(config)#int s1/0
r1(config-if)#frame traffic
r1(config-if)#do sh traffic

Interface   Se1/0
       Access Target    Byte   Sustain   Excess    Interval  Increment Adapt
VC     List   Rate      Limit  bits/int  bits/int  (ms)      (bytes)   Active
103           56000     875    7000      0         125       875       -
102           128000    2000   128000    0         125       2000      - <-booyah!!!

Don’t shoot yourself in the foot after mastering the fine art of FRTS.  Be sure to turn “frame-relay traffic-shaping” on for your interface.

October 11, 2007

LFU 5 – Know Your Technologies

I was doing some BGP confederation labs today and spent a ton of time trying to troubleshoot a next-hop issue.  The segment that I was having trouble on was something like this:

r4 r2 r3
AS 1 —— AS 65002 —— AS 65013
Confederation ID 2 Confederation ID 2

r2 and r2 are  BGP confederation peers in BGP conferation ID 2.  So r4 peers to remote-as 2 (eBGP connection) then r2 peers with r3 using remote-as 65013 (eBGP connection).  There was a subnet on r4 that I was advertising into BGP (let’s say lo0 10.100.4.4) that was not getting installed into the route table of r3 because r3 did not have a route to the next-hop value of that route.  That next-hop was showing up as r4′s connection to r2.

I went over my configs again and again.  I compared “sh ip bgp” tables and “sh ip bgp neigh x.x.x.x advertised” output, but I simply could not see why the route on r3 still had r2 as the next-hop.

My logic went like this:

r4 advertises the route to r2.  Because this is an eBGP peering r4 modifies the next-hop value to itself. [note: in this case the next-hop modification did not matter as the route was originated on r4, but you get the idea]

r2 advertises the route to r3, but because this peering between confederation peers is treated as an eBGP peering, it will modify the next-hop value to itself.

r3 gets the route and sees it as valid and best because it has a route (directly connected in this case) to the next-hop of r2.  I should be able to see that route installed on r3.

Well, try as I might I could not get this to work.  If I installed a static route for r4 to go to r2, it worked.  If I configured “neighbor [r3's address] next-hop-self” on r2, it worked. 

Since confederations are meant to get around the iBGP rules and treat the peerings as eBGP peerings, why the hell wasn’t r2 changing the next-hop value to itself?  Arrgh!!!

I finally gave up and did a Google search (I know – I should use the DOC!) thinking that I’d find some weird IOS bug.  The only “bug” was with me:

Q. Do eBGP sessions between confederations modify the next hop?

A. No, eBGP sessions between confederation sub-ASes does not modify the next hop attribute. All iBGP rules still apply to have the whole AS behave as a single entity. The metric and local preference values also remain unaltered among confederation eBGP peers. Refer to the BGP Confederation section of BGP Case Studies for more information about confederations.

It was my misunderstanding of BGP conferations that was the culprit.  I should have known my techno

September 25, 2007

LFU 4 – Fat Fingers Can Doom You

I was doing a NAT lab today and came to a dead stop because I couldn’t get BGP to work between two routers.  R4 and R5 share two links: a PTP serial link (155.1.45.0/24) and a PTP Frame Relay link (155.1.0.0/24).  I was running OSPF as an IGP and everything was fine until I found that BGP was not working:

r4#sh ip bgp sum
BGP router identifier 150.1.4.4, local AS number 1
BGP table version is 1, main routing table version 1

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
150.1.5.5       4     2       0       0        0    0    0 never    Active

r5#sh ip bgp sum
BGP router identifier 150.1.5.5, local AS number 2
BGP table version is 1, main routing table version 1

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
150.1.4.4       4     1       0       0        0    0    0 never    Active

I went over the BGP config on both routers and couldn’t find any issues:

r4#sh run | sec bgp
router bgp 1
 no synchronization
 bgp router-id 150.1.4.4
 bgp log-neighbor-changes
 neighbor 150.1.5.5 remote-as 2
 neighbor 150.1.5.5 ebgp-multihop 255
 neighbor 150.1.5.5 update-source Loopback0
 no auto-summary

r5#sh run | sec bgp
router bgp 2
 no synchronization
 bgp router-id 150.1.5.5
 bgp log-neighbor-changes
 neighbor 150.1.4.4 remote-as 1
 neighbor 150.1.4.4 ebgp-multihop 255
 neighbor 150.1.4.4 update-source Loopback0
 neighbor 150.1.4.4 default-originate
 no auto-summary

I issues “clear ip bgp *” multiple times on both sides.  I removed the whole BGP configuration on both routers and then re-added them.  Finally, I reloaded both routers.  I still couldn’t get BGP to work.

I debugged BGP events:

r4#debug ip bgp event
BGP events debugging is on
*Sep 25 16:52:58.743: BGP: Regular scanner event timer
*Sep 25 16:52:58.743: BGP: Import timer expired. Walking from 1 to 1

r4#clear ip bgp *

*Sep 25 16:52:58.743: BGP: Regular scanner event timer
*Sep 25 16:52:58.743: BGP: Import timer expired. Walking from 1 to 1
*Sep 25 16:53:04.371: BGP: reset all neighbors due to User reset
*Sep 25 16:53:04.375: BGP(IPv4 Unicast): will wait 60s for the first peer to establish
*Sep 25 16:53:04.375: BGP(IPv6 Unicast): computed bestpaths, table version wentfrom 1 to 1
*Sep 25 16:53:04.375: BGP(VPNv4 Unicast): computed bestpaths, table version went from 1 to 1
*Sep 25 16:53:04.375: BGP(IPv4 Multicast): computed bestpaths, table version went from 1 to 1
*Sep 25 16:53:04.375: BGP(IPv6 Multicast): computed bestpaths, table version went from 1 to 1
*Sep 25 16:53:04.375: BGP(NSAP Unicast): computed bestpaths, table version went from 1 to 1
*Sep 25 16:53:13.743: BGP: Regular scanner event timer
*Sep 25 16:53:13.743: BGP: Import timer expired. Walking from 1 to 1
*Sep 25 16:53:28.743: BGP: Regular scanner event timer
*Sep 25 16:53:28.743: BGP: Import timer expired. Walking from 1 to 1
*Sep 25 16:53:43.743: BGP: Regular scanner event timer
*Sep 25 16:53:43.743: BGP: Performing BGP general scanning
*Sep 25 16:53:43.743: BGP(0): scanning IPv4 Unicast routing tables
*Sep 25 16:53:43.743: BGP(1): scanning IPv6 Unicast routing tables
*Sep 25 16:53:43.743: BGP(IPv6 Unicast): Performing BGP Nexthop scanning for general scan
*Sep 25 16:53:43.743: BGP(1): Future scanner version: 16, current scanner version: 15
*Sep 25 16:53:43.743: BGP(2): scanning VPNv4 Unicast routing tables
*Sep 25 16:53:43.743: BGP(VPNv4 Unicast): Performing BGP Nexthop scanning for general scan
*Sep 25 16:53:43.743: BGP: Import walker start version 0, end version 1
*Sep 25 16:53:43.743: BGP: … start import cfg version = 0

I did a Google search on “BGP: Import timer expired. Walking from 1 to 1″ and came across a post suggesting the following:

1) You don’t have a route to it.

2) You need ebgp-multihop but haven’t configured it. (If it’s not on a directly connected network or you’re using update-source loopback, you need ebgp-multihop)

3) (Unlikely, I suspect you’d get a different error) It’s not configured to talk BGP to you.

1 – check.  2 – check.  3 – ummm check.

Actually, number 1 was my issue.  Even though I had looked at the OSPF config, I never did my due diligence and actually verified the loopback addresses from each side of the link(s).  When I finally did that, I found my problem:

r5#sh ip route 150.1.4.4
% Subnet not in table
  <-this is a problem  :-)

Although I had glanced at the OSPF configurations, I didn’t notice my problem the first couple of times:

r4#sh run | sec ospf
router ospf 100
 router-id 150.1.4.4
 log-adjacency-changes
 network 155.1.0.4 0.0.0.0 area 0
 network 155.1.4.4 0.0.0.0 area 0  <-DOH!!! 150 not 155!!!
 network 155.1.45.4 0.0.0.0 area 0

r4(config)#router os 100
r4(config-router)#no network 155.1.4.4 0.0.0.0 area 0
r4(config-router)#net 150.1.4.4 0.0.0.0 area 0
r4(config-router)#^Z
r4#
*Sep 25 17:00:39.999: %BGP-5-ADJCHANGE: neighbor 150.1.5.5 Up
*Sep 25 17:00:41.255: %SYS-5-CONFIG_I: Configured from console by console
r4#sh ip bgp sum
BGP router identifier 150.1.4.4, local AS number 1
BGP table version is 1, main routing table version 1

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
150.1.5.5       4     2       2       2        0    0    0 00:00:12        0  <-success!!!!

My OSPF neighbors were established on each router using the router-id which was the same as the loopback address.  I didn’t think the problem through enough to realize that this meant absolutely nothing about the state of the route from each router to the other router’s loopback address.  I had fat-fingered the network address in r4′s OSPF configuration and therefore the network was never advertised into OSPF.  BGP was using the loopback address as the neighbor address.  Since it did not have an IGP route to the loopback, the BGP adjacency never established.  About 45 minutes of head-scratching later, I discovered the problem.

Internetwork Expert advises not to use loopback addresses like 1.1.1.1 (r1) because it is pretty easy for one of the BBC routers to use those types of address and inject some not-so-fun troubles into your lab.  On the same hand, if your loopback addresses are very similar to your active interface networks, it becomes pretty easy to mistype a network statement which will lead to problems like the one that I had.  It also makes it a bit more difficult to find the mistyped statement(s) when you’re quickly trying to troubleshoot.

Next Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 109 other followers