Page 1 of 1

Replacing switches in a stack

Posted: 15 Jan 2014 09:08
by Th0mas
Hey guys,

I'm quite familiar with Cisco switches, but these Alcatal-Lucent boxes leave me puzzled! We have a stack of 5 switches, of which two died after a power interruption. We got replacements from Alcatel and removed the dead switches, putting in the new. Now I have the following issue:
  • The switches are numbered correctly on the front panel, from 1 to 5.
  • show module and show chassis recognise all 5 switches in the stack.

Code: Select all

gemeentehuis-sw-01-> show module
                                    HW    Mfg
Slot     Part-Number     Serial #   Rev   Date      Model Name
-------+--------------+------------+---+-----------+--------------------------------
CMM-3    903177-90      P1181604     05  MAR 16 2013  6450 48 PORT COPPER GE POE
CMM-4    903007-90      P0486172     07  JAN 28 2013  6450 24 PORT COPPER GE
NI-1     903177-90      P4080271     06  OCT 05 2013  OS6450-P48
NI-2     903177-90      P1181798     05  MAR 17 2013  6450 48 PORT COPPER GE POE
NI-3     903177-90      P1181604     05  MAR 16 2013  6450 48 PORT COPPER GE POE
NI-4     903007-90      P0486172     07  JAN 28 2013  6450 24 PORT COPPER GE
NI-5     903173-90      P2486721     05  JUN 28 2013  6450 24 PORT COPPER GE

gemeentehuis-sw-01-> show chassis

Chassis 1
  Model Name:                    OS6450-P48,
  Description:                   48 POE 10/100/1000 + 2 10G  + 2 1/10G STK/UPLINK,
  Part Number:                   903177-90,
  Hardware Revision:             06,
  Serial Number:                 P4080271,
  Manufacture Date:              OCT 05 2013,
  Admin Status:                  POWER ON,
  Operational Status:            UP,
  MAC Address:                   e8:e7:32:a3:7e:f8,

Chassis 2
  Model Name:                    6450 48 PORT COPPER GE POE,
  Description:                   6450 48 PORT COPPER GE POE,
  Part Number:                   903177-90,
  Hardware Revision:             05,
  Serial Number:                 P1181798,
  Manufacture Date:              MAR 17 2013,
  Admin Status:                  POWER ON,
  Operational Status:            UP,
  MAC Address:                   e8:e7:32:7c:43:28,

Chassis 3
  Model Name:                    6450 48 PORT COPPER GE POE,
  Description:                   6450 48 PORT COPPER GE POE,
  Part Number:                   903177-90,
  Hardware Revision:             05,
  Serial Number:                 P1181604,
  Manufacture Date:              MAR 16 2013,
  Admin Status:                  POWER ON,
  Operational Status:            UP,
  MAC Address:                   e8:e7:32:7c:15:b0,

Chassis 4
  Model Name:                    6450 24 PORT COPPER GE,
  Description:                   Virtual Chassis,
  Part Number:                   903007-90,
  Hardware Revision:             07,
  Serial Number:                 P0486172,
  Manufacture Date:              JAN 28 2013,
  Admin Status:                  POWER ON,
  Operational Status:            UP,
  Number Of Resets:              6
  MAC Address:                   e8:e7:32:76:65:c2,

Chassis 5
  Model Name:                    6450 24 PORT COPPER GE,
  Description:                   6450 24 PORT COPPER GE,
  Part Number:                   903173-90,
  Hardware Revision:             05,
  Serial Number:                 P2486721,
  Manufacture Date:              JUN 28 2013,
  Admin Status:                  POWER ON,
  Operational Status:            UP,
  MAC Address:                   e8:e7:32:90:ab:e0,
  • However, I can not configure the switch ports in module 1 and 5:

Code: Select all

gemeentehuis-sw-01-> show vlan port 1/2
ERROR: Port does not exist

gemeentehuis-sw-01-> show vlan port 2/2
  vlan     type      status
--------+---------+--------------
     1    default   forwarding

gemeentehuis-sw-01-> show vlan port 3/2
  vlan     type      status
--------+---------+--------------
     1    default   forwarding

gemeentehuis-sw-01-> show vlan port 4/2
  vlan     type      status
--------+---------+--------------
     1    qtagged   forwarding
     2    qtagged   forwarding
     3    qtagged   forwarding
   999    default   forwarding

gemeentehuis-sw-01-> show vlan port 5/2
ERROR: Port does not exist
Where did I go wrong? How do I fix this, preferrably without rebooting the entire stack as it also connects servers and their storage. Rebooting would mean I will have to first power down all my virtual machines, power down the ESXi hosts and then reboot the switch.

Also, how can I view the contents of flash memory for a module other than the primary unit?

Re: Replacing switches in a stack

Posted: 15 Jan 2014 11:13
by devnull
I guess you used 6.6.3.372 (GA) or somthing smaller than 6.6.3.431 on your 6450?
With theses releases there is known a known bug, that sometimes creates flash problems resulting in a corrupted flash (kbase.img missing or 0 bytes).

So you should upgrade to latest release..

What image/miniboot/fpga did the new switches came with?
please post
"show ni"
"show stack topology"

Did you downgrade before (my current switches all came with 6.6.4, which required imho a fpga and miniboot downgrade , before i could integrate them.
Did you set the boot slot number?

Is there anything in the logs?

You should (as of hardware guide) replace one switch after the other,not all at once.

What can you monitor on the console, when you reboot e.g. slot 5 (aka power down)

Access to Remote stackmember:

Have a look at Stack member 2:
rls 2 /flash

Have a look at Stack member 3:
rls 3 /flash

delete a file on a stackmember:
rrm 3 KFfpga.upgrade_kit
rrm 2 /flash/working/xyz

Re: Replacing switches in a stack

Posted: 16 Jan 2014 10:34
by Th0mas
Many thanks for your reply!

We use the exact version you mention:

Code: Select all

gemeentehuis-sw-01-> show system
System:
  Description:  Alcatel-Lucent 6450 24 PORT COPPER GE 6.6.3.372.R01 GA, May 14, 2012.,
...
However, it doesn't seem to be the bug that you mention. The images exist and have the exact same file size as the images of a working module (#2).

Code: Select all

gemeentehuis-sw-01-> rls 1 /flash/working
drw         2048  Dec  4  2013   ./
drw         2048  Dec  4  2013   ../
-rw     15510736  Dec 31  2000   KFbase.img
-rw      2325506  Dec 31  2000   KFdiag.img
-rw      5083931  Dec 31  2000   KFeni.img
-rw      2511585  Dec 31  2000   KFos.img
-rw       597382  Dec 31  2000   KFsecu.img
-rw          705  Dec 31  2000   software.lsm
-rw         3562  Dec  3  2013   boot.cfg

gemeentehuis-sw-01-> rls 2 /flash/working
drw         2048  Dec  4  2013   ./
drw         2048  Dec  4  2013   ../
-rw     15510736  Dec 31  2000   KFbase.img
-rw      2325506  Dec 31  2000   KFdiag.img
-rw      5083931  Dec 31  2000   KFeni.img
-rw      2511585  Dec 31  2000   KFos.img
-rw       597382  Dec 31  2000   KFsecu.img
-rw          705  Dec 31  2000   software.lsm
-rw         3562  Dec  3  2013   boot.cfg

gemeentehuis-sw-01-> rls 5 /flash/working
drw         2048  Dec  4  2013   ./
drw         2048  Dec  4  2013   ../
-rw     15510736  Dec 31  2000   KFbase.img
-rw      2325506  Dec 31  2000   KFdiag.img
-rw      5083931  Dec 31  2000   KFeni.img
-rw      2511585  Dec 31  2000   KFos.img
-rw       597382  Dec 31  2000   KFsecu.img
-rw          705  Dec 31  2000   software.lsm
-rw         3562  Dec  3  2013   boot.cfg
The stack topology shows no errors.

Code: Select all

gemeentehuis-sw-01-> show stack topology
                                         Link A  Link A          Link B  Link B
NI      Role      State   Saved  Link A  Remote  Remote  Link B  Remote  Remote
                          Slot   State   NI      Port    State   NI      Port
----+-----------+--------+------+-------+-------+-------+-------+-------+-------
   1 IDLE        RUNNING    1    UP          2   StackB  UP          5   StackA
   2 IDLE        RUNNING    2    UP          3   StackB  UP          1   StackA
   3 SECONDARY   RUNNING    3    UP          4   StackB  UP          2   StackA
   4 PRIMARY     RUNNING    4    UP          5   StackB  UP          3   StackA
   5 IDLE        RUNNING    5    UP          1   StackB  UP          4   StackA
The show ni command, however, does show an issue. The operational status of modules 1 and 5 are "Unpowered". As the output of the command is very long, I've only included the output of module 5:

Code: Select all

Module in slot 5
  Model Name:                    6450 24 PORT COPPER GE,
  Description:                   6450 24 PORT COPPER GE,
  Part Number:                   903173-90,
  Hardware Revision:             05,
  Serial Number:                 P2486721,
  Manufacture Date:              JUN 28 2013,
  Firmware Version:              ,
  Admin Status:                  POWER ON,
  Operational Status:            UNPOWERED,
  Power Consumption:             40,
  Power Control Checksum:        0xfd37,
  CPU Model Type   :             ,
  MAC Address:                   e8:e7:32:90:ab:e2,
  ASIC - Physical 1:             ,
  FPGA - Physical 1:             ,
  UBOOT Version :                ,
  UBOOT-miniboot Version :       ,
  POE SW Version :               n/a
    Daughter Board         1
      Model Name:                    6450 2 PORT SFP+ MOD 10G,
      Description:                   6450 2 PORT SFP+ MOD 10G,
      Part Number:                   903040-90,
      Hardware Revision:             05,
      Serial Number:                 P0983059,
      Manufacture Date:              FEB 28 2013,
      Firmware Version:              ,
      Admin Status:                  POWER ON,
      Operational Status:            DOWN
      MAC Address:                   e8:e7:32:78:e9:3a,
      ASIC - Physical 1:             ,
Finally, to answer some of your questions:
  • I did not upgrade/downgrade the new switches. I faultly assumed a plug & play scenario where the current stack would push the correct software version to the new members.
  • I changed the boot slot numbers and the front LED's indicate the correct numbers from 1 till 5.
  • I added both new members at the same time. I must have read the documentation not carefully enough as I did not read the note about not adding two members at the same time.
I'll upload some logs after a reboot of module 1 or 5 later.

Re: Replacing switches in a stack

Posted: 16 Jan 2014 11:13
by devnull
In general it should work when adding 2 at a time, still documentation states you should not do that.

A Software will be pushed to the member by the stack master, but this is limited to the AOS, not e.g. fpga software or miniboot.

At least a wrong fpga software will stop the stack from communicating correctly, but in this case it probably should not even show up there. (unsure about that)
As it does not show anything within ni
for Asic/FPGA/Miniboot shows something is wrong.

The Bug i mentioned was probably the cause why you needed to have the replacement at the first time.. as it mainly shows up after a power failure.

I would:
reboot or remove slot 5 from the stack, power it up seperately and check
show ni with respect of FPGA/miniboot/...

I would definatley plan an update to either 6.6.3.520 or 6.6.4.208 (requires miniboot/fpga update) and not stay on the (known buggy) 372. as the flash corruption can happen again taking out further switches after a power cycle.

maybe build a stack of two just out of the two replacement switches and/or update this ones first.

Cheers