Problem reforming a VFL stack
Posted: 11 Mar 2024 05:28
I'd like to start a discussion to perhaps get an explanation of a serious problem that happened to us on a production site.
Let me give you the background:
- Two OS6900-X72 stacked in VFL
- several remote buildings, each with a stack of two OS6860E-U28s as the heart of the building (an optical link between the two 6900s).
Large configuration set up on the central core.
After an electrical shutdown of unit 1 of the 6900 stack, it doesn't restart (problem with the flash, there's no OS left in it).
We decided to replace it with a Spare we had on our premises.
We reconfigured the VFL identically to the one that had failed (identical priority, identical chassis-group, identical VF-Link port, identical code version).
When the stack was rebuilt, it restarted correctly (a normal behavior until then). Except that, instead of retrieving the configuration of the now master switch (unit 2), it was the new switch that pushed its blank configuration to unit 2.
In my opinion, this shouldn't have happened. It's the master that should have pushed its configuration to the newcomer...
The mistake we may have made was to set the new switch to the same priority as the faulty one. Initially, it had priority 200, against priority 100 on chassis 2.
chassis id 1 = priority 200 (new switch)
chassis id 2 = priority 100 (slave became master following chassis 1's electrical shutdown)
Could you give us an explanation?
Thanks in advance for your reply.
Let me give you the background:
- Two OS6900-X72 stacked in VFL
- several remote buildings, each with a stack of two OS6860E-U28s as the heart of the building (an optical link between the two 6900s).
Large configuration set up on the central core.
After an electrical shutdown of unit 1 of the 6900 stack, it doesn't restart (problem with the flash, there's no OS left in it).
We decided to replace it with a Spare we had on our premises.
We reconfigured the VFL identically to the one that had failed (identical priority, identical chassis-group, identical VF-Link port, identical code version).
When the stack was rebuilt, it restarted correctly (a normal behavior until then). Except that, instead of retrieving the configuration of the now master switch (unit 2), it was the new switch that pushed its blank configuration to unit 2.
In my opinion, this shouldn't have happened. It's the master that should have pushed its configuration to the newcomer...
The mistake we may have made was to set the new switch to the same priority as the faulty one. Initially, it had priority 200, against priority 100 on chassis 2.
chassis id 1 = priority 200 (new switch)
chassis id 2 = priority 100 (slave became master following chassis 1's electrical shutdown)
Could you give us an explanation?
Thanks in advance for your reply.