Page 1 of 1

Problem reforming a VFL stack

Posted: 11 Mar 2024 05:28
by fripouille
I'd like to start a discussion to perhaps get an explanation of a serious problem that happened to us on a production site.

Let me give you the background:
- Two OS6900-X72 stacked in VFL
- several remote buildings, each with a stack of two OS6860E-U28s as the heart of the building (an optical link between the two 6900s).
Large configuration set up on the central core.

After an electrical shutdown of unit 1 of the 6900 stack, it doesn't restart (problem with the flash, there's no OS left in it).
We decided to replace it with a Spare we had on our premises.

We reconfigured the VFL identically to the one that had failed (identical priority, identical chassis-group, identical VF-Link port, identical code version).
When the stack was rebuilt, it restarted correctly (a normal behavior until then). Except that, instead of retrieving the configuration of the now master switch (unit 2), it was the new switch that pushed its blank configuration to unit 2.

In my opinion, this shouldn't have happened. It's the master that should have pushed its configuration to the newcomer...

The mistake we may have made was to set the new switch to the same priority as the faulty one. Initially, it had priority 200, against priority 100 on chassis 2.
chassis id 1 = priority 200 (new switch)
chassis id 2 = priority 100 (slave became master following chassis 1's electrical shutdown)

Could you give us an explanation?
Thanks in advance for your reply.

Re: Problem reforming a VFL stack

Posted: 12 Mar 2024 10:20
by Gleylancer
The explanation is that the switch with the higher priority has the higher priority. :shock:

Since a reboot of all the switches initates a master election, the higher priority chassis will become the master and overwrite all the others. This is normal behavior and explained in the switch management guide.

Re: Problem reforming a VFL stack

Posted: 13 Mar 2024 03:04
by silvio
Hi fripouille,
yes - this shouldn't happen. The master (your chassis 2) should send the config to the new device. Most important thing is that the uptime of the new switch (chassis 1) should be longer than the uptime of the actual master (ch 2).
BR Silvio