Best Practice: Restart Service switchd on Cumulus MLAG-Pair Switch
Sometimes we/user want to restart service switchd on the one of MLAG-Pair switch Cumulus, or maybe testing to cut off the peerlink between that MLAG-Pair switch (like UAT or similar activity). But if we don’t know who is the primary switch on that MLAG-Pair, we could get into trouble like the Host will disconnected from one of the MLAG-Pair switch when we do the restart service switchd.
So before we can restart service switchd (sudo systemctl restart switchd.service) one of that MLAG-Pair Switch, first we can check the status of the CLAG status to confirm whose right now be the primary switch.
You can use this command: net show clag
In sample above, we can see the switch CLAG status that we access and the peer switch CLAG status after we execute net show clag command. After we execute in the switch we access, this switch priority (our priority) is set by 1000, the peer switch is set by 2000. So the status of this switch is Primary, the peer switch is secondary.
If we want to restart service switchd.service of this primary switch (after we make sure the CLAG status just like before), or you want to cut off the peerlink of the pair-switch, you must change the primary switch to become secondary switch first. Because the primary switch is controlling everything like traffic from host below this pairing-switch, MAC-CLAG, etc. If you still insisted restart the service in the switch with primary CLAG status, the traffic from host below cannot pass through via primary switch and secondary switch. You can say the brain (primary switch) of this pairing switch is still restarting the proses switching, but the body is still active. And secondary switch will not take over the primary role because that behaviour. That’s why the host traffic can be disrupted because the host still sending the traffic to the primary switch and secondary switch but the primary switch not prepared to accept the traffic.
You can change the primary role in the switch by change the value of the priority using this command:
- In Cumulus v4.x, use this command: net add interface peerlink.4094 clag priority (higher_priority_value_than_secondary_switch)
- In Cumulus v5.x, use this command: nv set mlag priority (higher_priority_value_than_secondary_switch)
When you restart the switchd.service of secondary switch, there is no interupted traffic from host to the switch, because the primary still can control the situation.
Note that IF the primary switch is restarted by physically, like sudo reboot, or take out the power cord of the switch, the HA mechanism of this pair-switch is still running. Means that you don’t need to worry of the traffic of the host below the pair switch, if the primary switch is being restart/power off physically.