All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Redundant switches
We're having a nice challenge here at the office. We currently have 2 cabinets with one switch each. We have 2 uplinks in total for the 2 cabinets. The switches are linked to each other with a single Ethernet cable so everything keeps working when one of the uplink fails. The switches have no redundant power. And that's there the trouble starts.
We're currently looking at buying two new switches to replace the current ones. The current ones are quite old and need to be replaced. We'd like to go to a situation where the two switches are fully redundant, so we can loose one of them without noticing anything. Without redundant power supplies... (they cost about the same as a switch, which is roughly 1300 euros)
So, our idea was to put one 48-port switch into each cabinet and connect all servers to both switches. But, we would like all servers to have one IP address to handle the traffic and that IP should keep working even if one switch drops (so either both interface would need to listen to that IP or there should be some fail-over). We know ucarp, but that would still result in downtime (though minimal). We've been looking at bonding two interfaces together and we're investigation that right now.
My question is, is there any other way to achieve what we want (perhaps at the switch level)? The switch we're looking at is the HP A5120-48G EI.
Comments
Buy a 3rd switch, keep it as cold spare (configured and ready). Accept that once ifter 5-6 years one of the switches will fail and there will be some downtime when you will have to turn on the spare switch and rewire everything to it.
Not many switches support multi-chassis LAG. And the added complexity might result in more downtime overall, not less.
Not many switches support multi-chassis LAG. And the added complexity might result in more downtime overall, not less.
That's a plan, but we would still need to have someone plug all the cables. ucarp is faster (I believe we get 10 seconds of downtime max. on our MooseFS cluster).
you can do failover (bond) in linux when connecting one node to multiple switches
Yep, that's what we're trying. I'm just wondering if there's anything else we could try (perhaps at the switch level).
This: http://en.wikipedia.org/wiki/MC-LAG
I don't know this specific HP switch (I am not very familiar with HP), but I suggest buying switches with some kind of hitless failover functionality, such as Brocade switches. That combined with failover on the server connection to the switches, like an active-backup bond or STP.
vrrp.
The only way I can think of solving it would be if all servers have atleast 2 nics and that you can team each pair.
Then connect one wire to each switch and enable spanning tree.
If you look closely there are some cheap options with redundant PSUs.
Do you need Layer3? this makes it really expensive.
If no or only very basic, get a Force10 S25 (not the S25P which has 24x SFP and only 4x Ethernet) - These have redundant PSUs (though not hot swapable), Good HW (Made in Taiwan, not China), FTOS is also very similar to Cisco except the much better vlan handling.
There are 2 for sale on eBay, lightly used, for 500US$ each:
http://www.ebay.com/itm/FORCE10-S25-01-GE-24V-24x-10-100-1000-RJ45-PORTS-4x-SHARED-SFP-L3-SWITCH-USED-/251211177975?pt=US_Network_Switches&hash=item3a7d5a5ff7
You also need a stacking module, i recommend the 40G one as it supports the entire backplane and a 2x10G XFP module:
http://www.ebay.com/itm/Force10-S50-01-24G-1S-24G-STACKING-MODULE-S50N-S50V-S25-S25V-QUANTITY-/290832461624?pt=US_Network_Switch_Modules&hash=item43b6f72f38
170$ each, you might need to source for a longer cable (~100US$).
(You can add 2 expansion cards in a S25, so you could add a second 40G stacking card to each and stack indefinitely (backplane becomes shared over 3 switches) or use a 3rd one redundant, or connect them with 2 cables for additional capacity/redundancy)
10G module if you need one costs around 400$ (2x XFP per module, modules not included - SX/550m is around 130$ each, LX/10km 500$+ depending on wavelenght and range)
Then conf it like your current ones are, you might need some vlan magic for the failover, and et voila, redundant everything for not even 2000$ total.
Funny enough, the much less capable HP 2810s cost like 50 euros more and they are plain Layer 2. So this is a fantastic offer for such a switch.
We're specifically looking for new switches with warranty. It's not something I get to decide, unfortunately, otherwise I'd have picked up two Force10s already. Heard a lot of good things about them Thanks for the great suggestion anyway.
Thanks everybody for the suggestions so far. I'll look into all of them If anyone has another idea, let me know!
One thing to note about failover, the switch has to completely crap out (link down) for OS to switch from primary to secondary. If the switch stops forwarding packets then nothing happens. I somehow managed to make this happen once, required the DC to pull the power on the failed switch.
If you require really low downtime, Juniper/Cisco both should have virtual chassis switches that support LACP across two members. LACP sends link state packets to verify the link all the time, so this would lower any potential downtime.
The easy setup, as suggested, is a stack of two switch and then ask your upstream provider to etherchannel/lag lacp the 2 links.
The alternative would be to setup a proper spanning tree configuration between all the switch involved to prevent loops:
isp-switch-a <---> YOUR-switch-a <---> YOUR-switch-B <---> isp-switch-b
You can then connect your servers to both your switch in active/passive bonding/teaming
Force10 is available new as well (paid 1700EUR per S25P, should be much less on a normal one) and the warranty is handled by Dell...
Interesting. I'll give our Dell supplier a call this week if they can provide me a price. Thanks!
[Offtopic]
Just curious, does anyone know if the Force10 S25N support ingress rate limiting with CIR/CBS?
[/Offtopic]
@eLohkCalb I haven't used these switches, but usually ingress rate limiting is supported even by the almost braindead managed switches. Seems it is the egress limiting that is more complex to implement. And the Force10 should not be braindead at all.
Yes it is (in+egress), requires L3 image.
We using Nexus 2000 Fabric Extender (ToR) and 5000 with vPC+ here and it works very well
@rds100, I do have problems in believing those marketing oriented datasheet, and sometimes the technical documents fail to clarify that point as well.
My issue with this rate-limit is that I usually need to restrict the outgoing traffic from servers, and hence ingress is more important than egress. Some of these switches implement it using one rate with color-blind metering, and that will fall short when you put it into practice.
@William, thanks for the info.