Wednesday, December 23, 2015

When to ECMP over Edge HA

I’ve been around IT long enough to remember when Data Center Network designs resembled those of Campus Networks. The two design methodologies started to diverge when it became apparent that, given the same outage duration, negative impact to the business was greater at the Data Center than at the Campus. A Campus user had to wait 50 seconds for a page to load? The user probably didn’t even notice. That a Data Center web server couldn’t serve webpages for 50 seconds? Somebody’s job might be in the chopping block.

Today’s rule of thumb is that unplanned Data Center Network disruption must be kept to a minimum. With that goal in mind, many of the Network protocols have been tweaked to provide fast Network re-convergence in the event Network components fail. For example, Spanning Tree (STP) gave way to Rapid Spanning Tree (RSTP) and recovery time when from about 45 seconds to a few seconds.

So where am I going with this? Well, NSX has a component called the NSX Edge. The NSX Edge is a Virtual Appliance that does NetworkFunction Virtualization (NFV). One of the nice features of the NSX Edge is that you can deploy it in pairs of Active/Passive. The two NSX Edges will have identical configuration, with the passive NSX Edge monitoring the up-state of the Active NSX Edge, and the Active NSX Edge feeding state tables and session information back to the Passive. If the Active NSX Edge goes down, the Passive NSX Edge takes over the role of Active. This feature of the NSX Edge is called Edge High Availability, or HA (not to be confused with vSphere HA, that is a different concept).

Edge HA supports a dead-timer (Declare Dead Time) as small as 6 seconds (although VMware’s latest version, V3, of the NSX Reference Design Guide encourages you to not go below 9 seconds). Assuming a HA dead-timer of 6 seconds, plus a second or two for the Passive to be fully functional as the Active, you can expect to have the NFV service unavailable, during unplanned outages, for less than 10 seconds.

Some of the NFV services the NSX Edge supports are routing protocols (OSFP, BGP, IS-IS), NAT, Load Balancer and stateful L3/4 Firewall. I like to think of these services as asymmetrical and symmetrical. An asymmetrical service would be routing. Traffic can come in one NSX Edge and return via a different path. Symmetrical services, such as NAT or Load Balancer, require that traffic return via the same NSX Edge, as there is some mapping table or session state that needs to be consulted for the traffic to go on its way.

For example, NSX Edge Cabaña performs a destination NAT to traffic from Brugal destined for Virtual Machine Piña Colada. If Piña Colada’s return traffic goes up it’s default gateway, NSX Edge Hamaca, then there will be no one to reverse the destination NAT that Cabaña did. When Brugal receives the traffic, it will drop the traffic as it won’t recognize the source IP as one it is communicating with.


By the way, I strongly discourage you from using the design in the above image.

Let’s talk a bit about routing. The NSX Edge supports low keepalive and dead timers for its routing protocols (keeping in line with low recovery time for Data Centers). In the case of OSPF, you can go with 1 and 3. So after 3 seconds, the OSPF neighbor adjacency gets removed and its entries get flushed from the routing table. The NSX Edge also supports having up to 8 different paths in the routing table for the same network. This is called Equal Cost Multipathing (ECMP). This ECMP support works for OSPF, BGP and IS-IS. In the case of BGP, the NSX Edge kind of disregards the BGP standard and adds multiple paths based on the number of BGP Peers it has advertising the same network.

So a question that I’ve been asked a few times is this one: Should I deploy two NSX Edges running a routing protocol and use ECMP or should I deploy a single NSX Edge with Edge HA?

Well, the answer depends on what other NFV service is that NSX Edge providing. If the NSX Edge is providing a symmetrical service, my advice is to go with HA. If the NSX Edge is only providing an asymmetrical service (routing), then go with multiple NSX Edges and turn on ECMP in the upstream and downstream NSX Edges or routers.

To explain the logic behind it, have a look at the diagram below. NSX Edges Cabaña and Hamaca are running OSPF with timers of 1 and 3 in the internal side. The NSX Edge Barcelo is doing L3/4 stateful firewalling as well as running OSPF with ECMP enabled.


Traffic coming from Piña Colada will be load shared (ECMP) by Barcelo between Cabaña and Hamaca. If either Cabaña or Hamaca goes down, Barcelo will detect the event in about 3 seconds, remove all routes from the downed NSX Edge in its routing table, and send all traffic via the other NSX Edge. And by the way, when the NSX Edge goes down, only about half the traffic gets affected.

Alternatively, if instead of two NSX Edges, Cabaña and Hamaca, you had just one with Edge HA, when that NSX Edge goes down all traffic will be affected. Assuming an Edge HA dead timer of 6 seconds and OSPF dead timer of 3 seconds it would be just under 10 seconds before traffic flow is restored.

.elver

No comments:

Post a Comment