Trust but Verify: How to Monitor SD-WAN

by Alec Pinkham

October 11, 2018

SD-WAN is a popular topic on every blog, news site, and at every conference today, but much of the chatter deals with the implementation benefits. As ONUG’s own Nick Lippis mentions in his post about the top 5 topics IT will be talking about, SD-WAN is entering a new generation that’s focused on scale, security, and performance. With this new generation, IT needs a new understanding of how to monitor SD-WAN effectively for the long term, bearing in mind nuances around the technology and why traditional methods won’t suffice.

True End-to-End Monitoring

It’s clear that device-based metrics can’t be gathered from the WAN due to device ownership. That remains true with SD-WAN which is why so many companies that are implementing solutions rely on performance metrics from the vendor. This typically includes bandwidth, latency, loss, and jitter statistics for the WAN connections. If the SD-WAN solution provides this data, then why do you need additional monitoring? They fail to include LAN connections between users and the SD-WAN device.

Depending on the organization, the LAN can be straightforward or immensely complicated. As companies grow, so does the number of network tools, devices, and skeletons stashed in IT’s closet. Voice and data traffic needs to traverse all of this internal infrastructure before it even reaches the SD-WAN connection. When problems arise it is common to blame the newly-implemented SD-WAN for poor performance, but, in all fairness, they–and you–may not be able to see metrics on the LAN to effectively troubleshoot where an issue stems from. The complexity that arises requires a combination of packet, flow, path and synthetic monitoring to cover these new environments from the entire network path from source to destination.

Active Monitoring

A popular use case for SD-WAN is migrating from private networks such as MPLS to bonded commercial internet connections. One important factor that many new customers don’t realize is that compared to MPLS, the commercial internet is very dynamic. SD-WAN relies on accurate sizing information to send traffic. Looking at the bandwidth that the ISP provides is one thing, but measuring the end-to-end capacity between offices will likely tell a different story. SD-WAN is not sending traffic for performance testing but measuring performance on the traffic from end-users which means IT has to wait for a problem to be experienced before it can even detect. With continuous monitoring, IT can see poor performance trends and be proactive.

Unfortunately, most SD-WAN configurations have static input fields for bandwidth in which IT enters what was paid for, instead of what is currently available. This value is often used to calculate QoS and TCP segment size and can lead to issues where SD-WAN logic is affected when utilized capacity spikes or available capacity falls.

When looking at monitoring provided by SD-WAN vendors you’ll also likely note that all traffic is reported as one value, but the difference between voice and data traffic can be stark. Network issues that affect the large packets of data traffic may leave small voice packets unaffected or vice versa. Actively testing for the performance of both allows IT to see early indications of issues and identify which applications will be impacted.

Be Wary of the Hub & Spoke

One final constraint of SD-WAN network monitoring is the fact that many are implemented in a hub and spoke design that uses tunnels to route traffic through central infrastructure. As expected, this can create hairpin turns where traffic travels along a route to a central location only to return on a similar route to a location close to the origin. While content delivery networks can compound this problem there are additional hops between the SD-WAN cloud point and the application. The main issue is that all of this is invisible to the user and the metrics shown by SD-WAN vendors need to be aggregated and correlated with other data to fully comprehend the end-to-end path.

This issue is not limited to SD-WAN; it’s also seen in Cloud Access Security Brokers (CASBs) as well. Monitoring through and outside of SD-WAN or CASB services may allow IT end-users to identify these hairpins and watch them closely.

Trust but Verify

Over the past few years, the IT market has seen software-defined everything, but SD-WAN has been a solid thread through all of the noise remaining singular in purpose–optimize connections over the infrastructure you don’t own. Whether you are employing public, hybrid, or private clouds, the networks that IT is managing every day are getting more complex. These are just a few proof points of why you’ll want to validate the performance and implementation of SD-WAN to ensure the best experience for you and your end-users.

Author's Bio

Alec Pinkham

Director of Product Marketing, AppNeta