Starting this week, the M&A and SD-WAN 2.0 Working Group calls will include both vendors and IT executives. To bring everyone in the ONUG Community up to date, here are summaries for what IT executives in each working group have been discussing in the first three calls this year. Anyone who would like to participate in either working group is encouraged to join. The next step is to prioritize the key use cases to concentrate on and then flesh out the detailed functional requirements for each.
While there are certainly many outstanding issues related to managing private networks and data centers, the M&A Working Group focus for 2019 is hybrid multi-cloud environments with workloads executing in multiple public and private clouds. This mirrors the focus of the SDSS Working Group, which should lead to synergies in terms of creating common testbed infrastructure for integrating the reference solutions for both groups.
Although there are different types of workloads, for 2019 the M&A group agreed to focus specifically on container-based workloads, since these have become the de facto standard building block for microservices.
Application topology mapping:
Distributing microservices implemented using containers across a hybrid multi-cloud environment leads to the problem of gaining visibility into where these workloads are executing and the “application pathways” between them through the underlying network infrastructure. It’s also necessary to identify workload dependencies on the underlying compute and network infrastructure. There is an acute need for monitoring tools that can do this automatically and track changes to the topology and inter-dependencies in real-time. Graphic visualizations will eliminate the need to do this manually using Visio, which is cumbersome and static.
There is a need to monitor not only the host infrastructure supporting container workloads but also provide monitoring for individual containers. Container-specific visibility is a big gap that needs to be addressed.
Extracting metrics, normalizing and tagging data:
Monitoring requires instrumentation to generate metrics and efficient protocols for collecting data. This data needs to be normalized by data type to account for variations in data formats across multiple implementations and environments. It is also critical that normalized monitoring data be tagged appropriately BEFORE it is analyzed and stored. Proper tagging of metric data ensures the ability to rapidly perform the necessary correlations between data collected from different parts of the hybrid multi-cloud infrastructure.
Scaling the monitoring infrastructure:
Monitoring in a hybrid multi-cloud environment places a premium on the scalability of the underlying monitoring infrastructure. There are more metrics being collected and at a higher volume from across many different domains in the environment. Streaming telemetry is essential for real-time analytics but can overload ingest at the analytics engine and storage repository.
Single pane of glass for multiple dashboards:
Everyone deals with the “swivel chair problem” of too many screens. Better to provide access to multiple dashboards via a single screen that can be accessed by multiple operators for different use cases. For example, integrate the tools for monitoring external-facing systems with internal infrastructure monitoring. This would simplify correlating issues external users are experiencing with internal problems that are the root cause. There are many other examples.
Microservices depend on other microservices in the hybrid multi-cloud infrastructure but also on external services such as DNS and other services provided by third parties (such as public cloud providers). Quality of user experience becomes a function of the performance of not just each individual microservice but the end-to-end performance of all microservices and external services strung together. Poor performance can result in violation of customer SLAs and therefore it is necessary to be able to pinpoint the offending microservice or external service provider.
Active / synthetic monitoring:
The philosophy for external services is “trust but verify”. Continuous active / synthetic monitoring of these services can be used to check their health and potentially flag issues before they are manifested by negatively impacting the user experience.
SD-WAN monitoring issues:
SD-WAN specific monitoring issues include tracking connection utilization and congestion for traffic engineering and capacity planning. There is also the need to map and track the virtual connections across the SD-WAN to dependencies in the underlying network infrastructure. SD-WAN monitoring metrics also need to be tagged properly in order to perform the necessary correlations.
Multi-vendor SD-WAN integration (not interoperability):
IT executives acknowledge the reality that SD-WAN multi-vendor interoperability is not happening (yet), but they still have to deal with the following issues:
The group discussed strategies for SD-WAN interworking or integration that involve:
SD-WAN integration into hybrid multi-cloud environments:
A major focus of the SD-WAN 2.0 Working Group this year will be integration of SD-WANs into hybrid multi-cloud environments. There are a whole host of issues that need to be addressed, including:
This involves addressing issues at multiple layers, spanning both SD-WAN and the cloud:
And of course there are security requirements that span all three layers.
SD-WAN security and compliance issues:
The group has discussed a number of security-related issues that need to be addressed when moving to an SD-WAN from a private network (based on MPLS). There are potential security gaps that arise and pitfalls that may go unnoticed. Some are purely a function of adopting an SD-WAN. Others result from integrating SD-WANs with public cloud services. Here are some of the issues discussed:
SD-WAN deployment best practices:
The group would also like to help develop a set of operational best practices for SD-WANs, along these lines: