by Nick Lippis
The two largest barriers of entry of the software-defined revolution are skills and tools. The Software-Defined Enterprise will not become a reality unless there are new monitoring and analytic tools as well as new infrastructure DevOps engineers with the skills to put the tools to work. At ONUG, we’ve seen the lack of uptick in data center overlay deployments for this very reason; there is currently no visibility of underlay/overlay and especially trouble event causality. And don’t expect VMware, Docker, Amazon, Microsoft, Cisco, et al. to provide comprehensive software-defined management tools either, as they focus on delivering specifically tools for their offerings. So as you can see, IT is forced to integrate a wide set of tools to deliver lifecycle management for the Software-Defined Enterprise. Still, there is some really good news here too. There are a large number of vendors offering tools with machine learning and AI integrated or road mapped. Companies like Solarwinds, NetScout, InfluxDB, Moogsoft, ServiceNow, SignalFX, Sensu, Splunk, Elastic, Wavefront, Nagios, New Relic, Gigamon, xMatters, PagerDuty, and others are not tied to cloud or enterprise providers.
While the challenge of integrating a wide set of tools to deliver monitoring and analytics for workload management in private and public cloud software-defined infrastructure is daunting, the reality is that it’s the only choice. According to data from the ONUG Community, the number of end points most large enterprises support is in the 10K-100K range and will grow either moderately or, as 30% of the community indicated, will grow significantly or explosively. IoT will help more enterprises fit into that significantly growth category.
Factor in that, by and large, there is a lack of end-to-end visibility across existing IT infrastructure caused by monitoring silos and a lack of complete monitory visibility across the IT stack thanks to monitoring gaps. Now think of a software-defined appliance like a load balancer that is sandwiched between workloads and infrastructure. This appliance is the glue in the application dependency map and if it starts to fail, it could start a chain reaction that begins with sending log files until the log database disks are filled, crashing the database, which ultimately crashes the VM controllers, ADC controllers, and load balancer, sending workload offline until a resolution is reached. This happens all the time!
This scenario is easy to understand but tricky to identify, as it impacts the entire application dependency map, which is not entirely visible thanks to gaps in software-defined management tools. But wait it gets worse, consider the workload that is distributed between private and public cloud where there are no consistent tools. Or a workload that is hosted within in a container that experiences performance degradation only for the container to be killed and instantiated in another locale. Multiply this last scenario by 10, 100, 1,000, 10,000, etc. and its management is beyond human processing; we are into the domain of machine learning and AI.
So what’s the good news? There is a lot actually. First, at ONUG Fall 2016, the 750 attendees indicated that they spend well over $1.5B on infrastructure tools. The broader market is significantly larger than $1.5B. Moreover, in their keynote sessions, Chris Drumgoole, CTO at GE, and Gene Sun, Vice President, Global Network and Communication Services at FedEx, both said that they are using tools to transition workload to cloud based infrastructure. In short, they will start to fund new tools that support monitoring, analytics, machine learning, and AI for software-defined infrastructure management at the expense of older tools. The key point here is that existing spend on old tools is being reallocated to a new set of tools. Interpretation is, limited new budget needed!
Now, while integrating a wide set of tools – of which some may be open sourced, closed sourced, cloud-based and/or enterprise vendor-based – is difficult, the big upside here is that control and options shift toward enterprise IT. That is, if cloud-based software-defined infrastructure management is built with 3rd party tools that are not tied to infrastructure products and services, then enterprise IT gains leverage, choice, and options to swap in and out infrastructure products/modules/services over time.
A few specific industry standards that will go a long way to enable IT engineers to build their monitoring and analytics 2.0 infrastructure. First is an industry agreement on the type of state information that each infrastructure physical device and virtual module should contain, or a catalog of state information that is stored in every piece of the infrastructure. This could be as simple as on/off, link up/down, etc. with a time stamp. Once software-defined infrastructure contains state information, there must then be a way to extract the information and store it in a data lake. In other words, there needs to be an industry agreement on an open state format so that when state information is extracted from a Cisco, Pluribus, or HPE switch, for example, and transmitted to an enterprise data lake, the format is the same. With this common format for the transfer and storage of state information, a massive data lake containing time stamped state information can be used to run analytics. In fact, an entire new infrastructure analytics market can emerge to leverage the techniques, tools, and skill sets gained by big data analytics.
Another industry standard or understanding should be focused on monitoring, and more specifically on how to move traffic to analytic engines so as to eliminate monitoring gaps and silos. In short, an open traffic monitoring format is needed for the software-defined management market to emerge.
These three industry standards or understandings – state catalog, state format, and traffic monitoring format – promise to usher in a new era in software defined management that unleashes the agility enterprises seek to compete in the digital transformation age while providing IT teams more control and options around how to build infrastructure. The really good news is that this work is being done in the ONUG Monitoring and Analytics 2.0 initiative with wide support from the vendor and IT executive community.
Nick Lippis is an authority on corporate computer networking. He has designed some for the largest computer networks in the world. He has advised many Global 2000 firms on network strategy, architecture, equipment, services and implementation including Hughes Aerospace, Barclays Bank, Kaiser Permanente, Eastman Kodak Company, Federal Deposit Insurance Corporation (FDIC), Liberty Mutual, Schering-Plough, Sprint, WorldCom, Cisco Systems, Nortel Networks and a wide range of other equipment suppliers and service providers.
Mr. Lippis is uniquely positioned to comment, analyze and observe computer networking industry trends and developments. At Lippis Enterprises, Inc., Nick works with entrepreneurs evaluating new business opportunities in enterprise networking and serves as an independent investor and advisor.