Perspectives From Inside the Trenches: Real-World Network Automation

Use case narratives have long been established as invaluable guidebooks. Especially in the digital world where technology is rapidly changing, tapping the minds of those that are in the trenches yields immeasurable insights. This is exactly what happened at ONUG’s conference last Fall in NYC. The keynote session focused on large-scale automation made possible by Gluware. The panel consisted of Salvatore Rannazzisi, Principal Network Architect of Merck, and Kevin Carney, Retired Principal Network Architect of Mastercard, and was moderated by Jeff Gray, CEO and Co-Founder of Gluware. 

This fireside chat focused on real-world uses, the state of automation, options for driving global scale automation, as well as the business benefits of taking on large-scale automation. If you weren’t there, here’s what you missed.

Identifying the Problem in Network Infrastructure

The networking world has gone from an Ethernet, where you just plugged in a tap and life emerged, to virtual networks with a hundred gigabit. These technologies have enabled all the things we use every day, including the Internet and our cell phones. While the infrastructure has changed, however, some things have not. For example, engineers will walk up to a router, type in an IP address and manually make changes. More than 90 percent of issues result because of manual error. We’ve been doing this same process for more than 40 years with the same result. That’s called insanity. 

The same inefficient process permeated industries. Rannazzisi admitted, “We take all this time getting a standard approved, but by the time it gets published, the standard is already old.” There is simply no way to get a standard out in production, over tens of thousands of devices, in a reasonable amount of time. As a result, his company was always “out of compliance.” 

Some teams tried scripting, legacy tools that used tertiary copying of a configuration or even expensive Perl scripting. None of these methods were effective. 

How Automation Provides a Solution

How does automation improve network lifecycle management? That was the question presented to Kevin Carney. First, we must get rid of the manual piece. Automation is going to enable us to provide customers, both external and internal, with a more stable environment, so they can do their business. Second, automation is critical to the security portion. The majority of the time, there’s an individual that goes into a router, switch, firewall, etc. to put in a standard configuration. You just hope everyone involved does it right and you get a golden configuration across everything, front to back. 

Carney described this scenario. You have a really smart team. But, turnover leaves you with a situation where “you’ve drained the swamp and the alligators are all over.” You’ve suddenly got 10 guys making changes, and they finally get the smart idea that there needs to be an access control list. But, they forget about it. Automation is able to recognize that because it checks the standard configuration and puts the access control list back on immediately. It’s critical to give the right people the right access. 

The next area that can reap the benefits of automation is auditing. Most industries, including the government, must provide an audit to someone. Whether it’s an annual, bi-annual, quarterly, monthly, etc., it’s a difficult process. Numerous devices must be checked to a standard through a very labor-intensive process. However, automation can grab the configuration and check it against a golden configuration on a daily basis. So, when the audit comes up, the work is already done. Labor reduces and a bunch of people are happy. Remediation is a key piece of reaping the benefits of automation, both in terms of audits and troubleshooting. 

Real-World Applications

Merck is far along in its network automation. They’ve focused on getting common services across its numerous devices, making compliance much more effective. They started with routers. Now, they’ve moved to code motions on routers. Currently, they have models of automation for every type of use case for every type of switch in their environment, including manufacturing and isolated security networks. They are completely in on a full, layer two audit, including network access control. 

Rannazzisi explained they are efficiently performing Knack now, including its plethora of commands. If someone gets approval to do a Knack exception, they can simply put a description on that interface and run the automation model again. 

Time savings is the most tangible benefit. Rannazzisi’s example was that of getting a change done. Merck used managed service providers (MSP), like many organizations. However, global changes require a statement of work, and often a wait time of six to nine months while the MSP ramped up their staff to perform the job, then accessed each device to manually perform the change. Merck has cut that time by 90 percent, and they don’t pay an MSP to do the work. 

“Someone just asked me to make a change for load balancing on port channels, and it was literally a five-minute change,” bragged Rannazzisi. It took longer for the process to catch up to the automation. 

In addition to saving time, the company is now in compliance. Rannazzisi admitted they were always out of compliance and failing audits before Gluware because of the paper standards paradox. They’d make a paper standard change and simply not have the resources to go and do it. 

How to Move Forward

How can companies who want to move forward do it? Carney spoke about the complexity of change. It’s going to require labor hours and varying skill sets. Plus, you must consider security, both at flight and at rest. So, companies can choose from many methods to move forward. Carney described them this way.

  • Command-line scripting has been used for decades. Even if you think you may be sophisticated, it’s still a manual process, open for human error.
  • Roll Your Own, also called formal scripting, involves scripting a language, getting a catalog and creating playbooks. While there can be some advantages, it is still ultimately a manual process that requires a lot of work. You must create a playbook for every device and all their varying functionalities. Once created, you’ll also have to consider maintenance. As far as skill levels, you may have to take your network engineers off one project to learn a new language. It will take time for them to gain enough experience to be truly effective. You must also think about the security of the databases and the security of the scripts. It’s really just a “pseudo automated way.” 
  • An automated approach, like Gluware, is going to require a learning curve, and companies must be prepared for that. The difference is that the knowledge gained to create configurations is transferable. Your team will continue to use the skills they accumulate. GUI format makes it easy to train others to do small things upon role base, making the whole team more efficient. With an automated approach, when vendors change their iOS login or add features, all you have to do is add a parameter to your config. That’s it. Your database is the source of all knowledge. It is fed by other applications and it feeds other applications. 

“Changing How We Do Business”

That’s how Carney described automation. All these years, we’ve gone from network to network, but never changed how we manage the infrastructure. Automation is changing that, and it’s significant. Rannazzisi described it as “intelligent copying and pasting.” Automation enables you to remediate things very quickly and keep in compliance. 

ONUG is hosting another informative event on building, running and securing the enterprise cloud, ONUG Digital Live will take place May 6 and 7, register by April 1 to reserve your complimentary pass.  Don’t miss it! Learn more here or contact us.

Author's Bio

Stephen Collins

Working Group CTO, ONUG