Automation in the Age of the Remote Workforce: Terracon and Gluware Show the Way
Throughout the years at ONUG, Gluware has consistently shown how it partners with customers. It is unique in that its customers advocate for Gluware in public, which is a difficult internal process to undergo. But time and time again, Gluware’s customers do. They do it because they are delighted with the outcome, because Gluware delivers what it promises.
Jamie Hughes, Infrastructure Architect at Terracon, delivered a keynote during ONUG Digital Live this past May with Jeff Gray, Gluware CEO. Automation is a top five issue across all ONUG Community members. Gluware is the leader in Intelligent Network Automation, delivering an Intent-Based orchestration engine that empowers Network Operations to successfully automate and orchestrate mission-critical networks at scale. Gluware is a proven automation platform in some of the largest corporate brands such Mastercard, Merck, et al.
Jamie presents in this keynote the new reality that the pandemic brought us and according to ONUG Community members, this way of working is the new norm. To support highly distributed workforces, its infrastructure needs to be elastic. Terracon, being an engineering firm, is constantly adding and subtracting sites. This is a fundamental attribute of automating elastic infrastructure that nearly all large enterprises are developing strategies to address.
As businesses reassess/rationalize their current projects, processes, real estate and IT spend, two things come into focus: 1) the value of buyer and supplier partnerships is fundamentally important as these relationships are changing, and 2) businesses’ digital transformation projects and staffing are by far winning in funding battles. In short, now more than ever, IT buyers and suppliers are tightly reliant upon each other to solve problems as corporations quickly downscale real estate plus service customers, partners and suppliers digitally.
Terracon and Gluware show how good partners work with each other to deliver a positive outcome. Terracon needed a solution to automate remote site deployment and change management that was multi-vendor, leaves an audit trail for compliance assurance, assures the remote site is configured correctly and keeps track of inventory. Jamie and his team chose Gluware to automate their remote workforce elastic infrastructure, and in the process, they saved time, money and gained better security. But the biggest payoff is that Terracon is more digital, agile and flexible to service its customers, suppliers and partners.
Nick Lippis; ONUG Co-Founder/Co-Chair
Jeff Gray Gluware CEO and Co-Founder
Jamie Hughes Terracon Infrastructure Architect
Jeff Gray, Gluware CEO & Co-Founder
Welcome everyone to the Gluware Terracon keynote, “Automating Business Continuity: Network Life Before, During and After COVID-19.
I’m Jeff Gray, CEO and Co-founder of Gluware, and I’m excited to introduce Jamie Hughes of Terracon.
We have a very relevant and timely case study that Jamie will be sharing today. Jamie is one of the most talented and capable network architects that I’ve had the pleasure to come across. He found a critical issue in his company about a year ago and he sought to find an automation platform that would work for his needs and for Terracon’s needs. And he got his hands dirty; he got very involved; he made his decision and he executed. And Terracon is leading the industry with their automation efforts and reaping the benefits. Now, Terracon was reaping the benefits before the pandemic, but now that the pandemic has hit, they’re paying off even more, and Jamie will discuss that.
It is my honor to introduce Jamie Hughes of Terracon to present this keynote.
Thanks for that introduction, Jeff. This is Jamie Hughes with Terracon.
We’re an engineering firm that’s employee-owned. We have more than 5000 people across the United States, serving all 50 states. Our growth patterns are 50% through acquisition and 50% organic. One of the challenges that presents is we’re always spinning up new sites, spinning down sites, moving—so there’s a lot of change and leads into why we were looking to automate our network.
I’m fairly new at Terracon so one of the things I was trying to get to know is, “How does the company work?” We operate in all 50 states and one of the challenges that represents is that we don’t have IT staff here. They’re more centrally-located in the Kansas City area. We don’t have those remote hands at all the sites so need to perform all those functions remotely.
Some of the challenges that we were presented with are
Some of the top requirements we had:
At the start of this, these are a lot of the issues that Terracon was facing:
Terracon’s Evaluation-Decision Process. We looked at several different things:
DIY (Do-It-Yourself) was pretty much out the window from the start.
One of the reasons was we just don’t have the resources to apply–go to new skills like Python, database, all the platforms, and the ongoing maintenance. A new piece of hardware is delivered, and we need to QA to onboard that. We just don’t have the resources to perform that kind of task.
Then we looked at other tool sets that can help us do this automation. A lot of them were limited to one vendor.
They lacked the validation that we would need to build to make sure we did all these changes correctly.
That led us Gluware.
How does Gluware work?
Basically, it’s broken up into multiple parts. The nice things about this is you can take your own current configuration—what is your standard, or where do you want to be at and if you haven’t met that standard. No coding required—you don’t have to learn Python or all that fancy stuff. We use the CLI that we want from other devices to allow us to do that.
The pre and post checks. The nice thing is you can actually take your configuration and ask, “What is this change going to do when I apply it?” You can actually go out and have it do a live connects to the real production device. It won’t make any changes. It just pulls down the configuration, takes your changes you’re looking to make, and then spits out (what config changes are required for that device).
What’s the result of that going to look like to automate a policy at scale? That piece is fantastic because, now I can take my QoS policy if we do one of those acquisitions. I’m going to add five new routers, I could really just take that and push that configuration out to those five new sites. It really sped up our deployment of moves and changes of our operation.
The other fantastic things that works in brownfield. A lot of the tools I looked at the problem with that was is almost all of them required to start with rebuild (of the entire configuration) and nobody’s network looks like that.
Some of the initial use-cases and quick wins.
The device manager was a huge one. We were able to get a lot of our devices in there get inventory on what versions of software they’re running and that was really helpful. Do they have SmartNet on them? Do we need to renew that for this year next year when is expiring?
Configuration drift and audit. To me, that was a really powerful component. Because what a let us do is we were using that resource to go and find out what kind of errors we have in the environment. And I noticed that, for example, our operations team was spending a lot of time on troubleshooting access lists. One of the things we did is— one of the first pieces we automated is—let’s get this access list—do we have a golden standard and how we can push out devices and waste less people’s time.
The OS manager. One of the fantastic things about this is we had a lot of different versions of operating system in our environment. This has really helped us cut that down. As we deploy those new sites, we have a new standard. It’s really easy to do. All the nice thing is operations teams that are, they’re willing to do that because now they literally hook a device up, push the configuration out to upgrade the operating system, and away we go. It’s really simple.
The most powerful component of this system is the Config Modeling. We can take those pieces of the CLI config we have and break them down into their component parts. The great thing about this is it really makes it easy to get into the solution and start automating things and feel like you’re getting somewhere. But you’re not trying to model the whole config. Once you can break it down into its component pieces, you can really move that bar forward in your environment.
Completed automation projects.
Over the last year, these are several of the projects we’ve worked on automating in our environment. One of the first things—we took some little piece—was fixing our access list. Some of the bigger projects we worked on were redoing our QoS. Since we were doing all this stuff manually, it hadn’t been looked at in a while. We were missing applications in there—things have changed or moved in and weren’t getting classified correctly any longer.
First, we sat down and figured out, “Where do you want to be.” And then, basically, we took from there, “Okay here’s where we want to be at. We built a policy in the tool, pushed that out, and then—since that was one of the first major things we had tried—we went back and audited ourselves. “What does this look like?” I was really impressed from that automation exercise getting that precision. Before we had 20% of our sites wouldn’t function correctly. After doing this audit, we literally audited every device, and they were all correct. It was just fantastic!
The other piece was the site-to-site VPN failover for MPLS what was happening was our MPLS circuits. When they went down, our employees lost connectivity to the data center. In this case, we were able to use automation to speed up those deployments and get that out faster. Our employees, instead of twiddling our thumbs when a circuit goes down, they would failover to the internet circuit or VPN tunnel and get back to our data center. It’s a huge win for us. It took outages that some of the sites could have been be down for hours now they didn’t even notice.
The other piece of really nice is the vulnerability remediation. A lot of times, we get this security vulnerability and it says you have to be on this specific operating system and you also need to have these commands or not have these commands on your device. With Gluware, I can go take that inventory management and find out how many devices are on that version of code. And then I can go run audit against something like “Do they have this command line in there?” It makes it really easy to figure out if I have devices that are vulnerable and if need to do something. Or maybe I don’t need to do anything at all.
And then the firewall deployment. One of the last things we’ve been working on here. Some of my team members have had some time during the pandemic and they’ve actually went in and automated the full firewall deployment so now we can take one of those devices and completely automate the deployment of it, which is a huge win for us.
Speaking of that, the pandemic presented a several challenges.
Some of those were our ability to travel, change network patterns, and now there were users aren’t at the site that are working from home, and supply chain challenges.
So, automation helped us several ways with this.
Normally when we were deploying the new site of another company we bought, we need to go out and travel. We would send our employees out to those sites to spin up this network equipment. Well now we have the inability to travel so “How do we resolve that?”, because that just doesn’t stop.
What we started doing is with Gluware, we can actually push the configuration out through the out-of-band modem. It allowed us to kind of switch our strategy. Instead of shipping all the items to our corporate office, we could dropship the equipment out there, use the out-of-band modems, push those configurations out, and get the device online and then upgrade its operating system right at the site. Then we can just hire local support to go out and just cable it up for us.
The other thing that it allowed was with a change in traffic patterns, we found that some of our QoS wasn’t as effective or needed to be put in different places. For example, with all the remote workers, basically we had a huge expansion of remote working from home. Well now we needed QoS on our internet edge. We were able to take those modular policies in the tool and with just a couple hours of modeling around and changing them to fit our environment for the internet edge, it could push that out for us. Then the nice thing is, later on, we could tweak that for our data center. Now with that being modular I can go add an application and then I know when I push it out on all my datacenter devices—it’s automatically going to have that updated config on there.
The other thing that allowed us to do is the ability with changing the routing in the environment, now that the people are home in a lot of cases, their internet circuit had more upload bandwidth to their MPLS circuit. We can flip the routing around on those things on the fly to basically direct traffic in the most efficient manner.
And then here’s some of the benefits Terracon realized in our automation environment.
Some of the lessons that Terracon learned out of this was
And with that, I’d like to turn it over to Jeff gray again.
Jamie, thank you so much for sharing your story with the ONUG community. Because of your vision, Terracon is reaping a tremendous amount of benefits, and you were very early, and you’ve executed further than many companies out there in the industry. I want to thank you for your partnership, and I want to thank you once again for sharing the story.
And with the benefits that Jamie and other customers have realized, we have decided to invest in others during this pandemic.
I have two announcements to make here at ONUG Digital Live. One is Gluware has now partnered with Microsoft and Gluware is now available in the Microsoft Azure Marketplace. We are now delivering a 30-Day Free Pilot-to-Production Trial Offer.
This is much more than just testing software. This is approximately a $25,000 value and it includes software, support, design, and training for qualified customers. We want to be able to share the benefits of automation in the same way that Terracon tested and rolled to production. Now, you can do this much faster because, within Microsoft Azure, you can download Gluware directly into your Azure tenant, spin up in minutes, and get automated. We want to invest to support that because it delivers a lot of benefits for customers, especially in a time like this. Please apply in the lower right-hand corner to apply for our business continuity offer. Gluware will work with you, support you and partner with you.
And with that, thank you for your time and stay safe.