Wikis > Monitoring and Analytics > February 15, 2017 - Initiatives Workshops Notes

by Karlo Zatylny, SolarWinds

Below are my notes from the meeting – they are somewhat scattered due to the jumping around of the conversation:

Feb 15 2017 Notes (Karlo Zatylny – SolarWinds)

  • Open Traffic Monitoring Format
  • Where to focus traffic analytics?
    • WAN
      • SDWAN is becoming more popular but should not dominate our methodology
    • LAN
    • Data center
    • Enterprise
  • How do we use different technologies to provide insight into the application
    • Flow, interface util etc. need to tie back to an overall metric that focuses on the performance of the target application infrastructure
    • How do the statistics allow the person monitoring the network allow them to be proactive in debugging troubles
      • Should know the problem before the user calls
      • Should be able to describe the impact zone
  • Define a framework that focuses on the application and is agnostic to network infrastructure (LAN, WAN, Hybrid, data center)
  • Analysis should apply to each layer from carrier to enterprise to SMB
  • How do different layers of networks interact with monitoring and analytics
    • How does the enterprise talk a common analytics language to the carrier and use common analytics terms and methods to describe accurately issues that cross from one network into another
  • Is there a way for carriers and customers to communicate without disclosing information that is closed to the public
    • Does analytics play a role in being able to describe a language to facilitate network problems across networks?
  • What is the output of ONUG M&A group?
    • Use cases?
    • Framework?
  • What are the common threads and technologies used today that can be used for analytics?
  • Use term “service centric” as opposed to “application” so that we include technology like LDAP and other services that are not necessarily tied to a single end user application
  • If I am troubleshooting:
    • What is the information needed?
    • What analysis could be done on that information to give better insight to drive to action?
    • Are there historical issues that share similar traits that could be used for analysis?
  • Output should be in the format:
    • Problem Statement
    • Use Cases
    • Data available
    • Analytics possible
    • Desired plausible actions
    • Cause – Resolution description
  • How do we progress from “my application is slow/broken” to finding root cause, to identifying a resolution?
    • Understand application/service topology and dependencies
    • Understand what metrics describe specific applications and their behavior
    • What analytics can point to anomalous behavior?
    • No need to distinguish hardware vs software metrics but their relationships need to be understood
  • What are symptoms of common issues due to  common configuration issues
    • What does BGP flapping often manifest as?
    • How does route configuration issues often have symptoms of?
  • How can we be predictive knowing when the green is about to go to yellow?
    • Can predictive analytics forecast when a green status is about to move to yellow/red?
  • How do different levels of depth play a role?
    • Response time à NetFlow à DPI
  • Can we create tools and protocols to examine real time tracing of an application?
    • This would segment the problem
  • With a view on the future, how do we help mold the future tools with the right monitoring and analytics that yields a result that gives the user a method for identifying, troubleshooting, and fixing service issues?
  • TODO Action before March 1st: Come up with  a document that is a skeleton document that can be used to collaborate and unify the working set of ideas
    • Put something out there for everyone to comment
    • What is the format that we need to use?
  • TODO Action before March 1st:  Write use cases: Need volunteers for writing use cases:
    • Please fill in your name and specific area. I heard one person on the call but don’t know who it was
    • Format: use case, specific problem statement, associated man hours
  • Areas of focus
    • Datacenter
    • SDWAN
    • Well documented use cases for presentation in  April
    • How to  collaborate moving  forward