Wednesday, March 29, 2017
Home » DevOps » Data for DevOps: Part III

Data for DevOps: Part III

In the first and second parts of this series, we introduced the idea that data analytics can significantly improve IT outcomes and facilitate the transition to DevOps practices. We described the three data types available to IT departments (Operations Data, Monitoring Data, and Event Data), and we delved into more detail about how to use Operations Data and Monitoring Data in the form of continuous streams of metrics. This post focuses on the value of a special category of Monitoring Data; Event Data.

Diving Deeper: Event Data

At a fundamental level, we can understand the job of IT operations teams’ as delivering operational specifications such as discoverability, availability, and persistence. How these specifications may play out in practice may be meeting a specification related to page-load times for a web application. Currently, most IT departments are already using monitoring data to automatically trigger alarms, alerting operations staff to issues as soon as they arise.

While this is a considerable improvement over waiting for user complaints, analyzing event logs alongside monitoring streams can help solve two major issues with this approach:

  • False alarms – Tuning any automatic alert is difficult. IT departments often find themselves balancing creating sensitive alerts that can identify issues quickly against not flooding their systems with false positives. Analysis of monitoring streams that preceded various event types can help target alerts designed for specific event types. Multi-variable analytics methods can also allow for more robust, and even self-learning, alerts.
  • Reactive response – While simple, rules-based alerts that, for example, trigger when a metric crosses a given threshold certainly expedite IT department response times, they remain inherently reactive. The systems still fail or, depending on the alert design, come close to failing. Analytics methods that seek to model historical data and understand system behavior prior to events provide the opportunity to develop predictive models that allow for proactive attention. These analytical methods can identify behavior patterns that have evolved into incidents in the past, providing a warning to perform service activities prior to a failure occurring.
  • Self-healing – The move from reactive to proactive service attention enables significant improvement in IT operations. Perhaps the most exciting prospect of these type of analyses, though, is the ability to remove the need for service attention altogether. By coupling the data-driven failure analysis enabled by Monitoring and Event Data with Operations Data such as service logs, IT organizations can begin to automate responses to specific behavior patterns, improving performance and efficiency.

Outside of direct operational improvements, Event Data also provide retrospective information for evaluating performance metrics. For example, event logs can allow a department to track the number of times a resource was unavailable when requested. Understanding patterns in these logs and correlating them with other operational changes provides an objective measure of the value of DevOps interventions.

To the extent that DevOps philosophy and practices aim to transform IT departments into proactive entities that can efficiently deliver services across a business, data analytics offers an obvious tool for designing and prioritizing new processes and practices. In the coming months, we will examine the ideas discussed above in more detail.

About Arti Garg

Arti Garg
As a Principal Consultant at Datapipe, Arti strives to ensure that data and analytics solutions complement and support their existing processes and culture for all Datapipe clients. Arti has a deep understanding for how any new solution, technical or otherwise, must align with an organization's mission, culture, and values. She writes about her experience helping enterprises identify gaps in their existing processes and ways that data-driven software solutions can effectively close them.

Check Also

How an Effective DevOps Culture Can Lead to Better Automation

It’s easy to think that DevOps and automation are interchangeable, but that’s not the case. Rather, automation is an integral aspect of the overall DevOps picture. Understanding the difference between the two can take an organization to the next level in terms of optimization and efficiency.