Sunday, February 19, 2017
Home » DevOps » Data for DevOps: Part III

Data for DevOps: Part III

In the first and second parts of this series, we introduced the idea that data analytics can significantly improve IT outcomes and facilitate the transition to DevOps practices. We described the three data types available to IT departments (Operations Data, Monitoring Data, and Event Data), and we delved into more detail about how to use Operations Data and Monitoring Data in the form of continuous streams of metrics. This post focuses on the value of a special category of Monitoring Data; Event Data.

Diving Deeper: Event Data

At a fundamental level, we can understand the job of IT operations teams’ as delivering operational specifications such as discoverability, availability, and persistence. How these specifications may play out in practice may be meeting a specification related to page-load times for a web application. Currently, most IT departments are already using monitoring data to automatically trigger alarms, alerting operations staff to issues as soon as they arise.

While this is a considerable improvement over waiting for user complaints, analyzing event logs alongside monitoring streams can help solve two major issues with this approach:

  • False alarms – Tuning any automatic alert is difficult. IT departments often find themselves balancing creating sensitive alerts that can identify issues quickly against not flooding their systems with false positives. Analysis of monitoring streams that preceded various event types can help target alerts designed for specific event types. Multi-variable analytics methods can also allow for more robust, and even self-learning, alerts.
  • Reactive response – While simple, rules-based alerts that, for example, trigger when a metric crosses a given threshold certainly expedite IT department response times, they remain inherently reactive. The systems still fail or, depending on the alert design, come close to failing. Analytics methods that seek to model historical data and understand system behavior prior to events provide the opportunity to develop predictive models that allow for proactive attention. These analytical methods can identify behavior patterns that have evolved into incidents in the past, providing a warning to perform service activities prior to a failure occurring.
  • Self-healing – The move from reactive to proactive service attention enables significant improvement in IT operations. Perhaps the most exciting prospect of these type of analyses, though, is the ability to remove the need for service attention altogether. By coupling the data-driven failure analysis enabled by Monitoring and Event Data with Operations Data such as service logs, IT organizations can begin to automate responses to specific behavior patterns, improving performance and efficiency.

Outside of direct operational improvements, Event Data also provide retrospective information for evaluating performance metrics. For example, event logs can allow a department to track the number of times a resource was unavailable when requested. Understanding patterns in these logs and correlating them with other operational changes provides an objective measure of the value of DevOps interventions.

To the extent that DevOps philosophy and practices aim to transform IT departments into proactive entities that can efficiently deliver services across a business, data analytics offers an obvious tool for designing and prioritizing new processes and practices. In the coming months, we will examine the ideas discussed above in more detail.

About Arti Garg

Arti Garg
As a Principal Consultant at Datapipe, Arti strives to ensure that data and analytics solutions complement and support their existing processes and culture for all Datapipe clients. Arti has a deep understanding for how any new solution, technical or otherwise, must align with an organization's mission, culture, and values. She writes about her experience helping enterprises identify gaps in their existing processes and ways that data-driven software solutions can effectively close them.

Check Also

Data for DevOps: Part II

Infrastructure monitoring data can provide IT departments with insight into the virtual or physical, systems they manage. Analyzing these data can provide value across multiple axes, including outside the IT organization itself.