Blog

Uptime Monitoring –  Calculating cost and ROI

May 11, 2017 | Heroix Staff

Why:

As an Information Technology professional, you have in all likelihood experienced an IT disruption that has affected application and server uptime. Information Technology is very much a metrics driven profession.  

We often think of the technical metrics related to server uptime and application performance, but we can’t lose sight of the financial metrics as most everything IT does impacts the bottom line.

When IT is operating at 99.999%, system uptime isn’t given a thought, much like a refrigerator lightbulb, the expectation is “always on”. It is only when there are application availability failures that IT faces close scrutiny.

Ultimately measuring system uptime and the associated costs is about showing IT’s value to the business. The principle reasons for monitoring and measuring uptime are:

  1. Providing an objective view of the health of applications from an end user’s perspective
  1. Showing that IT is at the top of their game, that they are proactive and aware of potential problem areas
  1. Showing IT is impactful to the organization

How:

The goal is to quantify the cost of “Uptime”, both from a Business perspective, as well as an IT perspective.  We want to arrive at an ROI… “R”eturn “O”n “I”nformation Technology by categorizing the Business and IT cost centers and then for each of them arriving at a formula that can be easily incorporated into a spreadsheet.

IT’s value to the business can be calculated by quantifying:

  1. The cost to the business of system uptime failures
  1. The percentage of failures IT can prevent
  1. The costs associated to employ an effective IT monitoring strategy

First, we will want to identify the Business cost centers so that we can measure and quantify the potential impact to the business of all system uptime related issues. The bigger the impact, the more value IT brings to the business.

Second, we will want to look at IT’s cost centers so that we can measure and quantify the resources required of IT to deliver maximum uptime to the business.

Quantifying Business Costs

As IT professionals, we can’t lose sight of the fact that IT’s primary purpose is to support the business goals of an organization.

Quantifying the impact of downtime to a business/organization is essential to understanding the ecosystem that IT is functioning in and how/where it can add value.

When looking at business-related cost centers we want to not only quantify the potential cost to the business of IT failures that compromise uptime, but also to determine how much is preventable with an effective IT monitoring strategy.

Lost Profits – Current Sales, deferred sales, and missed sales 

Business models will vary from organization to organization, however associating downtime with lost profit/service can be accomplished with a few fundamental questions.

If orders/services can’t be fulfilled immediately how much of the business is permanently lost?  If orders/services are deferred, how does that impact the organization’s bottom line?  What is the impact of missed sales? Will customers come back?

This sample calculation serves as an excellent starting point:

P= profit per business hour = (Annual Profit / Number of business hours per year

D= downtime hours per year

Lost profits= P * D

Note: If your organization is nonprofit the above formula will still work, instead of focusing on profit, assign a monetary value to your nonprofit’s cause

C= cause value per business hour = (Cause value / Number of business hours per year

D= downtime hours per year

Lost value = C * D

Lost Employee productivity

How many employees are unable to perform their job and for how long?

N= number of non-productive employees

C= cost per hour of a non-productive employee including salary and benefits

D= annualized downtime hours per non-productive employee

Lost Employee productivity = N * C * D

Recovery Costs – Additional resources required to remove backlog

What is the cost annually to your company/organization to recover? This includes additional staff required to remove backlogs and satisfy queued orders.

N= number of recovery employees

C= cost per hour of a recovery employee including salary and benefits

R= annualized number of recovery hours per recovery employee

Recovery Cost Per Year= N * C * R

Penalties – Associated with Service Level Agreements and compliance issues

Depending on the nature of your business you may be subject to penalties for downtime, noncompliance, or even lost data.

D= downtime hours per year

P= penalty/fine amount per hour

Penalties= D * P

Quantifying Information Technology Costs

IT Engineering – The time spent measuring and quantifying uptime

If you’re a glass half full person you’ll use the term uptime, if you’re a glass half empty or experienced significant outages, you (and especially your constituents) are most definitely more comfortable with the term downtime.  

Whether you are manually logging application behavior, using IT monitoring technology to monitor the applications and supporting IT infrastructure, or synthetically emulating end user activity,  showing that applications are available and functioning is the most objective way to quantify uptime and show the level of service IT is delivering.

N= number IT Engineers

C= cost per hour for an IT Engineer including salary and benefits

T= annualized time spent measuring uptime per IT Engineer

IT Monitoring for Uptime Cost =N *C * T

IT Engineering – The time spent regularly collecting, collating, and analyzing application and IT infrastructure performance

Correlating uptime / downtime with performance issues is critical to identifying the early symptoms of IT problems and using the diagnosis to anticipate and prevent future disruptions.  When the IT infrastructure is on premises measurement and diagnosis is a straightforward process, however as more workloads migrate to the cloud and  IT organizations look to run as hybrid configurations, the collection and reporting of performance based metrics becomes an increasingly complicated proposition.

N= number IT Engineers

C= cost per hour for an IT Engineer including salary and benefits

T= annualized time spent collecting, collating, and analyzing application performance per IT Engineer

IT Performance Monitoring Cost = N * C * T

IT Engineering – The time and resources spent to diagnose and recover from system and application failures.

Once a failure is diagnosed by IT, consideration needs to be given to the time and other IT resources needed to fix the problem. For example: Was there a data loss that required a recovery? Were additional resources called in that required supervision? Was a technology change required?

N= number IT Engineers

C= cost per hour for an IT Engineer including salary and benefits

R= annualized number of recovery hours per IT engineer

IT Recovery Cost =N * C * R

IT Engineering – Hardware costs

While the cost to replace a failed piece of hardware is easily quantifiable, the cost of misallocated hardware resources can be a bit more involved. 

Let’s take the example of under allocating resources, applications at the very least will perform sub-optimally and the very worst fail altogether.  The most common misallocations are related to virtual resources.  Often these problems can be fixed relatively easily by moving or redistributing existing IT resources.  

Over allocated resources, won’t adversley affect uptime, however there will be additional expenses related to power consumption, annual maintenance contracts, and other costs associated with the actual purchase of unneeded hardware.

          C= Cost of extra hardware including: power, maintenance, and depreciation

          Q= Quantity of Hardware

          Hardware costs = C * Q

IT Engineering –  Software costs

The cost associated with any software used in the actual monitoring of uptime and application performance including annual license and support.

Software costs = Annual licensing and support/maintenance

ROI - “R”eturn “O”n “I”nformation Technology

When calculating an ROI, we want to show how impactful IT is to the organization by measuring how much Information Technology through its action is able to increase uptime (glass ½ full) or decrease downtime (glass ½ empty). 

What would happen if IT were able to decrease downtime by 75%?  What would that mean to the organization in terms of cost savings?  Your mileage will vary, but play close attention to the monetary savings/impact IT can have to the organization.  The more “expensive” disruptions in availability are to an organization the bigger the opportunity for IT to be impactful.

Downloadable Sample ROI Calculator

Uptime Monitoring - Calculating “R”eturn “O”n “I”nformation Technology
   
Business Costs  
   
Lost Profits – Current Sales, deferred sales, and missed sales $150,000.00
Lost Employee productivity $10,000.00
Recovery Costs $8,000.00
Penalties $0.00
   
Sub-total Business Costs $168,000.00
   
Percentage cost reduction with effective IT Monitoring 80%
Business Saving Associated with IT Monitoring $134,400.00
   
IT Costs  
   
Engineering - Uptime Monitoring $7,800.00
Engineering - Application Performance Monitoring $10,000.00
Engineering - Recovery Costs $10,000.00
Hardware Costs $5,000.00
Software Costs $2,500.00
   
Total IT Costs $35,300.00
   
“R”eturn “O”n “I”nformation Technology 380.74%

 

Want to learn more?

Download our How to Monitor IT-Driven Business Services Whitepaper and learn about five essential principles you can use to ensure that business-focused application and system monitoring will succeed as planned.

 

 Download the whitepaper: How to Monitor IT-Driven Business Services

 

Sign Up for the Blog

Heroix will never sell or redistribute your email address.