>> Correlate your events: You’re not monitoring in a vacuum
July 10, 2007
Customers and prospects commonly ask about correlated events. They want to know what they are and why they should use them. I usually explain to them that without correlation of your events, you’re basically monitoring in isolation. If you’re monitoring application, system, and network events without comparing them to each other, how can you really know what impact the individual components are having on your business? For example, if a router goes down or a database experiences delays, will it affect a customer-facing website? If you want to be able to understand how entire business processes are being affected, you need to be able to correlate alerts and events.
First off, you need to know the root cause of your problem. Does a web-based response time problem on your e-commerce site originate in the application server, back end database, network or web server? If a server is unreachable, is the server itself down or is there a problem with the router? Knowing the underlying causes of these problems prevents you from having to solve by “trial and error†and greatly speeds up the time it takes to resolve the issue.
Second, if you can pinpoint each problem, you can understand the nature, severity, duration and potential remedies of the issue. With this specific information you can outline instructions on how to respond.
Finally, correlating events through monitoring enables you to escalate problems based on how persistent a given situation is. If an application fails you could set it to restart to see if that solves the issue before notifying staff. Or you could generate an alert if a specific set of issues are missing. For example, if expected data files are not created, a file transfer application may be malfunctioning. Or lack of CPU utilization could be indicative of a problem for certain applications.
These are just a few examples of how you could use correlated events in your monitoring. If you think about your business critical applications and the types of conditions that matter at your site, you will probably find situations where correlated events will save you time and trouble.
For more information about how Longitude can help you achieve your business goals, go to http://www.heroix.com/agentless/agentless_performance_monitoring_main.htm.
Posted by: Chris Smith, Senior Technical Engineer
Subscribe by RSS






