>> Spikes and Sudden Performance Bottlenecks
September 24, 2008
| 1. Longitude Report showing CPU spike - Click to enlarge |
| 2. Longitude Report showing CPU spike with annotations - Click to enlarge |
Every now and then, our Tech Support line receives a question from a customer or prospect about how to deal with spikes in resource usage and the corresponding performance bottlenecks. Typically, they have either generated a graph showing a spike visually (with CPU usage being a common culprit), or they have noticed an event or received an email alert about a sudden bottleneck. They need to find out what caused the spike and how to respond.
Whenever you see a sudden change like the one shown (Figure 1), first think about what day and time of day the spike occurred and whether it may have been an expected activity. Are there regular production processes that are resource intensive, such as month-end processing that might be done overnight at the beginning of the following month? Does the spike correspond to a planned maintenance window on the system in question? What about periodic virus updates, patch applications, reboots, or backup processes? If the answer to any of these questions is “yes,†then the quickest way to identify the problem may be to pick up the phone and verify the situation with a colleague.
Once you find the cause, we recommend documenting it by annotating the report. Simply click on the graph at the time in question and enter your notes; Longitude then stores it as part of the report and shows your annotation with a blue pin (see Figure 2).
| 3. Longitude Event Monitor showing memory spike - Click to enlarge |
![]() |
| 4. Longitude Event Monitor showing drill down on memory spike- Click to enlarge |
If a virus scan or other “aha†cause isn’t forthcoming, it’s time to drill down. In Figures 1 and 2, Longitude shows a red icon for each CPU event that occurred during the time period shown. Click on any event to view it in Longitude’s Event Monitor and drill down as needed.
When you see an event in the Event Monitor, just click on the event, and then click into the links given to see things like the rule that triggered the event, or the processes using the most resources. Figures 3 and 4 show what it looks like to view and drill down from a memory event.
Usually, drilling down from an event to see the heavy resource users is all that’s needed to identify the source of the spike. What you do from there will depend on what you find. You may need to reallocate resources, look into application-specific performance issues, or merely document the situation and move on.
