Managing and Monitoring SharePoint Server 2010: Resolving Bottlenecks

In our previous article “Monitoring SharePoint 2010: Configuring the Usage Database”, we discussed the available performance counters and what they tell you.  In this article we’ll discuss bottlenecks; how to detect them; and how to resolve them.

In general terms bottlenecks are the result of insufficient resources to service transaction request. Bottlenecks can be physical hardware, operating system, or application related. More often than not though you will find that a bottleneck is caused by ‘homegrown’ code or 3rd party solutions. Reviewing custom code could yield better results than simply adding more hardware to solve the issue. Another common issue that creates bottlenecks is an incorrect configured server or, an incorrect configured farm. Bottlenecks can also be caused by an inefficient design of the data structures causing those to require more resources than necessary.

For a system administrator, it is essential to manage bottlenecks by constantly monitoring performance. When you identify a performance issue, you must assess the best resolution for removing the bottleneck. The performance counters and other performance monitoring applications are key when analyzing problems.

Physical Bottleneck Resolution

The physical bottlenecks can be based on processor, memory, disk, or network contention. To successfully resolve a bottleneck issue you have to identify the exact nature of the issue and then make a change to mitigate the problem.

You may have to resolve bottleneck issues by making changes to hardware or system configurations, once you have determined that they are not caused by a misconfiguration, inefficient custom code or third party solutions, or inefficient solution implementation. The following tables identify problem threshold and possible resolution options. Some of the options suggest hardware upgrades or modifications.

Objects and Counters Problem Resolution Options
Processor – % Processor Time Over 75-85% Upgrade processorIncrease number of processorsAdd additional server(s)
Avg. Disk Queue Length Gradually increasing, system not in a steady state and queue is backing up Increase number or speed of disksChange array configuration to stripeMove some data to an alternative server
% Idle Time Greater than 90% Increase number of disksMove data to an alternative disk or server
% Free Space Less than 30% Increase number of disksMove data to an alternative disk or server
Available Mbytes Less than 2GB on a Web server. Add memory.

SQL server available memory will be low, by design, and does not always indicate a problem.
Cache Faults/sec Greater than 1 Add memoryIncrease cache speed or size if possibleMove data to an alternative disk or server
Pages/sec Greater than 10 Add memory
Paging File
% Used and % Used Peak The server paging file, sometimes called the swap file, holds “virtual” memory addresses on disk. Page faults occur when a process has to stop and wait while required “virtual” resources are retrieved from disk into memory. These will be more frequent if the physical memory is inadequate. Add memory
Total Bytes/sec Over 40-50% of network capacity. This is the rate at which data is sent and received via the network interface card. Investigate further by monitoring Bytes received/sec and Bytes Sent/sec.Reassess network interface card speedCheck number, size, and usage of memory buffers
Working Set Greater than 80% of total memory Add memory
% Processor Time Over 75-85%. Increase number of processorsRedistribute workload to additional servers
Application Pool Recycles Several per day, causing intermittent slowness. Make sure that you have not implemented settings that automatically recycle the application pool unnecessarily throughout the day.
Requests Queued Hundreds or thousands of requests queued. Implement additional Web serversThe default maximum for this counter is 5,000, and you can change this setting in the Machine.config file
Request Wait Time As the number of wait events increases, users will experience degraded page rendering performance. Implement additional Web servers
Requests Rejected Greater than 0 Implement additional Web servers

Hopefully this will help you understanding and resolving bottlenecks in your SharePoint environment. In our next article we’ll discuss how to create a custom Monitis monitor to track your SharePoint 2010 performance and identify possible bottlenecks.