Hosting Partners  |  About Us  |  Blog  |  Legal  |  Portal Login

The Planet Blog

 
Posts Tagged ‘database’

George Govantes“What counters are important when troubleshooting SQL Server with Windows’ Perfmon (Performance Monitor)?”

As a database administrator architect, I can’t tell you how many times I’ve gotten that question from Windows administrators and new database administrators, and it’s a pretty daunting question to answer. When a SQL instance is installed on a server, there may be more than a thousand new performance counters added to Perfmon and to the Performance Monitor section of the Reliability and Performance Monitor in Windows 2008, so where does one start?

When it comes to performance tuning and troubleshooting, SQL Server has not changed much over the years. In a nutshell, it is all about CPU, memory and I/O. To reduce the time it takes to determine the bottleneck in a given environment, we should isolate the counters to these specific areas.

Of the many performance counters that can be selected when troubleshooting a SQL Server, these key indicators can help to quickly isolate bottlenecks and direct your investigation to the appropriate resources for corrective action:

At the Operating System Level

CPU Counter

Server Work Queues > Queue Lengths
Description: Queue Length is the current length of the server work queue for this CPU.
Value: A sustained queue length greater than twice the number of CPU cores might indicate processor congestion.

Memory Counter

Memory > Pages/Sec
Description: A page fault occurs when the operating system (OS) cannot find the requested information in its physical memory, forcing the OS to seek the information at the disk level. A soft page fault is when a page is found elsewhere in the physical memory, and a hard fault requires disk access. Most processors can handle large numbers of soft faults without significant consequence. However, hard faults, which require disk access, can cause significant delays.
Value: This value should stay below 20-25 pages per second.

I/O Counters

Physical Disk > Average Read Queue Length
Physical Disk > Average Write Queue Length
Description: Avg. Disk Queue Length is the average number of read or write requests that were queued for the selected disk during the sample interval.
Value: The value for this counter should always be under 2. This is the most reliable counter to use when the SQL instance is using external SAN storage for its databases.

Determining Application Resource Usage

Process Counter:

From Perfmon, select “Process” then select the application in question – in this case, sqlservr.
SQL Perfmon
This will add all of the operating system counters associated with the application.
SQL Perfmon
For SQL Server, the counters usually reflect information about the internal operations of the SQL instance that may not show up as a problem at the operating system level. For example, a customer might call to report that the server hangs for several moments throughout the day, but there are no errors in the application or system logs, and the OS Performance counters show all counters within normal limits.

These SQL performance counters can provide deeper insight into how the database engine is working against the customer’s application to help in your investigation.

SQL Server

SQL Server: General Statistics > User Connections
Description: Number of users connected to the system.
Value: This number will vary between applications on a given server.

SQL Server

SQL Server: Locks > Lock Waits/Sec
Description: Number of lock requests that could not be satisfied immediately and required the caller to wait before being granted the lock.
Value: This number should be near 0. If the value is greater than 0 for a sustained period, the “system hanging” as described by the customer is most likely due to locking and blocking issues within the database.

SQL Server

SQL Server: Buffer Manager > Buffer cache hit ratio
Description: The percentage of database pages that were found in physical memory. SQL Server stores information inside of 8KB data pages.
Value: The buffer pool cache hit ratio should be 100%. Values below 90% indicate that SQL Server is experiencing memory pressure.

SQL Server

SQL Server: Buffer Manager > Page life expectancy
Description: The number of seconds that a database page remains in physical memory.
Value: Values under 300 seconds indicate that the SQL instance is experiencing memory pressure.

SQL Server

SQL Server: Buffer Manager > Page reads/sec
SQL Server: Buffer Manager > Page writes/sec
Description: These values show the number of database reads and writes requested.
Value: Values for these counters will vary between database applications, but this information is useful when determining if SQL Server is the primary application using the disk. If the Buffer Manager page read-writes are low but disk-queue lengths are high, there might be a disk bottleneck. If the Page read-writes are higher than normal, a memory shortage is likely to exist.

SQL Troubleshooting Summary Chart

Counter Preferred Value
Server Work Queues > Queue Lengths < 2 per processor
Memory > Pages /Sec < 20 page faults per second per processor
Physical Disk > Average Read Queue Length
Physical Disk > Average Write Queue Length
< 2 x the number of spindles
SQL Server: General Statistics > User Connections Varies
SQL Server: Locks > Lock Waits/Sec < 500MS
SQL Server: Buffer Manager > Buffer cache hit ratio > 95
SQL Server: Buffer Manager > Page life expectancy > 300
SQL Server: Buffer Manager > Page reads/sec Value should be used to confirm I/O or memory bottlenecks
SQL Server: Buffer Manager > Page writes/sec Value should be used to confirm I/O or memory bottlenecks

The performance counter suggested may not help resolve complex situations that involve application-level performance issues, but in my experience, it is a great starting point.

-George

Matthew BoehmLet’s face it — we’ve all seen some bad code in our time. During my eight-plus years of using PHP, I’ve inherited several legacy systems, and when reading over the previous programmer’s code, I often sit back and say, “What the heck was this guy thinking?”

Combine a programmer’s bad code with his or her lack of understanding about how databases work, and you’ve got one inefficient, unresponsive Web site.

The best example of this that I’ve encountered was one programmer’s attempt simply to get the number of rows in a table with a particular column value. This was his solution:

$rowCount = 0;
$res = mysql_query("SELECT * FROM users_activity");
while($row = mysql_fetch_row($res))
{
  if($row[‘activity] == ‘LOGIN')
  {
    $rowCount++;
  }
}

I hope all of you are cringing as badly as I did when I came across it the first time. For those of you that don’t understand why this is horribly inefficient, I’ll EXPLAIN.

MySQL has an often overlooked but extremely powerful built-in query analyzer that allows you to take any SELECT SQL statement, preface it with the keyword EXPLAIN, and get an analysis of the query with lots of information on how to optimize it.

Let’s use the above SQL statement on a table with over 4 million rows. Here’s the schema:

CREATE TABLE `users_activity` (
  `userId` int(11) NOT NULL,
  `activity` varchar(30) NOT NULL DEFAULT 'LOGIN',
  `notes` varchar(200) DEFAULT NULL,
  `timestamp` datetime NOT NULL DEFAULT '1970-01-01 00:00:00',
  KEY `userId_2` (`userId`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8

The first concern we have is that the query has no WHERE clause. Not having this clause requires MySQL to scan the entire table (full_table_scan), which usually involves reading the entire table from disk into memory, and if you only need to do this once, you just wasted all that memory. With EXPLAIN, MySQL can show us exactly how many rows it checks:

>EXPLAIN SELECT * FROM users_activity
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: users_activity
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 4060232
        Extra:

MySQL will have to examine 4,060,232 rows to satisfy this query. Let’s add a WHERE clause and see how that helps:

>EXPLAIN SELECT * FROM users_activity WHERE activity = 'LOGIN'
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: users_activity
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 4060232
        Extra: Using where

Hmm … Well, that didn’t change the number of rows evaluated, did it? The query optimizer still had to look at all the rows, one by one, to determine if the activity column was equal to ‘LOGIN’. Even with our WHERE clause, this query is still almost as bad as the original author’s PHP code. Why did MySQL have to do this? To answer that, take a look at the NULL value returned for possible_keys. This result shows that MySQL was unable to find an index for the column to help narrow down the search. Let’s add an index and see what happens:

>ALTER TABLE `users_activity` ADD INDEX (`activity`);

>EXPLAIN SELECT * FROM users_activity WHERE activity = 'LOGIN'
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: users_activity
         type: ref
possible_keys: activity
          key: activity
      key_len: 92
          ref: const
         rows: 422436
        Extra: Using where

Very nice. We immediately reduced the number of rows that MySQL has to examine from 4,060,232 to 422,436. An 89 percent reduction in virtually no time at all.

This is just one of many scenarios in which using EXPLAIN can help you write better SQL statements. You can turn on the log_queries_not_using_indexes option to find out which SQL statements are not using indexes. By using this log and EXPLAIN, you can enhance the performance of your MySQL server by leaps and bounds.

For further information on EXPLAIN, visit http://dev.mysql.com/doc/refman/5.1/en/using-explain.html

-Matthew

 
 

Dedicated Servers

Managed Hosting

Colocation

Business Solutions

Why The Planet?

Contact Us