Database troubleshooting is not for the faint of heart, especially when it comes to troubleshooting locking and blocking problems. Sometimes the elephant in the room is actually a rhino, a problem that seems like something else that can make troubleshooting all the more difficult and frustrating. Other times, the answer you’re looking for is staring you right in the face. This is the case with SQL Server 2008 R2, which has a known locking issue you can resolve by simply applying the latest service pack or update.
Sometimes the operations you perform cause your databases to have locking or blocking problems. Locking and blocking are unavoidable. They occur in every relational database management system (RDBMS), and SQL Server 2008 R2 is no exception.
Blocking occurs when a single logged-on database session, or server process ID (SPID), holds a lock on a specific resource, and a second SPID tries to acquire a conflicting lock on that same resource. Typically, resource locks don’t last long. When the first user releases the lock, the second server SPID can lock the resource to continue processing a query.
The Nitty Gritty of Locking
SQL Server 2008 R2 uses a granular approach to locking. It will select the appropriate locking level based on the number of records affected and other ongoing activities.
By default, locks are escalated from smaller row-level and page-level locks, to larger table-level locks as needed to improve performance. While escalation is generally a good thing, it also can cause problems, like when a Service Set Identifier (SSID) locks an entire table and prevents other SSIDs from working with that table.
You can configure options for row- and page-level locking. These are enabled by default on indexes. SQL Server 2008 R2 also supports table partitioning. Because partitions divide data into separate objects, partitioning tables can improve overall performance.
The types of queries can also determine the duration of locks. When a query isn’t executed within a transaction and lock hints aren’t used, locks for SELECT statements are held only while the resource is being read—not for the duration of the query. Locks for INSERT, UPDATE and DELETE statements are held for the duration of the query. This helps to ensure data consistency and lets SQL Server roll back the query, if necessary.
When a query is executed within a transaction, three key factors determine the duration of a lock:
- The type of query
- The isolation level of the transaction
- The presence or absence of lock hints
Locking and blocking are normal characteristics of an RDBMS; however, they can degrade performance when an SPID holds locks on resources for extended periods of time. Performance is also affected when an SPID holds locks on resources and fails to release them.
The former blocking problem can resolve itself over time, as the SPID will eventually release the locks. The resulting performance degradation is still very real. The latter type of blocking problem can cause severe performance degradation, but is—fortunately—relatively easy to find if you monitor SQL Server for locking and blocking problems.
SQL Server Activity Monitor can help you identify locking and blocking issues. Closely watch the Wait Time, Wait Type, Wait Resource and Blocked By values for listed processes.
Most of the Activity Monitor process information comes from the following dynamic management views:
You can get a clearer picture of locking and blocking with the sys.dm_tran_locks view. This returns information about active lock requests that have been granted or are waiting to be granted. The sys.dm_exec_connections, sys.dm_exec_sessions and sys.dm_exec_requests give you information about active connections, sessions and requests, respectively.
Zero in on sys.dm_exec_requests (described in detail in the MSDN Library). A request with a “sleeping” status has completed execution, and is likely awaiting a command from the application. A request with a “running” or “runnable” status is currently processing a query. A request with a “suspended” status is waiting for a lock, latch or other event.
The wait_type column, as the name implies, returns the type of wait. If the value is greater than zero, the SPID is currently waiting. Look to the wait_time and wait_resource columns for more information. If the request is blocked, wait_time shows the duration in milliseconds. The wait_resource shows the resource for which the SPID is waiting. Note also that blocking_session_id shows the ID of the session that is blocking the request, or a negative value with information about the blocking resource owner.
Poorly written front-end applications can cause a variety of blocking problems. If an application fails to properly manage nested transaction levels, you can have a blocking situation where the request wait_type is zero, its status is sleeping, and its open_transaction_count is greater than zero.
Most likely, this application had a query timeout or issued a cancel without also issuing the required number of ROLLBACK and/or COMMIT statements. This leaves locks active and blocks other SPIDs. SQL Server won’t correct this. The application has to be written to properly handle nested transactions.
If an application fails to completely fetch all result rows, you can have a blocking situation where a request wait_type is zero, the status is runnable, and the open_transaction_count is greater than zero. Most likely, this means the application didn’t fetch all result rows and left locks on tables, blocking other SPIDs. If possible, you should configure the application to ensure it fetches all results.
When the problem is on the back end, you can resolve blocking problems by ensuring SQL Server creates the appropriate indexes on your databases. You may also want to perform query tuning as necessary. Indexes can reduce the number of records a query accesses. They do this by reducing the number of lookups the database engine has to perform. To help you find tables and columns needing indexes, use the following dynamic management views:
It’s important to keep in mind that indexes can slow data modification (as discussed in “How Online Index Operations Work”). Therefore, when you create indexes, make sure they’re being used with the sys.dm_db_ index_usage_stats view.
Indexing doesn’t necessarily cause long-term table locks. With online indexing, only an Intent Share (IS) lock is held on the source table during the main phase of the index operation. This lets queries or updates continue. Generally, a Shared lock is held on the source object for a short time at the start (and in some cases, the end) of the index operation.
When blocking is caused by a normally running query with a long execution time, you can look for ways to optimize the query. One of the best ways to do this is to minimize the amount of resources locked in a transaction. Do this by making transactions as small as possible, remove non-critical processing and reduce to what you actually need. You may want to consider splitting a transaction into multiple queries that affect fewer rows, but only if this doesn’t affect the desired outcome. Design transactions to use the correct transaction isolation level and consider whether you can use locking hints to reduce locking while still getting the correct results.
Dealing with Deadlock
A special type of blocking problem called a deadlock occurs when two or more SPIDs have locks on separate objects, and each wants a lock on the other’s object. Here, each SPID waits for the other to release the lock, but it doesn’t happen. You can minimize deadlocks, but not avoid them completely. Fortunately, the SQL Server lock manager automatically checks for deadlocks. When lock manager detects a deadlock, it does three things:
- Chooses one of the SPIDs to kill
- Generates a 1205 error message that’s sent to the originating client
- Kills the SPID to free the locked resource and allow the other SPIDs to continue
The Activity Monitor and sys.dm_tran_locks view can also help you identify deadlocks. In Activity Monitor, look at the Wait Time (elapsed wait time shown in milliseconds), Wait Type (shows whether processing is waiting or not) and Wait Resource (if waiting, shows the resource for which processing is waiting).
Front-end applications should be designed to handle deadlocks. A well-designed front-end application will trap the 1205 error, reconnect to SQL Server and then re-submit the transaction. A well-designed front-end application will also watch for distributed deadlocks that can occur when one side of the resource request is at the client level and the other side is at the server level.
The application can then resolve distributed deadlocks using query timeouts or bound connections. Here, it will break a distributed deadlock when a query timeout occurs. A client could also have multiple connections that bind into a single transaction space so the connections do not block each other.
That’s it in a nutshell. That’s how locking works and how you find and resolve blocking problems. Most of the time, SQL Server can resolve many blocking problems, including deadlocks. You can also eliminate or reduce blocking problems with good design. Start with well-designed queries and savvy front-end applications. Then extend this to the back end by fine-tuning.