16. January 2009 04:59
So I was doing a lecture on sql server query optimization at UCLA yesterday, and one of the topics I was covering was statiscs. As we know, statistics is information that the sql server collects about the distribution of data in columns and indexes, and they are used by the query optimizer to determine the best execution plan. If the database option to update statistics automatically is on, which is the default, statistics will be updated as soon as the need is discovered. In general, if your table has over 500 rows in it, and at least 500 plus 20% of the rowcount rows have been modified, statistics are out of date. Therefore, when that happens, and a user does a select, sql server will first update the statistics to get the best execution plan, and then work on returning your query results. While that sounds good in theory, in practice, it kills your query performance because wating for statistics to update when you're trying to get data returned to your application is very costly. I talked about being proactive in updating statistics with good maintenance plans, etc. I also talked about the possibility of turning on the option of updating statistics asynchronously on the database. What that will do, is alow your query to return data using the old execution plan, while the statistics are updating on the different thread.
As I was discussing these issues, a great question came up. Is updating statistics sufficient to make sure that our stored procedures use the latest and greatest execution plan? The answer is, not necessarily. Before SQL Server 2005, the execution plan for a stored procedure was saved during initial compile, and if you changed your table structure or added new indexes to a table after that compile, you would have to make sure to recompile your stored procedure in order for it to generate a new and better execution plan. Since the 2005 version, there has been a slight improvement to when the optimization of your procedure's execution plan occurs. Now, the optimization happens automatically when the first time a stored procedure is executed right after SQL Server services are restarted. It will also happen automatically if an underlying table used by the stored procedure changes. However, if you add a new index from which a stored procedure might benefit, and you do not want to wait till the SQL Server is restarted for your stored procedure to reap the benefit of your new index. In this case, you might want to force the stored procedure to recompile the next time it executes, and for that you can use the sp_recompile system stored procedure.