[SQL Server] 统计信息创建后不再更新

来源:互联网 发布:linux wifi 配置 编辑:程序博客网 时间:2024/05/14 17:35

       统计信息,对于SQL Server数据库引擎使用最优的查询计划,起到重要的作用。 但如果未在 column上创建索引(或自动创建的索引),则统计信息可能在创建以后,不会再次更新,系统将有可能使用性能较差的查询计划。

默认创建一个数据库时,“自动创建统计信息”,“自动更新统计信息”两个选项都是打开的。

 

 

确保“统计信息自动更新”的方法:在对应的列上创建索引.

手动更新统计信息的方法:  UPDATE STATISTICS TableName WITH FULLSCAN, ALL

另外两种方法更新统计信息的方法,是“强制查询计划重编译”, 以及清空缓存。(见下文最后英语部分。)

 

原文地址:http://blog.sqlauthority.com/2011/06/02/sql-server-puzzle-statistics-are-not-updated-but-are-created-once/

http://blog.sqlauthority.com/2011/06/17/sql-server-solution-puzzle-statistics-are-not-updated-but-are-created-once/

 

 

原文内容:

I am running this test on SQL Server 2008 R2. Here is the quick scenario about my setup.

  • Create Table
  • Insert 1000 Records
  • Check the Statistics
  • Now insert 10 times more 10,000 indexes
  • Check the Statistics – it will be NOT updated

Note: Auto Update Statistics and Auto Create Statistics for database is TRUE

Expected Result – Statistics should be updated – SQL SERVER – When are Statistics Updated – What triggers Statistics to Update

Now the question is why the statistics are not updated?

The common answer is – we can update the statistics ourselves using

UPDATE STATISTICS TableName WITH FULLSCAN, ALL

However, the solution I am looking is where statistics should be updated automatically based on algorithm mentioned here.

Now the solution is to ____________________.

Vinod Kumar is not allowed to take participate over here as he is the one who has helped me to build this puzzle.

I will publish the solution on next week. Please leave a comment and if your comment consist valid answer, I will publish with due credit.

Here is the script to reproduce the scenario which I mentioned.

-- Execution Plans Difference
-- Create Sample Database
CREATE DATABASE SampleDB
GO
USE SampleDB
GO
-- Create Table
CREATE TABLE ExecTable (ID INT,
FirstName VARCHAR(100),
LastName VARCHAR(100),
City VARCHAR(100))
GO
-- Insert One Thousand Records
-- INSERT 1
INSERT INTO ExecTable (ID,FirstName,LastName,City)
SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END,
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas'
ELSE 'Houston' END
FROM
sys.all_objects a
CROSS JOIN sys.all_objects b
GO
-- Display statistics of the table - none listed
sp_helpstats N'ExecTable', 'ALL'
GO
-- Select Statement
SELECT FirstName, LastName, City
FROM ExecTable
WHERE City  = 'New York'
GO
-- Display statistics of the table
sp_helpstats N'ExecTable', 'ALL'
GO

-- Replace your Statistics over here
-- NOTE: Replace your _WA_Sys with stats from above query
DBCC SHOW_STATISTICS('ExecTable', _WA_Sys_00000004_7D78A4E7);
GO
--------------------------------------------------------------
-- Round 2
-- Insert Ten Thousand Records
-- INSERT 2
INSERT INTO ExecTable (ID,FirstName,LastName,City)
SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END,
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas'
ELSE 'Houston' END
FROM
sys.all_objects a
CROSS JOIN sys.all_objects b
GO
-- Select Statement
SELECT FirstName, LastName, City
FROM ExecTable
WHERE City  = 'New York'
GO
-- Display statistics of the table
sp_helpstats N'ExecTable', 'ALL'
GO
-- Replace your Statistics over here
-- NOTE: Replace your _WA_Sys with stats from above query
DBCC SHOW_STATISTICS('ExecTable', _WA_Sys_00000004_7D78A4E7);
GO
-- You will notice that Statistics are still updated with 1000 rows
-- Clean up Database
DROP TABLE ExecTable
GO
USE MASTER
GO
ALTER DATABASE SampleDB
SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
GO
DROP DATABASE SampleDB
GO

 

 

When I created non clustered index on the column city, it also created statistics on the same column with same name as index. When we populate the data in the column the index is update – resulting execution plan to be invalided – this leads to the statistics to be updated in next execution of SELECT. This behavior does not happen on Heap or column where index is auto created. If you explicitly update the index, often you can see the statistics are updated as well. You can see this is for sure happening if you follow the tell of John Sansom.

John Sansom‘s suggestion:

That was fun!

Although the column statistics are invalidated by the time the second select statement is executed, the query is not compiled/recompiled but instead the existing query plan is reused.

It is the “next” compiled query against the column statistics that will see that they are out of date and will then in turn instantiate the action of updating statistics.

You can see this in action by forcing the second statement to recompile.

SELECT FirstName, LastName, City
FROM ExecTable
WHERE City = ‘New York’ option(RECOMPILE)   -- 在生产环境下不能使用这种方式,严重影响性能
GO

Kevin Cross also have another suggestion:

I agree with John. It is reusing the Execution Plan. Aside from OPTION(RECOMPILE), clearing the Execution Plan Cache before the subsequent tests will also work.

i.e., run this before round 2:
————————————————————–
– Clear execution plan cache before next test
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
————————————————————–

Nice puzzle!

Kevin

As this was puzzle John and Kevin both got the correct answer, there was no condition for answer to be part of best practices. I know John and he is finest DBA around – his tremendous knowledge has always impressed me. John and Kevin both will agree that clearing cache either using DBCC FREEPROCCACHE and recompiling each query every time is for sure not good advice on production server. It is correct answer but not best practice.

By the way, if you have better solution or have better suggestion please advise. I am open to change my answer and publish further improvement to this solution. On very separate note, I like to have clustered index on my Primary Key, which I have not mentioned here as it is out of the scope of this puzzle.

 

 

原创粉丝点击