update statistics during production ?

  • Hello,

    I found that if I am doing some optimizing for RealTime query, which is used for data inserted for example few minutes ago or during this week (after update statistics which was during weekend before), sometimes there is big difference than data before update statistics, depends on if it is easy or some more difficult query with join

    My question is , will I avoid something like this when I will do update statistics during week? during production? if will be possible during break time ? For example every 2 or 6 hours ?

    Or Real Time query will have sometimes bad execution plan, bad logical reads, bad estimated rows and etc due to statistics ?

    Of course for some Result Query will be helpful, because user can check data in this week ( and now are not in statistics ) and there can be some good benefit.

    Because for example If I will use same query for another historical data with recompile hint , I will get good execution plan, good estimated rows, good logical reads and etc...

    Now I am doing at Saturday 22:15 in off-hours with Ola script with parameters like these... it takes about 1-2hours.

    EXECUTE msdb.dbo.IndexOptimize

    @databases = 'database',

    @FragmentationLow = NULL,

    @FragmentationMedium = 'INDEX_REORGANIZE,INDEX_REBUILD_OFFLINE',

    @FragmentationHigh = 'INDEX_REBUILD_OFFLINE',

    @FragmentationLevel1 = 5,

    @FragmentationLevel2 = 25,

    @UpdateStatistics = 'ALL',

    @OnlyModifiedStatistics='Y',

    @StatisticsSample = 100,

    @LogToTable='Y'

    I am just curious

    Thanks

  • Example:

    TABLE1 has xx xxx rows

    TABLE2 has xx xxx xxx rows , and for one row in TABLE1 are xx rows inside TABLE2

    Query:

    select * from TABLE1 (NOLOCK) A

    LEFT JOIN TABLE2 (NOLOCK) B

    ON

    A.1PKCOLUMN=B.1PKCOLUMN

    AND A.2PKCOLUMN=B.2PKCOLUMN

    WHERE A.1PKCOLUMN='DATE' AND A.2PKCOLUMN>'SEQ'

    OPTION (RECOMPILE)

    If I have join with data in statistics OR USE HASH JOIN

    Table 'TABLE1'. Scan count 1, logical reads 55

    Table 'TABLE2'. Scan count 1, logical reads 423

    54635 row(s) affected

    CPU time = 563 ms, elapsed time = 2259 ms.

    If I have join with data not in statistics

    Table 'TABLE2'. Scan count 1360, logical reads 614720

    Table 'TABLE1'. Scan count 1, logical reads 81,

    58394 row(s) affected

    CPU time = 24797 ms, elapsed time = 25973 ms.

    After this testing I tried using HASH JOIN > Result was like first query with WorkFile inside messages

    And after all I did UPDATE STATISTICS and both query WITH JOIN are like first

  • You might find that you identify certain tables or certain indexes that need more frequent statistics updates. When you do, you might want to schedule updates on those specific objects, separate from your weekly statistics updates. It really depends on how fast your statistics age and how much load updating the statistics will put on the server. Depending on the server and the load, I've had all statistics updated once a day. I even, in an extreme case, had one table that had statistics getting updated every 20 minutes (it was a bad design and this was a crutch to keep it going). There are no hard and fast rules here.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (2/24/2016)


    You might find that you identify certain tables or certain indexes that need more frequent statistics updates. When you do, you might want to schedule updates on those specific objects, separate from your weekly statistics updates. It really depends on how fast your statistics age and how much load updating the statistics will put on the server. Depending on the server and the load, I've had all statistics updated once a day. I even, in an extreme case, had one table that had statistics getting updated every 20 minutes (it was a bad design and this was a crutch to keep it going). There are no hard and fast rules here.

    Like you wrote, maybe will be good identify these tables for more frequent and not need for all tables.

    Today I was thinking about this, and I think that also one per day will be helpful, because I use almost all time first column of PK and this is date. Maybe count of this rows in statistics will not be ok, but this date will be inside so it can be helpful.. will test.

    And right , sometimes is needed to do something what is not based on good design...

    Thanks for suggestion.

  • And one question.

    Thanks to few problems in past , we bought new storage, so our IO is ok, according to my opinion is overrated.

    Normally can be there any performance problem during statistics update ? ( only simple, without fullscan )

    I tried now on few tables and I cannot see some effect during production and we are working in real time, so some little delay and I know immediately that something happened...

    Is possible cause locks or something ? For example for Update/Insert/Delete ?

  • Nothing is ever 100%, but no, in most cases you're only going to see some additional I/O, CPU & memory during a stats update. It doesn't put locks on the table, so you shouldn't see outright contention that would lead to blocking like that. The real impact on most systems is that after the stats are updated you're going to get recompiles on plans in cache as they get called and the stats update has marked them as invalid. That can cause some waits while the recompile occurs.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Ah, good note with recompile, I didnt know about it. Thanks .

    So it can be almost similar effect like after restart server and first load or delete all plan manualy from memory.

    I will try update stats one per day during week, when first row with day will arrive.

    I dont have some big problem with this, but want to try some new suggestions and ideas for improve our server...

    Now we have during week about CPU 20-35% , Memory about 100GB from 172GB and IO with this new storage also is not problem.

    Thank you

  • tony28 (2/24/2016)


    I tried now on few tables and I cannot see some effect during production and we are working in real time, so some little delay and I know immediately that something happened...

    Is possible cause locks or something ? For example for Update/Insert/Delete ?

    I hope you understand that the query you provided in the first post is not only updating the statistics but also rebuilds the indexes, which definitely will cause locks on the tables.


    Alex Suprun

  • Alexander Suprun (2/24/2016)


    tony28 (2/24/2016)


    I tried now on few tables and I cannot see some effect during production and we are working in real time, so some little delay and I know immediately that something happened...

    Is possible cause locks or something ? For example for Update/Insert/Delete ?

    I hope you understand that the query you provided in the first post is not only updating the statistics but also rebuilds the indexes, which definitely will cause locks on the tables.

    Of course, I will do without rebuild... in these parameters will be null... based on article on Ola website.

    Thank you for interest

  • Alexander Suprun (2/24/2016)


    tony28 (2/24/2016)


    I tried now on few tables and I cannot see some effect during production and we are working in real time, so some little delay and I know immediately that something happened...

    Is possible cause locks or something ? For example for Update/Insert/Delete ?

    I hope you understand that the query you provided in the first post is not only updating the statistics but also rebuilds the indexes, which definitely will cause locks on the tables.

    Good catch. I was only stalking about statistics updates. Index rebuilds are a different story.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (2/24/2016)


    Alexander Suprun (2/24/2016)


    tony28 (2/24/2016)


    I tried now on few tables and I cannot see some effect during production and we are working in real time, so some little delay and I know immediately that something happened...

    Is possible cause locks or something ? For example for Update/Insert/Delete ?

    I hope you understand that the query you provided in the first post is not only updating the statistics but also rebuilds the indexes, which definitely will cause locks on the tables.

    Good catch. I was only stalking about statistics updates. Index rebuilds are a different story.

    Yes this is, I tried also rebuild with offline and I saw some delay on production, little surprise for workers :Whistling::cool:

  • I tried now during production

    EXECUTE msdb.dbo.IndexOptimize

    @databases = 'database',

    @FragmentationLow = NULL,

    @FragmentationMedium = NULL,

    @FragmentationHigh = NULL,

    @UpdateStatistics = 'ALL',

    @OnlyModifiedStatistics='Y',

    @LogToTable='Y'

    first with few smaller databases without problem - took about 1minute for each

    and with main database -- took about 15minutes and it looks also without problem..

    CPU was little increased sometimes about 5-15% , it means about 25-40%, but rather is about 25-30% like before..

    So I will try to do one per day and during update will try check some side effect and hope that will help with these RealTime queries..

    And I am just curious, I saw in statistics details EQ_ROWS , this is count based on some numbers right? some averages, so I think that it can be helpful for me ...

    thanks for help...

  • tony28 (2/24/2016)


    And I am just curious, I saw in statistics details EQ_ROWS , this is count based on some numbers right? some averages, so I think that it can be helpful for me ...

    thanks for help...

    Now you're into the statistics. It builds something called a histogram that contains details of the distribution of data. The EQ_ROWS shows how many rows equal the value in the upper bound of the step (RANGE_HI_KEY). See https://msdn.microsoft.com/en-us/library/hh510179%28v=sql.110%29.aspx for information on the properties of statistics.

  • So with this query is without problem and update statistics will be helpful...

    select * from TABLE1 (NOLOCK) A

    LEFT JOIN TABLE2 (NOLOCK) B

    ON

    A.1PKCOLUMN=B.1PKCOLUMN

    AND A.2PKCOLUMN=B.2PKCOLUMN

    WHERE A.1PKCOLUMN='DATE' AND A.2PKCOLUMN>'SEQ'

    OPTION (RECOMPILE)

    But I found one issue..sometimes I have to need to use like below

    A.1PKCOLUMN+A.2PKCOLUMN> 'DATE'+'SEQ'

    select * from TABLE1 (NOLOCK) A

    LEFT JOIN TABLE2 (NOLOCK) B

    ON

    A.1PKCOLUMN=B.1PKCOLUMN

    AND A.2PKCOLUMN=B.2PKCOLUMN

    WHERE A.1PKCOLUMN>='DATE' AND A.1PKCOLUMN+A.2PKCOLUMN> 'DATE'+'SEQ'

    OPTION (RECOMPILE)

    And when statistics were updated for both tables , result is same

    Data in table are Date is same, and sqc is still inserted for example every 30s-2minutes.

    Table 'TABLE2'. Scan count 1201, logical reads 5194,

    Table 'TABLE1'. Scan count 1, logical reads 72

    I tried update these table again with manual UPDATE STATISTICS TABLE1... TABLE2

    and after this result is

    Table 'TABLE1'. Scan count 1, logical reads 78, ph

    Table 'TABLE2'. Scan count 1, logical reads 421

    I found one different before manual update in execution plan ( I dont have saved )

    This is that old data and now after manual update had Seek Prediction in PK of TABLE2 only 1PKCOLUMN

    And in the issue was 1PKCOLUMN and 2PKCOLUMN...

    but now is ok... what it was?

    EDIT:

    I tried on few next tables and now it looks ok without manual update... is there any delay that statistics will be ok? - I think no - Or what it could be ?

    Will try again tommorow...

  • So problem looks like still ex plan from first executions even if I use OPTION(RECOMPILE)

    Also tried to use DBCC FREESYSTEMCACHE('SQL Plans') but data was same,,,

    With different tables which has almost same data is ok, but I didnt execute query before update stats.

    Any suggestion how to prevent this ?

    but Grant Fritchey wrote that after update statistics is ex.plan recompiled.

    I have to wait to next week for next testing.

    Thanks

    EDIT: for example helps change WHERE A.1PKCOLUMN>='20160225' from = or A.1PKCOLUMN to B.1PKCOLUMN....

    if I changed back same like before update statistics ..

    hmm

    Select * from TABLE1 A

    LEFT JOIN TABLE2 B

    ON

    A.1PKCOLUMN=B.1PKCOLUMN

    AND A.2PKCOLUMN=B.2PKCOLUMN

    WHERE A.1PKCOLUMN='DATE' AND A.1PKCOLUMN+A.2PKCOLUMN>'DATE+SEQ'

    OPTION(RECOMPILE)

    Select * from TABLE1 A

    LEFT JOIN TABLE2 B

    ON

    A.1PKCOLUMN=B.1PKCOLUMN

    AND A.2PKCOLUMN=B.2PKCOLUMN

    WHERE A.1PKCOLUMN='DATE' AND A.1PKCOLUMN+B.2PKCOLUMN>'DATE+SEQ'

    OPTION(RECOMPILE)

    --or

    Select * from TABLE3 A

    LEFT JOIN TABLE4 B

    ON

    A.1PKCOLUMN=B.1PKCOLUMN

    AND A.2PKCOLUMN=B.2PKCOLUMN

    WHERE A.1PKCOLUMN='DATE' AND A.1PKCOLUMN+A.2PKCOLUMN>'DATE+SEQ'

    OPTION(RECOMPILE)

    If I changed something in first query ( executed before update stats ) , every time is like second result..

    (11020 row(s) affected)

    Table 'TABLE2'. Scan count 321, logical reads 1563,

    Table 'TABLE1'. Scan count 1, logical reads 78,

    (11020 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0,

    Table 'TABLE1'. Scan count 1, logical reads 78,

    Table 'TABLE2'. Scan count 1, logical reads 683,

    or

    Table 'Worktable'. Scan count 0, logical reads 0,

    Table 'TABLE3'. Scan count 1, logical reads 78,

    Table 'TABLE4'. Scan count 1, logical reads 683,

Viewing 15 posts - 1 through 15 (of 35 total)

You must be logged in to reply to this topic. Login to reply