Regarding update stats

  • Hi All,

    We have a big database ~5TB. We update the stats with full scan weekly once. It runs more than a day.Since maintenance window is limited, we exit out if it is taking more time. My question is, what is best approach to be taken to update stats for fairly large and highly volatile tables ??? Do we need to break all the small tables in a seperate job and put all the big tables and update stats as a separate job? Please give some ideas? I even tried to increase maxdop to 6 and changing the sample to 70. if I change the sampling to 70, we are seeing performance issues over that particular week. So, I had to revert it back to 100 sampling.

    We are also, do an exercise of cleaning up all the backup tables created by application team as part of their weekly agile releases and also archiving the LOG tables which have data more than 3 years.  Please provide some inputs on updating stats for large highly volatile tables. How to deal with them. Please share some of your real time experiences.

    --below is the code schedule inside a sql agent job.

    EXECUTE [SQLDBA_utils].[dbo].[IndexOptimize]

    @databases = 'dbname',

    @MaxDOP = 4,

    @UpdateStatistics='ALL',

    @StatisticsSample = 100,

    @OnlyModifiedStatistics='Y',

    @TimeLimit=86400, -- 24 hours in secs -- if it runs more than 24 hours, exit .

    @LogToTable = 'Y'

     

    Regards,

    Sam

  • IndexOptimize: are you only updating statistics ? or also doing index maintenance (you may skip index maintenance)

    The name resemples Ola's

     

    https://dba.stackexchange.com/questions/197532/ola-index-and-update-stats#:~:text=There%20is%20a%20third%20scenario%2C%20where%20you%20can,%3D%20NULL%2C%20%40UpdateStatistics%20%3D%20%27ALL%27%2C%20%40OnlyModifiedStatistics%20%3D%20%27Y%27%3B

    There is a third scenario, where you can only update statistics, using the following code (from Ola's site):

    EXECUTE dbo.IndexOptimize

    @databases = 'USER_DATABASES',

    @FragmentationLow = NULL,

    @FragmentationMedium = NULL,

    @FragmentationHigh = NULL,

    @UpdateStatistics = 'ALL',

    @OnlyModifiedStatistics = 'Y';

    Is that 5 TB constantly refreshing? If you have stale / readonly data you might consider partitioning.

    SQL 2014 introduced incremental statitics for partitioned tables https://techcommunity.microsoft.com/t5/core-infrastructure-and-security/incremental-statistics-my-new-favorite-thing/ba-p/371171

  • Agreed.  Partition the data if you need to update stats on a very large table and you can separate active data from inactive data by partitioning.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Thanks Scott. Some vendor apps are not supporting the partitioning. Any other options?

  • Sam, you didn't answer Jo's point above.

    Are you actually using IndexOptimise correctly?

    Have you read Ola's documentation to verify your job is actually only doing statistics maintenance and not also index maintenance?

    Have you modified the default Ola code to factor in the defaults of @FragmentationMedium and @FragmentationHigh to be NULL?

    If not your code you provided will be doing index maintenance also

    Look at examples C & D "https://ola.hallengren.com/sql-server-index-and-statistics-maintenance.html"

    You can see Ola tells you for just statistics maintenance you must pass the fragmentation low/medium/high values also

     

  • Compress the data.  Look at both page compression and COMPRESS/DECOMPRESS for large char columns.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • I have the same question.  It sounds as if you are rebuilding indexes.

    Second, these scripts perform an update statistics on the table with a full scan.  A table typically has many statistics created on  it, I would suggest updating each statistic IF IT NEEDS IT individually.  That is far less overhead, as is usually much faster.

    This query can give you a view of the statistics and the usage.  The PctMod column is somewhat misleading, if a single row is updated 100 times it reports 100 in the modification_counter column.

    SELECT DB_NAME(DB_ID())
    ,S.name
    ,O.name
    ,O.object_id
    ,STAT.name
    ,CONVERT(
    numeric (18, 2)
    ,ROUND(
    (CONVERT(numeric (18, 2), STATPROP.modification_counter)
    / CASE
    WHEN STATPROP.rows = 0
    THEN 1.00
    ELSE CONVERT(numeric (18, 2), STATPROP.rows)
    END * 100
    ), 2
    )
    ) PctMod
    ,STATPROP.rows
    ,STATPROP.modification_counter
    ,STATPROP.last_updated
    FROM sys.objects O
    INNER JOIN sys.schemas S ON O.schema_id = S.schema_id
    INNER JOIN sys.stats STAT ON STAT.object_id = O.object_id
    CROSS APPLY sys.dm_db_stats_properties(STAT.object_id, STAT.stats_id) STATPROP
    WHERE S.name <> 'sys';

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Thanks everyone. Till now, I was under the impression that it is only updating stats.

    One question,  came to mind . What would be the Sampling percent used, if I don't specify any value?

    If I don't specify the value, will the same SAMPLE percent is used for all tables or will it be different sample for different tables based on different rows in each table? I checked the CommandLog, it doesn't show me the SAMPLE percentage.

    select * from [master].[dbo].[CommandLog] ;

    Command

    UPDATE STATISTICS [dbo].[DEPT] [CL_DEPT_c1]

    UPDATE STATISTICS [dbo].[DEPT] [NCL_DEPT_c2]

    UPDATE STATISTICS [dbo].[EMP] [CL_EMP_c1]

    UPDATE STATISTICS [dbo].[EMP] [NCL_EMP_c2]

    Regards,

    Sam

  • Another question is, How to tell which tables needs FULLSCAN and which ones not. someetimes , we may or may not afford to run 100% on each and every statistics. Even 10% is a high number for a large table. Any ideas on what should be the approach for updating stats for those Large highly volatile tables?

    Regards,

    Sam

  • ScottPletcher wrote:

    Compress the data.  Look at both page compression and COMPRESS/DECOMPRESS for large char columns.

    As a bit of a sidebar, be aware that Page Compression causes index rebuilds to take substantially longer.  You'll also get a whole lot more fragmentation (which also means a whole lot more page splits, waits, and log file action) on what used to be fixed width datatypes if you're in the habit of doing inserts followed by updates to NULL to what used to be fixed width columns.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply