Index Theories on adding new columns (Not Nullable)

  • Background: We have databases we support on client machines.  Some of those DBs come back to us so we can run internal upgrade tests to make sure query performance released in upgrade is the same as before upgrade.

    Ticket: After upgrade query performance is lagging on a DB.  After defragmenting/reorganizing/updating statistic indexes the issue is still there.  The query (complex 2K lines) went from 20 seconds to 60 seconds when the data stayed the same.  Something changed because the execution path is completely different.  looking at the usual suspect tables the fragmentation is under 1%.

    My Theory: Since I am at whits end all I can think is defragmenting is not going to work when a table has a column added.  I have two tables that are used in these queries that had fields removed and added.  The added field is not nullable.  We use the MS default fill factor.  As the data moves on the pages the index is getting corrupted.  The only solution I can think of is dropping indexes including the PK and recreating them.  This will basically rebuild the structures of the index.  Similar to what happens when in table design view using SSMS.  I am curious if you know of any articles discussing this.  I can't seem to search effectively.  Before I make this a standard for development going forward I need to know if my theory is somewhat correct.

  • What do you mean by "As the data moves on the pages the index is getting corrupted"?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Also, what was added and removed? These are columns, not fields, just to be sure we understand each other.

    Perhaps you can better explain with some code what happened?

  • I do not think I can edit the original post.  I am talking about columns in the table.

    These are scripts we run through a program to alter existing tables.  ie Upgrade.sql is the script that hold the DDL

    IF col_length('[Foo].[TableA]','ColumnB') IS NOT NULL
    BEGIN
    ALTER TABLE [Foo].[TableA] DROP COLUMN [ColumnB];
    END

    IF col_length(N'[Foo].[TableB]', 'ColumnAdd') IS NULL
    BEGIN
    ALTER TABLE [Foo].[TableB] ADD [ColumnAdd] [bit] NOT NULL CONSTRAINT [DF_ColumnAdd] DEFAULT (0);
    END

    Here is what I mean by the data in the index corrupted.  I have Index A it is on a column Foo.TableB.TableBId.  The index has built statistics where it knows where that data lives within the pages (Correct me if I am wrong).  When I add a column in this case a Bit it will force the data to grow unlike if the column is nullable.  When it adds the byte for the bit the data moves and the index needs to be rebuilt in order to understand the historical data changed.  The statistics will be rebuilt with the index and correctly point to the data.  What I think I am seeing is that i added a column and the index is not fragmented.  The statistics on the index are wrong so it masks the issue. I can't defrag because it is under 5%.  In a mouse analogy I moved the SQLs cheese.  Could MS not have accounted for this behavior?

  • 5% doesn't prevent you from rebuilding an index.  The settings you pass to your defrag software might.

    The term "corrupted" is the wrong term here.  The index is not being corrupted.  I might becoming more fragmented but that's not "corruption".

    You generally shouldn't need to rebuild the index you speak of if it's only got 5% logical fragmentation.

    Rebuilding an index is a very expensive way to rebuild statistics.  Just rebuild the statistics on the index in question and the mouse will find the SQL cheese just fine.  Index "A" won't actually be affected unless it's rebuilt to include the new column as a key, it contains an INCLUDE, or it's the Clustered Index.

    In any case, if you add a non-nullable column,  it's very likely that you'll have a shedload of page splits on the Clustered Index.  5% logical fragmentation on a very large index won't matter so much but the Page Density might.  You should check the average_percent_of_page_fullness for the index using sys.dm_db_index_physical_stats.  Since a non-nullable BIT column fequently only adds 1 byte to each row (unless it has to create another byte in the row header), you might not have a problem with the Clustered Index because of this.  Then again, you might have some huge problems that will require you to rebuild the Clustered Index.

    All of that notwithstanding, I can't imagine any of this causing the problem you described of "The query (complex 2K lines) went from 20 seconds to 60 seconds when the data stayed the same".  That would imply that the performance issue occurred before you added the column (which would make the data NOT be the same).

    So when did the performance issue actually occur?  Before or after your column changes?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you very much for these responses they are helping me  with next steps.  I have a baseline DB that when we upgrade (Several DDL scripts) gets the unresponsive query.  I was doing usual suspects when I decided it was this table that was causing the issue.  The execution plan was a skinny line on baseline, and a fat line on post upgrade.  It went from a seek to a scan.  What I can do to test this further is actually take the baseline and run each script to see which changed the performance.  You did answer my question above.  I think it is the clustered index and I bet the other indexes have an include on the clustered index.  We do a lot of table changes in a version because of additional features.  I really want to know what is causing this specific issue.  I will test more tomorrow and give some feedback on the post.  Thanks again

  • this is one of those times where the query store can really help you (maybe)

    if you can find that query in the query store (maybe as a regressed query) then look at the multiple plans for it.... you might be able to use the "force plan" option

    but as jeff said, rebuilding the indexes probably isn't the right way to go

    you still have sp_updatestats

    MVDBA

  • I'll also add that if the original was taking 20 seconds, there's a very good chance that it needs a little rework to make it more performant and use fewer resources in the process.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I know I am a bit late in replying.  My theory is correct in this case.  When the column was added the indexes became fragmented (Existing indexes can't find records) so the execution path changed.  This caused the slowness.  The data did not change beyond the column being added.  No DML was done to this table upon upgrade.

    steps:

    1. restore database and confirm slow running query
    2. Remove FKs referencing PK and drop the index.  I also dropped the other indexes used to read the data
    3. recreated the indexes (all of them) and query performed as fast as prior to the upgrade.

    I did test removing one or the other index (PK vs IX) to see what happened.  It seemed to actually not matter as much as I was hoping.  When I did the restore I had the query returning in 1.16 and then rebuilding it was 10 seconds.  I did notice something about this table that I am now questioning.  The PK AutoId is not the clustered index.  One of the parent int FKs indexes is.  It is not unique.  Could that be the issue that the clustered index is not unique?

    CREATE CLUSTERED INDEX [IX_Table_Field1] ON [dbo].[Table]
    (
    [Parent_Id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    GO


    vs

    ALTER TABLE [dbo].[Table] ADD CONSTRAINT [PK_Table] PRIMARY KEY NONCLUSTERED
    (
    [Table_Id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    GO
  • Uniqueness isn't something you need for the clustered index. The choice of clustering key is usually something that is increasing and not often changing, like an id or date field that grows. This is separate from the PK. The PK is for uniqueness.

    In terms of the issues, you need to do some execution plan analysis to see what might be used and not used and what's affecting this.

    When you test, I assume you run these a few times? The first time might load data into RAM, so it will usually be slower. It's more interesting to see the reads (and writes) and execution plan.

  • I disagree; when at all possible, you want to make the clustered index unique yourself whenever possible.  Yes, SQL will always force it to be unique anyway, but that often leads to performance loss in all sorts of queries where you wouldn't suspect it.

    Also, since row compression is available for your version of SQL (2016), you should definitely strongly consider that as well.  ROW compression is usually a given, PAGE is often valuable but you should verify it first.  The only time you wouldn't do that is if you are severely CPU constrained.

    You should also explicitly specify FILLFACTOR.  Again, the proper value varies, but that's actually a good reason to explicitly specify it: it makes you consider the best FF for that specific index.

    Putting this all into action for your situation, you might try:

    CREATE UNIQUE CLUSTERED INDEX [Table__CL_Parent_Id] ON [dbo].[Table]

    (

    [Parent_Id] , Table_Id

    ) WITH ( DATA_COMPRESSION = PAGE /*ROW*/, FILLFACTOR = 98, SORT_IN_TEMPDB = ON ) ON [PRIMARY]

    GO

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Scott,

    I just had a meeting and that is what I am going to try next is to make the index have the PK Auto Id in it.  That would make it unique and hopefully will correct the situation we are seeing.  I will look at the fill factor and compression to see if that is something we can use.

  • Hoping I can pull you guys back into this discussion because I was partially wrong.  It seems like the index is not this issue.  I recreate the index timing was still bad.  What I have narrowed it down to is a rogue Statistic

    In the baseline I have indexes/Auto Stat based on the Structure_Id.  I only have one statistic.  Here is the ordered columns

    --****Field**** is the one I want you to specifically see
    StatsIDStatsNameStatsColIDColumnName
    1IX_MII_Items_TransactionsID1Transactions_Id
    2PK_MII_Items1Items_Id
    3IX_MII_Items_LaborCostSchemesID1LaborCostSchemesId
    4IX_MII_Items_Structures_Id1****Structures_Id********
    5_WA_Sys_00000016_1E05700A1FormulasId
    6_WA_Sys_00000020_1E05700A1MaterialCostCategoriesId
    7_WA_Sys_0000000D_1E05700A1Sets_Id
    8_WA_Sys_00000003_1E05700A1ItemType
    9_WA_Sys_0000001C_1E05700A1AggregateSets_Id
    10_WA_Sys_00000014_1E05700A1UserQuantity
    11_WA_Sys_00000013_1E05700A1WasteQuantity
    12_WA_Sys_00000006_1E05700A1CalculatedQuantity
    13_WA_Sys_00000021_1E05700A1WorkStationsId
    14_WA_Sys_00000012_1E05700A1ApplicationCategoriesId
    15_WA_Sys_00000001_1E05700A1ItemsId
    16_WA_Sys_0000000B_1E05700A1RecordVersion
    17_WA_Sys_00000029_1E05700A1IsPackageItem
    18_WA_Sys_00000028_1E05700A1IsSpecialOrder
    19_WA_Sys_0000002C_1E05700A1ChangeoverValueId

    In the upgraded database I have an additional Structures_Id WA stat

    StatsIDStatsNameStatsColIDColumnName
    1IX_MII_Items_TransactionsID1Transactions_Id
    2PK_MII_Items1Items_Id
    3_WA_Sys_0000000C_1E05700A1***Structures_Id***
    4IX_MII_Items_LaborCostSchemesID1LaborCostSchemesId
    5_WA_Sys_00000016_1E05700A1FormulasId
    6_WA_Sys_00000020_1E05700A1MaterialCostCategoriesId
    7_WA_Sys_0000000D_1E05700A1Sets_Id
    8_WA_Sys_00000003_1E05700A1ItemType
    9_WA_Sys_0000001C_1E05700A1AggregateSets_Id
    10_WA_Sys_00000014_1E05700A1UserQuantity
    11_WA_Sys_00000013_1E05700A1WasteQuantity
    12_WA_Sys_00000006_1E05700A1CalculatedQuantity
    13_WA_Sys_00000021_1E05700A1WorkStationsId
    14_WA_Sys_00000012_1E05700A1ApplicationCategoriesId
    15_WA_Sys_00000001_1E05700A1ItemsId
    16_WA_Sys_0000000B_1E05700A1RecordVersion
    17_WA_Sys_00000029_1E05700A1IsPackageItem
    18_WA_Sys_00000028_1E05700A1IsSpecialOrder
    19_WA_Sys_0000002C_1E05700A1ChangeoverValueId
    20IX_MII_Items_Structures_Id1***Structures_Id****

    I am assuming the Query hits the WA Statistic on Structures_Id before it hits the IX index on Structures id and query looses significant performance.  When I run -

    DROP STATISTICS BusinessMgmt.Items._WA_Sys_0000000C_1E05700A

    The performance is back to normal.  Here is where I am hoping you can help.

    These upgrades happen on customer machines.  I can't analyze each one of them.  Is there a way I can make sure this does not happen? I know I can turn off Auto Stats on specific tables.  I would prefer not to.  I would like have a better utility to detect columns that are in auto generated statistics and remove them if they are also in stats based on indexes.  I am thinking of using a query like this -

    SELECT s.stats_id StatsID,
    s.name StatsName,
    sc.stats_column_id StatsColID,
    c.name ColumnName
    FROM sys.stats s
    INNER JOIN sys.stats_columns sc
    ON s.object_id = sc.object_id AND s.stats_id = sc.stats_id
    INNER JOIN sys.columns c
    ON sc.object_id = c.object_id AND sc.column_id = c.column_id
    WHERE OBJECT_NAME(s.object_id) = 'items'
    ORDER BY s.stats_id, sc.column_id;

    Last question that I am sure will result in it depends is why would SQL create a stat when an index clearly has the field included

     

    Thanks ahead of time if you managed to read this far.

  • SQL uses statistics to check ordinality and cardinality of a table to figure out which index to use or which query plan to use.

    if your data changes in volume in any given table then new stats will be created for the query plans as they change over time, even if you have the "perfect index"

    there is a certain amount of "estimation" on SQL's part - perhaps try using an index hint or a force plan

    MVDBA

  • All those _WA are system created, and they do get created, as Mike noted, in response to queries. If the statistics tell the optimizer that your index may, or may not, be useful, then that is factored in. You can specifically create statistics, and sometimes using a full sample size gets you different results. Analyzing the execution plan and the statistics (DBCC SHOW_STATISTICS) will help you see if there is some skew in the stats v the data.

    You also have to look at the queries that are causing execution plans. If more data comes back, or more columns, it's possible you hit a tipping point where SQL thinks a scan might be better than a seek. This is more analysis than stats, and really goes into looking at what changed in the execution plan.

    Grant has a good book that might help you, and perhaps you can restructure queries.

    https://www.sqlservercentral.com/books/sql-server-execution-plans-second-edition-by-grant-fritchey

    In the short term, you could add an agent job that just removes these status (look based on column, not name) from clients and have that run daily, hourly, etc. However, I'd dig into what is broken.

     

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply