Index Theories on adding new columns (Not Nullable)

  • JKSQL

    SSCrazy Eights

    Points: 8312

    Background: We have databases we support on client machines.  Some of those DBs come back to us so we can run internal upgrade tests to make sure query performance released in upgrade is the same as before upgrade.

    Ticket: After upgrade query performance is lagging on a DB.  After defragmenting/reorganizing/updating statistic indexes the issue is still there.  The query (complex 2K lines) went from 20 seconds to 60 seconds when the data stayed the same.  Something changed because the execution path is completely different.  looking at the usual suspect tables the fragmentation is under 1%.

    My Theory: Since I am at whits end all I can think is defragmenting is not going to work when a table has a column added.  I have two tables that are used in these queries that had fields removed and added.  The added field is not nullable.  We use the MS default fill factor.  As the data moves on the pages the index is getting corrupted.  The only solution I can think of is dropping indexes including the PK and recreating them.  This will basically rebuild the structures of the index.  Similar to what happens when in table design view using SSMS.  I am curious if you know of any articles discussing this.  I can't seem to search effectively.  Before I make this a standard for development going forward I need to know if my theory is somewhat correct.

  • Jeff Moden

    SSC Guru

    Points: 994647

    What do you mean by "As the data moves on the pages the index is getting corrupted"?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 715809

    Also, what was added and removed? These are columns, not fields, just to be sure we understand each other.

    Perhaps you can better explain with some code what happened?

  • JKSQL

    SSCrazy Eights

    Points: 8312

    I do not think I can edit the original post.  I am talking about columns in the table.

    These are scripts we run through a program to alter existing tables.  ie Upgrade.sql is the script that hold the DDL

    IF col_length('[Foo].[TableA]','ColumnB') IS NOT NULL
    BEGIN
    ALTER TABLE [Foo].[TableA] DROP COLUMN [ColumnB];
    END

    IF col_length(N'[Foo].[TableB]', 'ColumnAdd') IS NULL
    BEGIN
    ALTER TABLE [Foo].[TableB] ADD [ColumnAdd] [bit] NOT NULL CONSTRAINT [DF_ColumnAdd] DEFAULT (0);
    END

    Here is what I mean by the data in the index corrupted.  I have Index A it is on a column Foo.TableB.TableBId.  The index has built statistics where it knows where that data lives within the pages (Correct me if I am wrong).  When I add a column in this case a Bit it will force the data to grow unlike if the column is nullable.  When it adds the byte for the bit the data moves and the index needs to be rebuilt in order to understand the historical data changed.  The statistics will be rebuilt with the index and correctly point to the data.  What I think I am seeing is that i added a column and the index is not fragmented.  The statistics on the index are wrong so it masks the issue. I can't defrag because it is under 5%.  In a mouse analogy I moved the SQLs cheese.  Could MS not have accounted for this behavior?

  • Jeff Moden

    SSC Guru

    Points: 994647

    5% doesn't prevent you from rebuilding an index.  The settings you pass to your defrag software might.

    The term "corrupted" is the wrong term here.  The index is not being corrupted.  I might becoming more fragmented but that's not "corruption".

    You generally shouldn't need to rebuild the index you speak of if it's only got 5% logical fragmentation.

    Rebuilding an index is a very expensive way to rebuild statistics.  Just rebuild the statistics on the index in question and the mouse will find the SQL cheese just fine.  Index "A" won't actually be affected unless it's rebuilt to include the new column as a key, it contains an INCLUDE, or it's the Clustered Index.

    In any case, if you add a non-nullable column,  it's very likely that you'll have a shedload of page splits on the Clustered Index.  5% logical fragmentation on a very large index won't matter so much but the Page Density might.  You should check the average_percent_of_page_fullness for the index using sys.dm_db_index_physical_stats.  Since a non-nullable BIT column fequently only adds 1 byte to each row (unless it has to create another byte in the row header), you might not have a problem with the Clustered Index because of this.  Then again, you might have some huge problems that will require you to rebuild the Clustered Index.

    All of that notwithstanding, I can't imagine any of this causing the problem you described of "The query (complex 2K lines) went from 20 seconds to 60 seconds when the data stayed the same".  That would imply that the performance issue occurred before you added the column (which would make the data NOT be the same).

    So when did the performance issue actually occur?  Before or after your column changes?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • JKSQL

    SSCrazy Eights

    Points: 8312

    Thank you very much for these responses they are helping me  with next steps.  I have a baseline DB that when we upgrade (Several DDL scripts) gets the unresponsive query.  I was doing usual suspects when I decided it was this table that was causing the issue.  The execution plan was a skinny line on baseline, and a fat line on post upgrade.  It went from a seek to a scan.  What I can do to test this further is actually take the baseline and run each script to see which changed the performance.  You did answer my question above.  I think it is the clustered index and I bet the other indexes have an include on the clustered index.  We do a lot of table changes in a version because of additional features.  I really want to know what is causing this specific issue.  I will test more tomorrow and give some feedback on the post.  Thanks again

  • MVDBA

    SSC-Insane

    Points: 20775

    this is one of those times where the query store can really help you (maybe)

    if you can find that query in the query store (maybe as a regressed query) then look at the multiple plans for it.... you might be able to use the "force plan" option

    but as jeff said, rebuilding the indexes probably isn't the right way to go

    you still have sp_updatestats

    MVDBA

  • Jeff Moden

    SSC Guru

    Points: 994647

    I'll also add that if the original was taking 20 seconds, there's a very good chance that it needs a little rework to make it more performant and use fewer resources in the process.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • JKSQL

    SSCrazy Eights

    Points: 8312

    I know I am a bit late in replying.  My theory is correct in this case.  When the column was added the indexes became fragmented (Existing indexes can't find records) so the execution path changed.  This caused the slowness.  The data did not change beyond the column being added.  No DML was done to this table upon upgrade.

    steps:

    1. restore database and confirm slow running query
    2. Remove FKs referencing PK and drop the index.  I also dropped the other indexes used to read the data
    3. recreated the indexes (all of them) and query performed as fast as prior to the upgrade.

    I did test removing one or the other index (PK vs IX) to see what happened.  It seemed to actually not matter as much as I was hoping.  When I did the restore I had the query returning in 1.16 and then rebuilding it was 10 seconds.  I did notice something about this table that I am now questioning.  The PK AutoId is not the clustered index.  One of the parent int FKs indexes is.  It is not unique.  Could that be the issue that the clustered index is not unique?

    CREATE CLUSTERED INDEX [IX_Table_Field1] ON [dbo].[Table]
    (
    [Parent_Id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    GO


    vs

    ALTER TABLE [dbo].[Table] ADD CONSTRAINT [PK_Table] PRIMARY KEY NONCLUSTERED
    (
    [Table_Id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    GO
  • Steve Jones - SSC Editor

    SSC Guru

    Points: 715809

    Uniqueness isn't something you need for the clustered index. The choice of clustering key is usually something that is increasing and not often changing, like an id or date field that grows. This is separate from the PK. The PK is for uniqueness.

    In terms of the issues, you need to do some execution plan analysis to see what might be used and not used and what's affecting this.

    When you test, I assume you run these a few times? The first time might load data into RAM, so it will usually be slower. It's more interesting to see the reads (and writes) and execution plan.

  • ScottPletcher

    SSC Guru

    Points: 98115

    I disagree; when at all possible, you want to make the clustered index unique yourself whenever possible.  Yes, SQL will always force it to be unique anyway, but that often leads to performance loss in all sorts of queries where you wouldn't suspect it.

    Also, since row compression is available for your version of SQL (2016), you should definitely strongly consider that as well.  ROW compression is usually a given, PAGE is often valuable but you should verify it first.  The only time you wouldn't do that is if you are severely CPU constrained.

    You should also explicitly specify FILLFACTOR.  Again, the proper value varies, but that's actually a good reason to explicitly specify it: it makes you consider the best FF for that specific index.

    Putting this all into action for your situation, you might try:

    CREATE UNIQUE CLUSTERED INDEX [Table__CL_Parent_Id] ON [dbo].[Table]

    (

    [Parent_Id] , Table_Id

    ) WITH ( DATA_COMPRESSION = PAGE /*ROW*/, FILLFACTOR = 98, SORT_IN_TEMPDB = ON ) ON [PRIMARY]

    GO

    SQL DBA,SQL Server MVP(07, 08, 09) Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them.

  • JKSQL

    SSCrazy Eights

    Points: 8312

    Scott,

    I just had a meeting and that is what I am going to try next is to make the index have the PK Auto Id in it.  That would make it unique and hopefully will correct the situation we are seeing.  I will look at the fill factor and compression to see if that is something we can use.

Viewing 12 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply