Potential issue with runaway index growth

  • Hi,I'm pretty new to Azure but less new to SQL Server. We have an Azure SQL database with one table of about 300000 rows. So not huge. But, whilst the data appears to be only a few GB in size, the index on the Primary Key (which is just an int column) has grown to in excess of 50GB. If I insert the data from the table into a new one and recreate the index on there I do not see the index grow to anywhere near that size.

    Index maintenance using the Ola Hallengren procedures is in place.

    Has anyone experienced similar behavior in the past and found a way of decreasing the size of the index?

    Many thanks in advance.

    Alex.

  • Here's a question.  Why is this a problem?

    If the primary key is clustered, it IS the table.

    This may be normal.  If you could provide the DDL so we can see the table structure, that may help in diagnosing the cause.

    Is this index being rebuilt regularly by the maintenance scripts? Have you captured the fragmentation of the indexes on this table?

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • If your primary key is not the clustered index, this is extremely concerning. If it is, how are you measuring the table size as compared to the index size? As was already said, if that PK defines the clustered index, that is the table (plus the B+ tree of the index, hence why a clustered index will be bigger overall than the just the data stored in it).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • alexhdavies wrote:

    Hi,I'm pretty new to Azure but less new to SQL Server. We have an Azure SQL database with one table of about 300000 rows. So not huge. But, whilst the data appears to be only a few GB in size, the index on the Primary Key (which is just an int column) has grown to in excess of 50GB. If I insert the data from the table into a new one and recreate the index on there I do not see the index grow to anywhere near that size. Index maintenance using the Ola Hallengren procedures is in place. Has anyone experienced similar behavior in the past and found a way of decreasing the size of the index? Many thanks in advance. Alex.

    Yes... there are actually a couple of things that can cause this and the problem(s) can be exacerbated by doing the wrong type of index maintenance (and, yes... following the "Best Practice" 10/30% rules for index maintenance is actually one of the worst things you can do to a system).

    Please post the CREATE TABLE statement along with any constraints and indexes (post the CREATE INDEX/CONSTRAINT statements) that are on the table and maybe we can hone it down for you.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi All

    The table structure is as below. Whilst i get that a clustered index is the data, the thing I'm struggling to get my head round is why if I script out and recreate the table as a new table, and then insert the same data into it, why doesn't the clustered index on that new table match the same size as the original one?

    Regards why this is a problem, initially there were concerns that this was going to continue to grow day by day and incur increasing costs within Azure. We are also in a position whereby we have to migrate this from one Azure environment to another (long story), and exporting/importing the database when this table was only 1/3 of its current size took 18 hours, so I'd potentially be expecting to have to ask the business to agree to several days worth of downtime to do the migration.


    CREATE TABLE [dbo].[InvestigationForms](

    [Id] [int] IDENTITY(1,1) NOT NULL,

    [InvestigationId] [int] NOT NULL,

    [FormId] [int] NOT NULL,

    [InitialFormJson] [nvarchar](max) NOT NULL,

    [FormJson] [nvarchar](max) NOT NULL,

    [WrPublishedJson] [nvarchar](max) NULL,

    [FiPublishedJson] [nvarchar](max) NULL,

    [DateCreated] [datetime2](7) NOT NULL,

    [UserCreated] [varchar](128) NOT NULL,

    [DateModified] [datetime2](7) NULL,

    [UserModified] [varchar](128) NULL,

    [DateQC] [datetime2](7) NULL,

    [UserQC] [varchar](128) NULL,

    PRIMARY KEY CLUSTERED

    (

    [Id] ASC

    )WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]

    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

    GO

    ALTER TABLE [dbo].[InvestigationForms] WITH CHECK ADD CONSTRAINT [FK_InvestigationForms_Form] FOREIGN KEY([FormId])

    REFERENCES [dbo].[Forms] ([Id])

    GO

    ALTER TABLE [dbo].[InvestigationForms] CHECK CONSTRAINT [FK_InvestigationForms_Form]

    GO

    ALTER TABLE [dbo].[InvestigationForms] WITH CHECK ADD CONSTRAINT [FK_InvestigationForms_Investigation] FOREIGN KEY([InvestigationId])

    REFERENCES [dbo].[Investigations] ([Id])

    ON DELETE CASCADE

    GO

    ALTER TABLE [dbo].[InvestigationForms] CHECK CONSTRAINT [FK_InvestigationForms_Investigation]

    GO


     

  • alexhdavies wrote:

    why if I script out and recreate the table as a new table, and then insert the same data into it, why doesn't the clustered index on that new table match the same size as the original one?


    Answer: fragmentation of original index  OR  different fill factor

  • Also, you can implement compression (page or row) and get indexes smaller + less I/O and wiser RAM usage.

    Thanks to Kalen Delaney for the ported version of standard proc :

    https://www.sqlserverinternals.com/blog/2018/6/6/creating-my-own-spestimatedatacompressionsavings

     

     

  • Yeah, I'm with Andrey. Something along those lines could be likely. However, at a fundamental level, comparing the on-premises version of SQL Server to Azure SQL Database at any level is fraught with issues. ASD is updated and advanced far beyond your local version of SQL Server, even if you've got the latest CU installed. That can lead to the occasional weirdness when comparing.

    How are you querying the size?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Andrey wrote:

    Also, you can implement compression (page or row) and get indexes smaller + less I/O and wiser RAM usage. Thanks to Kalen Delaney for the ported version of standard proc : https://www.sqlserverinternals.com/blog/2018/6/6/creating-my-own-spestimatedatacompressionsavings    

    Just remember that compression doesn't work on out of row lobs or overflows.  There may be nearly zero advantage to trying to compress this table.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • alexhdavies wrote:

    Hi All The table structure is as below. Whilst i get that a clustered index is the data, the thing I'm struggling to get my head round is why if I script out and recreate the table as a new table, and then insert the same data into it, why doesn't the clustered index on that new table match the same size as the original one?

    First of all, are both of these tables in Azure?  And how long does it take to create that second table, as you mentioned?

    I think I know all the various things that are causing the issue you're seeing but need to know a bit more.  Since you're using OLA's code for Index Maintenance, can you provide a listing of all the index maintenance done on this table for the last 6 months?

    Also, do LOBs in Azure default to "In-Row" (if they fit) or "Out of Row" (even if they do fit)?

    As for the problem being fragmentation... yes, I agree but not for the reasons most people think.  For example, a part of the reason why I want to see the logs from Ola's maintenance is because I think your table is probably "stuck" on doing REORGANIZE and that's very bad because REORGANIZE is one of the worst things you can do to a Clustered Index even if it does contain in-row LOBs.  The 10%/30% thing for index maintenance isn't actually a "Best Practice" even though millions of people think it is.

    And, finally, can the NULLability of the variable width columns be changed to NOT NULL and have a default applied even if the default is a string of spaces?  And can we change the table to force all of the LOBs to be "Out of Row"?

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply