Indexing strategy on a highly transient table

  • Hi,
    I'm just looking for a disscussion about the best indexing strategy for a table I have that's in a constant state of flux, one minute we can have thousands if not tens of thousands of records in it and soon afterwards we can have just a handful.
    To give you and idea of the situation, we have a production line which is being constantly monitored, more often that not some of these values from a single line need investigating by an operator, we need to ensure that once that operator has acquired them that no other operator can take those specific rows, hence they get the ID of the MonitoredValue and their own OperatorID added to the table.
    The monitoring data is held  in a table called MonitoredValue that is forever increasing, currently in excess of 50 million rows.
    Whilst in that table the MonitoredValue isn't returned to the application for others to select (there are other parts of the application where it can be "taken" but I think that over complicates this post)
    Once the operator has finished they release THEIR rows (based on OperatorID) and it becomes accessible to other operators.
    There is a future requirement that an Operator can release just some of their rows.
    Here is my current table schema:
    CREATE TABLE dbo.MonitoredValueInUseBy
        (
            MonitoredValueInUseByID bigint IDENTITY(1,1) NOT NULL CONSTRAINT PK_MonitoredValueInUseBy PRIMARY KEY,
            OperatorID int NULL CONSTRAINT fk_MonitoredValueInUseBy_Operator FOREIGN KEY REFERENCES dbo.Operator (OperatorID),
            LineID int NOT NULL CONSTRAINT fk_MonitoredValueInUseBy_Line FOREIGN KEY REFERENCES dbo.Line (LineID),
            MonitoredValueID bigint NOT NULL CONSTRAINT fk_MonitoredValueInUseBy_MonitoredValue FOREIGN KEY REFERENCES dbo.MonitoredValue (MonitorValueID),
            IsTechnical bit NOT NULL,
            RaisedDateTime datetime NOT NULL,
            BreachDateTime datetime NOT NULL,
            RowVersion timestamp NULL
        )
    Obviously I have an index on MonitoredValueInUseByID and that is the CLUSTERED INDEX.
    I have a composite Non Clustered Index on OperatorID and MonitoredValueID called ix_MonitoredValueInUseBy_OperatorID_MonitoredValueID
    ix_MonitoredValueInUseBy_OperatorID_MonitoredValueID was created to support quick access back to the MonitoredValues table for all the rows a specific Operator has "locked".
    Index stats show that this index is being accessed.
    Now I'm wondering if that index is a help or a hindrance.  I'm certainly getting the odd bottle neck on this table causing blocking through to the MonitoredValue table.
    As I mentioned, this table is constantly being emptied and refilled,  is that too much traffic for the index to be useful ?  Conversely when the table is near full how effectively could I give the operator a view of their selected rows.
    I inherited this table so I don't mind dropping the RowVersion column if I just need to make the table narrower, the BIGINTs are non-negotiable though.
    I'd be very interested to hear the community's opinions on this, hence the post.
    Regards
    Giles

  • giles.clapham - Thursday, October 5, 2017 10:43 AM

    Hi,
    I'm just looking for a disscussion about the best indexing strategy for a table I have that's in a constant state of flux, one minute we can have thousands if not tens of thousands of records in it and soon afterwards we can have just a handful.
    To give you and idea of the situation, we have a production line which is being constantly monitored, more often that not some of these values from a single line need investigating by an operator, we need to ensure that once that operator has acquired them that no other operator can take those specific rows, hence they get the ID of the MonitoredValue and their own OperatorID added to the table.
    The monitoring data is held  in a table called MonitoredValue that is forever increasing, currently in excess of 50 million rows.
    Whilst in that table the MonitoredValue isn't returned to the application for others to select (there are other parts of the application where it can be "taken" but I think that over complicates this post)
    Once the operator has finished they release THEIR rows (based on OperatorID) and it becomes accessible to other operators.
    There is a future requirement that an Operator can release just some of their rows.
    Here is my current table schema:
    CREATE TABLE dbo.MonitoredValueInUseBy
        (
            MonitoredValueInUseByID bigint IDENTITY(1,1) NOT NULL CONSTRAINT PK_MonitoredValueInUseBy PRIMARY KEY,
            OperatorID int NULL CONSTRAINT fk_MonitoredValueInUseBy_Operator FOREIGN KEY REFERENCES dbo.Operator (OperatorID),
            LineID int NOT NULL CONSTRAINT fk_MonitoredValueInUseBy_Line FOREIGN KEY REFERENCES dbo.Line (LineID),
            MonitoredValueID bigint NOT NULL CONSTRAINT fk_MonitoredValueInUseBy_MonitoredValue FOREIGN KEY REFERENCES dbo.MonitoredValue (MonitorValueID),
            IsTechnical bit NOT NULL,
            RaisedDateTime datetime NOT NULL,
            BreachDateTime datetime NOT NULL,
            RowVersion timestamp NULL
        )
    Obviously I have an index on MonitoredValueInUseByID and that is the CLUSTERED INDEX.
    I have a composite Non Clustered Index on OperatorID and MonitoredValueID called ix_MonitoredValueInUseBy_OperatorID_MonitoredValueID
    ix_MonitoredValueInUseBy_OperatorID_MonitoredValueID was created to support quick access back to the MonitoredValues table for all the rows a specific Operator has "locked".
    Index stats show that this index is being accessed.
    Now I'm wondering if that index is a help or a hindrance.  I'm certainly getting the odd bottle neck on this table causing blocking through to the MonitoredValue table.
    As I mentioned, this table is constantly being emptied and refilled,  is that too much traffic for the index to be useful ?  Conversely when the table is near full how effectively could I give the operator a view of their selected rows.
    I inherited this table so I don't mind dropping the RowVersion column if I just need to make the table narrower, the BIGINTs are non-negotiable though.
    I'd be very interested to hear the community's opinions on this, hence the post.
    Regards
    Giles

    My first thought here is whether or not you have AUTO_UPDATE_STATISTICS turned on.   If it's truly volatile, such that the quantity of data is up or down more than the threshold for auto update of statistics to trigger, then it seems at the lowest level that such would make the indexes so configured would at least be useful.   However, at 50 million rows, that might mean you'd need to drop or add 5 to 10 million rows before stats would update, potentially causing bad execution plans due to stale statistics.   I don't recall what the threshold is, or if it's configurable, and someone with more knowledge will need to weigh in on that.   The first thing I'd try to do, though, would be to at least update statistics to at least give you a well-known starting point from which further measurements can be compared to a baseline measurement.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Adding to what Steve said, I think that this also falls under the "it depends" pile.  Having up to date statistics is important for sure, but knowing if indexes are helping or hurting is hard to say from just looking at one table definition.

    It depends on what the queries are doing when they look at the table.  For example, a "SELECT * FROM dbo.MonitoredValueInUseBy" will not likely use the index when it searches the table.  But toss a where clause on things that are in the index and you will get better performance.

    That being said, an insert is a fully blocking operation which means that while an insert is occurring, no other operations can be performed on the table (including SELECTs... in most cases).  It will also depend on if the majority of the operations are inserts and deletes or if they are selects.  Indexes will help SELECT, but hinder INSERT/DELETE as it needs to insert into the main table and into each index.

    What I would do is start by looking at your blocking.  If you see a lot of blocking (which I am expecting), look at the operations that are causing the blocking and go from there.

    EDIT - First, I'd do what Steve suggested (update statistics) and get the baseline.  Afterwards, I'd look for blocking.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Don't have time to give this much thought this morning but I must get one VERY important point out: I would NOT rely on auto-update stats to handle this particular table. I would strongly consider setting up a SQL Agent job to update all statistics on this table on a VERY frequent basis. I have done that for clients as frequently as every 10 minutes. This may merit that or even more frequently.

    Good luck with it - sounds like an interesting scenario (which I don't get much of any more)!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thanks Kevin, and thanks to Steve and bmg002 too.
    You advise to update all statisitics on this table on a very frequent basis.  By "This" table do you mean the transient dbo.MonitoredValueInUseBy or the large dbo.MonitoredValue table.
    Reading what Steve and bmg002 have said about Statistics I would imagine that the transient table is updated very frequently where as I would understand if the 50million rows table had become stale. (I'm waiting on getting some stats on the Statistics from the prodution system).
    Giles

  • If the data in the table is changing frequently, the statistics should be updated frequently.

    Basically the statistics are used to determine which plan should be used when performing operations on the table.  If the statistics say that there are 50 rows and there are 50 million rows, the execution plan is going to be way off.  If the statistics say there are 500 rows and there are 502 rows, the execution plan will be fairly accurate.

    So depending on how long it is taking to update the statistics on the table, it might not hurt to do both, but if you only have time to do one, do the one that has the most changing data.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • giles.clapham - Friday, October 6, 2017 7:17 AM

    Thanks Kevin, and thanks to Steve and bmg002 too.
    You advise to update all statisitics on this table on a very frequent basis.  By "This" table do you mean the transient dbo.MonitoredValueInUseBy or the large dbo.MonitoredValue table.
    Reading what Steve and bmg002 have said about Statistics I would imagine that the transient table is updated very frequently where as I would understand if the 50million rows table had become stale. (I'm waiting on getting some stats on the Statistics from the prodution system).
    Giles

    Transient...

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • giles.clapham - Thursday, October 5, 2017 10:43 AM

    Obviously I have an index on MonitoredValueInUseByID and that is the CLUSTERED INDEX.
    I have a composite Non Clustered Index on OperatorID and MonitoredValueID called ix_MonitoredValueInUseBy_OperatorID_MonitoredValueID
    ix_MonitoredValueInUseBy_OperatorID_MonitoredValueID was created to support quick access back to the MonitoredValues table for all the rows a specific Operator has "locked".
    Index stats show that this index is being accessed.
    Now I'm wondering if that index is a help or a hindrance.  I'm certainly getting the odd bottle neck on this table causing blocking through to the MonitoredValue table.
    ...
    I'd be very interested to hear the community's opinions on this, hence the post.
    Regards
    Giles

    "Obviously ... CLUSTERED INDEX."

    No, No, NONever slap a clustering identity column on a/every table just because you "always" do. 

    If you do lookups on the table (almost) always by, say, OperatorID and MonitoredValueID, then cluster the table on those first.  Add identity to insure a unique clustering key, as with the amount of delete activity on this table, you don't want that many ghost rows hanging around.  As to potential fragmentation, (1) presumably the rows for a given OperatorID and (set of) MonitoredValueIDs are added at the same time, or very close; and (2) with all the deletes going on, free pages/space should be available relatively close to the page(s) being INSERTed into anyway.

    CREATE TABLE dbo.MonitoredValueInUseBy
    (
    MonitoredValueInUseByID bigint IDENTITY(1,1) NOT NULL CONSTRAINT uq_MonitoredValueInUseBy_MonitoredValueInUseByID UNIQUE NONCLUSTERED,
    OperatorID int NULL CONSTRAINT fk_MonitoredValueInUseBy_Operator FOREIGN KEY REFERENCES dbo.Operator (OperatorID),
    LineID int NOT NULL CONSTRAINT fk_MonitoredValueInUseBy_Line FOREIGN KEY REFERENCES dbo.Line (LineID),
    MonitoredValueID bigint NOT NULL CONSTRAINT fk_MonitoredValueInUseBy_MonitoredValue FOREIGN KEY REFERENCES dbo.MonitoredValue (MonitorValueID),
    IsTechnical bit NOT NULL,
    RaisedDateTime datetime NOT NULL,
    BreachDateTime datetime NOT NULL,
    RowVersion timestamp NULL,
    CONSTRAINT CL_MonitoredValueInUseBy UNIQUE CLUSTERED ( OperatorID, MonitoredValueID, MonitoredValueInUseByID ) WITH ( FILLFACTOR = 99 )

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher - Tuesday, October 10, 2017 11:46 AM

    giles.clapham - Thursday, October 5, 2017 10:43 AM

    Obviously I have an index on MonitoredValueInUseByID and that is the CLUSTERED INDEX.
    I have a composite Non Clustered Index on OperatorID and MonitoredValueID called ix_MonitoredValueInUseBy_OperatorID_MonitoredValueID
    ix_MonitoredValueInUseBy_OperatorID_MonitoredValueID was created to support quick access back to the MonitoredValues table for all the rows a specific Operator has "locked".
    Index stats show that this index is being accessed.
    Now I'm wondering if that index is a help or a hindrance.  I'm certainly getting the odd bottle neck on this table causing blocking through to the MonitoredValue table.
    ...
    I'd be very interested to hear the community's opinions on this, hence the post.
    Regards
    Giles

    "Obviously ... CLUSTERED INDEX."

    No, No, NONever slap a clustering identity column on a/every table just because you "always" do. 

    If you do lookups on the table (almost) always by, say, OperatorID and MonitoredValueID, then cluster the table on those first.  Add identity to insure a unique clustering key, as with the amount of delete activity on this table, you don't want that many ghost rows hanging around.  As to potential fragmentation, (1) presumably the rows for a given OperatorID and (set of) MonitoredValueIDs are added at the same time, or very close; and (2) with all the deletes going on, free pages/space should be available relatively close to the page(s) being INSERTed into anyway.

    CREATE TABLE dbo.MonitoredValueInUseBy
    (
    MonitoredValueInUseByID bigint IDENTITY(1,1) NOT NULL CONSTRAINT uq_MonitoredValueInUseBy_MonitoredValueInUseByID UNIQUE NONCLUSTERED,
    OperatorID int NULL CONSTRAINT fk_MonitoredValueInUseBy_Operator FOREIGN KEY REFERENCES dbo.Operator (OperatorID),
    LineID int NOT NULL CONSTRAINT fk_MonitoredValueInUseBy_Line FOREIGN KEY REFERENCES dbo.Line (LineID),
    MonitoredValueID bigint NOT NULL CONSTRAINT fk_MonitoredValueInUseBy_MonitoredValue FOREIGN KEY REFERENCES dbo.MonitoredValue (MonitorValueID),
    IsTechnical bit NOT NULL,
    RaisedDateTime datetime NOT NULL,
    BreachDateTime datetime NOT NULL,
    RowVersion timestamp NULL,
    CONSTRAINT CL_MonitoredValueInUseBy UNIQUE CLUSTERED ( OperatorID, MonitoredValueID, MonitoredValueInUseByID ) WITH ( FILLFACTOR = 99 )

    I probably missed it due to being exhausted and in a hurry, but if there isn't a need for the identity at all wouldn't it be far more efficient to just let the uniquefier handle row pointers for the NONunique clustered index? That would save ALL the costs (numerous) and storage of maintaining the second unique nonclustered index.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru - Tuesday, October 10, 2017 12:58 PM

    ScottPletcher - Tuesday, October 10, 2017 11:46 AM

    giles.clapham - Thursday, October 5, 2017 10:43 AM

    Obviously I have an index on MonitoredValueInUseByID and that is the CLUSTERED INDEX.
    I have a composite Non Clustered Index on OperatorID and MonitoredValueID called ix_MonitoredValueInUseBy_OperatorID_MonitoredValueID
    ix_MonitoredValueInUseBy_OperatorID_MonitoredValueID was created to support quick access back to the MonitoredValues table for all the rows a specific Operator has "locked".
    Index stats show that this index is being accessed.
    Now I'm wondering if that index is a help or a hindrance.  I'm certainly getting the odd bottle neck on this table causing blocking through to the MonitoredValue table.
    ...
    I'd be very interested to hear the community's opinions on this, hence the post.
    Regards
    Giles

    "Obviously ... CLUSTERED INDEX."

    No, No, NONever slap a clustering identity column on a/every table just because you "always" do. 

    If you do lookups on the table (almost) always by, say, OperatorID and MonitoredValueID, then cluster the table on those first.  Add identity to insure a unique clustering key, as with the amount of delete activity on this table, you don't want that many ghost rows hanging around.  As to potential fragmentation, (1) presumably the rows for a given OperatorID and (set of) MonitoredValueIDs are added at the same time, or very close; and (2) with all the deletes going on, free pages/space should be available relatively close to the page(s) being INSERTed into anyway.

    CREATE TABLE dbo.MonitoredValueInUseBy
    (
    MonitoredValueInUseByID bigint IDENTITY(1,1) NOT NULL CONSTRAINT uq_MonitoredValueInUseBy_MonitoredValueInUseByID UNIQUE NONCLUSTERED,
    OperatorID int NULL CONSTRAINT fk_MonitoredValueInUseBy_Operator FOREIGN KEY REFERENCES dbo.Operator (OperatorID),
    LineID int NOT NULL CONSTRAINT fk_MonitoredValueInUseBy_Line FOREIGN KEY REFERENCES dbo.Line (LineID),
    MonitoredValueID bigint NOT NULL CONSTRAINT fk_MonitoredValueInUseBy_MonitoredValue FOREIGN KEY REFERENCES dbo.MonitoredValue (MonitorValueID),
    IsTechnical bit NOT NULL,
    RaisedDateTime datetime NOT NULL,
    BreachDateTime datetime NOT NULL,
    RowVersion timestamp NULL,
    CONSTRAINT CL_MonitoredValueInUseBy UNIQUE CLUSTERED ( OperatorID, MonitoredValueID, MonitoredValueInUseByID ) WITH ( FILLFACTOR = 99 )

    I probably missed it due to being exhausted and in a hurry, but if there isn't a need for the identity at all wouldn't it be far more efficient to just let the uniquefier handle row pointers for the NONunique clustered index? That would save ALL the costs (numerous) and storage of maintaining the second unique nonclustered index.

    One would normally think so.  And I'm probably one of the few people here who normally doesn't mind a nonunique clustering key.  In most situations, this doesn't cause an issue.

    But in this specific case, the issue, as I see it, is with how SQL actually implements the nonunique clustering key.  To determine the next uniquifier value to use, SQL keeps the previous high-value uniquifier row as a "ghost" row, if SQL "thinks" it needs to.  And given the description of activities for this table, I'm afraid it would result in a large number, or perhaps even a very large number, of ghost rows having to be retained.  Given that, it might be best to keep the identity in the table and use it to guarantee a unique key, relieving SQL of the need to keep any ghost rows.  

    You could almost certainly get away with not having a separate nonclustered index on identity.  I probably would not create another index on such a table in  my shop, but not having it tends to freak some people out.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Do you have a link for the ghost record creation for keeping track of the uniquefier value? I thought I had one but cannot find it.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thanks Scott, the Clustered Index was chosen to be on the Identity Column due to the large number of new rows being inserted into the table.  What you've said about fragmetation and the free pages in a table such as this makes an interesting point.

  • That makes sense, sorry about the rant.  It's just that I've seen far too many database "designs" where every table got an identity clustering key slapped on it before the rest of the columns were even known.

    The pernicious myth that an identity should (almost) "always" be the clustering key has probably damaged overall database performance more than anything else, even more than some violations of normal forms.

    Besides, a bit of fragmentation is not all that terrible, particularly versus the gains from being able to process rows in contiguous blocks.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher - Tuesday, October 10, 2017 11:46 AM

    giles.clapham - Thursday, October 5, 2017 10:43 AM

    Obviously I have an index on MonitoredValueInUseByID and that is the CLUSTERED INDEX.
    I have a composite Non Clustered Index on OperatorID and MonitoredValueID called ix_MonitoredValueInUseBy_OperatorID_MonitoredValueID
    ix_MonitoredValueInUseBy_OperatorID_MonitoredValueID was created to support quick access back to the MonitoredValues table for all the rows a specific Operator has "locked".
    Index stats show that this index is being accessed.
    Now I'm wondering if that index is a help or a hindrance.  I'm certainly getting the odd bottle neck on this table causing blocking through to the MonitoredValue table.
    ...
    I'd be very interested to hear the community's opinions on this, hence the post.
    Regards
    Giles

    "Obviously ... CLUSTERED INDEX."

    No, No, NONever slap a clustering identity column on a/every table just because you "always" do. 

    If you do lookups on the table (almost) always by, say, OperatorID and MonitoredValueID, then cluster the table on those first.  Add identity to insure a unique clustering key, as with the amount of delete activity on this table, you don't want that many ghost rows hanging around.  As to potential fragmentation, (1) presumably the rows for a given OperatorID and (set of) MonitoredValueIDs are added at the same time, or very close; and (2) with all the deletes going on, free pages/space should be available relatively close to the page(s) being INSERTed into anyway.

    CREATE TABLE dbo.MonitoredValueInUseBy
    (
    MonitoredValueInUseByID bigint IDENTITY(1,1) NOT NULL CONSTRAINT uq_MonitoredValueInUseBy_MonitoredValueInUseByID UNIQUE NONCLUSTERED,
    OperatorID int NULL CONSTRAINT fk_MonitoredValueInUseBy_Operator FOREIGN KEY REFERENCES dbo.Operator (OperatorID),
    LineID int NOT NULL CONSTRAINT fk_MonitoredValueInUseBy_Line FOREIGN KEY REFERENCES dbo.Line (LineID),
    MonitoredValueID bigint NOT NULL CONSTRAINT fk_MonitoredValueInUseBy_MonitoredValue FOREIGN KEY REFERENCES dbo.MonitoredValue (MonitorValueID),
    IsTechnical bit NOT NULL,
    RaisedDateTime datetime NOT NULL,
    BreachDateTime datetime NOT NULL,
    RowVersion timestamp NULL,
    CONSTRAINT CL_MonitoredValueInUseBy UNIQUE CLUSTERED ( OperatorID, MonitoredValueID, MonitoredValueInUseByID ) WITH ( FILLFACTOR = 99 )

    Heh.... No, NO, NO!!!  Using the Clustered index in the manner you suggest will actually slow things down because the clustered index is wider than a non-Clustered index.  Page splits on the Clustered Index are VERY time consuming not to mention that every non-Clustered index will now be wider than if you put the Clustered Index on the ID column.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Wednesday, October 11, 2017 10:28 PM

    ScottPletcher - Tuesday, October 10, 2017 11:46 AM

    giles.clapham - Thursday, October 5, 2017 10:43 AM

    Obviously I have an index on MonitoredValueInUseByID and that is the CLUSTERED INDEX.
    I have a composite Non Clustered Index on OperatorID and MonitoredValueID called ix_MonitoredValueInUseBy_OperatorID_MonitoredValueID
    ix_MonitoredValueInUseBy_OperatorID_MonitoredValueID was created to support quick access back to the MonitoredValues table for all the rows a specific Operator has "locked".
    Index stats show that this index is being accessed.
    Now I'm wondering if that index is a help or a hindrance.  I'm certainly getting the odd bottle neck on this table causing blocking through to the MonitoredValue table.
    ...
    I'd be very interested to hear the community's opinions on this, hence the post.
    Regards
    Giles

    "Obviously ... CLUSTERED INDEX."

    No, No, NONever slap a clustering identity column on a/every table just because you "always" do. 

    If you do lookups on the table (almost) always by, say, OperatorID and MonitoredValueID, then cluster the table on those first.  Add identity to insure a unique clustering key, as with the amount of delete activity on this table, you don't want that many ghost rows hanging around.  As to potential fragmentation, (1) presumably the rows for a given OperatorID and (set of) MonitoredValueIDs are added at the same time, or very close; and (2) with all the deletes going on, free pages/space should be available relatively close to the page(s) being INSERTed into anyway.

    CREATE TABLE dbo.MonitoredValueInUseBy
    (
    MonitoredValueInUseByID bigint IDENTITY(1,1) NOT NULL CONSTRAINT uq_MonitoredValueInUseBy_MonitoredValueInUseByID UNIQUE NONCLUSTERED,
    OperatorID int NULL CONSTRAINT fk_MonitoredValueInUseBy_Operator FOREIGN KEY REFERENCES dbo.Operator (OperatorID),
    LineID int NOT NULL CONSTRAINT fk_MonitoredValueInUseBy_Line FOREIGN KEY REFERENCES dbo.Line (LineID),
    MonitoredValueID bigint NOT NULL CONSTRAINT fk_MonitoredValueInUseBy_MonitoredValue FOREIGN KEY REFERENCES dbo.MonitoredValue (MonitorValueID),
    IsTechnical bit NOT NULL,
    RaisedDateTime datetime NOT NULL,
    BreachDateTime datetime NOT NULL,
    RowVersion timestamp NULL,
    CONSTRAINT CL_MonitoredValueInUseBy UNIQUE CLUSTERED ( OperatorID, MonitoredValueID, MonitoredValueInUseByID ) WITH ( FILLFACTOR = 99 )

    Heh.... No, NO, NO!!!  Using the Clustered index in the manner you suggest will actually slow things down because the clustered index is wider than a non-Clustered index.  Page splits on the Clustered Index are VERY time consuming not to mention that every non-Clustered index will now be wider than if you put the Clustered Index on the ID column.

    I don't think so.  Presuming that the MonitoredValueID is ascending, there shouldn't be that many page splits.  Even if there are some, given the number of deletions, there should be free pages within the same extent or very near by.

    As to non-clus indexes, the point is to avoid them here.  If you always come in by OperatorID -- with or without other key values -- there's no need for a non-clus index.  Which, btw, take up separate buffer space from the main table, and greatly contribute to potential deadlocks in these types of situations, since SQL often initially "thinks" it can use a non-clus index for a DELETE/INSERT task, then switches to a table scan instead.in its final plan.

    Often such proper clustering also greatly reduces join overhead, particularly by allowing merge joins, which can be extraordinarily efficient.  That is probably not much of a factor in this case, as I don't see this table being joined to other tables that often.

    At any rate, this should make an interesting test case.  I would expect this table to do much better with a more customized clustering than generic default to identity.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply