Clustered Index suggestion requested

  • I have inherited a system which has very few tables with indexes. One table has 18K rows, expected to grow to 25K by year end, row size is 140 bytes. Def is below:

    CREATE TABLE [dbo].[CEOData](

    [StatisticsDate] [datetime] NULL,

    [Seq] [int] NULL,

    [Format] [int] NULL,

    [Stats] [varchar](50) NULL,

    [Actual] [numeric](18, 2) NULL,

    [Budget] [numeric](18, 2) NULL,

    [Variance] [numeric](18, 2) NULL,

    [PriorYear] [numeric](18, 2) NULL

    )

    This table is never updated, 40 rows inserted daily, queried daily by StatisticsDate.

    StatisticsDate+ Stats would be unique.

    I've researched the criteria( http://www.sqlskills.com/blogs/kimberly/guids-as-primary-keys-andor-the-clustering-key/ and http://www.sqlservercentral.com/blogs/aschenbrenner/2010/08/31/unique-and-non_2D00_unique-non_2D00_clustered-indexes-on-a-unique-clustered-index/) for clustered and non-clustered indexes and if I were to create an unique clustered index on StatisticsDate+Stats, it would be unique, static and ever-increasing. But it would not be narrow. Seq and Format columns are strictly for use in a report (neither are unique at all)

    The SP which inserts to this is slow, mostly because of the number of tables and views the data is pulled from. The report that is generated from this table is quite slow. I feel certain that having an index would help with that. This table is not accessed by end-users.

    The part that I am hung up about is the 'narrow' aspect of a good clustered index. Should the fact that StatisticsDate+Stats would be up to 60 bytes prevent me from using it as a clustered index?

    Thanks for your suggestions.

  • A few things to consider:

    I have inherited a system which has very few tables with indexes. One table has 18K rows, expected to grow to 25K by year end, row size is 140 bytes.

    It's good to have a primary key and clustered index on every table; this is generally considered a best practice. That said, 25K is not a lot of records. If you have a lot to get done I personally would not worry to much about this.

    CREATE TABLE [dbo].[CEOData](

    [StatisticsDate] [datetime] NULL,

    [Seq] [int] NULL,

    [Format] [int] NULL,

    [Stats] [varchar](50) NULL,

    [Actual] [numeric](18, 2) NULL,

    [Budget] [numeric](18, 2) NULL,

    [Variance] [numeric](18, 2) NULL,

    [PriorYear] [numeric](18, 2) NULL

    )

    Not to get off the topic but, if you have control over the table design you would probably benefit from setting these columns to NOT NULL unless they MUST be NULLable.

    This table is never updated, 40 rows inserted daily, queried daily by StatisticsDate.

    StatisticsDate+ Stats would be unique.... if I were to create an unique clustered index on StatisticsDate+Stats, it would be unique, static and ever-increasing. But it would not be narrow.

    You would have to change these columns to not be NULLable. That's not an ideal clustering key but based on this design and the number of rows that you are dealing with I think that would be fine.

    The SP which inserts to this is slow, mostly because of the number of tables and views the data is pulled from.

    Adding indexes will make that only make those inserts slower. Just an FYI.

    The report that is generated from this table is quite slow. I feel certain that having an index would help with that. This table is not accessed by end-users.

    An index may not help. If you only have 18K records in there and the report is slow it's probably not due to lack of indexes on a table. What kinds of reporting sofware are you using? How are the reports accessing the table (e.g. via stored procedure, ad-hoc SQL query)?

    The part that I am hung up about is the 'narrow' aspect of a good clustered index. Should the fact that StatisticsDate+Stats would be up to 60 bytes prevent me from using it as a clustered index?

    Thanks for your suggestions.

    In this case that might be your best option and not a terrible one considering the small number of rows you are dealing with. For a better suggestion can you post some examples of queries that are ran against this table?

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Don't get too hung up on the "narrow" aspect of what makes a good index key. Instead of "narrow" think "as narrow as practical." Then, suddenly, your key structure is not at all out of wack. You're only talking about a datetime and an int. That's a 12byte key. It'll be fine. Since your primary access path is through the date field, making this the clustered key is a great idea. You should see very good index seeks to retrieve 40 rows out of 20K rows. And, if you were filtering on date before, you've been likely scanning all 20K rows, so this should be great.

    Get an execution plan before you implement the index and store it. Also get some average execution times and reads, all before you implement the index. You don't want to just throw it on there and then not know how well it worked, if it worked. Then you can compare before and after.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply