Descending Clustered Index

  • I have an audit table with approx 9 million rows in it (it's not my table)

    At present there is a clustered index on the key (unique identifier) and i want to change this so that the clustered index is on date (as most of the queries will scan across or between dates)

    What i was thinking was that if i added a clustered index (after removing the one on the key) and set the field option to desc then i would hit the most recent records first, as they would be at the 'front' of the disk storage.

    Would this be faster if i was querying on recent values (i.e. withi a month) ???

     

    cheers

    dbgeezer

  • I had a similar issue once and I found that there is no real way of speeding up the query through the index.

    If you select * from the table concerned where the identity is the clsutered index and the coulumndate is between or greater than a date you will see that it still uses the clustered index anyway.

    I don't see why anything at the beginning would be picked up first. If it knows about the index it will go straight to it. I stand corrected though.

    Anyway, the table I had had over 50 million rows. I got around it by creating a field made up of month and year and encouraged queries for a period rather than a date range. I found that 30 second queries ran in a couple of seconds.

    Hope this helps you out.


    ------------------------------
    The Users are always right - when I'm not wrong!

  • Assuming your audit table is going to be continually updated, making a DESCENDING clustered index on date will see SQL Server doing more work; everytime you add a row, SQL Server will be forced to reorganise every row in the table so your new row is at the front, slow work for a large table.

    If profiling this table shows that the majority of queries and the slowest of queries are date range queries, than ASCENDING clustered index on date may see better performance. When you add new rows, SQL Server will stick them on the end of the table, which will mean fast updates, and fast(er) selects on the ranges.

    Only if you insert or update a row with a date < max(date) will SQL Server need to reorganise rows.

    Once you've made the clustered index change, use profiler to monitor your workload again to check perfomance.  Use Query Analyzer with Show Execution Plan to see if SQL Server used the clustered index for hints.

     

     


    Julian Kuiters
    juliankuiters.id.au

  • Anyone else out there correct me if I'm wrong.

    Julian, I think you misunderstand the way a Clustered index works.  During an index reorg, or build you may be correct.

    During an insert the new data will be placed wherever there is room.  In a clustered index, the rows within a data page are physicaly in order.  But the "data pages" may not be in physical order. The higher level indexes will resolve to the data page.  Which functions as the leave level (lowest node) in a clustered index.

    Some data may be moved around to populate the new pages resulting from splits.  If you insert 2/15/04 and there is already a data page with a 2/1 and 2/30 the new row will be inserted into that page. Assuming there is room.  But if the max data is 5/1 and you insert 6/1 a new page may be created wherever there is free space.  The higher level indexes resolve the physical out of sequence.

    From your description, there would never be a need to reorg a clustered index. I see them get mildly fragmented on a regular basis.

    Finally, I need to do some digging as the little light bulb just went on.  I believe SQL Server does it natively, for UDB(DB2) there is an option on the index create statement to support bidrectional queries.  That said the same index can be used to search in Asc or Desc order.  At least on UDB, I'll see if I can find any info on SQL Servers equivalant.


    KlK

  • Grr.. I did a big reply and just lost it.

    Yes SQL Server organises the rows inside the data pages in physical order, and with no BOL info to the contrary can assume that the data pages are not in physical order. In that case it would not matter if the clustered index was ASC or DESC.

    IF you are adding rows outside the existing date range, then SQL Server would tack them on the end / start of the existing data pages till they are full, and just add a new data page as needed.

    The performance hit would come when you whack dates in the middle. When an index data page becomes full, SQL Server splits the page, moving 50% of the data to the new page, adding another Intermediate Level data page. This is also where fragmentation begins. So during your INSERT/UPDATE there is the increased hit of the page split. But then subsequent SELECTs using the index have deeper index pages to dig through. A index reorganise or drop/create would see SQL Server put all the index pages as one level deep, but this isn't something that you would want to be doing all the time.

    SQL Server doesn't seem to support bi-directional indexes, but I imagine it would be as efficient in searching forward or backwards though indexes.

    Do you agree?

     


    Julian Kuiters
    juliankuiters.id.au

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply