• SQL Kiwi (1/31/2012)


    peter-757102 (1/31/2012)


    Can you explain (or have a link that does a good job at it), why one would want to go that route over clustered storage in that scenario?

    Not offhand, no. It's not something I have looked at for years - but I am aware that some DW methodologies advocate heaps. It may have something to do with requiring sorted order for no apparent benefit, but we're on the ragged edge of my recollection here.

    I did check some on the net myself (long live Google)....and this is what I get from it:

    1. Nonclustered indexes on heaps have direct "pointers" to their respective record data in the heap, avoiding a secondary scan trough a clustered index. On large tables that are heavily read and where records are never changed, this could be a big speed factor.

    2. It is easier to append new data and use many different search patterns using multiple indexes (a clustered index is only optimal on one, the rest becomes expensive).

    It might add flexibility in segregating inactive and active data too i think. Keeping the active set small, makes things fast ofcourse.

    Some testing on my side is needed to figure out when to use and when not too and what problems I encounter.

    Currenly all cubes I worked with are relativly easy managable in size, around 110 GB for the biggest, but with very limited access pattens, so clustering worked fine there. In other scenarios with much smaller datasets I also use clustereing, supported with one index in some cases and that worked fine too. But as usual, as the dataset grows, issues and wrong choices become more aparent.