• My understanding is the the Clustered index defines how the data is physically stored on the drive and changes to values in the indexed columns could force the data to be re-written to a different location. There must be some small overhead to store the metadata about the columns in the CI; however if you have a CI on fields that are not sequentially inserted then the space required will be larger than the data because of page fill % and page splits. E.g. if your orders use a CI in date order, then orders are generally added sequentially so are added to the end of the index. Page fill can be very high (95% plus) and page splits will be rare. If your CI is by customer then page splits will be more common and page fill will need to be set much lower (e.g. 70%) and there will be more demand to re-index to clean up the leaf mapping.

    Think if the indexes literally as an index card box. You can only get 10 cards in a box.

    BOB/1001/7th Jun

    BOB/1250/9th Jul

    BOB/1251/10th Jul

    CHRIS/1055/8th Jun

    CHRIS/1056/10th Jun

    JOHN/1032/7th Jun

    JOHN/1033/7th Jun

    JOHN/1078/10th Jun

    JOHN/1083/11th Jun

    JOHN/1122/20th Jun

    Now if John places a new order, that's fine, it will go into a new box as this box is full

    If Chris places a new order, we can't insert it into the current box, but we need to do some maintenance. We will start a new box, put John' orders in the new box and add Chris's order to box 1. The space required to store 11 records is the same as the space to store 20 records. There is a significant cost involved in introducing a new box and moving records to it. Overall we will usually have some spare space. Lets say that 800 records are actually spread across 10 boxes which represents a page fill of 80%. We could 'reindex' and shuffle all the records up so that they take less space, but as soon as a new record needs to be inserted, we have the overhead of creating a new box. Instead we could look at our performance and make a decision that housekeeping (while the system is quiet) will create new boxes whenever a box is more that 60% full so that record inserts rarely require us to make a new box on-the-fly

    Alternatively if the records were indexed by order number or date (and time) then we would only ever be adding records to the end of the index so the cost of introducing a new box is significantly lower.

    I'm not sure if this answers your question; the amount of space required for the CI is dependent on the page fill % and the nature of the insert/update activity on the CI fields. The allocated space will [almost] always be larger than the actual data. Non CI indexes suffer from the same issue but will only hold the data for the index (and covering) fields.