What is the Formula to estimate\forecast space for non-clustered index on a table

  • Hello,

    What is the best way to forecast\estimate space for non-clustered index on a table?

    Example :

    Table name : Test123

    Row : 170000000

    Reserved : 18000000 KB

    Data : 70000000 KB

    Index: 40000000 KB

    Note: Test123 already has clustered index and 2 non clustered indexes.

  • Books Online has very specific formulas for how to do that, under:

    "Estimating the Size of a Nonclustered Index"

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • LearnSQL!!! (4/22/2015)


    Hello,

    What is the best way to forecast\estimate space for non-clustered index on a table?

    Example :

    Table name : Test123

    Row : 170000000

    Reserved : 18000000 KB

    Data : 70000000 KB

    Index: 40000000 KB

    Note: Test123 already has clustered index and 2 non clustered indexes.

    The problem with trying to come up with a formula is that it won't take into account things like variable width columns, compression, or fragmentation.

    You have to estimate based on sampling accumulated storage and row counts. So, let N equal the number of rows you want to estimate for.

    N * (Current Index KB / Current Rows)

    So 1 indexed row = ((40000000 * 1024) / 170000000) = ~241 bytes, and 1 billion rows would require about 241,000,000,000 bytes (241 GB) for index storage.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • With an existing table, you already know what the length of the columns will be, whether varchar or not. The formula provides you with the additional overhead SQL requires to create and manage a table/row.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (4/22/2015)


    With an existing table, you already know what the length of the columns will be, whether varchar or not. The formula provides you with the additional overhead SQL requires to create and manage a table/row.

    Forecasting sounds like a need to estimate the total size of the index size for a given row volume or future date. Estimating what percentage of the index is meta-data doesn't seem useful to know.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (4/22/2015)


    ScottPletcher (4/22/2015)


    With an existing table, you already know what the length of the columns will be, whether varchar or not. The formula provides you with the additional overhead SQL requires to create and manage a table/row.

    Forecasting sounds like a need to estimate the total size of the index size for a given row volume or future date. Estimating what percentage of the index is meta-data doesn't seem useful to know.

    What percentage of the index isn't useful, but the total bytes needed are required to produce an accurate estimate.

    Yes, it's only an estimate. It's impossible to be precise until the index is actually created, and even then a few mods to the table rows would change the final size.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (4/22/2015)


    Eric M Russell (4/22/2015)


    ScottPletcher (4/22/2015)


    With an existing table, you already know what the length of the columns will be, whether varchar or not. The formula provides you with the additional overhead SQL requires to create and manage a table/row.

    Forecasting sounds like a need to estimate the total size of the index size for a given row volume or future date. Estimating what percentage of the index is meta-data doesn't seem useful to know.

    What percentage of the index isn't useful, but the total bytes needed are required to produce an accurate estimate.

    Yes, it's only an estimate. It's impossible to be precise until the index is actually created, and even then a few mods to the table rows would change the final size.

    Maybe we can estimate the overhead on an (empty) index, but how is that useful, because it can't be used for forecasting the size of a filled index in production? The index data size over time depends on the distribution of large vs. small values (for varchar columns), page fill, fragmentation, and compression ratio.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (4/22/2015)


    ScottPletcher (4/22/2015)


    Eric M Russell (4/22/2015)


    ScottPletcher (4/22/2015)


    With an existing table, you already know what the length of the columns will be, whether varchar or not. The formula provides you with the additional overhead SQL requires to create and manage a table/row.

    Forecasting sounds like a need to estimate the total size of the index size for a given row volume or future date. Estimating what percentage of the index is meta-data doesn't seem useful to know.

    What percentage of the index isn't useful, but the total bytes needed are required to produce an accurate estimate.

    Yes, it's only an estimate. It's impossible to be precise until the index is actually created, and even then a few mods to the table rows would change the final size.

    Maybe we can estimate the overhead on an (empty) index, but how is that useful, because it can't be used for forecasting the size of a filled index in production? The index data size over time depends on the distribution of large vs. small values (for varchar columns), page fill, fragmentation, and compression ratio.

    I give up. But people do estimate index sizes all the time, particularly in large companies. There's separate storage people, and I have to tell them, in advance, how much space I need to add to accommodate a table's indexes.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply