how to find the lowest usage of a database

  • Jeff Moden - Wednesday, November 14, 2018 10:45 PM

    WhiteLotus - Wednesday, November 14, 2018 8:49 PM

    Jeff Moden - Tuesday, November 13, 2018 6:35 PM

    Wondering why data type is your biggest concern ?                                  

    Couple of reasons...
    1.  You've underestimated the size the index will be.  Each row of the non-clustered index will be 15 bytes consisting of a 7 byte row header, 4 bytes for the column being indexed, and 4 bytes for the Clustered Index that will automatically be added.  That means that your index will actually be 24GB (15 bytes * 1.6 Billion rows).
    2.  I've recently (about the last year) been working with indexes and index maintenance a whole lot and was concerned with "Expansive Updates" on this fairly large index, which would immediately and severely fragment your index, not to mention the affects on performance, log file activity, a low page densities that occur due to the bad page splits that "Expansive Updates" cause especially if the page splits occur in a "skewed" fashion.  These effects are especially pronounced on NCI's because of how many rows per page there usually are due to a narrow index row width.

    Since you have the Standard Edition, you already know you can't use the "ONLINE" option and that explains why you're looking for a "quiet time".  One way to reduce the time to build such an index is to create or rebuild the index while in the BULK LOGGED Recovery Model.  Index creation or rebuilding is minimally logged when in the BULK LOGGED model and your doing it in an OFFLINE manner.  I recently rebuilt a 146GB Clustered Index in about 12 minutes using that method. A 24GB index will take a lot less.  I also used the SORT_IN_TEMPDB option but your mileage may vary there because of memory and not knowing how fast your TempDB drive is compared to drive your database is on.

    I'm also concerned with what the column for the index being built is actually used for.  If it's an ever-increasing value and so is the CI, then you can set the FILL FACTOR for the index to 100 and likely never have to defrag it in the next 20 years.  If the column for the index isn't ever-increasing (inserted into the index out of order) or if the values in the column are updated or the values are inserted in large batches with a narrow scope on the column (Siloed Index is what I call that), then you may or may not be able to take advantage of any FILL FACTOR to reduce page splits.  You'll have to check later to figure that out.

    If the index already exists and isn't a unique index being used for an FK, you can also avoid some unwanted expansion of the MDF file by disabling the index before your rebuild it.

    Don't forget to shift back to the FULL Recovery Model if that's where you started.

    • I am wondering if we create index in a table whichtype of lock that might happen ?

    • Previously I thought if we have query withselect statement and where criteria then Non clustered index with included columnwould be the best index regardless the table size .

      So apparently it is not true ?

    On your first question, all user access to the underlying table in the ONLINE = OFF mode is impossible when....
    1. Creating, rebuilding, or dropping a Clustered Index.
    2. Rebuilding or dropping a Non-Clustered Index.

    Still on your first question, Inserts, Updates, and Deletes on the underlying table are blocked when Rebuilding or Dropping a Non-Clustered Index.  However, reads are not blocked and so you can still select against the underlying table.

    On your second question, I'm wondering where the question comes from and so I'm not sure of what your actually looking for.  I'm especially confused on how anything I may have said led you to believe that a covering index (which is what you're talking about) isn't beneficial especially on large tables.    A narrow Non-clustered Index (NCI from here on) that covers a query will be faster (sometimes significantly faster) than a wider Clustered Index (CI from here on) simply because fewer pages will be read thanks to the more narrow rows of the NCI (which is a form of table itself).  For large tables, I usually use the CI to control/reduce fragmentation, which frequently benefits queries with temporal criteria (seek first, range scan the rest).  Of course, as with anything else in SQL Server, "It Depends" on the shape of the data and the usage of the data.

    The caveat with covering indexes is that you have to remember that all NCIs are a duplication of data and such indexes can quickly grow to be larger than the original data whether the original data is in a heap or a CI (table).  You also have to remember that the key for the CI (or RID if it's a heap) is also stored in every NCI.  That's also why I said "usually" when it comes to me using the CI to control fragmentation.  There are exceptions depending on the data.

    I did create index (size 4 GB ) using Full Model and yes it took a while ( more than 10 mins and was still running ) so I cancelled it and changed it into Bulk Logged model and yes it's faster ( 7mins completed ) .Then changed it back to FULL .  Thank you for your advice !!
    But if compared with your index creation ( 146 GB in 12 mins ) , Mine is a lot slower . Not sure why ...

    Related Cover Index , At first I want to put it into 1.6 billion records of table ( This index has 1 key column and 10 included columns , total size : 40 GB ) , query performance is a lot faster , on the other hand There will be a lot of INSERT too , the consideration is ..it will slow down the insert performance so I decided to create Non clustered index with 1 key column - size 16 GB . Query performance is still fast but not as fast as using that cover index
    I decided to apply Non clustered index as it has more benefit in this case .
    Please let me know your thought ...

  • WhiteLotus - Thursday, November 15, 2018 5:15 PM

    Jeff Moden - Wednesday, November 14, 2018 10:45 PM

    WhiteLotus - Wednesday, November 14, 2018 8:49 PM

    Jeff Moden - Tuesday, November 13, 2018 6:35 PM

    Wondering why data type is your biggest concern ?                                  

    Couple of reasons...
    1.  You've underestimated the size the index will be.  Each row of the non-clustered index will be 15 bytes consisting of a 7 byte row header, 4 bytes for the column being indexed, and 4 bytes for the Clustered Index that will automatically be added.  That means that your index will actually be 24GB (15 bytes * 1.6 Billion rows).
    2.  I've recently (about the last year) been working with indexes and index maintenance a whole lot and was concerned with "Expansive Updates" on this fairly large index, which would immediately and severely fragment your index, not to mention the affects on performance, log file activity, a low page densities that occur due to the bad page splits that "Expansive Updates" cause especially if the page splits occur in a "skewed" fashion.  These effects are especially pronounced on NCI's because of how many rows per page there usually are due to a narrow index row width.

    Since you have the Standard Edition, you already know you can't use the "ONLINE" option and that explains why you're looking for a "quiet time".  One way to reduce the time to build such an index is to create or rebuild the index while in the BULK LOGGED Recovery Model.  Index creation or rebuilding is minimally logged when in the BULK LOGGED model and your doing it in an OFFLINE manner.  I recently rebuilt a 146GB Clustered Index in about 12 minutes using that method. A 24GB index will take a lot less.  I also used the SORT_IN_TEMPDB option but your mileage may vary there because of memory and not knowing how fast your TempDB drive is compared to drive your database is on.

    I'm also concerned with what the column for the index being built is actually used for.  If it's an ever-increasing value and so is the CI, then you can set the FILL FACTOR for the index to 100 and likely never have to defrag it in the next 20 years.  If the column for the index isn't ever-increasing (inserted into the index out of order) or if the values in the column are updated or the values are inserted in large batches with a narrow scope on the column (Siloed Index is what I call that), then you may or may not be able to take advantage of any FILL FACTOR to reduce page splits.  You'll have to check later to figure that out.

    If the index already exists and isn't a unique index being used for an FK, you can also avoid some unwanted expansion of the MDF file by disabling the index before your rebuild it.

    Don't forget to shift back to the FULL Recovery Model if that's where you started.

    • I am wondering if we create index in a table whichtype of lock that might happen ?

    • Previously I thought if we have query withselect statement and where criteria then Non clustered index with included columnwould be the best index regardless the table size .

      So apparently it is not true ?

    On your first question, all user access to the underlying table in the ONLINE = OFF mode is impossible when....
    1. Creating, rebuilding, or dropping a Clustered Index.
    2. Rebuilding or dropping a Non-Clustered Index.

    Still on your first question, Inserts, Updates, and Deletes on the underlying table are blocked when Rebuilding or Dropping a Non-Clustered Index.  However, reads are not blocked and so you can still select against the underlying table.

    On your second question, I'm wondering where the question comes from and so I'm not sure of what your actually looking for.  I'm especially confused on how anything I may have said led you to believe that a covering index (which is what you're talking about) isn't beneficial especially on large tables.    A narrow Non-clustered Index (NCI from here on) that covers a query will be faster (sometimes significantly faster) than a wider Clustered Index (CI from here on) simply because fewer pages will be read thanks to the more narrow rows of the NCI (which is a form of table itself).  For large tables, I usually use the CI to control/reduce fragmentation, which frequently benefits queries with temporal criteria (seek first, range scan the rest).  Of course, as with anything else in SQL Server, "It Depends" on the shape of the data and the usage of the data.

    The caveat with covering indexes is that you have to remember that all NCIs are a duplication of data and such indexes can quickly grow to be larger than the original data whether the original data is in a heap or a CI (table).  You also have to remember that the key for the CI (or RID if it's a heap) is also stored in every NCI.  That's also why I said "usually" when it comes to me using the CI to control fragmentation.  There are exceptions depending on the data.

    I did create index (size 4 GB ) using Full Model and yes it took a while ( more than 10 mins and was still running ) so I cancelled it and changed it into Bulk Logged model and yes it's faster ( 7mins completed ) .Then changed it back to FULL .  Thank you for your advice !!
    But if compared with your index creation ( 146 GB in 12 mins ) , Mine is a lot slower . Not sure why ...

    Related Cover Index , At first I want to put it into 1.6 billion records of table ( This index has 1 key column and 10 included columns , total size : 40 GB ) , query performance is a lot faster , on the other hand There will be a lot of INSERT too , the consideration is ..it will slow down the insert performance so I decided to create Non clustered index with 1 key column - size 16 GB . Query performance is still fast but not as fast as using that cover index
    I decided to apply Non clustered index as it has more benefit in this case .
    Please let me know your thought ...

    hey Jeff
    last night I added a non clustered index into a table - 1.6 billion records ( it takes 1 hour ) .
    and when I run the query yes it improves the performance( from 2 mins into 20 secs only )  but when I run again the same query this afternoon , I notice the performance went back to original state which is 2 mins long ( no one remove the index )
    How come ? Have you experienced such a thing ??

    thank you

Viewing 2 posts - 16 through 16 (of 16 total)

You must be logged in to reply to this topic. Login to reply