Similar to yesterday’s post, I have a follow-up for another Pragmatic Works Training on the T’s session that I delivered last month. This was is titled 5 SQL Server Indexing Myths. In the session, I discussed discussed five common indexing myths that I’ve come across over the years. The abstract for the sessions is:
There are many “best practices” around that help people decide how to index databases. Having these practices can help alleviate the time that it takes to design an indexing strategy for a database. These practices can be of great use, except when they are wrong. Join us in this session, as we discuss some common myths associated with indexes and then dive into the myths to demonstrate how they can be debunked. At the end of the session, you’ll know a few more things about indexes and leave armed with scripts that can help you debunk these myths on your own.
Webcast Recording & Materials
If you missed the webcast, you’ll can watch the recording here. For those that want to download the slide deck, you can click the following link – . Also, if you just want to flip through the slide deck again, here you go.
Q: Is it possible to get a quick list of the 5 myths that are going to be covered, so people know whether to stay or go.
Sorry, no. That would ruin the surprise, hopefully you stuck around.
Q: Does indexing even matter now with Hekaton in place?
Absolutely. In-memory OLTP (hekaton) only satisfies some use cases to data needs. Its great for high throughput OLTP tables. That doesn’t necessarily make it a good fit for reference tables, slowly changing OLTP, tables requiring LOB data types, or nearly all data warehouse solutions.
Q: Has SQL integrated Foxpro’s RUSHMORE technology – is there automatic multi-field optimization even though the indices are discretely indexed?
Stumped the presenter here, I’ve never had the opportunity to work with FoxPro. From what I can tell this technology was only implemented in FoxPro.
Q: How do you recommend finding the indexes that can use or benefit from tuning fill factor?
I look at sys.dm_db_index_operational_stats and sys.dm_db_index_physical_stats. The first dynamic management object provides information on the number of allocations; which relates to the rate in which page splits are occurring. Then look at the second dynamic management object to determine the indexes that are fragmented. After identification, then look at the rate in which these indexes need to be rebuilt and adjust the fill factor on those that are rebuilt most often.
Q: How does INCLUDE statement change the explain plans versus having the index contain all the columns?
INCLUDE will change how the access methods for SQL Server will retrieve the data from the included columns and it changes size and shape of the index; which can result in a change in the optimal plan and objects for solving the query.
Q: How many % fill factor normally do you recommend on the index?
Leave fill factor at 0, outside of this value, there is no other value to represents the best default amount of extra space to leave in a table.
Q: I have some very complex queries doing multiple table joins and the optimizer is not making use of existing indexes… should I be adding query hints or could this be a statistics issue?
It could be a statistics issue, the other thing, and probably more likely, the reads required to leverage the index would result in more IO that using the other indexes that are used. A third possibility would be poor parameter sniffing
Q: Do you think it is better to have many single column non-clustered indexes or a fewer number of multi-column non-clustered indexes?
It’s not so something that a single pattern can fit all tables and workloads. In a general sense, non-clustered indexes with multiple or included columns can often provide a better performance lift over a large number of single column indexes. Of course, YMMV.
Q: Index ordering depends on query join or SELECT or ORDER BY or HAVING clause criteria which will be most effective?
This can be a fairly complex answer. To answer simply, though, all of these types of operations help determine whether
Q: Inserting into empty clustered indexed table gives minimal logging (is that correct?) If Yes, what about if target table is empty clustered partitioned table – does it give same minimal logging ?
Insert operation can be minimally logged on inserts into empty clustered indexes, with simple or bulk-logged recovery models. My understanding is that inserts will be minimally logged as well for partitioned tables, as long as the requirements are met. I haven’t verified this, though.
Q: Should the partitioned column and first column in a clustered index be same for optimal performance, or can we have them as different columns?
No, I usually start with the partitioned column last in the clustering key. Since the partitioning key is already driving partition elimination, the other columns can be more useful for additional reductions.
Q: Is there a performance impact with indexes on SSAS cube processing or queries with heap vs. clustered indexes?
Yes, depending on the data model and the types of measures and facts being built, the choices between heaps and clustered indexes will have an impact. Look at how often the cubes are rebuilt and the auxiliary drill through workload that will hit the data warehouse to determine which provides the best performance.
Q: On a typical database with 50% – 50% DML, what kind of fill factor do you recommend?
Start at 100, or 0, and work your way down. I generally will drop by 10 for tables and indexes that end up with nightly rebuilds and 5 for those that need nightly reorgs. Since the DML and data types determine the rate of fragmentation, there is no right general number besides not adding any fill factor to start with.
Q: Is it better to create indexes with lower fill factor and rebuild them with greater fill if necessary? or should we have greater fill during creation?
Yes, start at 100, or 0, and then work your way down, as needed, based on the frequency of rebuilds and reorganizations.
Q: We have seen some indexes getting fragmented few minutes after we rebuild/reorganize them. What could be the reason?
The main reason would be that the rows in the table are variable length and volatile in those lengths. After the rebuilds and reorganizations, the rows are being modified and page splits are occurring. A secondary reason could be an database shrink, it that is happening, just stop doing that immediately. There are other reasons, but these are where I’d start looking.
Q: When do we make primary key as non clustered? Do we need to do that in Partitioned table (partition based on calculated column)?
That is a good example where the clustered index and the primary key would not be the same. Look for these types of scenarios, along with parent-child relationships, such as order tables and order detail tables, where the parent table would be the primary data access path for both tables over the detail tables primary key.
Q: When you say Heap, is that a non-clustered index?
No, a heap is the structure that a table uses when it does not have a clustered index. A non-clustered index is a sorted structure that can access specific values within columns within either a heap or clustered index.
Q: Will your example code be available on your blog?
Yes, they are in this blog post.
Q: If you build a table with a fill factor, does it override the instance default?
Table level fill factor will always override the instance settings for fill factor, whether the value is higher or lower.
Q: Would an index rebuild, physically order the primary clustered index?
The index rebuild will resort the pages physically in the database, it will not physically sort the data on the pages. This shouldn’t be considered an issue. Getting to the data is the primary concern and once the page is accessed, there it is.
Q: Would the seek work effectively without the column indexes?
If I understand this question right, seeks can only occur with clustered or non-clustered indexes. With a heap, the rows are unsorted, so to find any specific row, t
Q: When a table is partitioned – To keep the size of the indexs small. Is it Ok to skip the column on which the table is partitioned out of the index list of columns? Would the seek work without effectively without the column indexes?
You can skip the partitioning column for the non-clustered indexes, but then those indexes wouldn’t be partitioned and the ability to partition switch would be hampered.
Thanks for Attending
Thanks to those that attended the webcast. If you have any questions on the webcast or suggestions for improvements, please leave a comment on this post.