Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

What is proper way of indexing on a table which contain large amount of data Expand / Collapse
Author
Message
Posted Tuesday, February 5, 2013 11:12 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 9:13 AM
Points: 40,609, Visits: 37,070
Yes you can. No there isn't.


Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1416010
Posted Thursday, February 7, 2013 1:56 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Friday, May 30, 2014 3:50 AM
Points: 99, Visits: 176
Hi Gail,
I read all the articles listed in your link. This is really very good information.
All the requirement & best practice are good. But I have still a question in my mind.

In my case we have more than 10 columns.
I think, we cannot define a specific sequence for non clustered index which will best suite our dynamic query.
E.g: If we have:

FirstName, LastName, Address, City, State, Zip, Phone, Email

eight columns which takes part in our dynamic query, then how can we decide the exact sequence of where clause. If I consider index (lastname, firstname, ..........), it will only work for those queries which will contain where clause like this:
LastName = 'text' and firstname = 'text' and ...............

but what about those queries which does not contain lastname. In that case, this non clustered index will become useless. Will I have to create 8 * 8 = 64 indexes (or any other number count. Sorry, I am not too good in permutation/combination).
Thanks to Microsoft who gave us the opportunity to create 999 non clustered indexes.

I just want to raise a question: How can we say that this specific sequence of columns will be best suited when all columns have equal priority? sometimes, I will not pass lastname & sometimes city or postal code or zip also.
& this situation will become more worse when number of columns will increase.

Post #1417344
Posted Thursday, February 7, 2013 2:22 PM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 9:13 AM
Points: 40,609, Visits: 37,070
As Grant said earlier

Other than that, I'd say recognize the access patterns and then build indexes to support them.


No, you absolutely do not want to create indexes for every theoretical access pattern. Create indexes for the access patterns that are used heavily, just a couple. Then monitor and look for queries that run often and aren't efficient enough, consider another index or two.

http://sqlinthewild.co.za/index.php/2011/11/11/sql-university-advanced-indexing-indexing-strategies/



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1417354
Posted Friday, February 8, 2013 12:21 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Friday, May 30, 2014 3:50 AM
Points: 99, Visits: 176
thanks, it sounds great. :)

but now I am afraid because now I will have to ENJOY my weekend in the office again.

again, thanks a ton mate.
this time, I have learnt a lot.
Post #1417481
Posted Friday, February 8, 2013 2:07 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Friday, May 30, 2014 3:50 AM
Points: 99, Visits: 176
I was just reading the article:
http://www.sql-server-performance.com/2007/t-sql-where/2/

There is a recommendation given below:

"If you have a WHERE clause that includes expressions connected by two or more AND operators, SQL Server will evaluate them from left to right in the order they are written. This assumes that no parenthesis have been used to change the order of execution. Because of this, you may want to consider one of the following when using AND:

Locate the least likely true AND expression first. This way, if the AND expression is false, the clause will end immediately, saving time.

If both parts of an AND expression are equally likely being false, put the least complex AND expression first. This way, if it is false, less work will have to be done to evaluate the expression.

You may want to consider using Query Analyzer or Management Studio to look at the execution plans of your queries to see which is best for your situation. [2000, 2005, 2008] Updated 2-4-2009 "

Check the above bold line.

So, if we put least likely true column very first (left most where clause), then it is just opposite which you have asked.
You advised that we should put "most selective" column first.

Now I am in dilemma. What to do, what not to do.
Post #1417525
Posted Friday, February 8, 2013 2:28 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 9:13 AM
Points: 40,609, Visits: 37,070
Anuj Rathi (2/8/2013)
"If you have a WHERE clause that includes expressions connected by two or more AND operators, SQL Server will evaluate them from left to right in the order they are written. This assumes that no parenthesis have been used to change the order of execution. Because of this, you may want to consider one of the following when using AND:

Locate the least likely true AND expression first. This way, if the AND expression is false, the clause will end immediately, saving time.

If both parts of an AND expression are equally likely being false, put the least complex AND expression first. This way, if it is false, less work will have to be done to evaluate the expression.

You may want to consider using Query Analyzer or Management Studio to look at the execution plans of your queries to see which is best for your situation. [2000, 2005, 2008] Updated 2-4-2009 "


That's a load of complete hogswash.

Operators are not evaluated from left to right, if an expression if false the evaluation does not end.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1417532
Posted Friday, February 8, 2013 4:56 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Friday, May 30, 2014 3:50 AM
Points: 99, Visits: 176
Once again, I have a situation where my way is again blocked.

As I said earlier in this post that I have to implement indexing on a group of tables. Before raising my question, I just want explain my situation again.
In my table (as I earlier explained), there are 2 bit type columns.
1. first one is for record Status. Record is active or not.
2. second one is for record state. We don't delete any record physically. We just change the state flag from 1 to 0.

3. there are few foreign key columns, contains int ID, which are referenced other base tables. Suppose record_cat, record_subcat1 & record_subcat2 column.
record_cat will also be included in most of the queries.

Now, my question is:

As I will have to include both status & state columns in each & every query. But these columns will return a very large number of rows. because approx 70% of records are active.

record_state: 85%
record_status: 78%

record_cat: 60%
record_subcat1: 40%
record_subcat2 : 20%
so,If I create a non clustered index like: record_state, record_status, record_cat, record_subcat1, record_subcat2

then first two columns will produce a large number of rows & then this is the responsibility of record_cat, record_subcat1 and record_subcat2 to filter out the data.

but if I consider record_subcat2, record_subcat1, record_cat, record_status, record_state, this sequence is also a covering index but in just a opposite scenario. means record_subcat2 just filter out most of the records and so on.

But If i consider your your recommendation (most selective column first) then record_state & record_status should be first.

please give me an advice.

Post #1417603
Posted Friday, February 8, 2013 5:31 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 10:51 AM
Points: 14,201, Visits: 28,530
But state and status are not the most selective, they are the least selective.

Selectivity is how unique the data is. The most selective column would be the column with the most unique values in the table for the columns you are considering for that index. Meaning, you don't always put the single most unique column as the first column, but the most unique column of the ones you are currently choosing from for this index (god I hope that's clear).

So no, you probably don't want to have those bit columns as the first columns of your index. But, you probably will want to regularly have them in the index. Whether they should be part of the key or just an inclusion column, only testing will tell (I lean towards part of the key, but I'd test both).


----------------------------------------------------
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood..." Theodore Roosevelt
The Scary DBA
Author of:
SQL Server Query Performance Tuning
and
SQL Server Execution Plans

Product Evangelist for Red Gate Software
Post #1417626
Posted Friday, February 8, 2013 6:00 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 9:13 AM
Points: 40,609, Visits: 37,070
Selectivity should not be the primary indicator for choosing index columns. Putting the most selective column first would be stupid if nothing filters by it. The decision as to column order in indexes should be what queries filter by that column. Selectivity is a secondary consideration.

I've linked this blog post again, it's discussed in there or in the posts linked to.
http://sqlinthewild.co.za/index.php/2011/11/11/sql-university-advanced-indexing-indexing-strategies/



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1417647
Posted Friday, February 8, 2013 6:27 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 10:51 AM
Points: 14,201, Visits: 28,530
GilaMonster (2/8/2013)
Selectivity should not be the primary indicator for choosing index columns. Putting the most selective column first would be stupid if nothing filters by it. The decision as to column order in indexes should be what queries filter by that column. Selectivity is a secondary consideration.

I've linked this blog post again, it's discussed in there or in the posts linked to.
http://sqlinthewild.co.za/index.php/2011/11/11/sql-university-advanced-indexing-indexing-strategies/


Exactly. It's only a guideline, a starting point, the first test, if it makes sense.


----------------------------------------------------
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood..." Theodore Roosevelt
The Scary DBA
Author of:
SQL Server Query Performance Tuning
and
SQL Server Execution Plans

Product Evangelist for Red Gate Software
Post #1417663
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse