﻿<?xml version='1.0' encoding='UTF-8'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>SQLServerCentral / SQL Server 2008 / SQL Server 2008 - General  / What is proper way of indexing on a table which contain large amount of data / Latest Posts</title><generator>InstantForum.NET v2.9.0</generator><description>SQLServerCentral</description><link>http://www.sqlservercentral.com/Forums/</link><webMaster>notifications@sqlservercentral.com</webMaster><lastBuildDate>Wed, 22 May 2013 21:49:13 GMT</lastBuildDate><ttl>20</ttl><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>thanks John,this is really a very good &amp; simple approach to catch all the search patterns. I didn't thought about such logging table. I will definitely follow this approach.I more thing I have noticed today. Our Ex DBA has created all columns using some prefix. E.g.: catID for CategoryID, catvalue for CategoryName &amp; then made a view FOR EACH TABLE where he has defined fully qualified names (means CategoryID, CategoryName) for table columns.His point was: "I don't want to disclose ACTUAL column names of any table to UI developers &amp; hence we will not use actual column names in UI application. This will reduce the sql injection. So, in SELECT operation, we will use VIEW not table directly."But my opinion is different. We should create view only for those complex queries which we use in multiple places. because if we create a VIEW(which contains only a table to map the column name) for each table, this will create an extra overhead &amp; processing while executing the query.If I talk about SQL Injection, we have used stored procedure &amp; created a separate login/user &amp; GRANTED ONLY EXECUTE permission on a specific SCHEMA which contains only UI application related stored procedures.Then why are we putting extra load to MAP columns name?</description><pubDate>Fri, 08 Feb 2013 21:47:16 GMT</pubDate><dc:creator>Anuj Rathi</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>To change directions a bit, I am facing these same issues in the system I am working on.  There are a number of places in the UI where the users can perform a "catch all" query.  The original code simply times out. Some of these have as many as 90 different parameters that may or may not be used. The first thing I did was to create a logging table that stored the values of all of the parameters, with an identity and a created time stamp. At the beginning of each proc, I insert the values of all of the parameters into this table. I was able to capture exactly what the users were searching on, and then try to do some analysis.One of the searches indicated that there were 7 values used in 70% of the searches. I test if all 7 are in place, and wrote the query, and the proper indexes, to make this fast. No dynamic SQL!  If any of the other parameters were used, I used dynamic SQL and was able to tune these easier. This was the easy one!The hard one contained 92 different parameters.  In capturing the values of the parameters being used, the pattern showed that the searches were similar according to customer location. One location did one search all the time, another did something else. It turned into 4 different "base" searches. In this case, there were 4 separte procedures written, all using dynamic SQL, that covered these 4 bases. Again, I was able to divide and conquer becasue I captured the actual parameters being used in the searches. The primary thing this showed me was that there were some parms that were rarely used, which allowed me to completly ignore them in my efforts. Now, MOST of the searches are very fast, and the users are happy.   The "odd" searches take time to complete, but the users understand this. I do monitor the parameter usage to see if it may be changing. Good luck.</description><pubDate>Fri, 08 Feb 2013 12:02:02 GMT</pubDate><dc:creator>Michael L John</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>Sounds mostly reasonable, yes.</description><pubDate>Fri, 08 Feb 2013 08:09:20 GMT</pubDate><dc:creator>Grant Fritchey</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>Thank you Fritchey, Gail,Now, I am much satisfied. Finally, this is the conclusion that I will check those columns which takes part in the query mostly &amp; which column(s) are (mostly) responsible to populate the output, which column filters data to make the output as narrow as possible (in all possible/mostly queries) &amp; then I will try to make a some/few unified/homogeneous on the basis of above consideration.Am I going to a right direction?</description><pubDate>Fri, 08 Feb 2013 07:56:45 GMT</pubDate><dc:creator>Anuj Rathi</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>[quote][b]Anuj Rathi (2/8/2013)[/b][hr]record_cat is foreign key for record_subcat1 &amp; record_subcat1 is foreign key for record_subcat2.So, as per your explanation, record_subcat2 is most selective column in our index &amp;record_subcat2, record_subcat1, record_cat, ..........is preferred sequence.[/quote]All of which is completely, absolutely, totally irrelevant.What queries filter on those columns and what columns are filtered on mostly? That is what you need for indexing. Putting an index on the most selective column or in the order of selectivity is a total waste of time if the queries are not going to use them. Selectivity is something you consider, not the only thing that determines index column ordering.[url]http://sqlinthewild.co.za/index.php/2011/11/11/sql-university-advanced-indexing-indexing-strategies/[/url]and the posts that it links to.</description><pubDate>Fri, 08 Feb 2013 06:56:40 GMT</pubDate><dc:creator>GilaMonster</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>There just aren't very many hard and fast rules for how to put this together. Yes, if a given column isn't used that is in an index key, that might make it less likely for that index to be used for other queries. But you have to test to be sure.As to having the exact same columns in the leading edge on multiple queries, I would be extremely hesitant there. It can lead to good indexes not being recognized and used by the queries that need it.</description><pubDate>Fri, 08 Feb 2013 06:46:23 GMT</pubDate><dc:creator>Grant Fritchey</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>yes, this is clear now. Thank you for this good explanation.I am afraid that I again asking a question. But before asking again, I have to explain my case.record_cat is foreign key for record_subcat1 &amp; record_subcat1 is foreign key for record_subcat2.So, as per your explanation, record_subcat2 is most selective column in our index &amp;record_subcat2, record_subcat1, record_cat, ..........is preferred sequence.This will work fine If we search record in subcategory2. But if we search records in subcategory1 (category will also be there),now subcategory2 will not be present &amp; because subcategory2 is not present in where clause, then above index will become useless because it checks the first column in index column sequence.Second case: in absence of subcategory2, if we consider those records where subcategory2 is NULL like belowsubcategory2 is null and subcategory1 = x and category = y   (on the basis of above index sequence)in this case, this query will become non SARGable.My main motive is: how many (minimum) index should I create &amp; what should be the sequence of columns.I have tested all situations &amp; these are similar to above. Tuning adviser suggesting status &amp; state columns as very first column for every query which I tested &amp; for rest of the columns, tuning adviser suggests a different index according to sequence of columns in where clause (few situations I explained above).</description><pubDate>Fri, 08 Feb 2013 06:41:14 GMT</pubDate><dc:creator>Anuj Rathi</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>[quote][b]GilaMonster (2/8/2013)[/b][hr]Selectivity should not be the primary indicator for choosing index columns. Putting the most selective column first would be stupid if nothing filters by it. The decision as to column order in indexes should be what queries filter by that column. Selectivity is a secondary consideration.I've linked this blog post again, it's discussed in there or in the posts linked to.[url]http://sqlinthewild.co.za/index.php/2011/11/11/sql-university-advanced-indexing-indexing-strategies/[/url][/quote]Exactly. It's only a guideline, a starting point, the first test, if it makes sense.</description><pubDate>Fri, 08 Feb 2013 06:27:12 GMT</pubDate><dc:creator>Grant Fritchey</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>Selectivity should not be the primary indicator for choosing index columns. Putting the most selective column first would be stupid if nothing filters by it. The decision as to column order in indexes should be what queries filter by that column. Selectivity is a secondary consideration.I've linked this blog post again, it's discussed in there or in the posts linked to.[url]http://sqlinthewild.co.za/index.php/2011/11/11/sql-university-advanced-indexing-indexing-strategies/[/url]</description><pubDate>Fri, 08 Feb 2013 06:00:29 GMT</pubDate><dc:creator>GilaMonster</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>But state and status are not the most selective, they are the least selective.Selectivity is how unique the data is. The most selective column would be the column with the most unique values in the table for the columns you are considering for that index. Meaning, you don't always put the single most unique column as the first column, but the most unique column of the ones you are currently choosing from for this index (god I hope that's clear).So no, you probably don't want to have those bit columns as the first columns of your index. But, you probably will want to regularly have them in the index. Whether they should be part of the key or just an inclusion column, only testing will tell (I lean towards part of the key, but I'd test both).</description><pubDate>Fri, 08 Feb 2013 05:31:41 GMT</pubDate><dc:creator>Grant Fritchey</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>Once again, I have a situation where my way is again blocked.As I said earlier in this post that I have to implement indexing on a group of tables. Before raising my question, I just want explain my situation again.In my table (as I earlier explained), there are 2 bit type columns.1. first one is for record Status. Record is active or not.2. second one is for record state. We don't delete any record physically. We just change the state flag from 1 to 0.3. there are few foreign key columns, contains int ID, which are referenced other base tables. Suppose record_cat, record_subcat1 &amp; record_subcat2 column. record_cat will also be included in most of the queries.Now, my question is:As I will have to include both status &amp; state columns in each &amp; every query. But these columns will return a very large number of rows. because approx 70% of records are active.record_state: 85%record_status: 78%record_cat: 60%record_subcat1: 40%record_subcat2 : 20%so,If I create a non clustered index like: record_state, record_status, record_cat, record_subcat1, record_subcat2then first two columns will produce a large number of rows &amp; then this is the responsibility of record_cat, record_subcat1 and record_subcat2 to filter out the data.but if I consider record_subcat2, record_subcat1, record_cat, record_status, record_state, this sequence is also a covering index but in just a opposite scenario. means record_subcat2 just filter out most of the records and so on.But If i consider your your recommendation (most selective column first) then record_state &amp; record_status should be first.please give me an advice. </description><pubDate>Fri, 08 Feb 2013 04:56:00 GMT</pubDate><dc:creator>Anuj Rathi</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>[quote][b]Anuj Rathi (2/8/2013)[/b][hr]"If you have a WHERE clause that includes expressions connected by two or more AND operators, SQL Server will evaluate them from left to right in the order they are written. This assumes that no parenthesis have been used to change the order of execution. Because of this, you may want to consider one of the following when using AND:[b]Locate the least likely true AND expression first. This way, if the AND expression is false, the clause will end immediately, saving time.[/b]If both parts of an AND expression are equally likely being false, put the least complex AND expression first. This way, if it is false, less work will have to be done to evaluate the expression.You may want to consider using Query Analyzer or Management Studio to look at the execution plans of your queries to see which is best for your situation. [2000, 2005, 2008] Updated 2-4-2009 "[/quote]That's a load of complete hogswash. Operators are not evaluated from left to right, if an expression if false the evaluation does not end.</description><pubDate>Fri, 08 Feb 2013 02:28:00 GMT</pubDate><dc:creator>GilaMonster</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>I was just reading the article:[url=http://www.sql-server-performance.com/2007/t-sql-where/2/]http://www.sql-server-performance.com/2007/t-sql-where/2/[/url]There is a recommendation given below:"If you have a WHERE clause that includes expressions connected by two or more AND operators, SQL Server will evaluate them from left to right in the order they are written. This assumes that no parenthesis have been used to change the order of execution. Because of this, you may want to consider one of the following when using AND:[b]Locate the least likely true AND expression first. This way, if the AND expression is false, the clause will end immediately, saving time.[/b]If both parts of an AND expression are equally likely being false, put the least complex AND expression first. This way, if it is false, less work will have to be done to evaluate the expression.You may want to consider using Query Analyzer or Management Studio to look at the execution plans of your queries to see which is best for your situation. [2000, 2005, 2008] Updated 2-4-2009 "Check the above bold line.So, if we put least likely true column very first (left most where clause), then it is just opposite which you have asked.You advised that we should put "most selective" column first.Now I am in dilemma. What to do, what not to do.</description><pubDate>Fri, 08 Feb 2013 02:07:37 GMT</pubDate><dc:creator>Anuj Rathi</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>thanks, it sounds great. :)but now I am afraid because now I will have to ENJOY my weekend in the office again.  :( :(again, thanks a ton mate.this time, I have learnt a lot.</description><pubDate>Fri, 08 Feb 2013 00:21:27 GMT</pubDate><dc:creator>Anuj Rathi</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>As Grant said earlier[quote]Other than that, I'd say recognize the access patterns and then build indexes to support them.[/quote]No, you absolutely do not want to create indexes for every theoretical access pattern. Create indexes for the access patterns that are used heavily, just a couple. Then monitor and look for queries that run often and aren't efficient enough, consider another index or two.[url]http://sqlinthewild.co.za/index.php/2011/11/11/sql-university-advanced-indexing-indexing-strategies/[/url]</description><pubDate>Thu, 07 Feb 2013 14:22:52 GMT</pubDate><dc:creator>GilaMonster</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>Hi Gail,I read all the articles listed in your link. This is really very good information.All the requirement &amp; best practice are good. But I have still a question in my mind. In my case we have more than 10 columns. I think, we cannot define a specific sequence for non clustered index which will best suite our dynamic query. E.g: If we have:FirstName, LastName, Address, City, State, Zip, Phone, Emaileight columns which takes part in our dynamic query, then how can we decide the exact sequence of where clause. If I consider index (lastname, firstname, ..........), it will only work for those queries which will contain where clause like this:LastName = 'text' and firstname = 'text' and ...............but what about those queries which does not contain lastname. In that case, this non clustered index will become useless. Will I have to create 8 * 8 = 64 indexes (or any other number count. Sorry, I am not too good in permutation/combination). Thanks to Microsoft who gave us the opportunity to create 999 non clustered indexes.I just want to raise a question: How can we say that this specific sequence of columns will be best suited when all columns have equal priority? sometimes, I will not pass lastname &amp; sometimes city or postal code or zip also.&amp; this situation will become more worse when number of columns will increase.</description><pubDate>Thu, 07 Feb 2013 13:56:11 GMT</pubDate><dc:creator>Anuj Rathi</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>Yes you can. No there isn't.</description><pubDate>Tue, 05 Feb 2013 11:12:02 GMT</pubDate><dc:creator>GilaMonster</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>Hi Gail,I have changed my one stored procedure. As you have the code block which I posted. Now that block will look like this:IF @inbCityID IS NOT NULL AND @inbCityID &amp;gt; 0BEGIN	SET @SqlCondition = @SqlCondition + N' AND vaa.CityID = @inbCityID';ENDI have just one question. When I will call the sp_executesql &amp; will pass the parameters, Can I pass all parameters (approx 14) regardless of query.Means in above code block, if @inbCityID is 0 then above condition will not be included in our final sql string. Can I pass my all parameters (including @inbCityID) because we don't know that this condition will become true or false.I have checked by executing the procedure (&amp; passed all parameters) where only one condition was true, SP is running without any error.I have just asked because of any performance consideration. Is there any performance impact if I pass more parameters. I have included all parameters in params definition. I googled but didn't find any answer, that's why I am asking.</description><pubDate>Tue, 05 Feb 2013 10:44:08 GMT</pubDate><dc:creator>Anuj Rathi</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>Gail, thanks for the link,</description><pubDate>Tue, 05 Feb 2013 09:40:45 GMT</pubDate><dc:creator>Anuj Rathi</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>Yes, absolutely yes.As you have it, it's a security risk (SQL injection) and subject to poor plan caching and reuse.</description><pubDate>Tue, 05 Feb 2013 09:38:04 GMT</pubDate><dc:creator>GilaMonster</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>Hi Fritchey,We have made the SQL like this,IF @inbCityID IS NOT NULL AND @inbCityID &amp;gt; 0BEGIN	SET @SqlCondition = @SqlCondition + ' AND vaa.CityID = ' + CONVERT(VARCHAR(50), @inbCityID);END(this is only a block of code)&amp; Finally we use below code:EXECUTE SP_EXECUTESQL @FinalSql;Should I change above structure to parametrized structure &amp; then I should pass the parameters to SP_EXECUTESQL.</description><pubDate>Tue, 05 Feb 2013 09:37:11 GMT</pubDate><dc:creator>Anuj Rathi</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>[url]http://sqlinthewild.co.za/index.php/2011/11/11/sql-university-advanced-indexing-indexing-strategies/[/url]</description><pubDate>Tue, 05 Feb 2013 09:29:54 GMT</pubDate><dc:creator>GilaMonster</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>You've got two options there. Well, actually, you have to do both. First, talk to the business people to understand what they're actually going to do, not what they're asking for. Second, capture the queries as they come in and match that with what you know about the structure and their access methods. But the main thing is talking to the business to understand what they're doing.</description><pubDate>Tue, 05 Feb 2013 09:25:33 GMT</pubDate><dc:creator>Grant Fritchey</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>Hi Gail, Fritchey,Thanks for your reply. Fritchey, thank you very much. I have found my first approach which I have to do.We are not using sp_executesql. So, this is my step. I am going to change the process. Now, I am afraid because developers have created all SPs using sql string. We haven't used parameters."Other than that, I'd say recognize the access patterns and then build indexes to support them. In most circumstances, compound keys and included columns work much better than single column indexes, but it really, really depends on exactly what the T-SQL code looks like."can you please let me know how can I do this approach?</description><pubDate>Tue, 05 Feb 2013 09:12:37 GMT</pubDate><dc:creator>Anuj Rathi</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>[quote][b]Grant Fritchey (2/5/2013)[/b][hr]Other than that, I'd say recognize the access patterns and then build indexes to support them. In most circumstances, compound keys and included columns work much better than single column indexes, but it really, really depends on exactly what the T-SQL code looks like.[/quote]This.You analyse common patterns and index to support those.</description><pubDate>Tue, 05 Feb 2013 08:53:55 GMT</pubDate><dc:creator>GilaMonster</dc:creator></item><item><title>RE: What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>This is a gigantic topic.I'm not going to be able to give you specific guidance based on a generic problem as you've outlined. Sorry.It sounds like you're creating a "report on anything" page and you're finding, as everyone does, that it's very hard to optimize those. The fact is, in most cases, there are patterns to what the users are going to access. Seldom is it true that they will search on anything. Generally there are related columns that will always be used for specific types of searches. You need to identify these patterns, then index accordingly.Putting a single index on each column is an extremely poor choice. It is likely that the majority of those indexes will never be used by any search criteria. You will still have to maintain them through all inserts, updates &amp; deletes, adding significantly to your overhead.Dynamic t-sql is the way to go with this type of query. Just make sure you're using sp_executesql and parameters and not merely building a sql string and executing it. The first method ensures at least some plan reuse. The second ensures none.Other than that, I'd say recognize the access patterns and then build indexes to support them. In most circumstances, compound keys and included columns work much better than single column indexes, but it really, really depends on exactly what the T-SQL code looks like.</description><pubDate>Tue, 05 Feb 2013 08:45:17 GMT</pubDate><dc:creator>Grant Fritchey</dc:creator></item><item><title>What is proper way of indexing on a table which contain large amount of data</title><link>http://www.sqlservercentral.com/Forums/Topic1415845-391-1.aspx</link><description>Hi experts, I have to optimize a database. I have almost done everything(modified stored procedures, implemented index also). But I am not satisfied &amp; I am bit confused at this moment due to some sections.There are few tables (contains millions of rows)  which take part in public search. These tables also takes part in INSERT &amp; UPDATE operations.I have performed maximum possible normalization &amp; cannot break my tables any more.There is a main table &amp; several related tables (so called child tables). Main table has 18 columns. 10 columns are numeric (int, bigint &amp; numeric(18,2)), 4 columns are varchar (128), 2 columns are datetime and 2 columns are bit. Some columns are nullable also.Search is based on different conditions &amp; user can select one or more search conditions.We are using dynamic query in our stored procedures. All columns can take part in "where" clause.Now my questions are:1. Should I create index on every columns? if yes, there will be 18 index (1 clustered &amp; 17 non clustered). Is this correct approach?2. Should I create covering index with all non clustered index or Is it sufficient to create covering index with only one non clustered index (on that column which always takes part in where clause).Please suggest me proper approach.Thanks in advance !</description><pubDate>Tue, 05 Feb 2013 07:44:48 GMT</pubDate><dc:creator>Anuj Rathi</dc:creator></item></channel></rss>