September 8, 2010 at 1:44 pm
Dear all
I created a table with this fields:
ID bigint not null
fname varchar(50) not null
lname varchar(50) not null
and with this indexes:
clustered index on ID with fill factor of 100
non clustered index on fname
and the problem:
When I write query like this:
SELECT * FROM table WHERE fname='john'
every thing is good but if write a query like this (with arabic or persian texts)
SELECT * FROM table WHERE fname='علی'
SQL Server doesn't use index and do a full table scan!
Why? (my database collation is Persian_100_CI_AI and I'm using SQL 2008 R2)
Thanks anyway
September 11, 2010 at 2:35 pm
Indexed column and string used in WHERE clause must use the same collation. Otherwise index will not be selected - server will do full scan and compare values row by row.
I think it is SSMS problem, not server itself.
Try:
SELECT * FROM table WHERE fname='???' COLLATE Persian_100_CI_AI
to enforce the same collation to be used to represent '???' string in WHERE clause.
Regards,
Slawek
September 12, 2010 at 5:08 am
emdadgar2 (9/8/2010)
Dear allI created a table with this fields:
ID bigint not null
fname varchar(50) not null
lname varchar(50) not null
and with this indexes:
clustered index on ID with fill factor of 100
non clustered index on fname
and the problem:
When I write query like this:
SELECT * FROM table WHERE fname='john'
every thing is good but if write a query like this (with arabic or persian texts)
SELECT * FROM table WHERE fname='???'
SQL Server doesn't use index and do a full table scan!
Why? (my database collation is Persian_100_CI_AI and I'm using SQL 2008 R2)
Thanks anyway
Not sure, when I tried this I got the index seek for both. Could you post your full table definition and sample data?
September 12, 2010 at 9:43 am
You may also want to have a look at the xml plan for your query.
( SET SHOWPLAN_XML ON)
Search for implicit conversions as they may cause an index not to be used !
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
September 12, 2010 at 12:49 pm
Also I observed that default collation for the database is significant for the part written in single quotes..
September 12, 2010 at 1:59 pm
Really thanks
I tested all of them!
But sql server (2008 R2 - x64) doesn't use index on about 6 millions of records,
but when records omitted to under 1 millions, use index!!!!
I can force sql to use index with hints but I don't release why?
September 12, 2010 at 2:21 pm
It is all about statistics and index selectivity.
You issued:
SELECT * FROM table WHERE fname='???'
It means that for every row found in index sever must perform lookup to clustered index to retrieve two other columns.
Apparently '???' lands in bucket having many values.
For table having 1 mil records SQL calculates (using statistics) that cost (time spent) doing index seek + lookups will be still better than doing full scan. With 6+ mil row table it calculates that there will be so many hits in index, and so many lookups that it is better to do full scan (lookups are very expensive, and if more than 3-4% of table is going to be returned server selects full scan instead index seek + lookups)
You may:
1. Update your statistics for index based on fname
2. Create covering index on fname, including two other columns (which will result in duplicating your data)
3. make index starting from fname clustered, while still leaving column 1 as primary key.
4. Use Full Text index to index fname, lname columns, and use CONTAINS or CONTAINSTABLE predicates
(SELECT * FROM table WHERE CONTAINS(fname, '???')
I would personally use 3rd option, providing that most of your queries will be searching for fname, and you application issuing SELECT * cannot be changed to use different query
If application logic can be changed the best in my opinion will be 4, however mixing texts in English and Persian in one full-text index can be tricky and can lead to unpredictable results.
September 13, 2010 at 1:48 am
Slawek Guzek (9/12/2010)
It is all about statistics and index selectivity.You issued:
SELECT * FROM table WHERE fname='???'
It means that for every row found in index sever must perform lookup to clustered index to retrieve two other columns.
Apparently '???' lands in bucket having many values.
For table having 1 mil records SQL calculates (using statistics) that cost (time spent) doing index seek + lookups will be still better than doing full scan. With 6+ mil row table it calculates that there will be so many hits in index, and so many lookups that it is better to do full scan (lookups are very expensive, and if more than 3-4% of table is going to be returned server selects full scan instead index seek + lookups)
You may:
1. Update your statistics for index based on fname
2. Create covering index on fname, including two other columns (which will result in duplicating your data)
3. make index starting from fname clustered, while still leaving column 1 as primary key.
4. Use Full Text index to index fname, lname columns, and use CONTAINS or CONTAINSTABLE predicates
(SELECT * FROM table WHERE CONTAINS(fname, '???')
I would personally use 3rd option, providing that most of your queries will be searching for fname, and you application issuing SELECT * cannot be changed to use different query
If application logic can be changed the best in my opinion will be 4, however mixing texts in English and Persian in one full-text index can be tricky and can lead to unpredictable results.
Too many thanks, and :
For solution 1: My statics is updated and table reindexed, but full scan occurred
For solution 2:Database size is not matter, but Does duplicating data with covering indexes affect speed (slow down)?
For solution 3: I don't understand!!!:w00t:
For solution 4: It's good but when I want to query "John%" in this sample, fulltext don't cover item no.2:
data sample:
no. 1: Jane
no. 2: John Smith
no. 3: Johnnie
no. 4: Bill
and I want to get both no.1,2
Thanks for your time
September 13, 2010 at 1:56 am
I guess you mean you want no2 and no3 ! (both startin with john)
btw what's the distribution of your value '???'. In what kind of percentage does it occur ?
If more than a certain percentage sql may switch to a scan operation as well.
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
September 13, 2010 at 2:33 am
Hi,
Ad 2:) It does. Inserts will be slower, but how much slower - I can't tell. Probably you will not even notice difference with single or dozens of inserts.
Ad 3:)
The idea to reorder logical layout of your data in the table, and keep it ordered in order defined by fname, not ID column.
SQL will not have to make any lookups to retrieve data when using fname based index, because all data will be stored on fname index pages.
CREATE TABLE dbo.Table (
ID bigint NOT NULL PRIMARY KEY NONCLUSTERED ,
fname varchar](50) NOT NULL,
lname varchar](50) NOT NULL
)
CREATE UNIQUE CLUSTERED INDEX MyClustered Idx ON dbo.Table
(
fname ASC,
ID ASC
)
Re-load content to the table above and run your query.
Ad 4) "It's good but when I want to query "John%" in this sample, fulltext don't cover item no.2:"
Why ?
"John%" = SELECT * FROM Table WHERE CONTAINS (fname, '"john*"')
Regards,
Slawek
Viewing 10 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply