Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Concat columns in Where clause Expand / Collapse
Author
Message
Posted Tuesday, November 12, 2013 12:08 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, July 24, 2014 6:10 PM
Points: 98, Visits: 297
Hello everyone,
I'm searching to see how performance works on columns concat in the where clause.
What are the search / performance ramifications?
Any suggestions?

WHERE
[column1] + [column2]
Post #1513585
Posted Tuesday, November 12, 2013 12:12 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 3:28 PM
Points: 13,103, Visits: 11,933
Jonathan Marshall (11/12/2013)
Hello everyone,
I'm searching to see how performance works on columns concat in the where clause.
What are the search / performance ramifications?
Any suggestions?

WHERE
[column1] + [column2]


That would become a nonSARGable predicate. The value will have to be calculated for every single row in the result set. To make things even worse you will likely end up wrapping each column with an ISNULL because if either column is NULL the concatenation will be null. Perhaps if you can provide more details we can help come up with a solution that will work and be fast.


_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Post #1513587
Posted Tuesday, November 12, 2013 12:20 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, July 24, 2014 6:10 PM
Points: 98, Visits: 297
Very interesting I did not think of that at all.

The query was passed on to me to call from a package.
I will have to ask the creator more questions but it seems as they are trying to create unique values by not only using the concat in the WHERE clause but also in the SELECT statement.

The statements are used to update and insert into look-up tables.
Post #1513589
Posted Tuesday, November 12, 2013 12:21 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 12:06 AM
Points: 7,040, Visits: 12,958
Your question is incomplete since ther WHERE clause is missing the "= something" part. But that's just as a side note...

If my limited knowledge is correct, you'll end up with an index or table scan since there's a function applied to both columns (either string concatenation or to sum both values).
Here's a simple example:
WHERE [column1] + [column2] ='abc'

In order to figure out which rows would qualify SQL Server would need to compare all values of column1 that would either be empty (but not NULL), 'a', 'ab', or 'abc' together with each value of column2 either being empty (but not NULL), 'a', 'ab', or 'abc' and finally eliminate all rows not matching [column1] + [column2] ='abc'. It's even more complicated if we're talking about numeric values...
I guess, SQL Server would simply perform a table or index scan.

My preferred solution would be a computed, persisted, indexed column added to the table (if the number of queries and /or executions using such a where clause is significant). It depends

PS: I'm sure Gail (or any other one of the gurus available) will provide a much more reliable answer...




Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question
How to post performance related questions
Links for Tally Table , Cross Tabs and Dynamic Cross Tabs , Delimited Split Function
Post #1513590
Posted Tuesday, November 12, 2013 12:29 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, July 24, 2014 6:10 PM
Points: 98, Visits: 297
The queries are updating lookup tables that do not have a value by using a subquery to the lookup table.
I like that idea of using a computed column as these will be scanning through lots of data.

WHERE [column1] + [column2] NOT IN
(SELECT [column3] + [column4] from dbo.table order by [column1] + [column2])
Post #1513594
Posted Tuesday, November 12, 2013 12:30 PM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Friday, July 25, 2014 3:19 AM
Points: 4,320, Visits: 6,113
In addition to what others have said another ramification of what you seek (pun intended!!) to do will also prevent the optimizer from getting good estimates on rows to be hit. So you will also wind up with bad query plans to boot.

Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1513595
Posted Tuesday, November 12, 2013 12:47 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 12:06 AM
Points: 7,040, Visits: 12,958
Jonathan Marshall (11/12/2013)
The queries are updating lookup tables that do not have a value by using a subquery to the lookup table.
I like that idea of using a computed column as these will be scanning through lots of data.

WHERE [column1] + [column2] NOT IN
(SELECT [column3] + [column4] from dbo.table order by [column1] + [column2])


Step 1: remove the ORDER BY, it's useless.
Step 2: add a computed, persisted, indexed column on your source table
Step 3: add a computed, persisted, indexed column on your lookup table
Step 4: final query:
WHERE tableY.computedColumns1and2 NOT IN
(SELECT table.computedColumns3and4 from dbo.table)

Unfortunately, you'll usually still end up with an index scan on the large table since each and every value needs to be verified (except for the source table holding values of 1 .. 9 and the lookup table only holds 1 .. 2 (as a basic example), then there might be a chance of an index seek...

My approach would be an index as narrow as possible with just the computed column being a key column and the other values required being used as included columns.




Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question
How to post performance related questions
Links for Tally Table , Cross Tabs and Dynamic Cross Tabs , Delimited Split Function
Post #1513602
Posted Tuesday, November 12, 2013 12:48 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, July 24, 2014 6:10 PM
Points: 98, Visits: 297
Wow that is not good as the size of data will get quite large.
Thank you for the feedback. I will have to continue to research the options as far as a computed columns etc....

I will update with chosen path or changes.


Thanks again....

Jonathan
Post #1513603
Posted Tuesday, November 12, 2013 12:49 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Wednesday, July 23, 2014 9:00 AM
Points: 1,595, Visits: 4,585
Jonathan Marshall (11/12/2013)
The queries are updating lookup tables that do not have a value by using a subquery to the lookup table.
I like that idea of using a computed column as these will be scanning through lots of data.

WHERE [column1] + [column2] NOT IN
(SELECT [column3] + [column4] from dbo.table order by [column1] + [column2])

Now I understand why you're wanting concatenate columns in WHERE clause.

First, I don't understand why you have an ORDER BY clause in that sub-select. I'm surprised that would even parse.

When I have a similar situation to select where two or more columns in one set don't intersect with two or more columns in another set, I typically do an outer join. You can experiment to see which performs better.

select ...
from TableA
left join TableB on column1 = column3 and column2 = column4
where column3 is null;

Also, if basically all you need is to return column1 and column2 from one set where not contained in column3 and column4 of another set, then you can use the EXCEPT clause. It's essentially the reverse of UNION clause, because you're taking 2nd set away from 1st set rather than unionizing them.

SELECT column1, column2 from TableA
EXCEPT
SELECT column3, column4 from TableB;


If these sets will be large, then you'll probably benefit from indexing (column1, column2) and (column3, column4).
Post #1513604
Posted Tuesday, November 12, 2013 12:56 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Wednesday, July 23, 2014 9:00 AM
Points: 1,595, Visits: 4,585
Jonathan Marshall (11/12/2013)
The queries are updating lookup tables that do not have a value by using a subquery to the lookup table.
I like that idea of using a computed column as these will be scanning through lots of data.

WHERE [column1] + [column2] NOT IN
(SELECT [column3] + [column4] from dbo.table order by [column1] + [column2])

If updating one table from another, then consider MERGE statement.
Post #1513606
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse