Need help with Select Where clause using parameter equal blank or in a list

Question

Need help with Select Where clause using parameter equal blank or in a list

jplankenhorn

Ten Centuries

Points: 1026
More actions
July 17, 2013 at 10:06 am

#280538

I have a stored procedure that passes in several parameters that can be blank or a list of selected filter values. For each parameter, I split the values and insert into their own temp tables. This all works fine. When I get to my Select statement, I want to select all records if the parameter that was passed is blank or filter the select if the parameter was not blank. Here is an example of what I am dfoing: of what I am doing:
create procedure XYZ
@param1 varchar(50), -- passing in 2 values
@param2 varchar(50), -- passing in 2 values
@param3 varchar(50) -- passing in a blank string
as
create table #Param1Values
(Param1Valueint)
insert #Param1Values
select *
from SplitToInt(@param1)
--Repeat above for other two parameters
select *
from ABC
where (@param1 = '' or (colA in (select Param1Value from #Param1Values)) and
(@param2 = '' or (colB in (select Param2Value from #Param2Values)) and
(@param3 = '' or (colC in (select Param3Value from #Param3Values))
This all works, however, I found the timing for the query to fluctuate immensely when I alter the comparison of the second parameter. The query takes about 1:20 to run when coded as above. If I take out the "@param2 = '' or", the query runs in 11 seconds. I don't see this kind of adjustment if I take out the same portion of the first parameter. That only bring it down another 2 seconds. I have considered populating the temp table with all possible values, if the parameter is blank. The difference between that and just removing the whole second line(to simulate an All filter) is just 10 seconds. I also tried putting an index on the temp table, but that actually took longer.
I do know that there are a lot more distinct values in colB then there are in the other two. Also, the primary table of the query does not have any indexes set up for any of the columns in the query. I haven't tried setting those up yet to see what kind of performance improvement I might get.
Any ideas as to why there is such a big impact with this one column? Is there a better approach that I should be taking?
Thanks.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

kapil_kk SSC-Insane Points: 21316 More actions · Answer 1

Can you please some sample data that you are trying to insert?

_______________________________________________________________
To get quick answer follow this link:
http://www.sqlservercentral.com/articles/Best+Practices/61537/

jplankenhorn Ten Centuries Points: 1026 More actions · Answer 2

colA contains division numbers, with values of 10, 20, 30, etc.

colB contains GL Account Numbers with values of 3160, 3180, etc.

colC contains Process Codes with string values like apet, arss , etc

I am passing in 10,20 for the first parameter and 3160,3180 into the second. The third is blank.

The population of the temp tables works quickly and correctly.

The select query is only a simple example of a much more complicated query that has multiple joins.

Sean Lange SSC Guru Points: 286573 More actions · Answer 3

jplankenhorn (7/17/2013)
I have a stored procedure that passes in several parameters that can be blank or a list of selected filter values. For each parameter, I split the values and insert into their own temp tables. This all works fine. When I get to my Select statement, I want to select all records if the parameter that was passed is blank or filter the select if the parameter was not blank. Here is an example of what I am dfoing: of what I am doing:
create procedure XYZ
@param1 varchar(50), -- passing in 2 values
@param2 varchar(50), -- passing in 2 values
@param3 varchar(50) -- passing in a blank string
as
create table #Param1Values
(Param1Valueint)
insert #Param1Values
select *
from SplitToInt(@param1)
--Repeat above for other two parameters
select *
from ABC
where (@param1 = '' or (colA in (select Param1Value from #Param1Values)) and
(@param2 = '' or (colB in (select Param2Value from #Param2Values)) and
(@param3 = '' or (colC in (select Param3Value from #Param3Values))
This all works, however, I found the timing for the query to fluctuate immensely when I alter the comparison of the second parameter. The query takes about 1:20 to run when coded as above. If I take out the "@param2 = '' or", the query runs in 11 seconds. I don't see this kind of adjustment if I take out the same portion of the first parameter. That only bring it down another 2 seconds. I have considered populating the temp table with all possible values, if the parameter is blank. The difference between that and just removing the whole second line(to simulate an All filter) is just 10 seconds. I also tried putting an index on the temp table, but that actually took longer.
I do know that there are a lot more distinct values in colB then there are in the other two. Also, the primary table of the query does not have any indexes set up for any of the columns in the query. I haven't tried setting those up yet to see what kind of performance improvement I might get.
Any ideas as to why there is such a big impact with this one column? Is there a better approach that I should be taking?
Thanks.

I see a couple of potential issues here. First is your splitter. Delimited splitters can be VERY slow depending on how they are written. Another issue, why bother parsing the parameter into a table so you can join to it and then discard it? Your function already has it in a table. The last couple issues revolve around multiple execution paths. What you have done is created a situation where you can and will get suboptimal execution plans because of the way you are handling the where predicates.

Gail's blog explain the situation in a nutshell. http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/[/url]

I don't think your query is really a catch all type of query. Understanding why that works will explain why you are seeing fluctuations in execution time. I think your execution times at way too slow (11 seconds is a LONG time for a query if somebody is waiting for it or the row count is not in the millions). I think if you can post the actual code for your proc and your function along with some ddl for the tables and sample data we can make this thing scream.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

jplankenhorn Ten Centuries Points: 1026 More actions · Answer 4

Thanks for the link. That article was very informative. My query does pretty much fall into his Catch-All classification. This query is being used to populate a report based on a set of parameters that the user chooses, for example they may want to limit the report to a specific set of divisions or all of them.

The splitting routine was one I found online and was recommended as a very fast one that uses xml to parse the values. I have not found any problems in my queries that would indicate this as a problem.

The database that I am using is loaded from a progress database using SSIS. I use this database for reporting as it is much easier to work with and runs faster than the Progress db. That being said, I don't have many indexes on the tables at this point. I pretty much add them as I find issues. The tables in this query still need that optimizing. I'me sure I can lop a lot off the time just doing that.

I was primarily understanding why the query takes so long one way and not the other, especially, when compared to the same code against a different field in the table. Does it take longer because the table has more values? If so, why when the temp table is the one being looked at and it only contains 2 values? By the way, I can leave the "@param2= ''" code in and replace the "(select Param2Value from #Param2Values)" clause with (3160,3180) and it runs in 9 seconds.

Before I jump to using Dynamic SQL, something I just have a personal preference against doing, I may try breaking the query into two chunks. There is a table in the large query that contains the a single row per GL Div/Account. I I only need one column from that table, so I may use the above logic against it and create a temp table that can then be joined to my primary select table instead of using the original. If I am not happy with those results, I may go to dynamic SQL or just go with what I have after I see what the index changes does for me.

Sean Lange SSC Guru Points: 286573 More actions · Answer 5

I would suggest that using XML is not going to be as fast as the one found by following the link in my signature about splitting strings. That article has an xml splitter, plus a few others. Then the author tests the performance of all the various ways. You might be surprised. I would highly recommend at least perusing that article. It may not be that your splitter is an issue today but the tally based splitter will eliminate that issue for certain.

As for breaking this into chunks, I think that is an excellent approach. You still have some potential issues with plans that don't work as well based on current parameters. Here is another awesome article from Gail on this topic. http://sqlinthewild.co.za/index.php/2009/09/15/multiple-execution-paths/[/url]

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

jplankenhorn Ten Centuries Points: 1026 More actions · Answer 6

Thanks for the recommendations. I really appreciate them. I will be checking out all of those links.

I tweaked my code to prefilter using the method I said. I did have to modify it a bit and couldn't eliminate the other table completely, but the query does run in the 8 second range.

Now, on to the indexes 🙂

jplankenhorn Ten Centuries Points: 1026 More actions · Answer 7

Just an update on what I did. I split the query into a base SP and two sub-SPs. Since I had two parameter sets that would never be used together, query within a post date range or a list of periods, I decided to make a sub-SP out of each.

The base SP takes the parameter strings and loads them into a set of table types. If a parameter is blank, the table type is loaded with the entire available list of values. It then calls the appropriate sub-SP.

The sub-SPs now join to the passed in table types. The Period SP doesn't even need a where clause.

I have also added appropriate indexes to all the tables that are used. MY small sample from earlier now takes about 7 seconds. A large test that used to take a little over 2 minutes now takes 30 seconds. I am very happy with these results.

Thanks again for the help.

Sean Lange SSC Guru Points: 286573 More actions · Answer 8

You are welcome. Glad I was able to help you find a solution and thanks for letting me know.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/