Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Using a Variable for an IN Predicate


Using a Variable for an IN Predicate

Author
Message
jimmyzimms
jimmyzimms
Forum Newbie
Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)

Group: General Forum Members
Points: 6 Visits: 14
Bruce W Cassidy (3/1/2013)
jimmyzimms (3/1/2013)
A standardized xml type containing the set (think something installed in your schemas collection) would be far easier to use and debug plus SQL server is a stud at using XML.

Reporting Services will only automatically generate a comma-delimited string for multi-value parameters; it won't generate XML. So I would have to write a parser in Reporting Services to express the parameters in XML before passing them to T-SQL where I then have to break the XML down again.

Neither of those is actually difficult, but I don't see the value in spending the extra time every time I call a stored procedure from SQL Server Reporting Services to ensure that any muti-value parameters are encoded in XML.

In this case, it would make the interface between reporting services and SQL Server harder to use or debug.


Wow that sucks (about SSRS). The point is still valid and I'd actually suggest you in general try to keep to the xml array as the "native" input parameter and have an SSRS wrapper function that deals with the delimited string coercion. This is a fairly common implementation of an anti-corruption layer at work here.

That's a bit of good trivia there Bruce. thanks!
bmayhew
bmayhew
Forum Newbie
Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)

Group: General Forum Members
Points: 9 Visits: 50
I agree with the UDF approach. We use 2 UDFs, one for splitting character data, one for splitting int. Comes in very handy and only done once.
rd8202
rd8202
Forum Newbie
Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)

Group: General Forum Members
Points: 6 Visits: 17
Instead of the CTE, can't you just put the CTE's SELECT statement inside the WHERE clause?
Bruce W Cassidy
Bruce W Cassidy
Right there with Babe
Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)

Group: General Forum Members
Points: 787 Visits: 1033
jimmyzimms (3/1/2013)
I'd actually suggest you in general try to keep to the xml array as the "native" input parameter and have an SSRS wrapper function that deals with the delimited string coercion. This is a fairly common implementation of an anti-corruption layer at work here.

The delimited strings are assembled directly by SSRS; I don't see the value in an "anti-corruption layer" sorry. If SSRS is not building parameters correctly, that means that a significant part of the .Net layer must be corrupt, and no XML parsing is going to fix that (especially since the thing you're most likely to do to build the XML is use the .Net functionality to turn the array of parameters into XML).

Instead I plug the stored procedure directly into SSRS as a data source, and handle any parameters within the stored procedure (which I would also have to do if it were XML). Whereas if I were passing parameters as XML, I would be doing adding additional work, reducing the abstraction (as you have to go through preparation steps prior to calling a stored procedure, which means you have to know about parameter formats, which is a step down in abstraction), and gaining... what, exactly?
Bruce W Cassidy
Bruce W Cassidy
Right there with Babe
Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)

Group: General Forum Members
Points: 787 Visits: 1033
rd8202 (3/1/2013)
Instead of the CTE, can't you just put the CTE's SELECT statement inside the WHERE clause?


You can do it that way; it's really just a syntactical change. There's a slight difference in the query plan (as the article shows -- it does cover both approaches), but SQL Server ends up doing much the same work. You're exchanging a filter (the IN clause) for a join (the CTE), and both are pretty optimal operations.

Where there's a difference is where you decide to use a temporary table. I'll do that for two reasons: either I'm expecting more than an handful of values or I'm reusing the list of values in subsequent queries. For the first, I can create a temporary table with a primary key so SQL Server can do an index merge for the join, and that seems to work pretty well.
fregatepllada
fregatepllada
Valued Member
Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)

Group: General Forum Members
Points: 66 Visits: 322
Sean Lange (3/1/2013)
Jeff Moden (3/1/2013)
Sean Lange (3/1/2013)
Irozenberg 1347 (3/1/2013)
Elementary sub-query or EXISTS clause would
Radically improve performance.
If you like fancy coding you could pass XML as
parameter and use XQUERY.


You should read the article about the splitter. It will blow the doors off an XML parser. You can find the article by following the link in my signature about splitting strings. The article is great but the ensuing discussion (although incredibly lengthy) is even better.


Be careful now. He's not talking about passing a CSV and using an XML parser to split it. He's talking about passing actual XML and parsing it, which is actually very fast.


Ahh I guess I misread that. Blush

It is fast on the sql side of things but I have never really been convinced that applications gain much benefit because of all the extra data that has to be passed.
Thank you Jeff!
IMHO CSV data in database is a pure hack - some developers just cutting corners to speed up first cut and leave before it causing HUGE maintenance problem, especially with reporting.
Last year have to deal with data migration project when salesforce.com legacy system had been implemented by storing comma separate values inside of table cells. I used CSV split function based on XML in order to tranform a column that contained comma-separated strings into a new table with ID and atomic items. Pure job creation scheme!
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45418 Visits: 39942
Irozenberg 1347 (3/1/2013)
Thank you Jeff!
IMHO CSV data in database is a pure hack - some developers just cutting corners to speed up first cut and leave before it causing HUGE maintenance problem, especially with reporting.
Last year have to deal with data migration project when salesforce.com legacy system had been implemented by storing comma separate values inside of table cells. I used CSV split function based on XML in order to tranform a column that contained comma-separated strings into a new table with ID and atomic items. Pure job creation scheme!


You're welcome.

Just to expose my opinion a bit...

There are two types of CSV...

Comma Separated Values - This is the "true" CSV and will always have quotes around aything that is character based. It's extremely useful but has the same problem as XML... if a quote is embedded in the actual data, it needs to be "escaped" in some fashion. Most people and programs don't format this type of file correctly including EXCEL.

Comedy Separated Values - This is what most people think of when you say CSV. The only rules are that commas separate values and commas cannot be embedded in the values. It's a real PITA unless those rules are followed explicitly. Otherwise, it's pretty easy to make and use.

Both types of CSV are easy for humans to understand and read which is why they both gained favor in the past.

If it were me, I'd go back to the old ways and use things like ASCII characters 28 through 31 as they were meant to be along with the CrLf characters to help make it a bit easier to troubleshoot.

Shifting gears, I think XML is probably the worst hack of them all. I just can't imagine why anyone would have written such a dirty, bloated, PITA to use data transmission format even if we didn't have to move the data to databases but especially since databases and spreadsheets DO make the world go'round. Consider the following.

1. The tags cause extreme bloat. If communications were slower than what they are, no one in their right might would transmit so much useless repetative data as that contained in the tags.

2. The tags don't actually do what they were advertised to do in the very beginning. There are things like OPENXML in SQL Server that will correctly shred even the most complex (even if not perfectly "well formed") of XML hierarchical data into an "edge" table but nothing to tell the human what the structure is especially when it comes to nested levels (think indented list of tags). The human has to figure that out and then write code to shred it correctly.

3. The data format doesn't follow just one standard. It follows at least 2. One for entity data and one for element data both of which are just row data at different levels.

4. XML data sucks for relational databases because multiple entity data (tables) are nested in a hierarchy. Ask most relational database programmers what their least favorite subject is and chances are good that they'll say "hierarchies".

5. Unless you build a separate style sheet, XML doesn't do much to tell you what the datatype of each "cell" (whether it be entity or element data) is. That's another short coming, IMHO.

6. All of that was done for the sake of supposed human readability and, I've got to tell you, if you have XML with more than about 3 levels (entities), a human is going to have some good difficulty with shredding it in the mind well enough to actually write queries against it.

7. The data violates every rule of normalization there is if you store it in a database in an unshredded format. Holy shades of old IMS systems!

One of my favorite ways to transmit data in the old days included a combination of the ASCII characters 28-31 with CrLf just for readability and the old DBase III table format. Each file had a header that named the columns and identified the datatype much like a CREATE TABLE statement. It was database friendly because each file formed an entity (table). It was incredibly compact because there were no tags and the separators weren't something that most humans could type via a keyboard which meant you didn't have to worry about things like embedded quotes or separators.

The only good thing about XML, IMHO, is that it is so bloated and so resource intensive to shred that people HAD to make communications and CPUs faster just to handle XML and it's equally bloated father, HTML. Who in their right mind would use a 4 character starting tag and a 5 character ending tag just to identify a "cell" in a table for a single digit? Bleah!!! Sick What the hell were they thinking?

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Bruce W Cassidy
Bruce W Cassidy
Right there with Babe
Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)Right there with Babe (787 reputation)

Group: General Forum Members
Points: 787 Visits: 1033
All I can say to that is that it's a good thing that SQL isn't intended to be a parser, and thus we don't have to worry about handling CSV or XML data in the database... :-P
fregatepllada
fregatepllada
Valued Member
Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)

Group: General Forum Members
Points: 66 Visits: 322
Bruce W Cassidy (3/2/2013)
All I can say to that is that it's a good thing that SQL isn't intended to be a parser, and thus we don't have to worry about handling CSV or XML data in the database... :-P
Bruce, I wish you would be right BUT it's like a joke about THEORY and PRACTICE: they should be the same (IN Theory), but in practice they are different.
Please note that original meaning of CSV was CHARACTER-SEPARATED VALUES (more generic case than COMMA-SEPARATED values).
Now, back to PRACTICE - the most craziest thing in legacy system was "a feature" when multiple characters had been used in a single CSV (for instance both , and ; had been used in a single string). Anybody remembers definition of shortcut? Yes, the best one - the longest path between 2 points!
In REAL life we have to deal with uncertainties, one of them - is LACK of data integrity in legacy systems. Always expect unexpected, but be grateful to previous developers - they ultimately created a JOB for you!
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search