Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««12345

Using a Variable for an IN Predicate Expand / Collapse
Author
Message
Posted Friday, March 1, 2013 11:13 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, September 9, 2013 8:15 PM
Points: 2, Visits: 6
Bruce W Cassidy (3/1/2013)
jimmyzimms (3/1/2013)
A standardized xml type containing the set (think something installed in your schemas collection) would be far easier to use and debug plus SQL server is a stud at using XML.

Reporting Services will only automatically generate a comma-delimited string for multi-value parameters; it won't generate XML. So I would have to write a parser in Reporting Services to express the parameters in XML before passing them to T-SQL where I then have to break the XML down again.

Neither of those is actually difficult, but I don't see the value in spending the extra time every time I call a stored procedure from SQL Server Reporting Services to ensure that any muti-value parameters are encoded in XML.

In this case, it would make the interface between reporting services and SQL Server harder to use or debug.


Wow that sucks (about SSRS). The point is still valid and I'd actually suggest you in general try to keep to the xml array as the "native" input parameter and have an SSRS wrapper function that deals with the delimited string coercion. This is a fairly common implementation of an anti-corruption layer at work here.

That's a bit of good trivia there Bruce. thanks!
Post #1425652
Posted Friday, March 1, 2013 11:48 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, December 5, 2013 9:53 AM
Points: 7, Visits: 50
I agree with the UDF approach. We use 2 UDFs, one for splitting character data, one for splitting int. Comes in very handy and only done once.
Post #1425675
Posted Friday, March 1, 2013 1:16 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, September 18, 2014 3:47 PM
Points: 3, Visits: 9
Instead of the CTE, can't you just put the CTE's SELECT statement inside the WHERE clause?
Post #1425709
Posted Friday, March 1, 2013 1:47 PM


Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Tuesday, July 22, 2014 4:12 PM
Points: 758, Visits: 1,031
jimmyzimms (3/1/2013)
I'd actually suggest you in general try to keep to the xml array as the "native" input parameter and have an SSRS wrapper function that deals with the delimited string coercion. This is a fairly common implementation of an anti-corruption layer at work here.

The delimited strings are assembled directly by SSRS; I don't see the value in an "anti-corruption layer" sorry. If SSRS is not building parameters correctly, that means that a significant part of the .Net layer must be corrupt, and no XML parsing is going to fix that (especially since the thing you're most likely to do to build the XML is use the .Net functionality to turn the array of parameters into XML).

Instead I plug the stored procedure directly into SSRS as a data source, and handle any parameters within the stored procedure (which I would also have to do if it were XML). Whereas if I were passing parameters as XML, I would be doing adding additional work, reducing the abstraction (as you have to go through preparation steps prior to calling a stored procedure, which means you have to know about parameter formats, which is a step down in abstraction), and gaining... what, exactly?
Post #1425721
Posted Friday, March 1, 2013 1:52 PM


Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Tuesday, July 22, 2014 4:12 PM
Points: 758, Visits: 1,031
rd8202 (3/1/2013)
Instead of the CTE, can't you just put the CTE's SELECT statement inside the WHERE clause?


You can do it that way; it's really just a syntactical change. There's a slight difference in the query plan (as the article shows -- it does cover both approaches), but SQL Server ends up doing much the same work. You're exchanging a filter (the IN clause) for a join (the CTE), and both are pretty optimal operations.

Where there's a difference is where you decide to use a temporary table. I'll do that for two reasons: either I'm expecting more than an handful of values or I'm reusing the list of values in subsequent queries. For the first, I can create a temporary table with a primary key so SQL Server can do an index merge for the join, and that seems to work pretty well.
Post #1425729
Posted Friday, March 1, 2013 5:04 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, September 16, 2014 1:38 AM
Points: 34, Visits: 161
Sean Lange (3/1/2013)
Jeff Moden (3/1/2013)
Sean Lange (3/1/2013)
Irozenberg 1347 (3/1/2013)
Elementary sub-query or EXISTS clause would
Radically improve performance.
If you like fancy coding you could pass XML as
parameter and use XQUERY.


You should read the article about the splitter. It will blow the doors off an XML parser. You can find the article by following the link in my signature about splitting strings. The article is great but the ensuing discussion (although incredibly lengthy) is even better.


Be careful now. He's not talking about passing a CSV and using an XML parser to split it. He's talking about passing actual XML and parsing it, which is actually very fast.


Ahh I guess I misread that.

It is fast on the sql side of things but I have never really been convinced that applications gain much benefit because of all the extra data that has to be passed.
Thank you Jeff!
IMHO CSV data in database is a pure hack - some developers just cutting corners to speed up first cut and leave before it causing HUGE maintenance problem, especially with reporting.
Last year have to deal with data migration project when salesforce.com legacy system had been implemented by storing comma separate values inside of table cells. I used CSV split function based on XML in order to tranform a column that contained comma-separated strings into a new table with ID and atomic items. Pure job creation scheme!
Post #1425791
Posted Saturday, March 2, 2013 10:20 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 10:28 PM
Points: 35,215, Visits: 31,666
Irozenberg 1347 (3/1/2013)
Thank you Jeff!
IMHO CSV data in database is a pure hack - some developers just cutting corners to speed up first cut and leave before it causing HUGE maintenance problem, especially with reporting.
Last year have to deal with data migration project when salesforce.com legacy system had been implemented by storing comma separate values inside of table cells. I used CSV split function based on XML in order to tranform a column that contained comma-separated strings into a new table with ID and atomic items. Pure job creation scheme!


You're welcome.

Just to expose my opinion a bit...

There are two types of CSV...

Comma Separated Values - This is the "true" CSV and will always have quotes around aything that is character based. It's extremely useful but has the same problem as XML... if a quote is embedded in the actual data, it needs to be "escaped" in some fashion. Most people and programs don't format this type of file correctly including EXCEL.

Comedy Separated Values - This is what most people think of when you say CSV. The only rules are that commas separate values and commas cannot be embedded in the values. It's a real PITA unless those rules are followed explicitly. Otherwise, it's pretty easy to make and use.

Both types of CSV are easy for humans to understand and read which is why they both gained favor in the past.

If it were me, I'd go back to the old ways and use things like ASCII characters 28 through 31 as they were meant to be along with the CrLf characters to help make it a bit easier to troubleshoot.

Shifting gears, I think XML is probably the worst hack of them all. I just can't imagine why anyone would have written such a dirty, bloated, PITA to use data transmission format even if we didn't have to move the data to databases but especially since databases and spreadsheets DO make the world go'round. Consider the following.

1. The tags cause extreme bloat. If communications were slower than what they are, no one in their right might would transmit so much useless repetative data as that contained in the tags.

2. The tags don't actually do what they were advertised to do in the very beginning. There are things like OPENXML in SQL Server that will correctly shred even the most complex (even if not perfectly "well formed") of XML hierarchical data into an "edge" table but nothing to tell the human what the structure is especially when it comes to nested levels (think indented list of tags). The human has to figure that out and then write code to shred it correctly.

3. The data format doesn't follow just one standard. It follows at least 2. One for entity data and one for element data both of which are just row data at different levels.

4. XML data sucks for relational databases because multiple entity data (tables) are nested in a hierarchy. Ask most relational database programmers what their least favorite subject is and chances are good that they'll say "hierarchies".

5. Unless you build a separate style sheet, XML doesn't do much to tell you what the datatype of each "cell" (whether it be entity or element data) is. That's another short coming, IMHO.

6. All of that was done for the sake of supposed human readability and, I've got to tell you, if you have XML with more than about 3 levels (entities), a human is going to have some good difficulty with shredding it in the mind well enough to actually write queries against it.

7. The data violates every rule of normalization there is if you store it in a database in an unshredded format. Holy shades of old IMS systems!

One of my favorite ways to transmit data in the old days included a combination of the ASCII characters 28-31 with CrLf just for readability and the old DBase III table format. Each file had a header that named the columns and identified the datatype much like a CREATE TABLE statement. It was database friendly because each file formed an entity (table). It was incredibly compact because there were no tags and the separators weren't something that most humans could type via a keyboard which meant you didn't have to worry about things like embedded quotes or separators.

The only good thing about XML, IMHO, is that it is so bloated and so resource intensive to shred that people HAD to make communications and CPUs faster just to handle XML and it's equally bloated father, HTML. Who in their right mind would use a 4 character starting tag and a 5 character ending tag just to identify a "cell" in a table for a single digit? Bleah!!! What the hell were they thinking?


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1425869
Posted Saturday, March 2, 2013 1:22 PM


Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Tuesday, July 22, 2014 4:12 PM
Points: 758, Visits: 1,031
All I can say to that is that it's a good thing that SQL isn't intended to be a parser, and thus we don't have to worry about handling CSV or XML data in the database...
Post #1425895
Posted Sunday, March 3, 2013 6:17 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, September 16, 2014 1:38 AM
Points: 34, Visits: 161
Bruce W Cassidy (3/2/2013)
All I can say to that is that it's a good thing that SQL isn't intended to be a parser, and thus we don't have to worry about handling CSV or XML data in the database...
Bruce, I wish you would be right BUT it's like a joke about THEORY and PRACTICE: they should be the same (IN Theory), but in practice they are different.
Please note that original meaning of CSV was CHARACTER-SEPARATED VALUES (more generic case than COMMA-SEPARATED values).
Now, back to PRACTICE - the most craziest thing in legacy system was "a feature" when multiple characters had been used in a single CSV (for instance both , and ; had been used in a single string). Anybody remembers definition of shortcut? Yes, the best one - the longest path between 2 points!
In REAL life we have to deal with uncertainties, one of them - is LACK of data integrity in legacy systems. Always expect unexpected, but be grateful to previous developers - they ultimately created a JOB for you!
Post #1425937
« Prev Topic | Next Topic »

Add to briefcase «««12345

Permissions Expand / Collapse