Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««5253545556»»»

Tally OH! An Improved SQL 8K “CSV Splitter” Function Expand / Collapse
Author
Message
Posted Friday, February 22, 2013 1:24 PM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Monday, May 26, 2014 10:11 AM
Points: 452, Visits: 820
mister.magoo (2/22/2013)
Usman Butt (2/22/2013)
L' Eomot Inversé (2/22/2013)
mister.magoo (2/22/2013)
What bothered me was that feeling that "someone should have thought of this ages ago, shouldn't they"?
Always makes me nervous when that is the case...

That feeling was bothering me too - I can't understand why I didn't spot the opportunity the first time I looked at the code, let alone why Jeff or Paul or another of the top experts who contribute so much to SQLServerCentral didn't do it long ago.
Looks as if you're the only one who's awake around here, Mr M.


IIRC, Jeff Moden already knows it. But I guess the conversation we had somewhere else was something like that he likes to use the default collation. (I may be wrong. So I sincerely apologize in advance in case I said anything/everything wrong).

But in my case, the binary collated splitter version is already in production for comma delimeted strings



And to supplement my earlier post, please read posts from here onwards. I guess both of our top experts Jeff Moden and Paul White already knows it very well?
Post #1423239
Posted Friday, February 22, 2013 2:02 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 5:36 PM
Points: 8,545, Visits: 9,033
Usman Butt (2/22/2013)
And to supplement my earlier post, please read posts from here onwards. I guess both of our top experts Jeff Moden and Paul White already knows it very well?

Ouch. Less than 6 months ago and I already forgot!


Tom
Post #1423260
Posted Tuesday, February 26, 2013 3:48 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Yesterday @ 4:55 AM
Points: 323, Visits: 2,198
Using a binary comparisons is a good speedup for specific cases and good to be aware of and certainly worth mentioning in bold with its benefits. What it is not however is a good in a generic solution that people can copy paste and directly use without unexpected side-effects. Collations do not exist for nothing and most of the searches in a database will be case insensitve for valid reasons.
Post #1423980
Posted Tuesday, February 26, 2013 3:53 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 9:52 AM
Points: 7,107, Visits: 13,462
peter-757102 (2/26/2013)
Using a binary comparisons is a good speedup for specific cases and good to be aware of and certainly worth mentioning in bold with its benefits. What it is not however is a good in a generic solution that people can copy paste and directly use without unexpected side-effects. Collations do not exist for nothing and most of the searches in a database will be case insensitve for valid reasons.


The collation changes to DS8K suggested by Mister Magoo are all internal to the function and are not inherited by (or retained by) the output columns. Good point though - the output of multiple nested REPLACE accelerated by applying a binary collation will retain the new collation cast and could affect subsequent operations.


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1423985
Posted Tuesday, February 26, 2013 4:32 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 8:28 PM
Points: 3,609, Visits: 5,219
ChrisM@Work (2/26/2013)
peter-757102 (2/26/2013)
Using a binary comparisons is a good speedup for specific cases and good to be aware of and certainly worth mentioning in bold with its benefits. What it is not however is a good in a generic solution that people can copy paste and directly use without unexpected side-effects. Collations do not exist for nothing and most of the searches in a database will be case insensitve for valid reasons.


The collation changes to DS8K suggested by Mister Magoo are all internal to the function and are not inherited by (or retained by) the output columns. Good point though - the output of multiple nested REPLACE accelerated by applying a binary collation will retain the new collation cast and could affect subsequent operations.


Does this mean though, that internally the BIN collation sequence should be either CS or CI, depending on what the default is for the database?

I always use Latin1_General_BIN and now I'm not sure if that is CS or CI!



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1423998
Posted Tuesday, February 26, 2013 4:46 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 9:52 AM
Points: 7,107, Visits: 13,462
dwain.c (2/26/2013)
ChrisM@Work (2/26/2013)
peter-757102 (2/26/2013)
Using a binary comparisons is a good speedup for specific cases and good to be aware of and certainly worth mentioning in bold with its benefits. What it is not however is a good in a generic solution that people can copy paste and directly use without unexpected side-effects. Collations do not exist for nothing and most of the searches in a database will be case insensitve for valid reasons.


The collation changes to DS8K suggested by Mister Magoo are all internal to the function and are not inherited by (or retained by) the output columns. Good point though - the output of multiple nested REPLACE accelerated by applying a binary collation will retain the new collation cast and could affect subsequent operations.


Does this mean though, that internally the BIN collation sequence should be either CS or CI, depending on what the default is for the database?

I always use Latin1_General_BIN and now I'm not sure if that is CS or CI!


It's neither case nor accent sensitive but there's evidence here that it's smirk-sensitive:
Latin1_General_BIN_:s
Latin1_General_BIN_:]


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1424004
Posted Tuesday, February 26, 2013 5:11 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 4:21 PM
Points: 1,780, Visits: 5,644
ChrisM@Work (2/26/2013)
dwain.c (2/26/2013)
ChrisM@Work (2/26/2013)
peter-757102 (2/26/2013)
Using a binary comparisons is a good speedup for specific cases and good to be aware of and certainly worth mentioning in bold with its benefits. What it is not however is a good in a generic solution that people can copy paste and directly use without unexpected side-effects. Collations do not exist for nothing and most of the searches in a database will be case insensitve for valid reasons.


The collation changes to DS8K suggested by Mister Magoo are all internal to the function and are not inherited by (or retained by) the output columns. Good point though - the output of multiple nested REPLACE accelerated by applying a binary collation will retain the new collation cast and could affect subsequent operations.


Does this mean though, that internally the BIN collation sequence should be either CS or CI, depending on what the default is for the database?

I always use Latin1_General_BIN and now I'm not sure if that is CS or CI!


It's neither case nor accent sensitive but there's evidence here that it's smirk-sensitive:
Latin1_General_BIN_:s
Latin1_General_BIN_:]


Surely you mean that Latin1_General_BIN is case sensitive (CS) and accent sensitive (AS) ?

Binary collation will definitely find 'a' and 'A' to be different and 'á' to be different to 'a'.

That is part of the reason it is quicker for CHARINDEX and equality testing - it doesn't have to compare 'a' to 'A','á','Á' etc... it is either 'a' or NOT 'a'.

I do agree that it is smirk sensitive though :o) != :O)


MM


  • MMGrid Addin
  • MMNose Addin


  • Forum Etiquette: How to post Reporting Services problems
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • How to Post Performance Problems - by Gail Shaw

  • Post #1424015
    Posted Tuesday, February 26, 2013 5:15 AM


    SSCertifiable

    SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

    Group: General Forum Members
    Last Login: Today @ 9:52 AM
    Points: 7,107, Visits: 13,462
    mister.magoo (2/26/2013)
    ChrisM@Work (2/26/2013)
    dwain.c (2/26/2013)
    ChrisM@Work (2/26/2013)
    peter-757102 (2/26/2013)
    Using a binary comparisons is a good speedup for specific cases and good to be aware of and certainly worth mentioning in bold with its benefits. What it is not however is a good in a generic solution that people can copy paste and directly use without unexpected side-effects. Collations do not exist for nothing and most of the searches in a database will be case insensitve for valid reasons.


    The collation changes to DS8K suggested by Mister Magoo are all internal to the function and are not inherited by (or retained by) the output columns. Good point though - the output of multiple nested REPLACE accelerated by applying a binary collation will retain the new collation cast and could affect subsequent operations.


    Does this mean though, that internally the BIN collation sequence should be either CS or CI, depending on what the default is for the database?

    I always use Latin1_General_BIN and now I'm not sure if that is CS or CI!


    It's neither case nor accent sensitive but there's evidence here that it's smirk-sensitive:
    Latin1_General_BIN_:s
    Latin1_General_BIN_:]


    Surely you mean that Latin1_General_BIN is case sensitive (CS) and accent sensitive (AS) ?

    Binary collation will definitely find 'a' and 'A' to be different and 'á' to be different to 'a'.

    That is part of the reason it is quicker for CHARINDEX and equality testing - it doesn't have to compare 'a' to 'A','á','Á' etc... it is either 'a' or NOT 'a'.

    I do agree that it is smirk sensitive though :o) != :O)


    Oops! Smirk in me eyes yes of course. "binary" is a bit of a giveaway.


    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
    Exploring Recursive CTEs by Example Dwain Camps
    Post #1424019
    Posted Tuesday, February 26, 2013 5:12 PM


    Hall of Fame

    Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

    Group: General Forum Members
    Last Login: Yesterday @ 8:28 PM
    Points: 3,609, Visits: 5,219
    ChrisM@Work (2/26/2013)
    mister.magoo (2/26/2013)
    ChrisM@Work (2/26/2013)
    dwain.c (2/26/2013)
    ChrisM@Work (2/26/2013)
    peter-757102 (2/26/2013)
    Using a binary comparisons is a good speedup for specific cases and good to be aware of and certainly worth mentioning in bold with its benefits. What it is not however is a good in a generic solution that people can copy paste and directly use without unexpected side-effects. Collations do not exist for nothing and most of the searches in a database will be case insensitve for valid reasons.


    The collation changes to DS8K suggested by Mister Magoo are all internal to the function and are not inherited by (or retained by) the output columns. Good point though - the output of multiple nested REPLACE accelerated by applying a binary collation will retain the new collation cast and could affect subsequent operations.


    Does this mean though, that internally the BIN collation sequence should be either CS or CI, depending on what the default is for the database?

    I always use Latin1_General_BIN and now I'm not sure if that is CS or CI!


    It's neither case nor accent sensitive but there's evidence here that it's smirk-sensitive:
    Latin1_General_BIN_:s
    Latin1_General_BIN_:]


    Surely you mean that Latin1_General_BIN is case sensitive (CS) and accent sensitive (AS) ?

    Binary collation will definitely find 'a' and 'A' to be different and 'á' to be different to 'a'.

    That is part of the reason it is quicker for CHARINDEX and equality testing - it doesn't have to compare 'a' to 'A','á','Á' etc... it is either 'a' or NOT 'a'.

    I do agree that it is smirk sensitive though :o) != :O)


    Oops! Smirk in me eyes yes of course. "binary" is a bit of a giveaway.


    Doh! (Homer Simpson moment)



    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
    Since random numbers are too important to be left to chance, let's generate some!
    Learn to understand recursive CTEs by example.
    Splitting strings based on patterns can be fast!
    Post #1424288
    Posted Wednesday, February 27, 2013 1:51 AM


    SSCertifiable

    SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

    Group: General Forum Members
    Last Login: Today @ 9:52 AM
    Points: 7,107, Visits: 13,462
    dwain.c (2/26/2013)
    ChrisM@Work (2/26/2013)
    mister.magoo (2/26/2013)
    ChrisM@Work (2/26/2013)
    dwain.c (2/26/2013)
    ChrisM@Work (2/26/2013)
    peter-757102 (2/26/2013)
    Using a binary comparisons is a good speedup for specific cases and good to be aware of and certainly worth mentioning in bold with its benefits. What it is not however is a good in a generic solution that people can copy paste and directly use without unexpected side-effects. Collations do not exist for nothing and most of the searches in a database will be case insensitve for valid reasons.


    The collation changes to DS8K suggested by Mister Magoo are all internal to the function and are not inherited by (or retained by) the output columns. Good point though - the output of multiple nested REPLACE accelerated by applying a binary collation will retain the new collation cast and could affect subsequent operations.


    Does this mean though, that internally the BIN collation sequence should be either CS or CI, depending on what the default is for the database?

    I always use Latin1_General_BIN and now I'm not sure if that is CS or CI!


    It's neither case nor accent sensitive but there's evidence here that it's smirk-sensitive:
    Latin1_General_BIN_:s
    Latin1_General_BIN_:]


    Surely you mean that Latin1_General_BIN is case sensitive (CS) and accent sensitive (AS) ?

    Binary collation will definitely find 'a' and 'A' to be different and 'á' to be different to 'a'.

    That is part of the reason it is quicker for CHARINDEX and equality testing - it doesn't have to compare 'a' to 'A','á','Á' etc... it is either 'a' or NOT 'a'.

    I do agree that it is smirk sensitive though :o) != :O)


    Oops! Smirk in me eyes yes of course. "binary" is a bit of a giveaway.


    Doh! (Homer Simpson moment)


    About once a week, mate. Now I'm an old git I can blame it on declining memory. When I was a sprightly youngster it was more embarrassing.


    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
    Exploring Recursive CTEs by Example Dwain Camps
    Post #1424397
    « Prev Topic | Next Topic »

    Add to briefcase «««5253545556»»»

    Permissions Expand / Collapse