Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««6364656667»»»

Tally OH! An Improved SQL 8K “CSV Splitter” Function Expand / Collapse
Author
Message
Posted Tuesday, June 10, 2014 2:45 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Tuesday, December 23, 2014 12:10 PM
Points: 13,330, Visits: 12,829
timgapinski-841846 (6/10/2014)
CREATE FUNCTION permission denied in database 'tempdb'.

not sure why i do not have permission over the tempdb but i could not get started with these much needed tools because of this error above.

i have permission over all other tables.

any suggestions?



Create a new database and work in there.


_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Post #1579418
Posted Wednesday, June 25, 2014 10:50 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 4:23 PM
Points: 35,821, Visits: 32,494
Paul White (5/15/2014)
Jeff Moden (5/15/2014)
I'd not seen that happen in my testing but know that such a thing could happen. Thanks for the heads up and the patch. When I get my head above water, I'll see if I can incorporate it. Thanks!

Yes it was always a calculated risk. It seems the new cardinality estimator in 2014 results in a plan favouring a table spool, which effectively caches a single NEWID call. The semantics of NEWID are a bit of a mess, but that's a whole other discussion.

Please don't incorporate the hack fix into your fine script - it is undocumented and unsupported after all. No doubt some revision to the script will be necessary, but please try to avoid 8690. Cheers.


I've finally had a bit of time to spend on this. I can't get the symptoms you spoke of to repeat themselves. Not sure why you may be having the problem.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1586223
Posted Friday, June 27, 2014 12:46 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Monday, December 22, 2014 12:15 AM
Points: 9,928, Visits: 11,207
Jeff Moden (6/25/2014)
I've finally had a bit of time to spend on this. I can't get the symptoms you spoke of to repeat themselves. Not sure why you may be having the problem.

Make sure you're on SQL Server 2014 and using the 120 model cardinality estimator.




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #1586724
Posted Wednesday, July 9, 2014 2:46 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, November 19, 2014 3:36 PM
Points: 21, Visits: 318
It seems that Paul is right, which explains why it appears that the memory optimized tally tables splitter beat CLR functions at first blush. (This isn't true sadly). Adding the traceflag removes the caching advantage.
Post #1590950
Posted Wednesday, August 6, 2014 12:48 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, October 9, 2014 12:54 PM
Points: 5, Visits: 344
I have a CSV file that I bulk loaded into a single column staging table.

I need a way to parse proper csv files that are comma delimited with optional quote text qualifiers
as the data may have embedded commas.
i.e. Jason,"900 N. May ST., #5"

using the Jeff's splitter it would get split into 3 columns i.e. Jason as column 1 900 N. May ST. as column 2 and #5 as column 3
I need it to keep the address together when there are quotes text qualifiers .

I was able to get the function from this article to work on my dev machine "Reaping the benefits of the Window functions in T-SQL"
http://www.sqlservercentral.com/articles/SQL+Server+2012/106903

But I need to deploy this to a server running 2008 and that function uses syntax that is only valid in 2012.

I would appreciate any help.
Post #1600064
Posted Wednesday, August 6, 2014 1:02 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 1:54 PM
Points: 2,549, Visits: 7,158
Jason-368451 (8/6/2014)
I have a CSV file that I bulk loaded into a single column staging table.

I need a way to parse proper csv files that are comma delimited with optional quote text qualifiers
as the data may have embedded commas.
i.e. Jason,"900 N. May ST., #5"

using the Jeff's splitter it would get split into 3 columns i.e. Jason as column 1 900 N. May ST. as column 2 and #5 as column 3
I need it to keep the address together when there are quotes text qualifiers .

I was able to get the function from this article to work on my dev machine "Reaping the benefits of the Window functions in T-SQL"
http://www.sqlservercentral.com/articles/SQL+Server+2012/106903

But I need to deploy this to a server running 2008 and that function uses syntax that is only valid in 2012.

I would appreciate any help.


Have you looked at Phil's article, The TSQL of CSV: Comma-Delimited of Errors? He explores some alternatives there.

I haven't created a 2K8 version of the CSV parsing code in the article as it uses a running total for marking delimiters active/inactive, quite inefficient without the window functions.
Post #1600068
Posted Wednesday, August 6, 2014 5:51 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 4:23 PM
Points: 35,821, Visits: 32,494
Jason-368451 (8/6/2014)
I have a CSV file that I bulk loaded into a single column staging table.

I need a way to parse proper csv files that are comma delimited with optional quote text qualifiers
as the data may have embedded commas.
i.e. Jason,"900 N. May ST., #5"

using the Jeff's splitter it would get split into 3 columns i.e. Jason as column 1 900 N. May ST. as column 2 and #5 as column 3
I need it to keep the address together when there are quotes text qualifiers .

I was able to get the function from this article to work on my dev machine "Reaping the benefits of the Window functions in T-SQL"
http://www.sqlservercentral.com/articles/SQL+Server+2012/106903

But I need to deploy this to a server running 2008 and that function uses syntax that is only valid in 2012.

I would appreciate any help.


The "easy" way I handle this at work is to run it through Import-CSV and pipe it to Export-CSV to another file and use BCP with a format file to do the imports based on the compound delimiters of "," and the leading and trailing quote. It's nasty fast.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1600170
Posted Wednesday, August 6, 2014 11:53 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Sunday, December 14, 2014 10:06 PM
Points: 379, Visits: 1,991
Jason-368451 (8/6/2014)
I have a CSV file that I bulk loaded into a single column staging table.

I need a way to parse proper csv files that are comma delimited with optional quote text qualifiers
as the data may have embedded commas.
i.e. Jason,"900 N. May ST., #5"

using the Jeff's splitter it would get split into 3 columns i.e. Jason as column 1 900 N. May ST. as column 2 and #5 as column 3
I need it to keep the address together when there are quotes text qualifiers .

I was able to get the function from this article to work on my dev machine "Reaping the benefits of the Window functions in T-SQL"
http://www.sqlservercentral.com/articles/SQL+Server+2012/106903

But I need to deploy this to a server running 2008 and that function uses syntax that is only valid in 2012.

I would appreciate any help.


Hey there. Another option is to use SQL CLR. I wrote a library called SQL# that contains a Stored Proc named SQL#.String_SplitResultIntoFields. This proc takes a single field from a result set and splits it. The split expression can accept Regular Expressions . Meaning, you can follow the example here and split your imported data in a way that accounts for both embedded commas and embedded text-qualifiers. The only thing that RegEx expression doesn't do is remove the text qualifiers from the field, but it might be possible to update the RegEx to do that. To be fair, this proc is in the Full version, not the Free version, but it does handle your situation quite cleanly. There is another proc called File_SplitIntoFields which reads directly from the import file so you can bypass BCP and its format files, but that option does not handle embedded newlines (but that might not be a problem anyway given those would probably break your existing import).

Take care,
Solomon..

EDIT:
The condensed RegEx from the link noted above is:
(?x)\s*,\s*(?=(?:[^"]*"[^"]*")*[^"]*$)






SQL# - http://www.SQLsharp.com/
Post #1600333
Posted Wednesday, September 3, 2014 12:24 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Sunday, October 12, 2014 4:00 PM
Points: 341, Visits: 751
I'm using the splitter (just downloaded the zip file) and noticed something odd. Dunno if its just me, but check this out:
select * from dbo.[DelimitedSplit8K](' ', ' ') split -- input is one space, delimiter is one space

yields:
ItemNumber	Item
1
2


So, I get two rows back from the splitter when I only expect one. What would need to change to handle this situation correctly? Or, is there a better way to split when the delimiter is a blank?

Oh, fudge! I get it. It's two null strings separated by a blank.

My bad!!
Post #1610217
Posted Wednesday, September 3, 2014 1:07 PM


SSChasing Mays

SSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing Mays

Group: General Forum Members
Last Login: Yesterday @ 7:28 PM
Points: 651, Visits: 3,015
gbritton1 (9/3/2014)
I'm using the splitter (just downloaded the zip file) and noticed something odd. Dunno if its just me, but check this out:
select * from dbo.[DelimitedSplit8K](' ', ' ') split -- input is one space, delimiter is one space

yields:
ItemNumber	Item
1
2


So, I get two rows back from the splitter when I only expect one. What would need to change to handle this situation correctly? Or, is there a better way to split when the delimiter is a blank?


It's two 0-length values separated by a blank. You will always get the number of delimiters+1.

DECLARE @txt1 varchar(100) = ',,',
@txt2 varchar(10) = ' ',
@txt3 varchar(10) = ' ';

select ItemNumber, Item, item_length = LEN(item)
from dbo.[DelimitedSplit8K](@txt1, ',') split; -- this will get you three blanks

select ItemNumber, Item, item_length = LEN(item)
from dbo.[DelimitedSplit8K](@txt2, ' ') split; -- this will get you 2 blanks

select ItemNumber, Item, item_length = LEN(item)
from dbo.[DelimitedSplit8K](@txt3, ' ') split; -- this will get you 2 3 blanks



Edit: noticed your edit after posting my reply. Also fixed a type in my code comments.


-- Alan Burstein



Read this article for best practices on asking questions.
Need to split a string? Try this (Jeff Moden)
Need a pattern-based string spitter? Try this (Dwain Camps)

"I can't stress enough the importance of switching from a 'sequential files' mindset to 'set-based' thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code. " -- Itzek Ben-Gan 2001

My blog
Post #1610229
« Prev Topic | Next Topic »

Add to briefcase «««6364656667»»»

Permissions Expand / Collapse