SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Split input string into multicolumn - multirows


Split input string into multicolumn - multirows

Author
Message
dwain.c
dwain.c
SSCertifiable
SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)

Group: General Forum Members
Points: 7209 Visits: 6431
Here's another way:


DECLARE @MyString VARCHAR(8000) =
'30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|'

SELECT col1=MAX(CASE ItemNumber%4 WHEN 1 THEN Item END)
,col2=MAX(CASE ItemNumber%4 WHEN 3 THEN Item END)
FROM PatternSplitCM(@MyString, '[0-9]')
WHERE [Matched] = 1
GROUP BY ItemNumber/4




PatternSplitCM is described here: http://www.sqlservercentral.com/articles/String+Manipulation/94365/

Not sure it will be faster than using DelimitedSplit8K but since you're calling that function twice, it might give it a run for its money.

Since this is posted in a SQL 2005 forum, you'll need to note that the Tally Table CTE used in PatternSplitCM in the article requires SQL 2008. However it is easily replaced by an Itzik Ben-Gan style CTE tally table like the one that appears (I think) in PatternSplitQU (also in the article).

Edit: Revised (simplified) my vector of attack slightly.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
murthyvs
murthyvs
SSC-Enthusiastic
SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)

Group: General Forum Members
Points: 134 Visits: 86
Hi - the solution works as expected. Is there anyway to increase the size of the input string to 100,000. Its only limited to 8,000.
Thanks much!
dwain.c
dwain.c
SSCertifiable
SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)

Group: General Forum Members
Points: 7209 Visits: 6431
Jeff has said (he can correct me if I'm wrong) that DelimitedSplit8K is optimized for VARCHAR(8000) so change it to VARCHAR(MAX) with that in mind.

I think although I don't have rigid testing results that PatternSplitCM is not significantly affected if you change to VARCHAR(MAX).


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
murthyvs
murthyvs
SSC-Enthusiastic
SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)

Group: General Forum Members
Points: 134 Visits: 86
Yes, I have tried it. I have observed performance degradation. Query runs much slower with VARCHAR(MAX).
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)

Group: General Forum Members
Points: 84893 Visits: 41069
murthyvs (12/14/2012)
Hi - the solution works as expected. Is there anyway to increase the size of the input string to 100,000. Its only limited to 8,000.
Thanks much!


Unless your example is just a striking coincidence, I've seen such a thing before and I'd like to make a totally alternate suggestion if what I suspect is actually true.

It would appear that both fields in the string are sequential in their own right. If that is true (as it was with what I've seen before), then it will save you a huge amount of aggravation, the server a huge amount of processing time and I/O, your network an unnecessary blurb of 100K bytes for each usage, and the loss of feeling in your left arm when everyone jumps on you for resource usage, if you were to send just a starting value for each field and the number of field-pairs and let SQL very quickly generate even millions of pairs more quickly than you could ever transmit them.

For example, your example has fields that start with 30 and 38469 respectively and then both increment by 1 for 9 pairs of fields. Using the method I proposed above, you would pass only those 3 parameters over the pipe and then let SQL Server have at it. That would be capable of generating millions of rows if you needed it an it would happen a whole lot quicker than trying to send a million such field pairs over the pipe. For example, here code that does what I said o a million rows. I believe you'll find it to be remarkably quick.

--===== These would be parameters in a stored procedure
DECLARE @F1Start INT,
@F2Start INT,
@Pairs INT
;
SELECT @F1Start = 30,
@F2Start = 38469,
@Pairs = 1000000
;
--===== This would do the deed as previously described.
WITH
cteTally AS
(
SELECT TOP (@Pairs)
N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
)
SELECT Field1 = @F1Start + N,
Field2 = @F2Start + N
INTO #Temp
FROM cteTally
;




--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)

Group: General Forum Members
Points: 84893 Visits: 41069
dwain.c (12/14/2012)
Jeff has said (he can correct me if I'm wrong) that DelimitedSplit8K is optimized for VARCHAR(8000) so change it to VARCHAR(MAX) with that in mind.

I think although I don't have rigid testing results that PatternSplitCM is not significantly affected if you change to VARCHAR(MAX).


It is, indeed, optimized for 8K. You also have to remember that VARCHAR(MAX) doesn't like to be joined to even by a Tally Table and will cause at least a 2 to 1 slowdown just by changing the function variables from VARCHAR(8000) to VARCHAR(MAX) even if the data stays under 8K.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
murthyvs
murthyvs
SSC-Enthusiastic
SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)

Group: General Forum Members
Points: 134 Visits: 86
Hi Jeff - Sorry those things are not pairs/dont increment by one. I just made that example in excel.
murthyvs
murthyvs
SSC-Enthusiastic
SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)

Group: General Forum Members
Points: 134 Visits: 86
Yes, DelimitedSplit8K is miraculously optimized for VARCHAR(8000). It runs very fast!
murthyvs
murthyvs
SSC-Enthusiastic
SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)SSC-Enthusiastic (134 reputation)

Group: General Forum Members
Points: 134 Visits: 86
Hi - Quick question. Do you know why VARCHAR(MAX) doesnt work in this query?

It returns the following error:

Msg 240, Level 16, State 1, Line 110
Types don't match between the anchor and the recursive part in column "rowstrt" of recursive query "cte".

Thanks!

nigelrivett (12/11/2012)
This will do something like that.
It caters for any number of fields - just amend the final select for the number required

declare @s varchar(8000)
select @s = '30;38469;1|31;38470;1|32;38471;1|33;38472;1|34;38473;1|35;38474;1|36;38475;1|37;38476;1|38;38477;1|'
select @s = '30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|'

declare @rowterm varchar(1) = '|'
declare @fieldterm varchar(1) = ';'

;with cte as
(
select rowstrt = 1, rowend = charindex(@rowterm,@s)-1, seq = 1
union all
select rowstrt = rowend+2, rowend = charindex(@rowterm,@s,rowend+2)-1, seq = seq + 1
from cte where charindex(@rowterm,@s,rowend+2)<>0
) ,
cte2 as
(
select s = substring(@s, rowstrt,rowend-rowstrt+1), seq
from cte
) ,
cte3 as
(
select fldstrt = 1, fldend = charindex(@fieldterm,s)-1, seq, s, fldseq = 1 from cte2
union all
select fldstrt = fldend+2,
fldend = case when charindex(@fieldterm,s,fldend+2) <> 0 then charindex(@fieldterm,s,fldend+2)-1 else len(s) end,
seq, s, fldseq = fldseq+1
from cte3 where fldend < len(s)
)
select s1.s, s2.s, s3.s
from
(select seq, s = substring(s,fldstrt,fldend-fldstrt+1) from cte3 where fldseq = 1) s1
join (select seq, s = substring(s,fldstrt,fldend-fldstrt+1) from cte3 where fldseq = 2) s2 on s1.seq = s2.seq
left join (select seq, s = substring(s,fldstrt,fldend-fldstrt+1) from cte3 where fldseq = 3) s3 on s1.seq = s3.seq

Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)

Group: General Forum Members
Points: 84893 Visits: 41069
Yes... both variables and any/all string literals must be defined as VARCHAR(MAX) including the ones that hold the single character delimiters. It's one of the great joys of recursive CTE's. :-)

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search