SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Split string using Tally Table


Split string using Tally Table

Author
Message
Eric Mamet
Eric  Mamet
SSCommitted
SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)

Group: General Forum Members
Points: 1585 Visits: 893
Like a few others, I got it right a bit by chance by looking at the number of dots.

Frankly, do we need such a complicated statement to learn something?
I don't think so...

By the way, I did not learn anything with that question!
Oleg Netchaev
Oleg Netchaev
SSCommitted
SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)

Group: General Forum Members
Points: 1773 Visits: 1813
Jeff Moden (7/2/2010)
BWAA-HAA!!!! First, this is NOT a good example of Tally Table code. It doesn't use a Tally Table and it doesn't use anything that could be called efficient. Anyone who uses a recursive CTE to generate Tally numbers just doesn't know what hidden RBAR is. :-P

I agree with what someone else has stated... this code should be used as an example of how NOT to write code and how NOT to accomplish a split. Same goes for any example in this thread that uses a recursive CTE to do the split. ;-)

I hope no one actually copies the code to use as a split function but I will say it was very interesting in how many "extra" things were added to the code in an attempt at obfuscation. Heh... maybe it should be titled "job security methods". Hehe

The urge to submit a post was burning me for a couple of days, but I decided to wait, because I was afraid that I will not be able to find the way to type what I think about the script in QotD without sounding rude, so I am glad that Jeff, Hugo and vk-kirov stood up to it. I agree that this is one of the finest examples how NOT TO DO something. There are some valid reasons to use recursive CTE, for example to write BOM queries in 2005 (HierarchyID is not yet available), but these reasons do not extend to creating a tally table, specifically with @text declared as varchar(500) but the recursion level restricted to 32767 (???).

with tally (number) as
(
select
top (len(@text))
row_number() over (order by [object_id]) number
from sys.objects
)



should do the trick right?

Using xml to split delimited values is relatively cheap and not difficult to understand (Does not include the job security though :heheSmile. Here is one example based on simply replacing the delimiter with closing and opening the node of arbitrary name and appending opening node to the head and closing to the tail:

use AdventureWorks;
go

declare @delimiter char(1);
declare @text nvarchar(500);
declare @xml xml;

-- set variable values
select
@delimiter = '.',
@text = 'This t-sql will split these sentences into rows.' +
'How many rows will be returned?.' + 'M.a.y.b.e..n.o.n.e.?',
@xml = '<r>' + replace(@text, @delimiter, '</r><r>') + '</r>';

-- here is the "split" in all its glory
select
item.value('text()[1]', 'varchar(100)') single_item
from @xml.nodes('//r') R(item);



The above happily returns desired output:

single_item
-----------------------------------------------
This t-sql will split these sentences into rows
How many rows will be returned?
M
a
y
b
e
NULL
n
o
n
e
?


Oleg
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)

Group: General Forum Members
Points: 84453 Visits: 41061
Oleg Netchaev (7/6/2010)
Jeff Moden (7/2/2010)
BWAA-HAA!!!! First, this is NOT a good example of Tally Table code. It doesn't use a Tally Table and it doesn't use anything that could be called efficient. Anyone who uses a recursive CTE to generate Tally numbers just doesn't know what hidden RBAR is. :-P

I agree with what someone else has stated... this code should be used as an example of how NOT to write code and how NOT to accomplish a split. Same goes for any example in this thread that uses a recursive CTE to do the split. ;-)

I hope no one actually copies the code to use as a split function but I will say it was very interesting in how many "extra" things were added to the code in an attempt at obfuscation. Heh... maybe it should be titled "job security methods". Hehe

The urge to submit a post was burning me for a couple of days, but I decided to wait, because I was afraid that I will not be able to find the way to type what I think about the script in QotD without sounding rude, so I am glad that Jeff, Hugo and vk-kirov stood up to it. I agree that this is one of the finest examples how NOT TO DO something. There are some valid reasons to use recursive CTE, for example to write BOM queries in 2005 (HierarchyID is not yet available), but these reasons do not extend to creating a tally table, specifically with @text declared as varchar(500) but the recursion level restricted to 32767 (???).

with tally (number) as
(
select
top (len(@text))
row_number() over (order by [object_id]) number
from sys.objects
)



should do the trick right?

Using xml to split delimited values is relatively cheap and not difficult to understand (Does not include the job security though :heheSmile. Here is one example based on simply replacing the delimiter with closing and opening the node of arbitrary name and appending opening node to the head and closing to the tail:

use AdventureWorks;
go

declare @delimiter char(1);
declare @text nvarchar(500);
declare @xml xml;

-- set variable values
select
@delimiter = '.',
@text = 'This t-sql will split these sentences into rows.' +
'How many rows will be returned?.' + 'M.a.y.b.e..n.o.n.e.?',
@xml = '<r>' + replace(@text, @delimiter, '</r><r>') + '</r>';

-- here is the "split" in all its glory
select
item.value('text()[1]', 'varchar(100)') single_item
from @xml.nodes('//r') R(item);



The above happily returns desired output:

single_item
-----------------------------------------------
This t-sql will split these sentences into rows
How many rows will be returned?
M
a
y
b
e
NULL
n
o
n
e
?


Oleg


Hi Oleg,

There are, indeed, a number of ways to create a Tally table including the one you used here and a similar one on another post of yours. But you do have to be careful. sys.Objects can contain very few objects. On a brand new database in 2005, it will only contain about 47 objects and even squaring that number will only return 2209 rows. Instead, my recommendation is to refer to Master.sys.All_Columns which will have at least 4000 rows in it on a full installation.

So far as splits go, using one form of Tally Table or another to split smaller items (like INTs) is usually faster than trying to split such items with XML (I'm working on the tests to show you that on the other thread). Splitting larger items such as sentences is generally done faster (depending on the size) with the XML split. In either case, it's generally preferable to use an iTVF (inline Table Valued Function) over a stored procedure to add the utility of being able to pass a whole column to be split and not just a single variable.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Oleg Netchaev
Oleg Netchaev
SSCommitted
SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)

Group: General Forum Members
Points: 1773 Visits: 1813
Jeff,

Thank you very much for your suggestions. You make a very good point about the iTVF, it is definitely more useful than the proc. In my defense about the tally, I can say that my decision about a good-enough-for-qotd-scope tally script is based on the following assumption:

Someone playing with the script either:

"Got a job" :-) and thus has the development copy of the prod database with the sufficient number of records in sys.objects

"Don't got a job" w00t and thus has AdventureWorks database with the sufficient number of records in sys.objects (about 1800 records, 3.24 mln if cross joined).

iTVF works very well with cross apply, with which I cannot play at work as we are still at compat 80 2005, so I have to constantly restate my cross apply statements as joins (when I can) when moving the test scripts between the databases at work and AdventureWorks at home.

Oleg
SQLRNNR
SQLRNNR
SSC-Dedicated
SSC-Dedicated (31K reputation)SSC-Dedicated (31K reputation)SSC-Dedicated (31K reputation)SSC-Dedicated (31K reputation)SSC-Dedicated (31K reputation)SSC-Dedicated (31K reputation)SSC-Dedicated (31K reputation)SSC-Dedicated (31K reputation)

Group: General Forum Members
Points: 31788 Visits: 18550
thanks for the question



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw

TomThomson
TomThomson
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14156 Visits: 12197
Well, I got it wrong - after looking briefly at the code and deciding there was now way I was going to try to make sense of such an ill-formated mess, feeling flabbergasted at the misuse of a recursive CTE to generate a tally table, and concluding that the where clause indicated that all I had to do was count the dots and add 1 unless it was a trck question and the code was going to return an error instead of any rows (and there was no way I was going to syntax-check anything with that layout, so assume no error) I counted the dots wrong (by leaving out the firstline - I had scrolled it off th top of my window and forgot about it).

I hate code with unneeded complexity, especially when it's layed out so as to obscure its structure. I hate code that burns CPU cycles as if they were going out of fashion. It's a good question in that it does demonstrate that the select list is irrelevant when counting rows. It's a bad question in that it demonstrates a way that code should never be written.

Tom

michael.kaufmann
michael.kaufmann
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1321 Visits: 1082
jcrawf02 (7/2/2010)

Not sure why this one is tripping folks up, if you just count the delimiters, you can see it will return 13 pieces?

...anyway...good question, I'm sure it's introduced some folks to this idea that haven't seen it before.


That's the point I don't get--when counting the delimiters of the @text variable, I'll only come up with 12.
What am I missing here?

Thanks
michael.kaufmann
michael.kaufmann
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1321 Visits: 1082
michael.kaufmann (7/13/2010)
jcrawf02 (7/2/2010)

Not sure why this one is tripping folks up, if you just count the delimiters, you can see it will return 13 pieces?

...anyway...good question, I'm sure it's introduced some folks to this idea that haven't seen it before.


That's the point I don't get--when counting the delimiters of the @text variable, I'll only come up with 12.
What am I missing here?


Just found out--was only looking at the left of the delimiter... ;-)
TomThomson
TomThomson
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14156 Visits: 12197
michael.kaufmann (7/13/2010)
jcrawf02 (7/2/2010)

Not sure why this one is tripping folks up, if you just count the delimiters, you can see it will return 13 pieces?

...anyway...good question, I'm sure it's introduced some folks to this idea that haven't seen it before.


That's the point I don't get--when counting the delimiters of the @text variable, I'll only come up with 12.
What am I missing here?

Thanks


You have to add 1: there's a piece before each delimiter and another piece after the last delimiter.

Tom

Hugo Kornelis
Hugo Kornelis
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10787 Visits: 11966
michael.kaufmann (7/13/2010)
jcrawf02 (7/2/2010)

Not sure why this one is tripping folks up, if you just count the delimiters, you can see it will return 13 pieces?

...anyway...good question, I'm sure it's introduced some folks to this idea that haven't seen it before.


That's the point I don't get--when counting the delimiters of the @text variable, I'll only come up with 12.
What am I missing here?

Thanks

Both the text before the first delimiter and the text after the last delimiter will be a row, as each bit of text between two delimiters. That results in #rows = #delimiters + 1.

You can see this in the WHERE clause:
WHERE (...) = @StringDelimiter -- Twelve rows, for the twelve delimiters.
OR Number - 1 = LEN(@Text) -- One row, for the end of the text.

If the last character is a delimiter (that is, if the text had ended in a dot instead of a question mark), the number of rows would be equal to the number of delimiters, with the "last" row matching both of the OR'ed conditions.


Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search