Display all Column information in single cell ? got result, but not able to understand concept

Question

Display all Column information in single cell ? got result, but not able to understand concept

metoseeu

SSChasing Mays

Points: 630
More actions
March 6, 2008 at 9:45 pm

#203929

Hi
My objective is to display all the record information of a particular column in single cell.
tablename
Iden Code
2BB
4BB
4CH
Result should be something like ----BB,BB,CH.
I used the folloeing to acheive it..
DECLARE @FINAL varCHAR (40)
SELECT @FINAL = ''
SELECT @FINAL = @FINAL + Code + ',' FROM table
But can't understand the concept, hoW this is done.
Can anyone explain it?

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply

Robert-378556 SSCertifiable Points: 5917 More actions · Answer 1

It's called custom aggregate (I think). It executes the expression for each row in the resultset. You can do it with FOR XML PATH() too.

Take a look at this thread:

http://www.sqlservercentral.com/Forums/Topic462961-145-1.aspx

Andras Belokosztolszki SSC-Insane Points: 21971 More actions · Answer 2

metoseeu (3/6/2008)
Hi
My objective is to display all the record information of a particular column in single cell.
tablename
Iden Code
2BB
4BB
4CH
Result should be something like ----BB,BB,CH.
I used the folloeing to acheive it..
DECLARE @FINAL varCHAR (40)
SELECT @FINAL = ''
SELECT @FINAL = @FINAL + Code + ',' FROM table
But can't understand the concept, hoW this is done.
Can anyone explain it?

This is a bit of an ugly trick. You declare a variable, and then in the select statement you use this variable. It exploits that SQL Server in this case will "process" every row that is to be returned, and it will do this one row at a time. Every time (i.e. for every row) this statement will reuse the previous value of the variable, and append the current row to it.

A nicer and faster solution that works on SQL Server 2005 and later is one that Robert has linked to.

Regards,

Andras

Andras Belokosztolszki, MCPD, PhD
GoldenGate Software

Robert-378556 SSCertifiable Points: 5917 More actions · Answer 3

Ugly trick? why?

Custom aggregates are extremely useful feature. While lists can be done with "FOR XML PATH", some thing can't, like when you have to do a binary OR (not sum) on a list of integers. I love it and I wish I had something like this on other servers.

Also, such loops run extremely fast compared to cursors.

Jeff Moden SSC Guru Points: 1003872 More actions · Answer 4

A nicer and faster solution that works on SQL Server 2005 and later is one that Robert has linked to

Which of the many solutions are you refering to? There are several on that thread.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003872 More actions · Answer 5

Robert (3/8/2008)
Ugly trick? why?
Custom aggregates are extremely useful feature. While lists can be done with "FOR XML PATH", some thing can't, like when you have to do a binary OR (not sum) on a list of integers. I love it and I wish I had something like this on other servers.
Also, such loops run extremely fast compared to cursors.

I agree... the FOR XML PATH solution isn't much faster than the custom aggregate in the presence of an index beneficial to both methods. And things like a million row running total certainly couldn't be done in 7 seconds without using the custom aggregate method.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Adam Haines SSC-Insane Points: 23217 More actions · Answer 6

I agree... the FOR XML PATH solution isn't much faster than the custom aggregate in the presence of an index beneficial to both methods. And things like a million row running total certainly couldn't be done in 7 seconds without using the custom aggregate method.

I have to disagree with this. While the select method is adequate for smaller tables, it does not even come close to generate delimited strings as well as For xml path. I did a few speed test with statistics and found that xml was far superior in creating a row string.

The code for the test table

/****** Object: Table [dbo].[Tally] Script Date: 02/06/2008 12:14:21 ******/

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

CREATE TABLE [dbo].[Numbers](

[Nbr] [int] NOT NULL

CONSTRAINT [PK_Tally_N12] PRIMARY KEY CLUSTERED

(

[Nbr] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,

IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 100) ON [PRIMARY]

) ON [PRIMARY]

INSERT INTO Numbers (Nbr) VALUES (0) ;

WHILE (SELECT MAX(Nbr) FROM numbers)<1000000

INSERT INTO numbers

SELECT Nbr+(SELECT MAX(Nbr)+1 FROM numbers)

FROM numbers ;

The code used to run the test Note: I just changed the where clause to specify the number of records. Additionally, I ran each statement seperately not to skew the results.

--select method

declare @var varchar(MAX)

select @var = ''

SET STATISTICS IO ON

SET STATISTICS TIME ON

select @var = @var + cast(LTRIM(RTRIM(Nbr)) as VARchar(10)) + ','

from Numbers

where Nbr <= 10000

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

--for xml path method

declare @var varchar(MAX)

select @var = ''

SET STATISTICS IO ON

SET STATISTICS TIME ON

SELECT cast(LTRIM(RTRIM(Nbr)) AS VARCHAR(10)) + ','

FROM Numbers

where Nbr <= 10000

FOR XML PATH ('')

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

Test 1 - 1,000,000 rows

Select method:

I killed the process after 5 minutes

FOR XML PATH:

(1000001 row(s) affected)

Table 'Numbers'. Scan count 1, logical reads 1615, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 844 ms, elapsed time = 1060 ms.

Test 2 - 65000 rows.

Select method:

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 1 ms.

Table 'Numbers'. Scan count 1, logical reads 108, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 9079 ms, elapsed time = 9070 ms.

FOR XML PATH:

(65536 row(s) affected)

(65001 row(s) affected)

Table 'Numbers'. Scan count 1, logical reads 108, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 31 ms, elapsed time = 126 ms.

Test 3 - 10,000 rows

Select method:

Table 'Numbers'. Scan count 1, logical reads 20, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 250 ms, elapsed time = 258 ms.

FOR XML PATH:

(10001 row(s) affected)

Table 'Numbers'. Scan count 1, logical reads 20, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 16 ms, elapsed time = 87 ms.

As you can see the xml outperformed the select method in every instance, while a clustered index was present. I believe using FOR XML PATH is the new and improved method for creating column strings.

My blog: http://jahaines.blogspot.com

Adam Haines SSC-Insane Points: 23217 More actions · Answer 7

I understad the select method is still a good solution and completely backward compatible. I also understand that you probably will not need a string of this magnitude, but why use a more expensive method when a faster one is available?

My blog: http://jahaines.blogspot.com

Jeff Moden SSC Guru Points: 1003872 More actions · Answer 8

Adam Haines (3/8/2008)
I understad the select method is still a good solution and completely backward compatible. I also understand that you probably will not need a string of this magnitude, but why use a more expensive method when a faster one is available?

Heh... because you forgot... 😀 Code is not the only thing to consider where performance is concerned.

And, let's do an example where the SELECT method actually beats the XML method... on a millon row example similar to yours... by the way... compare the way you built your Tally table against the way I built my Test table... on my box, your method produces 1 column in about 15 seconds... mine produces 3 columns in about 9 seconds... "why use a more expensive method when a faster one is available?" :hehe: (Please don't get mad... I'm just having a little fun...)

Here's the code to setup the test table...

--drop table TestData

--===== Create and populate test table.

-- Column "RowNum" contains unique row numbers

-- Column "SomeID" contains non-unique numbers

-- (should be 1/400th of the row count to allow about 400 concatenated items per SomeInt).

-- Column "SomeCode" has a range of "AA" to "ZZ" non-unique 2 character strings

SELECT TOP 1000000 --<<<LOOK! CHANGE THIS NUMBER TO CHANGE THE NUMBER OF ROWS!

RowNum = IDENTITY(INT,1,1),

SomeID = ABS(CHECKSUM(NEWID()))%2500+1, --<<<LOOK! CHANGE THIS NUMBER TO 1/400th THE ROW COUNT

SomeCode = CHAR(ABS(CHECKSUM(NEWID()))%26+65)

+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)

INTO dbo.TestData

FROM Master.dbo.SysColumns t1,

Master.dbo.SysColumns t2

--===== A table is not properly formed unless a Primary Key has been assigned

ALTER TABLE dbo.TestData

ADD PRIMARY KEY CLUSTERED (RowNum)

--===== Add the index to support both types of queries

CREATE INDEX IXC_TestData_Cover1 ON dbo.TestData(SomeID,SomeCode)

Here's the concatenation function that Robert is talking about using with the SELECT...

CREATE FUNCTION dbo.fnConcatTest (@SomeID INT)

RETURNS VARCHAR(8000) AS

BEGIN

DECLARE @Return VARCHAR(8000)

SELECT @Return = ISNULL(@Return+',','')+SomeCode

FROM dbo.TestData

WHERE SomeID = @SomeID

RETURN @Return

END

...and, here's the test code... I put the concatenation function first so that the XML can benefit from any stats the SELECT may generate... just to be fair 😉

--===== Supress the auto-display of rowcounts for appearance

SET NOCOUNT ON

--===== Test and time the "Ugly stick" method

PRINT REPLICATE('=',80)

PRINT '"Ugly trick" code...'

SET STATISTICS TIME ON

SELECT SomeID, dbo.fnConcatTest(SomeID) AS CSVString

FROM dbo.TestData

GROUP BY SomeID

SET STATISTICS TIME OFF

--===== Test and time the XML method

PRINT REPLICATE('=',80)

PRINT 'XML code...'

SET STATISTICS TIME ON

SELECT t1.SomeID,

STUFF(

(SELECT ',' + t2.SomeCode

FROM TestData t2

WHERE t2.SomeID = t1.SomeID

FOR XML PATH('')),

1, 1, '') AS CsvString

FROM TestData t1

GROUP BY t1.SomeID

SET STATISTICS TIME OFF

PRINT REPLICATE('=',80)

And, here's the results...

[font="Courier New"]================================================================================

"Ugly trick" code...

SQL Server Execution Times:

CPU time = 3125 ms, elapsed time = 4289 ms.

================================================================================

XML code...

SQL Server Execution Times:

CPU time = 4219 ms, elapsed time = 5273 ms.

================================================================================

[/font]

And, check it out... which code is actually easier to read?

Still think the "ugly trick" method is still for small tables? :Whistling:

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003872 More actions · Answer 9

Almost forgot... here's the output when STATISTICS IO is ON... "Ugly trick" method beats the hell out of the XML method... 😀

[font="Courier New"]================================================================================

"Ugly trick" code...

Table 'TestData'. Scan count 1, logical reads 1985, physical reads 0,

read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

================================================================================

XML code...

Table 'TestData'. Scan count 2501, logical reads 11463, physical reads 0,

read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

================================================================================

[/font]

Now... what were people saying about the "ugly trick" method? :w00t:

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003872 More actions · Answer 10

One more thing... you left an extra comma at the end of your two outputs 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Adam Haines SSC-Insane Points: 23217 More actions · Answer 11

(Please don't get mad... I'm just having a little fun...)

Not at all:). I have actually been looking for this piece of code for some time.

One more thing... you left an extra comma at the end of your two outputs

Yeah, got lazy these can be removed by add a simple case statement. Additionally, the select method originally posted adds the extra comma too. 😛

DECLARE @FINAL varCHAR (40)
SELECT @FINAL = ''
SELECT @FINAL = @FINAL + Code + ',' FROM table

😉

Also I think the performance hit is directly correlated to manner in which you are using FOR XML. I have only used this techinque to create a string on a set of data, not process each row. For XML will perform substantially worse in doing this method because it is supposed to convert a set of data into XML. I believe,in the scenario you provided, it would be better to create a derived table and inner join the ids together versus process the xml for each row.

It seems the example I demonstrated was for a traditional use but the example you showed how the process method could perform worse under CERTAIN conditions. I still believe FOR XML will outperform the select method 9 times out of 10, under traditional circumstances. I do know one thing for sure and that is how not to use FOR XML PATH.

-Adam

My blog: http://jahaines.blogspot.com

Jeff Moden SSC Guru Points: 1003872 More actions · Answer 12

That's exactly what I was talking about... you asked why I would use something that I knew was slower... I knew that for the application I wanted, it would be faster. And, that's the key to writing a lot of performance code... "A Developer must not guess... a Developer must KNOW."

That's why folks like you and me test things and why we hardly ever deal in absolutes... we know that something that causes huge problems when used in one manner, is a "performance god-send" when used in an ever so slightly different manner. Another example is finding ranges of missing ID's in a column. Everyone knows that correlated sub-queries usually suck the life blood out of a query... but when used properly for certain situations, there's (almost) nothing faster.

(Heh... had to avoid the absolute of "nothing 😛 )

For XML will perform substantially worse in doing this method because it is supposed to convert a set of data into XML.

By the way, that's exactly what it does and the "set" of data isn't for each row... it's for the "distict" data formed by the GROUP BY.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Adam Haines SSC-Insane Points: 23217 More actions · Answer 13

touche friend. :)That's exactly what I was talking about... you asked why I would use something that I knew was slower... I knew that for the application I wanted, it would be faster. And, that's the key to writing a lot of performance code... "A Developer must not guess... a Developer must KNOW."
That's why folks like you and me test things and why we hardly ever deal in absolutes... we know that something that causes huge problems when used in one manner, is a "performance god-send" when used in an ever so slightly different manner. Another example is finding ranges of missing ID's in a column. Everyone knows that correlated sub-queries usually suck the life blood out of a query... but when used properly for certain situations, there's (almost) nothing faster.
(Heh... had to avoid the absolute of "nothing )

Well said.

By the way, that's exactly what it does and the "set" of data isn't for each row... it's for the "distict" data formed by the GROUP BY.

Yep. This is exactly what I meant. It runs once per each returned row from the main query. In my tests I was running FOR XML one time.

My blog: http://jahaines.blogspot.com

Jeff Moden SSC Guru Points: 1003872 More actions · Answer 14

Yep. This is exactly what I meant. It runs once per each returned row from the main query. In my tests I was running FOR XML one time.

Perfect... just didn't want you to think that I instantiated a triangular join or anything like that.

Both methods I used create a bit of a correlated subquery. This is one of those times when a correlated subquery isn't so bad. Process a million rows in about 4 seconds? Not bad for either method. I certainly wouldn't send either method back to a developer for correction after a code review.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)