Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Combine different rows based on sequence Expand / Collapse
Author
Message
Posted Monday, October 15, 2012 5:26 PM
SSChasing Mays

SSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing Mays

Group: General Forum Members
Last Login: Today @ 3:22 PM
Points: 646, Visits: 3,776
Hello All,

I have some really ugly data coming in from several files. I am using SSIS to import into a single column staging table and I am incrementing a counter field so I believe I can select the data from my staging table matching the sequence of the data in the file.

Alas the sequence of the data matters.

See all my DDL below and proposed solution below.

In one scenario the 'first' two records need to be combined into one row. The self join technique works. The are other other scenarios where the first three records need to be combined. I guess I could do three joins but I would like to do something more dynamic if possible, not dynamic like dynamic sql, but rather define a parameter that defines how many rows to combine.

DECLARE @ComboFactor int --The number of rows to combine

I think I can use my parameter and maybe row_number and partition, but that is where I am stuck since my skills with these newer functions are really minimal.

Thanks if you can help.

CREATE TABLE #Staging
( RowID int,
RowData varchar(4)
)

INSERT INTO #Staging (RowID,RowData) VALUES (1,'ab')
INSERT INTO #Staging (RowID,RowData) VALUES (2,'cd')
INSERT INTO #Staging (RowID,RowData) VALUES (3,'ef')
INSERT INTO #Staging (RowID,RowData) VALUES (4,'gh')
INSERT INTO #Staging (RowID,RowData) VALUES (5,'hi')
INSERT INTO #Staging (RowID,RowData) VALUES (6,'jk')
INSERT INTO #Staging (RowID,RowData) VALUES (7,'lm')
INSERT INTO #Staging (RowID,RowData) VALUES (8,'no')
INSERT INTO #Staging (RowID,RowData) VALUES (9,'pq')

SELECT
*
FROM #Staging


SELECT
T1.RowData + T2.RowData
FROM #Staging T1
INNER JOIN #Staging T2
ON T1.RowID +1 = T2.RowID
WHERE
(t1.RowID % 2) <> 0


DROP TABLE #Staging
Post #1372993
Posted Monday, October 15, 2012 6:25 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 6:12 PM
Points: 3,609, Visits: 5,220
How about something like this?

DECLARE @ComboFactor INT = 3 --The number of rows to combine 

;WITH Groups AS (
SELECT RowID, RowData
,rn=(ROW_NUMBER() OVER (ORDER BY RowID)-1) / @ComboFactor
FROM #Staging)
SELECT (
SELECT RowData + ''
FROM Groups b
WHERE a.rn = b.rn FOR XML PATH(''))
FROM Groups a
GROUP BY rn





My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1373001
Posted Monday, October 15, 2012 7:07 PM
SSChasing Mays

SSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing Mays

Group: General Forum Members
Last Login: Today @ 3:22 PM
Points: 646, Visits: 3,776
Interesting. Clearly I need to study up on FOR XML PATH. Thanks, I'll continue to review.

Post #1373007
Posted Monday, October 15, 2012 7:25 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 6:12 PM
Points: 3,609, Visits: 5,220
You're welcome. I forgot something though.

WHERE a.rn = b.rn ORDER BY RowID FOR XML PATH(''))

Need to make sure the rows are concatenated in the right order.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1373010
Posted Tuesday, October 16, 2012 1:17 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, April 15, 2014 1:33 PM
Points: 1, Visits: 48
The example code above showed this statement:

SELECT RowData + ''
FROM Groups b
WHERE a.rn = b.rn FOR XML PATH('')

Why is the + '' required? When I remove this then I get an result that looks like XML instead of a concatenated string.

If I Cast Rowdata to a varchar then the + '' is not required.

Thanks for the explanation.
Post #1373490
Posted Tuesday, October 16, 2012 6:18 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 6:12 PM
Points: 3,609, Visits: 5,220
russellolson (10/16/2012)
The example code above showed this statement:

SELECT RowData + ''
FROM Groups b
WHERE a.rn = b.rn FOR XML PATH('')

Why is the + '' required? When I remove this then I get an result that looks like XML instead of a concatenated string.

If I Cast Rowdata to a varchar then the + '' is not required.

Thanks for the explanation.


The XML tag is generated because if you remove the + '' you are now formatting a named column into the string, hence the XML tag. With the + '', it is an unnamed column so hence no XML tag.

CASTing rowdata has the same effect - producing an unnamed column.

I tend to use the former because it is fewer keystrokes. I suppose I should be more scientific and base my choice on performance, but I confess I've never tried it.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1373587
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse