Handle NULLs

  • Please someone tell me what I did wrong:

    You had a typo in your solution (used hobby1 twice from the first table), but that wouldn't effect the performance.

    But doesnt this build a test case with no nulls in it? Not sure if the performance will change at all , but if we're going to measure these solutions there should be nulls to be stripped out.

  • Nevyn (7/7/2011)


    Please someone tell me what I did wrong:

    You had a typo in your solution (used hobby1 twice from the first table), but that wouldn't effect the performance.

    <edit>I think I found it, but I agree, it probably did not affect performance. That said, there is a better effort later in this post :-)</edit>

    But doesnt this build a test case with no nulls in it? Not sure if the performance will change at all , but if we're going to measure these solutions there should be nulls to be stripped out.

    A fair point...I am not sure how or if it should measurably affect performance but I changed it up in the code sample below.

    Regarding these two requirements mikes84:

    1) Do you want duplicates shown in the results?

    2) Does order matter?

    Using our two tables hobby1 and hobby2 here is an example that should answer both questions:

    If...

    hobby1.hobby1 = tennis

    hobby1.hobby2 = cricket

    hobby2.hobby1 is null

    hobby2.hobby2 = tennis

    ...what should the result set look like?

    Here is another set of code with more varied test data, including nulls, as well as a Union/Crosstab method, a variation of the Union All/Crosstab, which removes duplicates but does not offer ordering based on column name.

    > APPLY method - removes dups, can be ordered by column name

    > String manipulation - preserves dups, can be ordered by column name

    > Unpivot/Pivot - preserves dups, can be ordered by column name

    > Union All/Crosstab - preserves dups, can be ordered by column name

    > Union/Crosstab - removes dups, cannot be ordered by column name

    > Parsename - preserves dups, can be ordered by column name

    SET NOCOUNT ON ;

    GO

    SET STATISTICS TIME OFF ;

    PRINT REPLICATE('-', 75) + '

    build temp tables'

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'hobby1')

    AND type IN ( N'U' ) )

    DROP TABLE hobby1 ;

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'hobby2')

    AND type IN ( N'U' ) )

    DROP TABLE hobby2 ;

    CREATE TABLE hobby1

    (

    id INT IDENTITY(1, 1),

    hobby1 VARCHAR(36),

    hobby2 VARCHAR(36)

    ) ;

    CREATE TABLE hobby2

    (

    id INT,

    hobby1 VARCHAR(36),

    hobby2 VARCHAR(36)

    ) ;

    INSERT INTO dbo.hobby1

    (

    hobby1,

    hobby2

    )

    SELECT TOP ( 250 )

    CAST(NEWID() AS VARCHAR(36)),

    NULL

    FROM master.sys.columns c1

    CROSS JOIN master.sys.columns c2

    UNION ALL

    SELECT TOP ( 250 )

    NULL,

    CAST(NEWID() AS VARCHAR(36))

    FROM master.sys.columns c1

    CROSS JOIN master.sys.columns c2

    UNION ALL

    SELECT TOP ( 250 )

    CAST(NEWID() AS VARCHAR(36)),

    CAST(NEWID() AS VARCHAR(36))

    FROM master.sys.columns c1

    CROSS JOIN master.sys.columns c2

    UNION ALL

    SELECT TOP ( 250 )

    NULL,

    NULL

    FROM master.sys.columns c1

    CROSS JOIN master.sys.columns c2 ;

    INSERT INTO dbo.hobby2

    (

    id,

    hobby1,

    hobby2

    )

    SELECT id,

    NULL,

    CAST(NEWID() AS VARCHAR(36))

    FROM dbo.hobby1

    WHERE id <= 150

    UNION ALL

    SELECT id,

    hobby2,

    CAST(NEWID() AS VARCHAR(36))

    FROM dbo.hobby1

    WHERE id BETWEEN 151 AND 550

    UNION ALL

    SELECT id,

    NULL,

    CAST(NEWID() AS VARCHAR(36))

    FROM dbo.hobby1

    WHERE id BETWEEN 551 AND 950

    UNION ALL

    SELECT id,

    hobby1,

    CAST(NEWID() AS VARCHAR(36))

    FROM dbo.hobby1

    WHERE id BETWEEN 951 AND 1000 ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    APPLY method'

    SELECT 'APPLY method'

    SET STATISTICS TIME ON ;

    -- now transform it by moving all nulls to the end without RBAR

    SELECT t.id,

    c1toc4 AS newc1,

    newc2,

    newc3,

    newc4

    FROM (

    SELECT h1.id AS id,

    h1.hobby1 AS c1,

    h1.hobby2 AS c2,

    h2.hobby1 AS c3,

    h2.hobby2 AS c4

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    ) t -- first CROSS APPLY is just to provide shortcuts for the

    -- various COALESCE statements

    CROSS APPLY (

    SELECT c1toc4 = COALESCE(c1, c2, c3, c4),

    c2toc4 = COALESCE(c2, c3, c4),

    c3toc4 = COALESCE(c3, c4)

    ) cx

    -- second CROSS APPLY figures out what goes in the second column

    -- we know the 1st column is always c1toc4 - i.e. the first non-null

    -- so here we compare the first non-null in cols 2-4 with c1toc4 and c3toc4

    -- if it matches either or those we need to either take column 4 or

    -- null (if col4 matches c1toc4 we have already used it for new column 1)

    CROSS APPLY (

    SELECT newc2 = COALESCE(NULLIF(COALESCE(NULLIF(c2toc4,

    c1toc4), c3toc4), c1toc4), NULLIF(c4, c1toc4))

    ) cx2

    -- and this is for the third column

    CROSS APPLY (

    SELECT newc3 = COALESCE(NULLIF(NULLIF(c3toc4, newc2), c1toc4),

    NULLIF(NULLIF(c4, newc2), c1toc4))

    ) cx3

    -- and this is for the fourth column

    CROSS APPLY (

    SELECT newc4 = NULLIF(NULLIF(NULLIF(c4, newc3), newc2), c1toc4)

    ) cx4 ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    String manipulation'

    SELECT 'String manipulation'

    SET STATISTICS TIME ON ;

    SELECT t.id,

    RTRIM(SUBSTRING(t.s, 1, 36)),

    RTRIM(SUBSTRING(t.s, 37, 36)),

    RTRIM(SUBSTRING(t.s, 73, 36)),

    RTRIM(SUBSTRING(t.s, 109, 36))

    FROM (

    SELECT h1.id,

    ISNULL(h1.hobby1 + SPACE(36 - LEN(h1.hobby1)), '') +

    ISNULL(h1.hobby2 + SPACE(36 - LEN(h1.hobby2)), '') + ISNULL(h2.hobby1 + SPACE(36

    - LEN(h2.hobby1)),'') + ISNULL(h2.hobby2

    + SPACE(36- LEN(h2.hobby2)),'') s

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    ) t ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Unpivot/Pivot'

    SELECT 'Unpivot/Pivot'

    SET STATISTICS TIME ON ;

    WITH OneTableCTE

    AS (

    SELECT a.id,

    a.hobby1 AS 'h1',

    a.hobby2 AS 'h2',

    b.hobby1 AS 'h3',

    b.hobby2 AS 'h4'

    FROM hobby1 a

    INNER JOIN hobby2 b ON a.id = b.id

    ),

    UnPivotCTE

    AS (

    SELECT id,

    hobby,

    ROW_NUMBER() OVER ( PARTITION BY id ORDER BY hobbyorder )

    AS fixedorder

    FROM OneTableCTE UNPIVOT ( hobby FOR hobbyorder IN ( h1, h2, h3, h4 ) )

    AS Normalized

    )

    SELECT OneTableCTE.id,

    [1],

    [2],

    [3],

    [4]

    FROM (

    SELECT id,

    hobby,

    fixedorder

    FROM UnPivotCTE

    ) AS Undone PIVOT ( MIN(hobby) FOR fixedorder IN ( [1], [2], [3], [4] ) )

    AS PivotTable

    RIGHT OUTER JOIN OneTableCTE ON PivotTable.id = OneTableCTE.id ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Union All/Cross tab'

    SELECT 'Union All/Cross tab'

    SET STATISTICS TIME ON ;

    WITH cte(id, hobby, hobby_num)

    AS (

    SELECT h.id,

    h.hobby1,

    1

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    UNION ALL

    SELECT h.id,

    h.hobby1,

    2

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    UNION ALL

    SELECT h.id,

    h.hobby1,

    3

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby1

    WHERE id = h.id )

    UNION ALL

    SELECT h.id,

    h.hobby2,

    4

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    ),

    cte2(id, hobby, row_num)

    AS (

    SELECT id,

    hobby,

    ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobby_num)

    FROM cte

    WHERE hobby IS NOT NULL

    )

    SELECT id,

    MAX(CASE WHEN row_num = 1 THEN hobby

    END) AS hobby1,

    MAX(CASE WHEN row_num = 2 THEN hobby

    END) AS hobby2,

    MAX(CASE WHEN row_num = 3 THEN hobby

    END) AS hobby3,

    MAX(CASE WHEN row_num = 4 THEN hobby

    END) AS hobby4

    FROM cte2

    GROUP BY id ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Union/Cross tab'

    SELECT 'Union/Cross tab'

    SET STATISTICS TIME ON ;

    WITH cte(id, hobby)

    AS (

    SELECT h.id,

    h.hobby1

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    UNION

    SELECT h.id,

    h.hobby2

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    UNION

    SELECT h.id,

    h.hobby1

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby1

    WHERE id = h.id )

    UNION

    SELECT h.id,

    h.hobby2

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    ),

    cte2(id, hobby, row_num)

    AS (

    SELECT id,

    hobby,

    ROW_NUMBER() OVER (PARTITION BY id ORDER BY id)

    FROM cte

    WHERE hobby IS NOT NULL

    )

    SELECT id,

    MAX(CASE WHEN row_num = 1 THEN hobby

    END) AS hobby1,

    MAX(CASE WHEN row_num = 2 THEN hobby

    END) AS hobby2,

    MAX(CASE WHEN row_num = 3 THEN hobby

    END) AS hobby3,

    MAX(CASE WHEN row_num = 4 THEN hobby

    END) AS hobby4

    FROM cte2

    GROUP BY id ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Parsename'

    SELECT 'Parsename'

    SET STATISTICS TIME ON ;

    SELECT h1.id,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 1)) AS Hobby1,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 2)) AS Hobby2,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 3)) AS Hobby3,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 4)) AS Hobby4

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id ;

    SET STATISTICS TIME OFF ;

    GO

    Common results on my machine:

    ---------------------------------------------------------------------------

    build temp tables

    ---------------------------------------------------------------------------

    APPLY method

    SQL Server Execution Times:

    CPU time = 94 ms, elapsed time = 260 ms.

    ---------------------------------------------------------------------------

    String manipulation

    SQL Server Execution Times:

    CPU time = 15 ms, elapsed time = 191 ms.

    ---------------------------------------------------------------------------

    Unpivot/Pivot

    SQL Server Execution Times:

    CPU time = 15 ms, elapsed time = 248 ms.

    ---------------------------------------------------------------------------

    Union All/Cross tab

    Warning: Null value is eliminated by an aggregate or other SET operation.

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 127 ms.

    ---------------------------------------------------------------------------

    Union/Cross tab

    Warning: Null value is eliminated by an aggregate or other SET operation.

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 131 ms.

    ---------------------------------------------------------------------------

    Parsename

    SQL Server Execution Times:

    CPU time = 15 ms, elapsed time = 137 ms.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • mister.magoo (7/7/2011)


    mikes84 (7/6/2011)


    I have a situation where I'm querying two tables, each that has up to two hobbies listed. The first table has hobby1 and hobby2, and the second table has hobby3 and hobby4. It's possible for there to be values in any of the fields. So, it could look like:

    hobby1, hobby2, hobby3, hobby4

    basketball, tennis, NULL, NULL

    or

    hobby1, hobby2, hobby3, hobby4

    NULL, NULL, basketball, tennis

    However, it could also look like:

    hobby1, hobby2, hobby3, hobby4

    basketball, NULL, tennis, NULL

    Ideally, I want it to look like the first example, where the values are brought to the left. I could use COALESCE in the first and second examples and have it work, but it wouldn't work on the third example. Any ideas how I could get something like the third example to look like the first one?

    Mike

    Hi Orlando (and other contributors),

    I don't understand the test data you are using... perhaps it's me but the question was about shuffling values into the leftmost columns where there were nulls between them.

    The fact that the APPLY method was removing dups was lost on me initially. I did not break anything, it was working as you had intended per the initial post all along.

    The test data you generate has no NULLS and duplicate entries.

    As far as I can see, the CROSS APPLY method I provided is the only one that is correctly processing this data into the two leftmost columns and removing duplicate entries (the removal of duplicates may not be a stated requirement, but it suits me to include it and it makes sense as you wouldn't want duplicates anyway)...

    Or am I completely missing the point?

    I agree with you. I think what you have is a practical solution...why not de-dup? and why not preserve the order of the columns as they are examined per the written query?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • I decided to build a large amount of test data that more closely matches the OP's given conditions. I'll be back with a possible solution soon. Not to worry about the 100,000 rows in each table... this runs very, very fast. Also note that the column names of the Hobby Names match the OP's.

    --=====================================================================================================================

    -- Create and populate the test tables. This is NOT a part of the solution.

    --=====================================================================================================================

    --===== Declare and preset some obviously named local variables

    DECLARE @RowsInFirstTable INT,

    @RowsInSecondTable INT

    ;

    SELECT @RowsInFirstTable = 100000,

    @RowsInSecondTable = 100000

    ;

    --===== Conditionally drop the test tables to make reruns in SSMS easier

    IF OBJECT_ID('tempdb..#HobbyName','U') IS NOT NULL DROP TABLE #HobbyName; --Used to build a lot of test data

    IF OBJECT_ID('tempdb..#Hobby1','U') IS NOT NULL DROP TABLE #Hobby1;

    IF OBJECT_ID('tempdb..#Hobby2','U') IS NOT NULL DROP TABLE #Hobby2;

    --===== Create a list of HobbyNames so we can randomly populate the test tables.

    -- The NULL's are simply part of the possible things that could be a hobby.

    SELECT HobbyNameID = IDENTITY(INT,1,1),

    HobbyName = CAST(d.HobbyName AS VARCHAR(15))

    INTO #HobbyName

    FROM (

    SELECT NULL UNION ALL

    SELECT NULL UNION ALL

    SELECT 'BaseBall' UNION ALL

    SELECT 'Basketball' UNION ALL

    SELECT 'Tennis' UNION ALL

    SELECT 'FootBall' UNION ALL

    SELECT 'Hockey' UNION ALL

    SELECT 'Rugby' UNION ALL

    SELECT 'Golf' UNION ALL

    SELECT 'Bowling'

    ) d (HobbyName)

    ;

    ALTER TABLE #HobbyName ADD PRIMARY KEY CLUSTERED (HobbyNameID) WITH FILLFACTOR = 100

    ;

    --===== Create the "first" table and populate it on the fly

    SELECT TOP (@RowsInFirstTable)

    Hobby1ID = IDENTITY(INT,1,1),

    Hobby1 = (SELECT TOP 1 HobbyName FROM #HobbyName WHERE HobbyNameID = ABS(CHECKSUM(NEWID()))%10+1),

    Hobby2 = (SELECT TOP 1 HobbyName FROM #HobbyName WHERE HobbyNameID = ABS(CHECKSUM(NEWID()))%10+1)

    INTO #Hobby1

    FROM sys.all_columns ac1,

    sys.all_columns ac2

    ;

    --===== Create the "second" table and populate it on the fly

    SELECT TOP (@RowsInSecondTable)

    Hobby2ID = IDENTITY(INT,1,1),

    Hobby3 = (SELECT TOP 1 HobbyName FROM #HobbyName WHERE HobbyNameID = ABS(CHECKSUM(NEWID()))%10+1),

    Hobby4 = (SELECT TOP 1 HobbyName FROM #HobbyName WHERE HobbyNameID = ABS(CHECKSUM(NEWID()))%10+1)

    INTO #Hobby2

    FROM sys.all_columns ac1,

    sys.all_columns ac2

    ;

    --===== Add PK's to both tables

    ALTER TABLE #Hobby1 ADD PRIMARY KEY CLUSTERED (Hobby1ID) WITH FILLFACTOR = 100;

    ALTER TABLE #Hobby2 ADD PRIMARY KEY CLUSTERED (Hobby2ID) WITH FILLFACTOR = 100;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • For what it's worth, here's my rendition of how this problem might be solved. It doesn't preserve dupes and it sorts the HobbyNames on each row alphabetically. As always, details are in the code. As a sidebar, it only "dips" each table once and there is no join. I also did my own form of unpivot simply because I don't care for the UNPIVOT syntax. 😀 I'll let Orlando test it on his machine rather than make any performance claims. (Orlando, the test data above is set for 100,000 rows... change it {just a couple of variable entries} to whatever you were using for the other's code. Thanks)

    --=====================================================================================================================

    -- Now, solve the problem. The details are, of course, in the comments.

    -- This method only "dips" each table once.

    --=====================================================================================================================

    WITH

    cteEnumerate AS

    ( --=== This CTE creates an EAV stacked Table removing dupes and NULLs in the process.

    -- Then it numbers (enumerates) the item's "column numbers" for reassembly in the outer SELECT coming up.

    -- As a pleasent side effect, the HobbyNames are sorted in alphabetical order.

    -- This step is a form of "pre-aggregation"

    SELECT ColNum = ROW_NUMBER() OVER (PARTITION BY eav.HobbyID ORDER BY eav.HobbyName),

    eav.HobbyID,

    eav.HobbyName

    FROM ( --=== This does the EAV "stacking", removal of dupes, removal of NULLs as combined info from both tables

    SELECT --===== Unpivot the "first" hobby table removing nulls and dupes

    DISTINCT

    HobbyID = h1.Hobby1ID,

    HobbyName = upvt.HobbyName

    FROM #Hobby1 h1

    CROSS APPLY ( --=== This does a high speed "Unpivot" of the first table

    SELECT Hobby1 UNION ALL

    SELECT Hobby2

    ) upvt (HobbyName)

    WHERE upvt.HobbyName > '' --Is NOT NULL and is NOT BLANK

    UNION -- We use union to get rid of dupes between tables ----------------------------------------------------

    SELECT --===== Unpivot the "second" hobby table removing nulls and dupes

    DISTINCT

    HobbyID = h2.Hobby2ID,

    HobbyName = upvt.HobbyName

    FROM #Hobby2 h2

    CROSS APPLY ( --=== This does a high speed "Unpivot" of the second table

    SELECT Hobby3 UNION ALL

    SELECT Hobby4

    ) upvt (HobbyName)

    WHERE upvt.HobbyName > '' --Is NOT NULL and is NOT BLANK

    ) eav

    ) --=== Now, we reassemble the data using a high-speed cross tab.

    -- For convenience of anticipated queries, we also return the number of hobbies for each "HobbyID"

    SELECT HobbyID,

    Hobby1 = MAX(CASE WHEN ColNum = 1 THEN HobbyName ELSE '' END),

    Hobby2 = MAX(CASE WHEN ColNum = 2 THEN HobbyName ELSE '' END),

    Hobby3 = MAX(CASE WHEN ColNum = 3 THEN HobbyName ELSE '' END),

    Hobby4 = MAX(CASE WHEN ColNum = 4 THEN HobbyName ELSE '' END),

    HobbyCount = COUNT(*)

    FROM cteEnumerate

    GROUP BY HobbyID

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have some re-thinking to do and some more learning to do from your work Jeff...I had suspicions before but it is abundantly clear to me now just how bad my approach is to conjuring up test data :laugh:

    Jeff, I modded your test data build script so we could run the existing solutions against the concrete tables we were using before. It just seemed easier than modding all the solution scripts but that means there is a risk that I munged your code in the process but I think I did OK.

    > APPLY method - removes dups, can be ordered by column name

    > String manipulation - preserves dups, can be ordered by column name

    > Unpivot/Pivot - preserves dups, can be ordered by column name

    > Union All/Crosstab - preserves dups, can be ordered by column name, does not return rows where all hobbies are null

    > Union/Crosstab - removes dups, cannot be ordered by column name, does not return rows where all hobbies are null

    > Parsename - preserves dups, can be ordered by column name

    > Jeff's Rendition - removes dups, cannot be ordered by column name, does not return rows where all hobbies are null

    Here is the complete set of tests:

    SET NOCOUNT ON ;

    GO

    SET STATISTICS TIME OFF ;

    GO

    PRINT REPLICATE('-', 75) + '

    build temp tables'

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'hobby1')

    AND type IN ( N'U' ) )

    DROP TABLE hobby1 ;

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'hobby2')

    AND type IN ( N'U' ) )

    DROP TABLE hobby2 ;

    CREATE TABLE hobby1

    (

    id INT IDENTITY(1, 1),

    hobby1 VARCHAR(36),

    hobby2 VARCHAR(36)

    ) ;

    CREATE TABLE hobby2

    (

    id INT IDENTITY(1, 1),

    hobby1 VARCHAR(36),

    hobby2 VARCHAR(36)

    ) ;

    --===== Add PK's to both tables

    ALTER TABLE dbo.hobby1 ADD PRIMARY KEY CLUSTERED (id) WITH FILLFACTOR = 100 ;

    ALTER TABLE dbo.hobby2 ADD PRIMARY KEY CLUSTERED (id) WITH FILLFACTOR = 100 ;

    GO

    --=====================================================================================================================

    -- Create and populate the test tables. This is NOT a part of the solution.

    --=====================================================================================================================

    --===== Declare and preset some obviously named local variables

    DECLARE @RowsInFirstTable INT,

    @RowsInSecondTable INT ;

    SELECT @RowsInFirstTable = 100000,

    @RowsInSecondTable = 100000 ;

    IF OBJECT_ID(N'tempdb..#HobbyName') IS NOT NULL

    DROP TABLE #HobbyName;

    --===== Create a list of HobbyNames so we can randomly populate the test tables.

    -- The NULL's are simply part of the possible things that could be a hobby.

    SELECT HobbyNameID = IDENTITY( INT,1,1),

    HobbyName = CAST(d.HobbyName AS VARCHAR(15))

    INTO #HobbyName

    FROM (

    SELECT NULL

    UNION ALL

    SELECT NULL

    UNION ALL

    SELECT 'BaseBall'

    UNION ALL

    SELECT 'Basketball'

    UNION ALL

    SELECT 'Tennis'

    UNION ALL

    SELECT 'FootBall'

    UNION ALL

    SELECT 'Hockey'

    UNION ALL

    SELECT 'Rugby'

    UNION ALL

    SELECT 'Golf'

    UNION ALL

    SELECT 'Bowling'

    ) d ( HobbyName ) ;

    ALTER TABLE #HobbyName ADD PRIMARY KEY CLUSTERED (HobbyNameID) WITH FILLFACTOR = 100 ;

    --===== Create the "first" table and populate it on the fly

    INSERT INTO dbo.hobby1

    (

    hobby1,

    hobby2

    )

    SELECT TOP ( @RowsInFirstTable )

    Hobby1 = (

    SELECT TOP 1

    HobbyName

    FROM #HobbyName

    WHERE HobbyNameID = ABS(CHECKSUM(NEWID())) % 10 + 1

    ),

    Hobby2 = (

    SELECT TOP 1

    HobbyName

    FROM #HobbyName

    WHERE HobbyNameID = ABS(CHECKSUM(NEWID())) % 10 + 1

    )

    FROM sys.all_columns ac1,

    sys.all_columns ac2 ;

    --===== Create the "second" table and populate it on the fly

    INSERT INTO dbo.hobby2

    (

    hobby1,

    hobby2

    )

    SELECT TOP ( @RowsInSecondTable )

    (

    SELECT TOP 1

    HobbyName

    FROM #HobbyName

    WHERE HobbyNameID = ABS(CHECKSUM(NEWID())) % 10 + 1

    ),

    (

    SELECT TOP 1

    HobbyName

    FROM #HobbyName

    WHERE HobbyNameID = ABS(CHECKSUM(NEWID())) % 10 + 1

    )

    FROM sys.all_columns ac1,

    sys.all_columns ac2 ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    APPLY method'

    SELECT 'APPLY method'

    SET STATISTICS TIME ON ;

    -- now transform it by moving all nulls to the end without RBAR

    SELECT t.id,

    c1toc4 AS newc1,

    newc2,

    newc3,

    newc4

    FROM (

    SELECT h1.id AS id,

    h1.hobby1 AS c1,

    h1.hobby2 AS c2,

    h2.hobby1 AS c3,

    h2.hobby2 AS c4

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    ) t -- first CROSS APPLY is just to provide shortcuts for the

    -- various COALESCE statements

    CROSS APPLY (

    SELECT c1toc4 = COALESCE(c1, c2, c3, c4),

    c2toc4 = COALESCE(c2, c3, c4),

    c3toc4 = COALESCE(c3, c4)

    ) cx

    -- second CROSS APPLY figures out what goes in the second column

    -- we know the 1st column is always c1toc4 - i.e. the first non-null

    -- so here we compare the first non-null in cols 2-4 with c1toc4 and c3toc4

    -- if it matches either or those we need to either take column 4 or

    -- null (if col4 matches c1toc4 we have already used it for new column 1)

    CROSS APPLY (

    SELECT newc2 = COALESCE(NULLIF(COALESCE(NULLIF(c2toc4,

    c1toc4), c3toc4), c1toc4), NULLIF(c4, c1toc4))

    ) cx2

    -- and this is for the third column

    CROSS APPLY (

    SELECT newc3 = COALESCE(NULLIF(NULLIF(c3toc4, newc2), c1toc4),

    NULLIF(NULLIF(c4, newc2), c1toc4))

    ) cx3

    -- and this is for the fourth column

    CROSS APPLY (

    SELECT newc4 = NULLIF(NULLIF(NULLIF(c4, newc3), newc2), c1toc4)

    ) cx4 ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    String manipulation'

    SELECT 'String manipulation'

    SET STATISTICS TIME ON ;

    SELECT t.id,

    RTRIM(SUBSTRING(t.s, 1, 36)),

    RTRIM(SUBSTRING(t.s, 37, 36)),

    RTRIM(SUBSTRING(t.s, 73, 36)),

    RTRIM(SUBSTRING(t.s, 109, 36))

    FROM (

    SELECT h1.id,

    ISNULL(h1.hobby1 + SPACE(36 - LEN(h1.hobby1)), '') +

    ISNULL(h1.hobby2 + SPACE(36 - LEN(h1.hobby2)), '') + ISNULL(h2.hobby1 + SPACE(36

    - LEN(h2.hobby1)),'') + ISNULL(h2.hobby2

    + SPACE(36- LEN(h2.hobby2)),'') s

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    ) t ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Unpivot/Pivot'

    SELECT 'Unpivot/Pivot'

    SET STATISTICS TIME ON ;

    WITH OneTableCTE

    AS (

    SELECT a.id,

    a.hobby1 AS 'h1',

    a.hobby2 AS 'h2',

    b.hobby1 AS 'h3',

    b.hobby2 AS 'h4'

    FROM hobby1 a

    INNER JOIN hobby2 b ON a.id = b.id

    ),

    UnPivotCTE

    AS (

    SELECT id,

    hobby,

    ROW_NUMBER() OVER ( PARTITION BY id ORDER BY hobbyorder )

    AS fixedorder

    FROM OneTableCTE UNPIVOT ( hobby FOR hobbyorder IN ( h1, h2, h3, h4 ) )

    AS Normalized

    )

    SELECT OneTableCTE.id,

    [1],

    [2],

    [3],

    [4]

    FROM (

    SELECT id,

    hobby,

    fixedorder

    FROM UnPivotCTE

    ) AS Undone PIVOT ( MIN(hobby) FOR fixedorder IN ( [1], [2], [3], [4] ) )

    AS PivotTable

    RIGHT OUTER JOIN OneTableCTE ON PivotTable.id = OneTableCTE.id ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Union All/Cross tab'

    SELECT 'Union All/Cross tab'

    SET STATISTICS TIME ON ;

    WITH cte(id, hobby, hobby_num)

    AS (

    SELECT h.id,

    h.hobby1,

    1

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    UNION ALL

    SELECT h.id,

    h.hobby2,

    2

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    UNION ALL

    SELECT h.id,

    h.hobby1,

    3

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby1

    WHERE id = h.id )

    UNION ALL

    SELECT h.id,

    h.hobby2,

    4

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    ),

    cte2(id, hobby, row_num)

    AS (

    SELECT id,

    hobby,

    ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobby_num)

    FROM cte

    WHERE hobby IS NOT NULL

    )

    SELECT id,

    MAX(CASE WHEN row_num = 1 THEN hobby

    END) AS hobby1,

    MAX(CASE WHEN row_num = 2 THEN hobby

    END) AS hobby2,

    MAX(CASE WHEN row_num = 3 THEN hobby

    END) AS hobby3,

    MAX(CASE WHEN row_num = 4 THEN hobby

    END) AS hobby4

    FROM cte2

    GROUP BY id ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Union/Cross tab'

    SELECT 'Union/Cross tab'

    SET STATISTICS TIME ON ;

    WITH cte(id, hobby)

    AS (

    SELECT h.id,

    h.hobby1

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    UNION

    SELECT h.id,

    h.hobby2

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    UNION

    SELECT h.id,

    h.hobby1

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby1

    WHERE id = h.id )

    UNION

    SELECT h.id,

    h.hobby2

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby1

    WHERE id = h.id )

    ),

    cte2(id, hobby, row_num)

    AS (

    SELECT id,

    hobby,

    ROW_NUMBER() OVER (PARTITION BY id ORDER BY id)

    FROM cte

    WHERE hobby IS NOT NULL

    )

    SELECT id,

    MAX(CASE WHEN row_num = 1 THEN hobby

    END) AS hobby1,

    MAX(CASE WHEN row_num = 2 THEN hobby

    END) AS hobby2,

    MAX(CASE WHEN row_num = 3 THEN hobby

    END) AS hobby3,

    MAX(CASE WHEN row_num = 4 THEN hobby

    END) AS hobby4

    FROM cte2

    GROUP BY id ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Parsename'

    SELECT 'Parsename'

    SET STATISTICS TIME ON ;

    SELECT h1.id,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 1)) AS Hobby1,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 2)) AS Hobby2,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 3)) AS Hobby3,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 4)) AS Hobby4

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Jeff''s Rendition'

    SELECT 'Jeff''s Rendition'

    SET STATISTICS TIME ON ;

    --=====================================================================================================================

    -- Now, solve the problem. The details are, of course, in the comments.

    -- This method only "dips" each table once.

    --=====================================================================================================================

    --=====================================================================================================================

    -- Now, solve the problem. The details are, of course, in the comments.

    -- This method only "dips" each table once.

    --=====================================================================================================================

    WITH

    cteEnumerate AS

    ( --=== This CTE creates an EAV stacked Table removing dupes and NULLs in the process.

    -- Then it numbers (enumerates) the item's "column numbers" for reassembly in the outer SELECT coming up.

    -- As a pleasent side effect, the HobbyNames are sorted in alphabetical order.

    -- This step is a form of "pre-aggregation"

    SELECT ColNum = ROW_NUMBER() OVER (PARTITION BY eav.HobbyID ORDER BY eav.HobbyName),

    eav.HobbyID,

    eav.HobbyName

    FROM ( --=== This does the EAV "stacking", removal of dupes, removal of NULLs as combined info from both tables

    SELECT --===== Unpivot the "first" hobby table removing nulls and dupes

    DISTINCT

    HobbyID = h1.id,

    HobbyName = upvt.HobbyName

    FROM dbo.hobby1 h1

    CROSS APPLY ( --=== This does a high speed "Unpivot" of the first table

    SELECT hobby1 UNION ALL

    SELECT hobby2

    ) upvt (HobbyName)

    WHERE upvt.HobbyName > '' --Is NOT NULL and is NOT BLANK

    UNION -- We use union to get rid of dupes between tables ----------------------------------------------------

    SELECT --===== Unpivot the "second" hobby table removing nulls and dupes

    DISTINCT

    HobbyID = h2.id,

    HobbyName = upvt.HobbyName

    FROM dbo.hobby2 h2

    CROSS APPLY ( --=== This does a high speed "Unpivot" of the second table

    SELECT hobby1 UNION ALL

    SELECT hobby2

    ) upvt (HobbyName)

    WHERE upvt.HobbyName > '' --Is NOT NULL and is NOT BLANK

    ) eav

    ) --=== Now, we reassemble the data using a high-speed cross tab.

    -- For convenience of anticipated queries, we also return the number of hobbies for each "HobbyID"

    SELECT HobbyID,

    Hobby1 = MAX(CASE WHEN ColNum = 1 THEN HobbyName ELSE '' END),

    Hobby2 = MAX(CASE WHEN ColNum = 2 THEN HobbyName ELSE '' END),

    Hobby3 = MAX(CASE WHEN ColNum = 3 THEN HobbyName ELSE '' END),

    Hobby4 = MAX(CASE WHEN ColNum = 4 THEN HobbyName ELSE '' END),

    HobbyCount = COUNT(*)

    FROM cteEnumerate

    GROUP BY HobbyID

    ;

    SET STATISTICS TIME OFF ;

    GO

    And here is a typical set of results on my machine after checking about a dozen or so runs:

    ---------------------------------------------------------------------------

    build temp tables

    ---------------------------------------------------------------------------

    APPLY method

    SQL Server Execution Times:

    CPU time = 17799 ms, elapsed time = 18257 ms.

    ---------------------------------------------------------------------------

    String manipulation

    SQL Server Execution Times:

    CPU time = 921 ms, elapsed time = 2009 ms.

    ---------------------------------------------------------------------------

    Unpivot/Pivot

    SQL Server Execution Times:

    CPU time = 1546 ms, elapsed time = 2138 ms.

    ---------------------------------------------------------------------------

    Union All/Cross tab

    Warning: Null value is eliminated by an aggregate or other SET operation.

    SQL Server Execution Times:

    CPU time = 796 ms, elapsed time = 1804 ms.

    ---------------------------------------------------------------------------

    Union/Cross tab

    Warning: Null value is eliminated by an aggregate or other SET operation.

    SQL Server Execution Times:

    CPU time = 780 ms, elapsed time = 1967 ms.

    ---------------------------------------------------------------------------

    Parsename

    SQL Server Execution Times:

    CPU time = 1466 ms, elapsed time = 1968 ms.

    ---------------------------------------------------------------------------

    Jeff's Rendition

    SQL Server Execution Times:

    CPU time = 1435 ms, elapsed time = 2181 ms.

    Edit 7/8/2011 7:20 Mtn: corrected property of Jeff's rendition, it "cannot be ordered by column name"

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Wow! that shows how slow my isnull/nullif/apply method is!

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • A question which i hope someone might be able to answer about the "Union/Cross tab" solution.

    Okay so lets say we have a result looking like 1, "Tennis", Null, "Golf", "Bowling".

    This first CTE will convert this to

    1 "Tennis"

    1 null

    1 "Golf"

    1 "Bowling"

    The second CTE adds a rownumber to each row order by id. So it should create

    1, 1, "Tennis"

    1, 2, null

    1, 3, Golf"

    1, 4, "Bowling"

    But how can we be sure it actually does that? There is no order by that forces it to come in that order. Is there a feature to CTEs so that they always come in a certain way without the order by?

    Have worked like 30min with CTEs (still stuck with 2000) so all very new 🙂

    /T

  • Regarding these two requirements mikes84:

    1) Do you want duplicates shown in the results?

    2) Does order matter?

    Using our two tables hobby1 and hobby2 here is an example that should answer both questions:

    If...

    hobby1.hobby1 = tennis

    hobby1.hobby2 = cricket

    hobby2.hobby1 is null

    hobby2.hobby2 = tennis

    ...what should the result set look like?

    1) I don't want duplicates in the results

    2) Order does matter. The order is: hobby1.hobby1, hobby1.hobby2, hobby2.hobby1, hobby2.hobby2

    If any hobbies are null, the next non-null hobby should be in its place.

    In the case of the sample you provided, it would be

    tennis, cricket, tennis, null

    I can't remember if your XML-based solution met both of these requirements.

    Jeff, I think your solution addresses requirement #1, but not #2, since it's ordering by hobby name alphabetically? Do you know how it could be modified to order in the aforementioned way?

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • Modding my code so it should do what you now want.

    SELECT t.id,

    RTRIM(SUBSTRING(t.s, 1, 36)),

    RTRIM(SUBSTRING(t.s, 37, 36)),

    RTRIM(SUBSTRING(t.s, 73, 36)),

    RTRIM(SUBSTRING(t.s, 109, 36))

    FROM (

    SELECT h1.id,

    ISNULL(h1.hobby1 + SPACE(36 - LEN(h1.hobby1)), '') +

    case when h1.hobby2 in (h1.hobby1) then '' else ISNULL(h1.hobby2 + SPACE(36 - LEN(h1.hobby2)), '') end +

    case when h2.hobby1 in (h1.hobby1, h1.hobby2) then '' else ISNULL(h2.hobby1 + SPACE(36 - LEN(h2.hobby1)),'') end +

    case when h2.hobby2 in (h1.hobby1, h1.hobby2, h2.hobby1) then '' else ISNULL(h2.hobby2 + SPACE(36 - LEN(h2.hobby2)),'') end s

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    ) t ;

    Think it works atleast 😀

    /T

  • I think you misunderstood the question, Mike. They are asking if the same hobby (Tennis in the example) appears twice in the four possible hobby records whether you want one of them eliminated.

  • mikes84 (7/8/2011)


    Regarding these two requirements mikes84:

    1) Do you want duplicates shown in the results?

    2) Does order matter?

    Using our two tables hobby1 and hobby2 here is an example that should answer both questions:

    If...

    hobby1.hobby1 = tennis

    hobby1.hobby2 = cricket

    hobby2.hobby1 is null

    hobby2.hobby2 = tennis

    ...what should the result set look like?

    1) I don't want duplicates in the results

    2) Order does matter. The order is: hobby1.hobby1, hobby1.hobby2, hobby2.hobby1, hobby2.hobby2

    If any hobbies are null, the next non-null hobby should be in its place.

    In the case of the sample you provided, it would be

    tennis, cricket, tennis, null

    I can't remember if your XML-based solution met both of these requirements.

    Jeff, I think your solution addresses requirement #1, but not #2, since it's ordering by hobby name alphabetically? Do you know how it could be modified to order in the aforementioned way?

    From your description I would think you wanted:

    tennis, cricket, null, null

    Please confirm.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • tommyh (7/8/2011)


    A question which i hope someone might be able to answer about the "Union/Cross tab" solution.

    Okay so lets say we have a result looking like 1, "Tennis", Null, "Golf", "Bowling".

    This first CTE will convert this to

    1 "Tennis"

    1 null

    1 "Golf"

    1 "Bowling"

    The second CTE adds a rownumber to each row order by id. So it should create

    1, 1, "Tennis"

    1, 2, null

    1, 3, Golf"

    1, 4, "Bowling"

    But how can we be sure it actually does that? There is no order by that forces it to come in that order. Is there a feature to CTEs so that they always come in a certain way without the order by?

    We can't 😀 You cannot guarantee the order inside a derived table or CTE <edit>unless you use TOP but that does not apply in this problem case</edit>. Only the outermost query can ask for and be guaranteed ordering. That's why the "Union/Crosstab" method has the property "cannot be ordered by column name". The "Union All/Crosstab" assigns a value to each row based on the column it came from and it uses that for later pivoting so it can guarantee column ordering.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Jeff Moden (7/7/2011)


    For what it's worth, here's my rendition of how this problem might be solved. It doesn't preserve dupes and it sorts the HobbyNames on each row alphabetically. As always, details are in the code. As a sidebar, it only "dips" each table once and there is no join. I also did my own form of unpivot simply because I don't care for the UNPIVOT syntax. 😀 I'll let Orlando test it on his machine rather than make any performance claims. (Orlando, the test data above is set for 100,000 rows... change it {just a couple of variable entries} to whatever you were using for the other's code. Thanks)

    Cool stuff, Jeff!

    So do you ever use PIVOT and/or UNPIVOT, or are workarounds almost always better/faster?

  • Here is the code including the latest string manipulation approach from tommyh and a variation of the "Union All/Crosstab" approach I put together, both of which remove dups yet still allow for a column order:

    SET STATISTICS TIME OFF ;

    GO

    SET NOCOUNT ON ;

    GO

    PRINT REPLICATE('-', 75) + '

    build temp tables'

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'hobby1')

    AND type IN ( N'U' ) )

    DROP TABLE hobby1 ;

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'hobby2')

    AND type IN ( N'U' ) )

    DROP TABLE hobby2 ;

    CREATE TABLE hobby1

    (

    id INT IDENTITY(1, 1) NOT NULL,

    hobby1 VARCHAR(36),

    hobby2 VARCHAR(36)

    ) ;

    CREATE TABLE hobby2

    (

    id INT IDENTITY(1, 1) NOT NULL,

    hobby1 VARCHAR(36),

    hobby2 VARCHAR(36)

    ) ;

    GO

    --===== Add PK's to both tables

    ALTER TABLE dbo.hobby1 ADD PRIMARY KEY CLUSTERED (id) WITH FILLFACTOR = 100 ;

    ALTER TABLE dbo.hobby2 ADD PRIMARY KEY CLUSTERED (id) WITH FILLFACTOR = 100 ;

    GO

    --============================================================================

    -- Create and populate the test tables. This is NOT a part of the solution.

    --============================================================================

    --===== Declare and preset some obviously named local variables

    DECLARE @RowsInFirstTable INT,

    @RowsInSecondTable INT ;

    SELECT @RowsInFirstTable = 200000,

    @RowsInSecondTable = 200000 ;

    IF OBJECT_ID(N'tempdb..#HobbyName') IS NOT NULL

    DROP TABLE #HobbyName;

    --===== Create a list of HobbyNames so we can randomly populate the test tables.

    -- The NULL's are simply part of the possible things that could be a hobby.

    SELECT HobbyNameID = IDENTITY( INT,1,1),

    HobbyName = CAST(d.HobbyName AS VARCHAR(15))

    INTO #HobbyName

    FROM (

    SELECT NULL

    UNION ALL

    SELECT NULL

    UNION ALL

    SELECT 'BaseBall'

    UNION ALL

    SELECT 'Basketball'

    UNION ALL

    SELECT 'Tennis'

    UNION ALL

    SELECT 'FootBall'

    UNION ALL

    SELECT 'Hockey'

    UNION ALL

    SELECT 'Rugby'

    UNION ALL

    SELECT 'Golf'

    UNION ALL

    SELECT 'Bowling'

    ) d ( HobbyName ) ;

    ALTER TABLE #HobbyName ADD PRIMARY KEY CLUSTERED (HobbyNameID) WITH FILLFACTOR = 100 ;

    --===== Create the "first" table and populate it on the fly

    INSERT INTO dbo.hobby1

    (

    hobby1,

    hobby2

    )

    SELECT TOP ( @RowsInFirstTable )

    Hobby1 = (

    SELECT TOP 1

    HobbyName

    FROM #HobbyName

    WHERE HobbyNameID = ABS(CHECKSUM(NEWID())) % 10 + 1

    ),

    Hobby2 = (

    SELECT TOP 1

    HobbyName

    FROM #HobbyName

    WHERE HobbyNameID = ABS(CHECKSUM(NEWID())) % 10 + 1

    )

    FROM sys.all_columns ac1,

    sys.all_columns ac2 ;

    --===== Create the "second" table and populate it on the fly

    INSERT INTO dbo.hobby2

    (

    hobby1,

    hobby2

    )

    SELECT TOP ( @RowsInSecondTable )

    (

    SELECT TOP 1

    HobbyName

    FROM #HobbyName

    WHERE HobbyNameID = ABS(CHECKSUM(NEWID())) % 10 + 1

    ),

    (

    SELECT TOP 1

    HobbyName

    FROM #HobbyName

    WHERE HobbyNameID = ABS(CHECKSUM(NEWID())) % 10 + 1

    )

    FROM sys.all_columns ac1,

    sys.all_columns ac2 ;

    GO

    ------------------------------------------------------------------------------------

    /*

    PRINT REPLICATE('-', 75) + '

    APPLY method (no dups, ordered by column)'

    SELECT 'APPLY method (no dups, ordered by column)'

    SET STATISTICS TIME ON ;

    -- now transform it by moving all nulls to the end without RBAR

    SELECT t.id,

    c1toc4 AS newc1,

    newc2,

    newc3,

    newc4

    FROM (

    SELECT h1.id AS id,

    h1.hobby1 AS c1,

    h1.hobby2 AS c2,

    h2.hobby1 AS c3,

    h2.hobby2 AS c4

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    ) t -- first CROSS APPLY is just to provide shortcuts for the

    -- various COALESCE statements

    CROSS APPLY (

    SELECT c1toc4 = COALESCE(c1, c2, c3, c4),

    c2toc4 = COALESCE(c2, c3, c4),

    c3toc4 = COALESCE(c3, c4)

    ) cx

    -- second CROSS APPLY figures out what goes in the second column

    -- we know the 1st column is always c1toc4 - i.e. the first non-null

    -- so here we compare the first non-null in cols 2-4 with c1toc4 and c3toc4

    -- if it matches either or those we need to either take column 4 or

    -- null (if col4 matches c1toc4 we have already used it for new column 1)

    CROSS APPLY (

    SELECT newc2 = COALESCE(NULLIF(COALESCE(NULLIF(c2toc4,

    c1toc4), c3toc4), c1toc4), NULLIF(c4, c1toc4))

    ) cx2

    -- and this is for the third column

    CROSS APPLY (

    SELECT newc3 = COALESCE(NULLIF(NULLIF(c3toc4, newc2), c1toc4),

    NULLIF(NULLIF(c4, newc2), c1toc4))

    ) cx3

    -- and this is for the fourth column

    CROSS APPLY (

    SELECT newc4 = NULLIF(NULLIF(NULLIF(c4, newc3), newc2), c1toc4)

    ) cx4 ;

    SET STATISTICS TIME OFF ;

    GO

    */

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    String manipulation (dups, ordered by column)'

    SELECT 'String manipulation (dups, ordered by column)'

    SET STATISTICS TIME ON ;

    SELECT t.id,

    RTRIM(SUBSTRING(t.s, 1, 36)),

    RTRIM(SUBSTRING(t.s, 37, 36)),

    RTRIM(SUBSTRING(t.s, 73, 36)),

    RTRIM(SUBSTRING(t.s, 109, 36))

    FROM (

    SELECT h1.id,

    ISNULL(h1.hobby1 + SPACE(36 - LEN(h1.hobby1)), '') +

    ISNULL(h1.hobby2 + SPACE(36 - LEN(h1.hobby2)), '') + ISNULL(h2.hobby1 + SPACE(36

    - LEN(h2.hobby1)),'') + ISNULL(h2.hobby2

    + SPACE(36- LEN(h2.hobby2)),'') s

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    ) t ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    String manipulation (no dups, ordered by column)'

    SELECT 'String manipulation (no dups, ordered by column)'

    SET STATISTICS TIME ON ;

    SELECT t.id,

    RTRIM(SUBSTRING(t.s, 1, 36)),

    RTRIM(SUBSTRING(t.s, 37, 36)),

    RTRIM(SUBSTRING(t.s, 73, 36)),

    RTRIM(SUBSTRING(t.s, 109, 36))

    FROM (

    SELECT h1.id,

    ISNULL(h1.hobby1 + SPACE(36 - LEN(h1.hobby1)), '') +

    case when h1.hobby2 in (h1.hobby1) then '' else ISNULL(h1.hobby2 +

    SPACE(36 - LEN(h1.hobby2)), '') end +

    case when h2.hobby1 in (h1.hobby1, h1.hobby2) then '' else ISNULL(h2.hobby1 +

    SPACE(36 - LEN(h2.hobby1)),'') end +

    case when h2.hobby2 in (h1.hobby1, h1.hobby2, h2.hobby1) then '' else

    ISNULL(h2.hobby2 + SPACE(36 - LEN(h2.hobby2)),'') end s

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    ) t ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Unpivot/Pivot (dups, ordered by column)'

    SELECT 'Unpivot/Pivot (dups, ordered by column)'

    SET STATISTICS TIME ON ;

    WITH OneTableCTE

    AS (

    SELECT a.id,

    a.hobby1 AS 'h1',

    a.hobby2 AS 'h2',

    b.hobby1 AS 'h3',

    b.hobby2 AS 'h4'

    FROM hobby1 a

    INNER JOIN hobby2 b ON a.id = b.id

    ),

    UnPivotCTE

    AS (

    SELECT id,

    hobby,

    ROW_NUMBER() OVER ( PARTITION BY id ORDER BY hobbyorder )

    AS fixedorder

    FROM OneTableCTE UNPIVOT ( hobby FOR hobbyorder IN ( h1, h2, h3, h4 ) )

    AS Normalized

    )

    SELECT OneTableCTE.id,

    [1],

    [2],

    [3],

    [4]

    FROM (

    SELECT id,

    hobby,

    fixedorder

    FROM UnPivotCTE

    ) AS Undone PIVOT ( MIN(hobby) FOR fixedorder IN ( [1], [2], [3], [4] ) )

    AS PivotTable

    RIGHT OUTER JOIN OneTableCTE ON PivotTable.id = OneTableCTE.id ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Union All/Cross tab (dups, ordered by column)'

    SELECT 'Union All/Cross tab (dups, ordered by column)'

    SET STATISTICS TIME ON ;

    WITH cte(id, hobby, hobby_num)

    AS (

    SELECT h.id,

    h.hobby1,

    1

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    UNION ALL

    SELECT h.id,

    h.hobby2,

    2

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    UNION ALL

    SELECT h.id,

    h.hobby1,

    3

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby1

    WHERE id = h.id )

    UNION ALL

    SELECT h.id,

    h.hobby2,

    4

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    ),

    cte2(id, hobby, row_num)

    AS (

    SELECT id,

    hobby,

    ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobby_num)

    FROM cte

    WHERE hobby IS NOT NULL

    )

    SELECT id,

    MAX(CASE WHEN row_num = 1 THEN hobby

    END) AS hobby1,

    MAX(CASE WHEN row_num = 2 THEN hobby

    END) AS hobby2,

    MAX(CASE WHEN row_num = 3 THEN hobby

    END) AS hobby3,

    MAX(CASE WHEN row_num = 4 THEN hobby

    END) AS hobby4

    FROM cte2

    GROUP BY id ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Union All/Cross tab (no dups, ordered by column)'

    SELECT 'Union All/Cross tab (no dups, ordered by column)'

    SET STATISTICS TIME ON ;

    WITH cte ( id, hobby, hobby_num )

    AS (

    SELECT h.id,

    h.hobby1,

    1

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    AND h.hobby1 IS NOT NULL

    UNION ALL

    SELECT h.id,

    h.hobby2,

    2

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    AND h.hobby2 IS NOT NULL

    UNION ALL

    SELECT h.id,

    h.hobby1,

    3

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby1

    WHERE id = h.id )

    AND h.hobby1 IS NOT NULL

    UNION ALL

    SELECT h.id,

    h.hobby2,

    4

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    AND h.hobby2 IS NOT NULL

    ),

    cte2 ( id, hobby, hobby_num, hobby_partition_num )

    AS (

    -- partition so we can omit dups later

    SELECT id,

    hobby,

    hobby_num,

    ROW_NUMBER() OVER ( PARTITION BY id, hobby ORDER BY hobby, hobby_num )

    FROM cte

    ),

    cte3 ( id, hobby, row_num )

    AS (

    -- remove dups

    SELECT id,

    hobby,

    ROW_NUMBER() OVER ( PARTITION BY id ORDER BY hobby_num )

    FROM cte2

    WHERE hobby_partition_num = 1

    )

    SELECT id,

    MAX(CASE WHEN row_num = 1 THEN hobby

    END) AS hobby1,

    MAX(CASE WHEN row_num = 2 THEN hobby

    END) AS hobby2,

    MAX(CASE WHEN row_num = 3 THEN hobby

    END) AS hobby3,

    MAX(CASE WHEN row_num = 4 THEN hobby

    END) AS hobby4

    FROM cte3

    GROUP BY id ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Union/Cross tab (no dups, not ordered by column)'

    SELECT 'Union/Cross tab (no dups, not ordered by column)'

    SET STATISTICS TIME ON ;

    WITH cte(id, hobby)

    AS (

    SELECT h.id,

    h.hobby1

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    UNION

    SELECT h.id,

    h.hobby2

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    UNION

    SELECT h.id,

    h.hobby1

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby1

    WHERE id = h.id )

    UNION

    SELECT h.id,

    h.hobby2

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby1

    WHERE id = h.id )

    ),

    cte2(id, hobby, row_num)

    AS (

    SELECT id,

    hobby,

    ROW_NUMBER() OVER (PARTITION BY id ORDER BY id)

    FROM cte

    WHERE hobby IS NOT NULL

    )

    SELECT id,

    MAX(CASE WHEN row_num = 1 THEN hobby

    END) AS hobby1,

    MAX(CASE WHEN row_num = 2 THEN hobby

    END) AS hobby2,

    MAX(CASE WHEN row_num = 3 THEN hobby

    END) AS hobby3,

    MAX(CASE WHEN row_num = 4 THEN hobby

    END) AS hobby4

    FROM cte2

    GROUP BY id ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Parsename (dups, ordered by column)'

    SELECT 'Parsename (dups, ordered by column)'

    SET STATISTICS TIME ON ;

    SELECT h1.id,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 1)) AS Hobby1,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 2)) AS Hobby2,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 3)) AS Hobby3,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 4)) AS Hobby4

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Jeff''s Rendition (no dups, not ordered by column)'

    SELECT 'Jeff''s Rendition (no dups, not ordered by column)'

    SET STATISTICS TIME ON ;

    --======================================================================

    -- Now, solve the problem. The details are, of course, in the comments.

    -- This method only "dips" each table once.

    --======================================================================

    --======================================================================

    -- Now, solve the problem. The details are, of course, in the comments.

    -- This method only "dips" each table once.

    --======================================================================

    WITH

    cteEnumerate AS

    ( --=== This CTE creates an EAV stacked Table removing dupes and NULLs in the process.

    -- Then it numbers (enumerates) the item's "column numbers" for reassembly in the

    -- outer SELECT coming up.

    -- As a pleasent side effect, the HobbyNames are sorted in alphabetical order.

    -- This step is a form of "pre-aggregation"

    SELECT ColNum = ROW_NUMBER() OVER (PARTITION BY eav.HobbyID ORDER BY eav.HobbyName),

    eav.HobbyID,

    eav.HobbyName

    FROM (

    --=== This does the EAV "stacking", removal of dupes, removal of NULLs as combined info

    -- from both tables

    SELECT --===== Unpivot the "first" hobby table removing nulls and dupes

    DISTINCT

    HobbyID = h1.id,

    HobbyName = upvt.HobbyName

    FROM dbo.hobby1 h1

    CROSS APPLY ( --=== This does a high speed "Unpivot" of the first table

    SELECT hobby1 UNION ALL

    SELECT hobby2

    ) upvt (HobbyName)

    WHERE upvt.HobbyName > '' --Is NOT NULL and is NOT BLANK

    UNION -- We use union to get rid of dupes between tables -----------------------------

    SELECT --===== Unpivot the "second" hobby table removing nulls and dupes

    DISTINCT

    HobbyID = h2.id,

    HobbyName = upvt.HobbyName

    FROM dbo.hobby2 h2

    CROSS APPLY ( --=== This does a high speed "Unpivot" of the second table

    SELECT hobby1 UNION ALL

    SELECT hobby2

    ) upvt (HobbyName)

    WHERE upvt.HobbyName > '' --Is NOT NULL and is NOT BLANK

    ) eav

    ) --=== Now, we reassemble the data using a high-speed cross tab.

    -- For convenience of anticipated queries, we also return the number of hobbies for each "HobbyID"

    SELECT HobbyID,

    Hobby1 = MAX(CASE WHEN ColNum = 1 THEN HobbyName ELSE '' END),

    Hobby2 = MAX(CASE WHEN ColNum = 2 THEN HobbyName ELSE '' END),

    Hobby3 = MAX(CASE WHEN ColNum = 3 THEN HobbyName ELSE '' END),

    Hobby4 = MAX(CASE WHEN ColNum = 4 THEN HobbyName ELSE '' END),

    HobbyCount = COUNT(*)

    FROM cteEnumerate

    GROUP BY HobbyID

    ;

    SET STATISTICS TIME OFF ;

    GO

    Typical results when processing 100,000 rows in each table:

    ---------------------------------------------------------------------------

    build temp tables

    ---------------------------------------------------------------------------

    String manipulation (dups, ordered by column)

    SQL Server Execution Times:

    CPU time = 891 ms, elapsed time = 1875 ms.

    ---------------------------------------------------------------------------

    String manipulation (no dups, ordered by column)

    SQL Server Execution Times:

    CPU time = 1031 ms, elapsed time = 1506 ms.

    ---------------------------------------------------------------------------

    Unpivot/Pivot (dups, ordered by column)

    SQL Server Execution Times:

    CPU time = 1218 ms, elapsed time = 1641 ms.

    ---------------------------------------------------------------------------

    Union All/Cross tab (dups, ordered by column)

    Warning: Null value is eliminated by an aggregate or other SET operation.

    SQL Server Execution Times:

    CPU time = 625 ms, elapsed time = 1618 ms.

    ---------------------------------------------------------------------------

    Union All/Cross tab (no dups, ordered by column)

    Warning: Null value is eliminated by an aggregate or other SET operation.

    SQL Server Execution Times:

    CPU time = 1000 ms, elapsed time = 2384 ms.

    ---------------------------------------------------------------------------

    Union/Cross tab (no dups, not ordered by column)

    Warning: Null value is eliminated by an aggregate or other SET operation.

    SQL Server Execution Times:

    CPU time = 610 ms, elapsed time = 1582 ms.

    ---------------------------------------------------------------------------

    Parsename (dups, ordered by column)

    SQL Server Execution Times:

    CPU time = 1062 ms, elapsed time = 1513 ms.

    ---------------------------------------------------------------------------

    Jeff's Rendition (no dups, not ordered by column)

    SQL Server Execution Times:

    CPU time = 1156 ms, elapsed time = 1785 ms.

    Typical results when processing 200,000 rows in each table:

    ---------------------------------------------------------------------------

    build temp tables

    ---------------------------------------------------------------------------

    String manipulation (dups, ordered by column)

    SQL Server Execution Times:

    CPU time = 1485 ms, elapsed time = 2860 ms.

    ---------------------------------------------------------------------------

    String manipulation (no dups, ordered by column)

    SQL Server Execution Times:

    CPU time = 2125 ms, elapsed time = 3133 ms.

    ---------------------------------------------------------------------------

    Unpivot/Pivot (dups, ordered by column)

    SQL Server Execution Times:

    CPU time = 2157 ms, elapsed time = 3166 ms.

    ---------------------------------------------------------------------------

    Union All/Cross tab (dups, ordered by column)

    Warning: Null value is eliminated by an aggregate or other SET operation.

    SQL Server Execution Times:

    CPU time = 1281 ms, elapsed time = 2864 ms.

    ---------------------------------------------------------------------------

    Union All/Cross tab (no dups, ordered by column)

    Warning: Null value is eliminated by an aggregate or other SET operation.

    SQL Server Execution Times:

    CPU time = 2000 ms, elapsed time = 4827 ms.

    ---------------------------------------------------------------------------

    Union/Cross tab (no dups, not ordered by column)

    Warning: Null value is eliminated by an aggregate or other SET operation.

    SQL Server Execution Times:

    CPU time = 1250 ms, elapsed time = 2863 ms.

    ---------------------------------------------------------------------------

    Parsename (dups, ordered by column)

    SQL Server Execution Times:

    CPU time = 2141 ms, elapsed time = 2931 ms.

    ---------------------------------------------------------------------------

    Jeff's Rendition (no dups, not ordered by column)

    SQL Server Execution Times:

    CPU time = 2297 ms, elapsed time = 3598 ms.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Viewing 15 posts - 31 through 45 (of 92 total)

    You must be logged in to reply to this topic. Login to reply