Handle NULLs

  • Both are brilliant!

    The string-based solution rocks...how simple!

    I'll throw one more approach out there using XML just because it seemed interesting when I thought of it. There is a chance it could be improved but I am no XML guru...the way it stands now it does not compete with either of the others.

    Here are all three solutions with slight mods so they all use the same temp tables.

    SET STATISTICS TIME OFF

    SET NOCOUNT ON ;

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-',75) + '

    build temp tables'

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'hobby1')

    AND type IN ( N'U' ) )

    DROP TABLE hobby1 ;

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'hobby2')

    AND type IN ( N'U' ) )

    DROP TABLE hobby2 ;

    CREATE TABLE hobby1

    (

    id INT,

    hobby1 VARCHAR(20),

    hobby2 VARCHAR(20)

    ) ;

    CREATE TABLE hobby2

    (

    id INT,

    hobby1 VARCHAR(20),

    hobby2 VARCHAR(20)

    ) ;

    INSERT INTO dbo.hobby1

    ( id, hobby1, hobby2 )

    VALUES ( 1, 'basketball', 'tennis' ),

    ( 2, 'basketball', NULL ),

    ( 3, NULL, NULL ),

    ( 4, NULL, 'basketball' ),

    ( 5, 'tennis', NULL ) ;

    INSERT INTO dbo.hobby2

    ( id, hobby1, hobby2 )

    VALUES ( 1, 'football', 'rugby' ),

    ( 2, 'football', NULL ),

    ( 3, NULL, NULL ),

    ( 4, NULL, 'football' ),

    ( 5, 'rugby', NULL ) ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-',75) + '

    get data as xml'

    SET STATISTICS TIME ON ;

    DECLARE @xml XML = (

    SELECT h1.id AS id,

    h1.hobby1 AS 'hobbies/hobby1',

    h1.hobby2 AS 'hobbies/hobby2',

    h2.hobby1 AS 'hobbies/hobby3',

    h2.hobby2 AS 'hobbies/hobby4'

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    FOR

    XML PATH('person'),

    TYPE,

    ROOT('people')

    ) ;

    --SELECT @xml

    SET STATISTICS TIME OFF ;

    PRINT 'xml parsing'

    SET STATISTICS TIME ON ;

    SELECT pref.value('(id/text())[1]', 'int') AS id,

    pref.value('(./hobbies//.[1]/text())[1]', 'varchar(50)') AS h1,

    pref.value('(./hobbies//.[1]/text())[2]', 'varchar(50)') AS h2,

    pref.value('(./hobbies//.[1]/text())[3]', 'varchar(50)') AS h3,

    pref.value('(./hobbies//.[1]/text())[4]', 'varchar(50)') AS h4

    FROM @xml.nodes('/people/person') AS People ( pref )

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-',75) + '

    APPLY method'

    SET STATISTICS TIME ON ;

    -- now transform it by moving all nulls to the end without RBAR

    SELECT t.id,

    c1toc4 AS newc1,

    newc2,

    newc3,

    newc4

    FROM (

    SELECT h1.id AS id,

    h1.hobby1 AS c1,

    h1.hobby2 AS c2,

    h2.hobby1 AS c3,

    h2.hobby2 AS c4

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    ) t -- first CROSS APPLY is just to provide shortcuts for the various COALESCE statements

    CROSS APPLY (

    SELECT c1toc4 = COALESCE(c1, c2, c3, c4),

    c2toc4 = COALESCE(c2, c3, c4),

    c3toc4 = COALESCE(c3, c4)

    ) cx

    -- second CROSS APPLY figures out what goes in the second column

    -- we know the 1st column is always c1toc4 - i.e. the first non-null

    -- so here we compare the first non-null in cols 2-4 with c1toc4 and c3toc4

    -- if it matches either or those we need to either take column 4 or null (if col4 matches c1toc4 we have already used it for new column 1)

    CROSS APPLY (

    SELECT newc2 = COALESCE(NULLIF(COALESCE(NULLIF(c2toc4, c1toc4), c3toc4), c1toc4), NULLIF(c4, c1toc4))

    ) cx2

    -- and this is for the third column

    CROSS APPLY (

    SELECT newc3 = COALESCE(NULLIF(NULLIF(c3toc4, newc2), c1toc4), NULLIF(NULLIF(c4, newc2), c1toc4))

    ) cx3

    -- and this is for the fourth column

    CROSS APPLY (

    SELECT newc4 = NULLIF(NULLIF(NULLIF(c4, newc3), newc2), c1toc4)

    ) cx4 ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-',75) + '

    String manipulation'

    SET STATISTICS TIME ON ;

    SELECT t.id,

    RTRIM(SUBSTRING(t.s, 1, 20)),

    RTRIM(SUBSTRING(t.s, 21, 20)),

    RTRIM(SUBSTRING(t.s, 41, 20)),

    RTRIM(SUBSTRING(t.s, 61, 20))

    FROM (

    SELECT h1.id,

    ISNULL(h1.hobby1 + SPACE(20 - LEN(h1.hobby1)), '') +

    ISNULL(h1.hobby2 + SPACE(20 - LEN(h1.hobby2)), '') +

    ISNULL(h2.hobby1 + SPACE(20 - LEN(h2.hobby1)), '') +

    ISNULL(h2.hobby2 + SPACE(20 - LEN(h2.hobby2)), '') s

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    ) t ;

    SET STATISTICS TIME OFF ;

    GO

    On a quiet system these are the results I was seeing consistently:

    ---------------------------------------------------------------------------

    build temp tables

    ---------------------------------------------------------------------------

    get data as xml

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    xml parsing

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 3 ms.

    ---------------------------------------------------------------------------

    APPLY method

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 2 ms.

    ---------------------------------------------------------------------------

    String manipulation

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • The string manipulation is probably the fastest.

    I wanted to try the UNPIVOT/PIVOT just for the heck of it.

    Using Opc.Three's test setup, here is what I did first:

    ;WITH OneTableCTE AS

    (

    SELECTa.id,

    a.hobby1 AS 'h1',

    a.hobby2 AS 'h2',

    b.hobby1 AS 'h3',

    b.hobby2 AS 'h4'

    FROM

    hobby1 a

    INNER JOIN hobby2 b

    ON a.id = b.id

    ), UnPivotCTE AS

    (

    SELECTid,

    hobby,

    ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobbyorder) AS fixedorder

    FROM OneTableCTE

    UNPIVOT (hobby for hobbyorder in (h1,h2,h3,h4)) AS Normalized

    )

    SELECTid,

    [1],

    [2],

    [3],

    [4]

    FROM (SELECT id, hobby, fixedorder FROM UnPivotCTE) AS Undone

    PIVOT

    (

    min(hobby)

    FOR fixedorder IN ([1],[2],[3],[4])

    ) AS PivotTable;

    It works pretty well and is pretty fast about it. My machine was running a bit too quick so I didn't get a good comparison to the string method speed (might try later at home on a slower system).

    The problem with this is that it doesn't handle the "all hobbies are null" case. It just excludes that ID on the unpivot. To get around this, I added in an outer join back to the original tables.

    The string way is certainly more elegant, but this was a lot of fun.

    Here is the finished code for all 4 methods:

    SET STATISTICS TIME OFF

    SET NOCOUNT ON ;

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-',75) + '

    build temp tables'

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'hobby1')

    AND type IN ( N'U' ) )

    DROP TABLE hobby1 ;

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'hobby2')

    AND type IN ( N'U' ) )

    DROP TABLE hobby2 ;

    CREATE TABLE hobby1

    (

    id INT,

    hobby1 VARCHAR(20),

    hobby2 VARCHAR(20)

    ) ;

    CREATE TABLE hobby2

    (

    id INT,

    hobby1 VARCHAR(20),

    hobby2 VARCHAR(20)

    ) ;

    INSERT INTO dbo.hobby1

    ( id, hobby1, hobby2 )

    VALUES ( 1, 'basketball', 'tennis' ),

    ( 2, 'basketball', NULL ),

    ( 3, NULL, NULL ),

    ( 4, NULL, 'basketball' ),

    ( 5, 'tennis', NULL ) ;

    INSERT INTO dbo.hobby2

    ( id, hobby1, hobby2 )

    VALUES ( 1, 'football', 'rugby' ),

    ( 2, 'football', NULL ),

    ( 3, NULL, NULL ),

    ( 4, NULL, 'football' ),

    ( 5, 'rugby', NULL ) ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-',75) + '

    get data as xml'

    SET STATISTICS TIME ON ;

    DECLARE @xml XML = (

    SELECT h1.id AS id,

    h1.hobby1 AS 'hobbies/hobby1',

    h1.hobby2 AS 'hobbies/hobby2',

    h2.hobby1 AS 'hobbies/hobby3',

    h2.hobby2 AS 'hobbies/hobby4'

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    FOR

    XML PATH('person'),

    TYPE,

    ROOT('people')

    ) ;

    --SELECT @xml

    SET STATISTICS TIME OFF ;

    PRINT 'xml parsing'

    SET STATISTICS TIME ON ;

    SELECT pref.value('(id/text())[1]', 'int') AS id,

    pref.value('(./hobbies//.[1]/text())[1]', 'varchar(50)') AS h1,

    pref.value('(./hobbies//.[1]/text())[2]', 'varchar(50)') AS h2,

    pref.value('(./hobbies//.[1]/text())[3]', 'varchar(50)') AS h3,

    pref.value('(./hobbies//.[1]/text())[4]', 'varchar(50)') AS h4

    FROM @xml.nodes('/people/person') AS People ( pref )

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-',75) + '

    APPLY method'

    SET STATISTICS TIME ON ;

    -- now transform it by moving all nulls to the end without RBAR

    SELECT t.id,

    c1toc4 AS newc1,

    newc2,

    newc3,

    newc4

    FROM (

    SELECT h1.id AS id,

    h1.hobby1 AS c1,

    h1.hobby2 AS c2,

    h2.hobby1 AS c3,

    h2.hobby2 AS c4

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    ) t -- first CROSS APPLY is just to provide shortcuts for the various COALESCE statements

    CROSS APPLY (

    SELECT c1toc4 = COALESCE(c1, c2, c3, c4),

    c2toc4 = COALESCE(c2, c3, c4),

    c3toc4 = COALESCE(c3, c4)

    ) cx

    -- second CROSS APPLY figures out what goes in the second column

    -- we know the 1st column is always c1toc4 - i.e. the first non-null

    -- so here we compare the first non-null in cols 2-4 with c1toc4 and c3toc4

    -- if it matches either or those we need to either take column 4 or null (if col4 matches c1toc4 we have already used it for new column 1)

    CROSS APPLY (

    SELECT newc2 = COALESCE(NULLIF(COALESCE(NULLIF(c2toc4, c1toc4), c3toc4), c1toc4), NULLIF(c4, c1toc4))

    ) cx2

    -- and this is for the third column

    CROSS APPLY (

    SELECT newc3 = COALESCE(NULLIF(NULLIF(c3toc4, newc2), c1toc4), NULLIF(NULLIF(c4, newc2), c1toc4))

    ) cx3

    -- and this is for the fourth column

    CROSS APPLY (

    SELECT newc4 = NULLIF(NULLIF(NULLIF(c4, newc3), newc2), c1toc4)

    ) cx4 ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-',75) + '

    String manipulation'

    SET STATISTICS TIME ON ;

    SELECT t.id,

    RTRIM(SUBSTRING(t.s, 1, 20)),

    RTRIM(SUBSTRING(t.s, 21, 20)),

    RTRIM(SUBSTRING(t.s, 41, 20)),

    RTRIM(SUBSTRING(t.s, 61, 20))

    FROM (

    SELECT h1.id,

    ISNULL(h1.hobby1 + SPACE(20 - LEN(h1.hobby1)), '') +

    ISNULL(h1.hobby2 + SPACE(20 - LEN(h1.hobby2)), '') +

    ISNULL(h2.hobby1 + SPACE(20 - LEN(h2.hobby1)), '') +

    ISNULL(h2.hobby2 + SPACE(20 - LEN(h2.hobby2)), '') s

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    ) t ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-',75) + '

    Unpivot/Pivot'

    SET STATISTICS TIME ON ;

    ;WITH OneTableCTE AS

    (

    SELECTa.id,

    a.hobby1 AS 'h1',

    a.hobby2 AS 'h2',

    b.hobby1 AS 'h3',

    b.hobby2 AS 'h4'

    FROM

    hobby1 a

    INNER JOIN hobby2 b

    ON a.id = b.id

    ), UnPivotCTE AS

    (

    SELECTid,

    hobby,

    ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobbyorder) AS fixedorder

    FROM OneTableCTE

    UNPIVOT (hobby for hobbyorder in (h1,h2,h3,h4) ) AS Normalized

    )

    SELECTOneTableCTE.id,

    [1],

    [2],

    [3],

    [4]

    FROM (SELECT id, hobby, fixedorder FROM UnPivotCTE) AS Undone

    PIVOT

    (

    min(hobby)

    FOR fixedorder IN ([1],[2],[3],[4])

    ) AS PivotTable

    RIGHT OUTER JOIN OneTableCTE ON PivotTable.id = OneTableCTE.id

    SET STATISTICS TIME OFF ;

    GO

  • I'd love to hear from the OP again 🙂 Are you out there mikes84?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Yeah, I'm not sure if he thought he'd get 6 or 7 different working answers.

  • Yeah, I sure wasn't expecting so many answers. I started to revisit the problem again. What do you think of this (also not that pretty):

    SELECT

    ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(hobby1, ' ') + ISNULL(hobby2, ' ') + ISNULL(hobby3, ' ') + ISNULL(hobby4, ' '), ' ', '.')), 1)), '') AS Hobby1,

    ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(hobby1, ' ') + ISNULL(hobby2, ' ') + ISNULL(hobby3, ' ') + ISNULL(hobby4, ' '), ' ', '.')), 2)), '') AS Hobby2,

    ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(hobby1, ' ') + ISNULL(hobby2, ' ') + ISNULL(hobby3, ' ') + ISNULL(hobby4, ' '), ' ', '.')), 3)), '') AS Hobby3,

    ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(hobby1, ' ') + ISNULL(hobby2, ' ') + ISNULL(hobby3, ' ') + ISNULL(hobby4, ' '), ' ', '.')), 4)), '') AS Hobby4

    FROM <tables> etc.

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • Actually that won't work. I think some derivative of that would though.

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • OK, how about this?

    ;WITH data(hobby1,hobby2,hobby3,hobby4)

    AS

    (

    SELECT 'basketball',NULL,NULL,NULL UNION ALL

    SELECT NULL,'basketball',NULL,NULL UNION ALL

    SELECT NULL,NULL,'basketball',NULL UNION ALL

    SELECT NULL,NULL,NULL,'basketball' UNION ALL

    SELECT 'basketball','tennis',NULL,NULL UNION ALL

    SELECT 'basketball',NULL,'tennis',NULL UNION ALL

    SELECT 'basketball',NULL,NULL,'tennis' UNION ALL

    SELECT NULL,'basketball','tennis',NULL UNION ALL

    SELECT NULL,'basketball',NULL,'tennis' UNION ALL

    SELECT NULL,NULL,'basketball','tennis' UNION ALL

    SELECT 'basketball','tennis','cricket',NULL UNION ALL

    SELECT 'basketball','tennis',NULL,'cricket' UNION ALL

    SELECT 'basketball',NULL,'tennis','cricket' UNION ALL

    SELECT NULL,'basketball','tennis','cricket' UNION ALL

    SELECT 'basketball','tennis','cricket','curling'

    )

    SELECT *

    INTO #temp

    FROM data d1

    SELECT REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN hobby1 IS NULL THEN '' ELSE hobby1 + ' ' END +

    CASE WHEN hobby2 IS NULL THEN '' ELSE hobby2 + ' ' END +

    CASE WHEN hobby3 IS NULL THEN '' ELSE hobby3 + ' ' END +

    CASE WHEN hobby4 IS NULL

    THEN '' ELSE hobby4 END), ' ', '.')), 1)) AS Hobby1,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN hobby1 IS NULL THEN '' ELSE hobby1 + ' ' END +

    CASE WHEN hobby2 IS NULL THEN '' ELSE hobby2 + ' ' END +

    CASE WHEN hobby3 IS NULL THEN '' ELSE hobby3 + ' ' END +

    CASE WHEN hobby4 IS NULL THEN '' ELSE hobby4 END), ' ', '.')), 2)) AS Hobby2,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN hobby1 IS NULL THEN '' ELSE hobby1 + ' ' END +

    CASE WHEN hobby2 IS NULL THEN '' ELSE hobby2 + ' ' END +

    CASE WHEN hobby3 IS NULL THEN '' ELSE hobby3 + ' ' END +

    CASE WHEN hobby4 IS NULL THEN '' ELSE hobby4 END), ' ', '.')), 3)) AS Hobby3,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN hobby1 IS NULL THEN '' ELSE hobby1 + ' ' END +

    CASE WHEN hobby2 IS NULL THEN '' ELSE hobby2 + ' ' END +

    CASE WHEN hobby3 IS NULL THEN '' ELSE hobby3 + ' ' END +

    CASE WHEN hobby4 IS NULL THEN '' ELSE hobby4 END), ' ', '.')), 4)) AS Hobby4

    FROM #temp

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • What nerve did these nulls hit, that sop many people are spending so much time to develop and perfect so many solutions?

  • keep in mind any string parsing relies on finding a pattern that isn't used. I personally wouldn't be banking on space as a delimiter.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I upped the ante to see how each technique would really do when presented with more than 5 rows of data in each table and in the process I broke the APPLY method (see the NULL in h3 and h4 now) AND the PARSENAME method (nothing coming back at all in any column). Further to that the Union All/Cross-tab method is taking more CPU but less time than the elegant String Manipulation method :unsure:

    Please someone tell me what I did wrong:

    SET NOCOUNT ON;

    GO

    SET STATISTICS TIME OFF ;

    PRINT REPLICATE('-', 75) + '

    build temp tables'

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'hobby1')

    AND type IN (N'U') )

    DROP TABLE hobby1 ;

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'hobby2')

    AND type IN (N'U') )

    DROP TABLE hobby2 ;

    CREATE TABLE hobby1

    (

    id INT IDENTITY(1, 1),

    hobby1 VARCHAR(36),

    hobby2 VARCHAR(36)

    ) ;

    CREATE TABLE hobby2

    (

    id INT,

    hobby1 VARCHAR(36),

    hobby2 VARCHAR(36)

    ) ;

    INSERT INTO dbo.hobby1

    (

    hobby1,

    hobby2

    )

    SELECT DISTINCT TOP (1000)

    CAST(NEWID() AS VARCHAR(36)),

    CAST(NEWID() AS VARCHAR(36))

    FROM master.sys.columns c1

    CROSS JOIN master.sys.columns c2 ;

    INSERT INTO dbo.hobby2

    (

    id,

    hobby1,

    hobby2

    )

    SELECT id,

    hobby2,

    hobby1

    FROM dbo.hobby1 ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    APPLY method'

    SELECT REPLICATE('-', 75) + '

    APPLY method'

    SET STATISTICS TIME ON ;

    -- now transform it by moving all nulls to the end without RBAR

    SELECT t.id,

    c1toc4 AS newc1,

    newc2,

    newc3,

    newc4

    FROM (

    SELECT h1.id AS id,

    h1.hobby1 AS c1,

    h1.hobby2 AS c2,

    h2.hobby1 AS c3,

    h2.hobby2 AS c4

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    ) t -- first CROSS APPLY is just to provide shortcuts for the various COALESCE statements

    CROSS APPLY (

    SELECT c1toc4 = COALESCE(c1, c2, c3, c4),

    c2toc4 = COALESCE(c2, c3, c4),

    c3toc4 = COALESCE(c3, c4)

    ) cx

    -- second CROSS APPLY figures out what goes in the second column

    -- we know the 1st column is always c1toc4 - i.e. the first non-null

    -- so here we compare the first non-null in cols 2-4 with c1toc4 and c3toc4

    -- if it matches either or those we need to either take column 4 or

    -- null (if col4 matches c1toc4 we have already used it for new column 1)

    CROSS APPLY (

    SELECT newc2 = COALESCE(NULLIF(COALESCE(NULLIF(c2toc4,

    c1toc4), c3toc4), c1toc4), NULLIF(c4, c1toc4))

    ) cx2

    -- and this is for the third column

    CROSS APPLY (

    SELECT newc3 = COALESCE(NULLIF(NULLIF(c3toc4, newc2), c1toc4),

    NULLIF(NULLIF(c4, newc2), c1toc4))

    ) cx3

    -- and this is for the fourth column

    CROSS APPLY (

    SELECT newc4 = NULLIF(NULLIF(NULLIF(c4, newc3), newc2), c1toc4)

    ) cx4 ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    String manipulation'

    SELECT REPLICATE('-', 75) + '

    String manipulation'

    SET STATISTICS TIME ON ;

    SELECT t.id,

    RTRIM(SUBSTRING(t.s, 1, 36)),

    RTRIM(SUBSTRING(t.s, 37, 36)),

    RTRIM(SUBSTRING(t.s, 73, 36)),

    RTRIM(SUBSTRING(t.s, 109, 36))

    FROM (

    SELECT h1.id,

    ISNULL(h1.hobby1 + SPACE(36 - LEN(h1.hobby1)), '') +

    ISNULL(h1.hobby2 + SPACE(36 - LEN(h1.hobby2)), '') + ISNULL(h2.hobby1 + SPACE(36

    - LEN(h2.hobby1)),'') + ISNULL(h2.hobby2

    + SPACE(36- LEN(h2.hobby2)),'') s

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id

    ) t ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Unpivot/Pivot'

    SELECT REPLICATE('-', 75) + '

    Unpivot/Pivot'

    SET STATISTICS TIME ON ;

    ;

    WITH OneTableCTE

    AS (

    SELECT a.id,

    a.hobby1 AS 'h1',

    a.hobby2 AS 'h2',

    b.hobby1 AS 'h3',

    b.hobby2 AS 'h4'

    FROM hobby1 a

    INNER JOIN hobby2 b ON a.id = b.id

    ),

    UnPivotCTE

    AS (

    SELECT id,

    hobby,

    ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobbyorder) AS fixedorder

    FROM OneTableCTE UNPIVOT ( hobby FOR hobbyorder IN (h1, h2, h3, h4) )

    AS Normalized

    )

    SELECT OneTableCTE.id,

    [1],

    [2],

    [3],

    [4]

    FROM (

    SELECT id,

    hobby,

    fixedorder

    FROM UnPivotCTE

    ) AS Undone PIVOT ( MIN(hobby) FOR fixedorder IN ([1], [2], [3], [4]) ) AS PivotTable

    RIGHT OUTER JOIN OneTableCTE ON PivotTable.id = OneTableCTE.id

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Union All/Cross tab'

    SELECT REPLICATE('-', 75) + '

    Union All/Cross tab'

    SET STATISTICS TIME ON ;

    WITH cte(id, hobby, hobby_num)

    AS (

    SELECT h.id,

    h.hobby1,

    1

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    UNION ALL

    SELECT h.id,

    h.hobby1,

    2

    FROM dbo.hobby1 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    UNION ALL

    SELECT h.id,

    h.hobby1,

    3

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby1

    WHERE id = h.id )

    UNION ALL

    SELECT h.id,

    h.hobby2,

    4

    FROM dbo.hobby2 h

    WHERE EXISTS ( SELECT *

    FROM dbo.hobby2

    WHERE id = h.id )

    ),

    cte2(id, hobby, row_num)

    AS (

    SELECT id,

    hobby,

    ROW_NUMBER() OVER (PARTITION BY id ORDER BY hobby_num)

    FROM cte

    WHERE hobby IS NOT NULL

    )

    SELECT id,

    MAX(CASE WHEN row_num = 1 THEN hobby

    END) AS hobby1,

    MAX(CASE WHEN row_num = 2 THEN hobby

    END) AS hobby2,

    MAX(CASE WHEN row_num = 3 THEN hobby

    END) AS hobby3,

    MAX(CASE WHEN row_num = 4 THEN hobby

    END) AS hobby4

    FROM cte2

    GROUP BY id ;

    SET STATISTICS TIME OFF ;

    GO

    ------------------------------------------------------------------------------------

    PRINT REPLICATE('-', 75) + '

    Parsename'

    SELECT REPLICATE('-', 75) + '

    Parsename'

    SET STATISTICS TIME ON ;

    SELECT h1.id,

    ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(h1.hobby1, ' ') +

    ISNULL(h1.hobby2, ' ') + ISNULL(h2.hobby1, ' ') + ISNULL(h2.hobby2, ' '), ' ', '.')),

    1)), '') AS Hobby1,

    ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(h1.hobby1, ' ') +

    ISNULL(h1.hobby2, ' ') + ISNULL(h2.hobby1, ' ') + ISNULL(h2.hobby2, ' '), ' ', '.')),

    2)), '') AS Hobby2,

    ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(h1.hobby1, ' ') +

    ISNULL(h1.hobby2, ' ') + ISNULL(h2.hobby1, ' ') + ISNULL(h2.hobby2, ' '), ' ', '.')),

    3)), '') AS Hobby3,

    ISNULL(REVERSE(PARSENAME(REVERSE(REPLACE(ISNULL(h1.hobby1, ' ') +

    ISNULL(h1.hobby2, ' ') + ISNULL(h2.hobby1, ' ') + ISNULL(h2.hobby2, ' '), ' ', '.')),

    4)), '') AS Hobby4

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id ;

    SET STATISTICS TIME OFF ;

    GO

    Results on my machine:

    ---------------------------------------------------------------------------

    build temp tables

    ---------------------------------------------------------------------------

    APPLY method

    SQL Server Execution Times:

    CPU time = 63 ms, elapsed time = 232 ms.

    ---------------------------------------------------------------------------

    String manipulation

    SQL Server Execution Times:

    CPU time = 15 ms, elapsed time = 115 ms.

    ---------------------------------------------------------------------------

    Unpivot/Pivot

    SQL Server Execution Times:

    CPU time = 32 ms, elapsed time = 270 ms.

    ---------------------------------------------------------------------------

    Union All/Cross tab

    Warning: Null value is eliminated by an aggregate or other SET operation.

    SQL Server Execution Times:

    CPU time = 31 ms, elapsed time = 79 ms.

    ---------------------------------------------------------------------------

    Parsename

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 170 ms.

    EDIT: add some line breaks in code window

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • mikes84, could you edit your post so the code has some line breaks in it? It's making the page reallllly wide.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc, what about this for the PARSENAME method?

    SELECT h1.id,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 1)) AS Hobby1,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 2)) AS Hobby2,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 3)) AS Hobby3,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 4)) AS Hobby4

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id ;

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • mikes84 (7/7/2011)


    opc, what about this for the PARSENAME method?

    SELECT h1.id,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 1)) AS Hobby1,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 2)) AS Hobby2,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 3)) AS Hobby3,

    REVERSE(PARSENAME(REVERSE(REPLACE(RTRIM(

    CASE WHEN h1.hobby1 IS NULL THEN '' ELSE h1.hobby1 + ' ' END +

    CASE WHEN h1.hobby2 IS NULL THEN '' ELSE h1.hobby2 + ' ' END +

    CASE WHEN h2.hobby1 IS NULL THEN '' ELSE h2.hobby1 + ' ' END +

    CASE WHEN h2.hobby2 IS NULL THEN '' ELSE h2.hobby2 END), ' ', '.')), 4)) AS Hobby4

    FROM dbo.hobby1 h1

    JOIN dbo.hobby2 h2 ON h1.id = h2.id ;

    The result set looks much fuller now 🙂

    Perf results:

    Parsename

    SQL Server Execution Times:

    CPU time = 31 ms, elapsed time = 884 ms.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • mikes84 (7/6/2011)


    I have a situation where I'm querying two tables, each that has up to two hobbies listed. The first table has hobby1 and hobby2, and the second table has hobby3 and hobby4. It's possible for there to be values in any of the fields. So, it could look like:

    hobby1, hobby2, hobby3, hobby4

    basketball, tennis, NULL, NULL

    or

    hobby1, hobby2, hobby3, hobby4

    NULL, NULL, basketball, tennis

    However, it could also look like:

    hobby1, hobby2, hobby3, hobby4

    basketball, NULL, tennis, NULL

    Ideally, I want it to look like the first example, where the values are brought to the left. I could use COALESCE in the first and second examples and have it work, but it wouldn't work on the third example. Any ideas how I could get something like the third example to look like the first one?

    Mike

    Hi Orlando (and other contributors),

    I don't understand the test data you are using... perhaps it's me but the question was about shuffling values into the leftmost columns where there were nulls between them.

    The test data you generate has no NULLS and duplicate entries.

    As far as I can see, the CROSS APPLY method I provided is the only one that is correctly processing this data into the two leftmost columns and removing duplicate entries (the removal of duplicates may not be a stated requirement, but it suits me to include it and it makes sense as you wouldn't want duplicates anyway)...

    Or am I completely missing the point?

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • mister.magoo (7/7/2011)


    . . . As far as I can see, the CROSS APPLY method I provided is the only one that is correctly processing this data into the two leftmost columns and removing duplicate entries (the removal of duplicates may not be a stated requirement, but it suits me to include it and it makes sense as you wouldn't want duplicates anyway)...

    One should NOT remoive duplicates unless it is an explicit requirement.

  • Viewing 15 posts - 16 through 30 (of 92 total)

    You must be logged in to reply to this topic. Login to reply