Is there an joiner much like the 8kSplitter?

  • I'm a fan of Jeff's 8k Splitter function, works a charm, but we have an application which feeds in a single column with a list of comma seperated values.

    Currently a developer has written a function with a cursor looping round building the string, but it is shockingly bad on performance.

    So just wondering if there is an inverse of the splitter which is speedy.

    This is the function code, table and column names changed as its senstive information

    DECLARE @sid INT

    DECLARE @descr varchar(255)

    DECLARE @item varchar(255)

    DECLARE cDetailsCursor CURSOR

    READ_ONLY

    FOR

    select csg.CID

    from table1 csg

    JOIN table2 map

    ON map.ID = csg.ID

    WHERE map.SID = @sid

    union

    select cs.CID

    FROM table3 cs

    where cs.SID = @sid

    OPEN cDetailsCursor

    SET @descr = ''

    FETCH NEXT FROM cDetailsCursor INTO @item

    WHILE (@@fetch_status <> -1)

    BEGIN

    IF (@@fetch_status <> -2)

    BEGIN

    IF Len(@descr) = 0

    BEGIN

    SET @descr = @item

    END

    ELSE

    BEGIN

    SET @descr = @descr + ', ' + @item

    END

    END

    FETCH NEXT FROM cDetailsCursor INTO @item

    END

    CLOSE cDetailsCursor

    DEALLOCATE cDetailsCursor

    RETURN(@descr)

    The cursor brings back a list of INT's

    eg

    1

    2

    3

    4

    5

    6

    and outputs

    '1, 2, 3, 4, 5, 6'

    The select outputs at the moment 4286808 rows without the where clauses, there can be any number of CID's assigned to a SID so I am thinking it might have to be a dynmaic unpivot as one SID might have 10 CID's and another may have 3000000 CID's.

  • Anthony

    Just declare a variable and concatenate in one single SELECT statement, something like this (not tested):

    SET @MyVariable = ''

    SELECT @MyVariable = @MyVariable + ', ' + MyColumn

    FROM MyTable

    ORDER BY MyKeyColumn

    -- Remove leading comma

    SET @MyVariable = RIGHT(@MyVariable ,LEN(@MyVariable )-1)

    SELECT @MyVariable

    John

    Edit: I know I haven't declared the variable - something in the websweeper here won't allow that statement.

  • CROSSTAB and FOR XML PATH would be the most popular around here, there are loads of examples on ssc. FWIW there's a similar thread here.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks Gents, been one of them days and nothing gone right so far, so my heads not in the game.

    Gone with John's responce as its the simpler to impliment, just changed the -1 to a -2 to remove the comma and the space.

    Cheers again gents.

  • Here is another way to it:

    WITH CTE AS

    (

    SELECT

    name as TableName,

    object_id

    FROM sys.tables WHERE schema_id = schema_id('dbo')

    )

    SELECT TableName,

    CommaList = STUFF((

    SELECT ',' + c.name

    FROM sys.columns c

    WHERE c.object_id = CTE.object_id

    ORDER BY c.column_id

    FOR XML PATH(''),TYPE).value('.','varchar(max)'),1,1,'')

    FROM CTE

    ORDER BY Tablename;

  • anthony.green (7/26/2012)


    Thanks Gents, been one of them days and nothing gone right so far, so my heads not in the game.

    Gone with John's responce as its the simpler to impliment, just changed the -1 to a -2 to remove the comma and the space.

    Cheers again gents.

    if you are going to be using it for a function i would have a look at lynn's STUFF with FOR XML PATH as that could be put together into a iTVF. get an even bigger performance boost over the cursor.

    EDIT: after looking at who i was responding to i am willing to hazard a guess that anthony knows that all ready.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • anthony.green (7/26/2012)


    Thanks Gents, been one of them days and nothing gone right so far, so my heads not in the game.

    Gone with John's responce as its the simpler to impliment, just changed the -1 to a -2 to remove the comma and the space.

    Cheers again gents.

    Be sure to handle nulls appropriately (something STUFF does for you). If you don't any one null entry in the table will NULL out the entire list.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Thanks all.

    With John's solution the view complete its run in 7 minutes 7 seconds and didnt max out TempDB.

    I have had to change Lynn's solution slightly

    DECLARE @descr VARCHAR(MAX) = '', @sid INT = 10

    ;WITH CTE AS

    (

    SELECT

    csg.CID

    FROM

    table1 csg

    JOIN

    table2 map

    ON

    map.PID = csg.PID

    WHERE

    map.SID = @sid

    UNION

    SELECT

    cs.CID

    FROM

    table3 cs

    WHERE

    cs.SID = @sid

    ),

    CTE2 AS(

    SELECT CommaList = STUFF((

    SELECT ', ' + CONVERT(VARCHAR,c.CID)

    FROM CTE c

    ORDER BY c.ID

    FOR XML PATH(''),TYPE).value('.','varchar(max)'),1,2,'')

    FROM CTE

    )

    SELECT @descr = CommaList FROM CTE2 GROUP BY CommaList

    RETURN (@descr)

    Wrapped the STUFF into another CTE due to it duplicating the row for how ever many CID's there are, so that can do a group by to just return the 1 distinct row.

    Currently running it based on Lynn's solution.

    Matt, luckly enough the columns in the select are primary keys so I dont need to worry about encountering a NULL value, but thanks for the reminder.

    Edit, doh forgot to change the return type, thats now been done to return a table so its iTVF and not scalar.

  • John's solution just run again took 8mins 8 seconds

    Lynn's solution as used in a iTVF took 18 minutes 26 seconds.

  • anthony.green (7/27/2012)


    John's solution just run again took 8mins 8 seconds

    Lynn's solution as used in a iTVF took 18 minutes 26 seconds.

    How many rows are you working with? It would be interesting to see how the cross-tab method compares.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • anthony.green (7/27/2012)


    Thanks all.

    With John's solution the view complete its run in 7 minutes 7 seconds and didnt max out TempDB.

    I have had to change Lynn's solution slightly

    <snip>

    Do me a favour and try Lynn's solution again like this: -

    DECLARE @descr VARCHAR(MAX) = '', @sid INT = 10;

    WITH CTE AS (

    SELECT csg.CID

    FROM table1 csg

    INNER JOIN table2 map ON map.PID = csg.PID

    WHERE map.SID = @sid

    UNION

    SELECT cs.CID

    FROM table3 cs

    WHERE cs.SID = @sid)

    SELECT @descr = STUFF((SELECT ', ' + CONVERT(VARCHAR, c.CID)

    FROM CTE c

    WHERE c.CID = a.CID

    ORDER BY c.ID

    FOR XML PATH(''), TYPE).value('.', 'varchar(max)'), 1, 2, '')

    FROM CTE a

    RETURN (@descr)

    You removed the WHERE clause in the subquery, which made the optimiser have to do a lot more work than it needed to.

    Scratch that. Can you provide DDL? I'd be interested to test the performance since I've always found FOR XML to be faster at string concatenation.


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • The CTE within the function unfiltered on a SID brings back over 4.2 million rows,

  • Not exactly sure what is happening, could you post the fdl for the table, some sample data, and your code so I can play a little?

  • Oh, and the expected results based on the sample data.

    Thanks.

  • Lynn Pettis (7/27/2012)


    Oh, and the expected results based on the sample data.

    Thanks.

    Lynn, just out of curiosity - because I don't know and "a developer should always know" - do you mind if I take the cross-tab version?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 15 posts - 1 through 15 (of 56 total)

You must be logged in to reply to this topic. Login to reply