STUFF XML PATH + Partition By

  • Hello,

    I need to concatenate multiple rows into one based on a condition:

    DC, UPC, MASTERCHAINNAME

    DC01, 9780061491900, TARGET

    DC01, 9780061491900, WALMART

    DC01, 9780061491900, COTSCO

    So I want it to be

    DC, UPC, MASTERCHAINNAME

    DC01, 9780061491900, TARGET, WALMART, COSTCO

    I read it is possible to do it with the help of STUFF XML PATH function, which does concat all fields, but it doesn't group them therefore I get all the chains in one column and it doesn't look if it applies to that DC and UPC#. I need a function like PARTITION BY that can filter MASTERCHAINNMAES by DC & UPC.

    Issue_01

     

    SELECT DC, UPC, stuff((select ', ' + MASTERCHAINNAME 
    from pre_filtering
    for xml path ('')
    ), 1, 2, '') as MASTERCHAINNAME
    FROM pre_filtering
    ORDER BY DC, UPC

     

    Right now it just gives me all chain names and that is wrong (without looking at UNIT# (UPC) and DC.

    Does someone know how this function can work?

    • This topic was modified 1 year, 2 months ago by  JeremyU. Reason: SQL
    • This topic was modified 1 year, 2 months ago by  JeremyU.
  • Create some test data in a consumable format (you should have done this):

    DROP TABLE IF EXISTS #pre_filtering;

    CREATE TABLE #pre_filtering (
    DC VARCHAR(50),
    UPC VARCHAR(50),
    MASTERCHAINNAME VARCHAR(50)
    );

    INSERT INTO #pre_filtering (DC, UPC, MASTERCHAINNAME)
    VALUES ('DC01', '9780061491900', 'TARGET'),
    ('DC01', '9780061491900', 'WALMART'),
    ('DC01', '9780061491900', 'COSTCO')
    ;

    With SQL 2019 you can use STRING_AGG instead:

    SELECT DC, 
    UPC,
    STRING_AGG(MASTERCHAINNAME, ',') as MASTERCHAINNAME
    FROM #pre_filtering
    GROUP BY DC, UPC
    ORDER BY DC, UPC
    ;

     

  • I have SSMS v18.5.1. When I tried to do it, it gave me an error namely 'STRING_AGG' is not a recognized built-in function name.

  • JeremyU wrote:

    I have SSMS v18.5.1. When I tried to do it, it gave me an error namely 'STRING_AGG' is not a recognized built-in function name.

    It's the version of SQL Server not the version of SSMS you need.

    Try: SELECT @@VERSION

  • You need a GROUP BY:

    SELECT DC, UPC, stuff((select ', ' + MASTERCHAINNAME 
    from pre_filtering b
    where b.DC = a.DC
    and b.UPC = a.UPC
    for xml path ('')
    ), 1, 2, '') as MASTERCHAINNAME
    FROM pre_filtering a
    GROUP BY DC, UPC
    ORDER BY DC, UPC
    ;

    • This reply was modified 1 year, 2 months ago by  Jonathan AC Roberts. Reason: Added where b.DC = a.DC and b.UPC = a.UPC
  • Jonathan it is Microsoft SQL Server 2016

  • JeremyU wrote:

    Jonathan it is Microsoft SQL Server 2016

    You asked the question in the SQL 2019 forum.

    Try the code in the last comment I made.

  • Jonathan thanks for your help. I appreciate it.

    There is an issue here. It just prints all the chain names and doesn't differentiate that some UPCs are not available or available for different locations.

    Issue_04

     

    Whereas this unit is available only at those locations

    Issue_06

     

    I don't know if you want to see my whole query, but this is it:

    I am just trying to list in which CHAINS items are available, but the code as it is right now  just lists all the chains without looking at DC and UPC (item#).

    ;with pre_filtering AS (
    SELECT distinct a.[DC], a.[UPC]
    ,b.[STARTDATE] AS 'MIN START POG DATE'
    ,c.[ENDDATE] AS 'MAX START POG DATE'
    ,a.[MASTERCHAINNAME]
    FROM [snapshot] a
    LEFT OUTER JOIN (SELECT distinct UPC, MASTERCHAINNAME, STARTDATE, ROW_NUMBER() OVER ( PARTITION BY UPC, MASTERCHAINNAME ORDER BY STARTDATE ASC) RN FROM [snapshot]) b
    ON a.UPC=b.UPC
    AND a.[MASTERCHAINNAME]=b.[MASTERCHAINNAME]
    LEFT OUTER JOIN (SELECT distinct UPC, MASTERCHAINNAME, ENDDATE, ROW_NUMBER() OVER ( PARTITION BY UPC, MASTERCHAINNAME ORDER BY ENDDATE DESC) RN FROM [snapshot]) c
    ON a.UPC=c.UPC
    AND a.[MASTERCHAINNAME]=c.[MASTERCHAINNAME]
    WHERE b.RN='1' and c.RN='1'

    )


    SELECT DC, UPC, MIN([MIN START POG DATE]) OVER (PARTITION BY UPC) AS 'MIN START POG DATE'
    ,MAX([MAX START POG DATE]) OVER (PARTITION BY UPC) AS 'MAX START POG DATE'

    ,stuff((select ', ' + MASTERCHAINNAME
    from pre_filtering
    for xml path ('')
    ), 1, 2, '') as MASTERCHAINNAME -- doesn't understand that some UPCs + DCs are not connected to MASTERCHAINNAMES
    FROM pre_filtering
    GROUP BY DC, UPC, [MIN START POG DATE], [MAX START POG DATE]
    ORDER BY DC, UPC
    ;


     

     

     

  • Try this:

    ;WITH pre_filtering AS 
    (
    SELECT DISTINCT
    a.[DC],
    a.[UPC],
    b.[STARTDATE] AS 'MIN START POG DATE',
    c.[ENDDATE] AS 'MAX START POG DATE',
    a.[MASTERCHAINNAME]
    FROM [snapshot] a
    LEFT JOIN (SELECT DISTINCT
    UPC,
    MASTERCHAINNAME,
    STARTDATE,
    ROW_NUMBER() OVER(PARTITION BY UPC, MASTERCHAINNAME ORDER BY STARTDATE ASC) RN
    FROM [snapshot]
    ) b
    ON a.UPC = b.UPC
    AND a.[MASTERCHAINNAME] = b.[MASTERCHAINNAME]
    LEFT JOIN (SELECT DISTINCT
    UPC,
    MASTERCHAINNAME,
    ENDDATE,
    ROW_NUMBER() OVER(PARTITION BY UPC, MASTERCHAINNAME ORDER BY ENDDATE DESC) RN
    FROM [snapshot]
    ) c ON a.UPC = c.UPC
    AND a.[MASTERCHAINNAME] = c.[MASTERCHAINNAME]
    WHERE b.RN = '1'
    AND c.RN = '1'
    )
    SELECT a.DC,
    a.UPC,
    MIN(a.[MIN START POG DATE]) OVER(PARTITION BY a.UPC) AS 'MIN START POG DATE',
    MAX(a.[MAX START POG DATE]) OVER(PARTITION BY a.UPC) AS 'MAX START POG DATE',
    STUFF((SELECT ', ' + b.MASTERCHAINNAME
    FROM pre_filtering b
    WHERE b.DC = a.DC
    AND b.UPC = a.UPC
    FOR xml PATH('')
    ), 1, 2, '') AS MASTERCHAINNAME -- doesn't understand that some UPCs + DCs are not connected to MASTERCHAINNAMES
    FROM pre_filtering a
    GROUP BY DC, UPC, [MIN START POG DATE], [MAX START POG DATE]
    ORDER BY DC, UPC
    ;
  • Something must be incorrect. It just keeps running and running. There are only around 249K rows I believe and it is running for pretty long time already and no output.

  • Try this instead, it splits it into 2 separate statements:

    IF OBJECT_ID('tempdb..#pre_filtering') IS NOT NULL
    DROP TABLE #pre_filtering;

    ;WITH pre_filtering AS
    (
    SELECT DISTINCT
    a.[DC],
    a.[UPC],
    b.[STARTDATE] AS 'MIN START POG DATE',
    c.[ENDDATE] AS 'MAX START POG DATE',
    a.[MASTERCHAINNAME]
    FROM [snapshot] a
    LEFT JOIN (SELECT DISTINCT
    UPC,
    MASTERCHAINNAME,
    STARTDATE,
    ROW_NUMBER() OVER(PARTITION BY UPC, MASTERCHAINNAME ORDER BY STARTDATE ASC) RN
    FROM [snapshot]
    ) b
    ON a.UPC = b.UPC
    AND a.[MASTERCHAINNAME] = b.[MASTERCHAINNAME]
    LEFT JOIN (SELECT DISTINCT
    UPC,
    MASTERCHAINNAME,
    ENDDATE,
    ROW_NUMBER() OVER(PARTITION BY UPC, MASTERCHAINNAME ORDER BY ENDDATE DESC) RN
    FROM [snapshot]
    ) c ON a.UPC = c.UPC
    AND a.[MASTERCHAINNAME] = c.[MASTERCHAINNAME]
    WHERE b.RN = '1'
    AND c.RN = '1'
    )
    SELECT *
    INTO #pre_filtering
    FROM pre_filtering
    ;

    SELECT a.DC,
    a.UPC,
    MIN(a.[MIN START POG DATE]) OVER(PARTITION BY a.DC, a.UPC) AS 'MIN START POG DATE',
    MAX(a.[MAX START POG DATE]) OVER(PARTITION BY a.DC, a.UPC) AS 'MAX START POG DATE',
    STUFF((SELECT ', ' + b.MASTERCHAINNAME
    FROM #pre_filtering b
    WHERE b.DC = a.DC
    AND b.UPC = a.UPC
    FOR xml PATH('')
    ), 1, 2, '') AS MASTERCHAINNAME -- doesn't understand that some UPCs + DCs are not connected to MASTERCHAINNAMES
    FROM #pre_filtering a
    GROUP BY DC, UPC, [MIN START POG DATE], [MAX START POG DATE]
    ORDER BY DC, UPC
    ;
  • If you want to make it even faster you can add an index to the temporary table:

    IF OBJECT_ID('tempdb..#pre_filtering') IS NOT NULL
    DROP TABLE #pre_filtering;

    ;WITH pre_filtering AS
    (
    SELECT DISTINCT
    a.[DC],
    a.[UPC],
    b.[STARTDATE] AS 'MIN START POG DATE',
    c.[ENDDATE] AS 'MAX START POG DATE',
    a.[MASTERCHAINNAME]
    FROM [snapshot] a
    LEFT JOIN (SELECT DISTINCT
    UPC,
    MASTERCHAINNAME,
    STARTDATE,
    ROW_NUMBER() OVER(PARTITION BY UPC, MASTERCHAINNAME ORDER BY STARTDATE ASC) RN
    FROM [snapshot]
    ) b
    ON a.UPC = b.UPC
    AND a.[MASTERCHAINNAME] = b.[MASTERCHAINNAME]
    LEFT JOIN (SELECT DISTINCT
    UPC,
    MASTERCHAINNAME,
    ENDDATE,
    ROW_NUMBER() OVER(PARTITION BY UPC, MASTERCHAINNAME ORDER BY ENDDATE DESC) RN
    FROM [snapshot]
    ) c ON a.UPC = c.UPC
    AND a.[MASTERCHAINNAME] = c.[MASTERCHAINNAME]
    WHERE b.RN = '1'
    AND c.RN = '1'
    )
    SELECT *
    INTO #pre_filtering
    FROM pre_filtering
    ;

    CREATE NONCLUSTERED INDEX IX_#pre_filtering_1 ON #pre_filtering(DC, UPC)
    INCLUDE (MASTERCHAINNAME, [MIN START POG DATE], [MAX START POG DATE])
    ;

    SELECT a.DC,
    a.UPC,
    MIN(a.[MIN START POG DATE]) OVER(PARTITION BY a.DC, a.UPC) AS 'MIN START POG DATE',
    MAX(a.[MAX START POG DATE]) OVER(PARTITION BY a.DC, a.UPC) AS 'MAX START POG DATE',
    STUFF((SELECT ', ' + b.MASTERCHAINNAME
    FROM #pre_filtering b
    WHERE b.DC = a.DC
    AND b.UPC = a.UPC
    FOR xml PATH('')
    ), 1, 2, '') AS MASTERCHAINNAME -- doesn't understand that some UPCs + DCs are not connected to MASTERCHAINNAMES
    FROM #pre_filtering a
    GROUP BY DC, UPC, [MIN START POG DATE], [MAX START POG DATE]
    ORDER BY DC, UPC
    ;

     

  • Jonathan thank you!! It works. I appreciate your help!!

  • NM.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • NM.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply