string_agg issue

  • I have a string_agg line in my code that is bringing back hundreds of duplicates in my service_cd field. Any ideas? Thx!:

    mbr_name     codes

    abc                   971, 971, 971, 971..............

    xyz                   978, 978, 978, 978..............

    select distinct
    s.MBR_NAME
    , string_agg(cast(s.SERVICE_CD as varchar(max)), ', ') as codes
    from MyReporting.dbo.EPISODE as s
    where s.EPISODE_TYPE in ( x2, x3 )
    group by s.MBR_NAME
    , s.SERVICE_CD
  • Your DISTINCT is applied to the result of the STRING_AGG, not to the elements it contains.

    Remove the dupes in a CTE and select your STRING_AGG from that.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • What is the datatype of SERVICE_CD? CASTing every element to VARCHAR(MAX) sounds like madness.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • It's varchar(15). Casting as varchar(max) was the seemingly only way to avoid this error:

    "STRING_AGG aggregation result exceeded the limit of 8000 bytes. Use LOB types to avoid result truncation."

  • DaveBriCam wrote:

    It's varchar(15). Casting as varchar(max) was the seemingly only way to avoid this error:

    "STRING_AGG aggregation result exceeded the limit of 8000 bytes. Use LOB types to avoid result truncation."

    After getting rid of the dupes, hopefully the error goes away without the need for a CAST.

    • This reply was modified 4 months, 2 weeks ago by  Phil Parkin.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Maybe like this?:

    select 
    s.MBR_NAME
    , string_agg(cast(s.SERVICE_CD as varchar(max)), ', ') as codes
    from (
    select distinct MBR_NAME, SERVICE_CD
    FROM MyReporting.dbo.EPISODE
    where EPISODE_TYPE in ( x2, x3 )
    ) as s
    group by s.MBR_NAME

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Thx! I prefer, for now, to look for a subquery since my *real* SQL already has one CTE (I've used 20+ before). I should not have oversimplified my code, but below is what I'm really up against. I tried the below and the duplicates persist

    declare @startdate date = '2023-08-01';
    declare @enddate date = '2023-08-31';

    with CTE_SARids
    as -- this is for collecting all the 'SAR' numbers for a date range along with the patients
    (select distinct
    DH.ENC_IDN
    , DH.MBR_IDN
    , DH.HIST_ADDED_DATE
    from MyReporting.dbo.V_MODEL_EPISODE_DECISION_HIST as DH
    where DH.HIST_ADDED_DATE
    between @startdate and @enddate
    group by DH.ENC_IDN
    , DH.MBR_IDN
    , DH.HIST_ADDED_DATE)

    -- Bringing back rest of needed data
    select SAR.ENC_IDN as 'SAR ID'
    , M.INDIVIDUALIZATION_NUMBER as 'Patient ID'
    , M.MBR_NAME as 'Patient Name'
    , DH.CURRENT_DECISION as 'Status'

    , string_agg(cast(S.SERVICE_CD as varchar(max)), ', ') as 'PC Codes'

    , convert(date, DH.HIST_ADDED_DATE, 101) as 'Initial Submission Date'
    , convert(date, DH.LAST_CHANGED_DATE, 101) as 'Effective Date'
    from CTE_SARids as SAR
    inner join MyReporting.dbo.V_MODEL_EPISODES as e
    on SAR.ENC_IDN = e.ENC_IDN

    inner join
    (
    select distinct
    SERVICE_CD
    , ENC_IDN
    , SVC_EXTN_IDN
    from MyReporting.dbo.V_MODEL_EPISODE_SERVICES
    ) as S

    on S.ENC_IDN = e.ENC_IDN
    inner join MyReporting.dbo.V_MODEL_EPISODE_PROVIDERS as P
    on P.ENC_IDN = e.ENC_IDN
    inner join MyReporting.dbo.V_MODEL_AUTH_SERVICE_EXTN_HIST as EH
    on EH.ENC_IDN = S.ENC_IDN
    and EH.SVC_EXTN_IDN = S.SVC_EXTN_IDN
    inner join MyReporting.dbo.V_MODEL_EPISODE_DECISION_HIST as DH
    on DH.ENC_IDN = S.ENC_IDN
    and DH.SVC_EXTN_IDN = S.SVC_EXTN_IDN
    inner join MyReporting.dbo.V_MODEL_EPISODE_COVERAGE as EC
    on EC.ENC_IDN = S.ENC_IDN
    inner join MyReporting.dbo.V_MODEL_MEMBERS as M
    on M.MBR_IDN = e.MBR_IDN
    inner join JivaReporting.dbo.V_MODEL_MBR_COVERAGE as MC
    on MC.MBR_IDN = e.MBR_IDN
    and MC.ID_TYPE_CD = 'alt'
    where e.EPISODE_TYPE in ( 'Behavioral Health Inpatient', 'Behavioral Health Outpatient', 'Inpatient', 'OutPatient' )
    and cast(DH.UPDATED_DATE as date)
    between @startdate and @enddate
    and cast(e.RECEIVED_DATE as date) > '3/31/2023'
    and DH.CURRENT_DECISION <> '-'
    and DH.CURRENT_DECISION in ( 'Approved', 'Denied' )
    and S.SERVICE_CD in ( '97151', '97152', '97153', '97154', '97155', '97156', '97157' )
    group by M.INDIVIDUALIZATION_NUMBER
    , M.MBR_NAME
    , SAR.ENC_IDN
    , DH.HIST_ADDED_DATE
    , EC.CLIENT_NAME
    , DH.LAST_CHANGED_DATE
    , DH.CURRENT_DECISION
    , S.SERVICE_CD
    , SAR.ENC_IDN;






    :

  • Wow, don't be offended, but what a mess! It might be time to think about a divide and conquer approach. Build up your result sets individually and then assemble them at the end to output what you need, as this will be much easier to debug.

    Also, in the CTE, you are doing a SELECT DISTINCT and a GROUP BY, on the same columns. Remove the GROUP BY and you'll get the same results.

     

    • This reply was modified 4 months, 2 weeks ago by  Phil Parkin.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • select SAR.ENC_IDN                                          as 'SAR ID'
    , M.INDIVIDUALIZATION_NUMBER as 'Patient ID'
    , M.MBR_NAME as 'Patient Name'
    , DH.CURRENT_DECISION as 'Status'

    , S.SERVICE_CDS as 'PC Codes' --<<--

    , convert(date, DH.HIST_ADDED_DATE, 101) as 'Initial Submission Date'
    , convert(date, DH.LAST_CHANGED_DATE, 101) as 'Effective Date'
    from ...
    inner join --<<--
    (
    select ENC_IDN,
    string_agg(SERVICE_CD, ', ') AS SERVICE_CDS
    from
    (
    select distinct
    SERVICE_CD
    , ENC_IDN
    from MyReporting.dbo.V_MODEL_EPISODE_SERVICES
    where S.SERVICE_CD in ( '97151', '97152', '97153', '97154', '97155', '97156', '97157' )
    ) as S0
    group by ENC_IDN
    ) as S
    on S.ENC_IDN = e.ENC_IDN
    ...
    /* shouldn't need nor want this
    group by M.INDIVIDUALIZATION_NUMBER --<<--
    , M.MBR_NAME
    , SAR.ENC_IDN
    , DH.HIST_ADDED_DATE
    , EC.CLIENT_NAME
    , DH.LAST_CHANGED_DATE
    , DH.CURRENT_DECISION
    , S.SERVICE_CD
    , SAR.ENC_IDN;
    */

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Thanks Scott! That worked

  • Great.  Glad it helped, and that the abbreviated way I posted it made sense.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply