Returning unique row from multiple in a table variable

  • I am building a process where I have a few places that have multiple rows with the same (non-unique) key. In one area, I can use the field that is different to return a single row, but I have a database structure where for another part of the query, I am getting multiple rows back and the data is useless to determine which row to return.

    In business logic, I have a single invoice number that has possibly several payors being billed. There is a payor ranking I can use to determine which row to return. The other area has no supplemental fields I can use, This business logic is only a shade away from the first example above, but it differs per invoice by who the guarantor is, and the different fields are telephone number, address bits and pieces and the guarantor name. The report I am building does not need a specific row, but I can return only one per invoice.

    So far, I have retrieved the raw data into a table variable, used ROW_NUMBER() OVER PARTITION to return the count of rows, but am over my head at getting just the maximum row from that. I have read examples on the web on both retrieving a single row and on using CTEs. What I have so far is creating a table variable and adding a row number column to hold the ROW_NUMBER OVER PARTITION results. When I try to feed the CTE to the temp table, I run into problems.

    DECLARE @guarantorTABLE

    (

    vst_int_idINTEGER,

    ivo_ext_idCHAR(23),

    row_numINTEGER,

    ent_typ_cdCHAR(1),

    fst_nmCHAR(15),

    mid_nmCHAR(15),

    lst_nmCHAR(25),

    adr_str_1CHAR(35),

    adr_str_2CHAR(35),

    cty_nmCHAR(35),

    cod_dtl_dsCHAR(40),

    zip_cdCHAR(10),

    phn_ara_cdCHAR(4),

    phn_exc_noCHAR(4),

    phn_lcl_noCHAR(7)

    )

    --INSERT INTO @guarantor

    ;WITH CTE AS

    (

    SELECT

    invoice.vst_int_id,

    invoice.ivo_ext_id,

    ROW_NUMBER() OVER

    (

    PARTITION BY

    invoice.ivo_ext_id

    ORDER BY

    invoice.vst_int_id,

    invoice.ivo_ext_id

    )AS row_num,

    TSM015.ent_typ_cd,

    TSM040g.fst_nm,

    TSM040g.mid_nm,

    TSM040g.lst_nm,

    TSM020g.adr_str_1,

    TSM020g.adr_str_2,

    TSM020g.cty_nm,

    TSM180gs.cod_dtl_ds,

    TSM020g.zip_cd,

    TSM060.phn_ara_cd,

    TSM060.phn_exc_no,

    TSM060.phn_lcl_no

    FROM

    @invoiceAS invoice

    LEFT OUTER JOIN

    @visitAS visit-- link to guarantor

    ON

    visit.vst_int_id = invoice.vst_int_id

    LEFT OUTER JOIN

    TPM350_VISIT_GUARANTORAS TPM350-- link to guarantor

    ON

    TPM350.vst_int_id = invoice.vst_int_id

    LEFT OUTER JOIN

    TSM015_ENTITYAS TSM015-- link to guarantor

    ON

    TSM015.ent_int_id = TPM350.guar_int_id

    LEFT OUTER JOIN

    TSM040_PERSON_HDRAS TSM040g-- guarantor name

    ON

    TSM040g.psn_int_id = TPM350.guar_int_idAND

    TSM040g.psn_int_id = visit.psn_int_id

    LEFT OUTER JOIN

    TSM021_ENT_ADRAS TSM021-- link

    ON

    TSM021.pri_fg= 'Y'AND-- Primary address flag

    TSM021.psn_int_id = TSM040g.psn_int_id

    LEFT OUTER JOIN

    TSM060_PHONEAS TSM060-- patient phone

    ON

    TSM060.phn_int_id = TSM040g.psn_int_id

    LEFT OUTER JOIN

    TSM061_ENT_PHNAS TSM061-- guarantor phone

    ON

    TSM061.phn_int_id = TSM060.phn_int_id

    LEFT OUTER JOIN

    TSM021_ENT_ADRAS TSM021g-- link to guarantor address

    ON

    TSM021.pri_fg= 'Y'AND-- original is TSM021.pri_fg

    TSM040g.psn_int_id = TSM021g.psn_int_id

    LEFT OUTER JOIN

    TSM020_ADDRESSAS TSM020g-- guarantor address

    ON

    TSM020g.adr_int_id = TSM021g.adr_int_id

    LEFT OUTER JOIN

    TSM180_MST_COD_DTLAS TSM180gs-- guarantor state code

    ON

    TSM180gs.cod_dtl_int_id = TSM020g.ste_cd

    -- The need here is to have but one row per invoice id removing duplicates caused by multiple payors on the invoice.

    GROUP BY

    invoice.vst_int_id,

    invoice.ivo_ext_id,

    TSM015.ent_typ_cd,

    TSM040g.fst_nm,

    TSM040g.mid_nm,

    TSM040g.lst_nm,

    TSM020g.adr_str_1,

    TSM020g.adr_str_2,

    TSM020g.cty_nm,

    TSM180gs.cod_dtl_ds,

    TSM020g.zip_cd,

    TSM060.phn_ara_cd,

    TSM060.phn_exc_no,

    TSM060.phn_lcl_no

    )

    INSERT INTO @guarantor

    SELECT

    *

    FROM

    CTE

    WHERE

    row_num = MAX(row_num)

    I do not want to filter out the row from the table variable because the SQL that follows this is a huge JOIN on all the pieces of data that runs slowly enough as it is. I just haven't figured out the way to make it work at temp table or table variable level.

  • jim.snyder (2/22/2016)


    I am building a process where I have a few places that have multiple rows with the same (non-unique) key. In one area, I can use the field that is different to return a single row, but I have a database structure where for another part of the query, I am getting multiple rows back and the data is useless to determine which row to return.

    In business logic, I have a single invoice number that has possibly several payors being billed. There is a payor ranking I can use to determine which row to return. The other area has no supplemental fields I can use, This business logic is only a shade away from the first example above, but it differs per invoice by who the guarantor is, and the different fields are telephone number, address bits and pieces and the guarantor name. The report I am building does not need a specific row, but I can return only one per invoice.

    So far, I have retrieved the raw data into a table variable, used ROW_NUMBER() OVER PARTITION to return the count of rows, but am over my head at getting just the maximum row from that. I have read examples on the web on both retrieving a single row and on using CTEs. What I have so far is creating a table variable and adding a row number column to hold the ROW_NUMBER OVER PARTITION results. When I try to feed the CTE to the temp table, I run into problems.

    ...I do not want to filter out the row from the table variable because the SQL that follows this is a huge JOIN on all the pieces of data that runs slowly enough as it is. I just haven't figured out the way to make it work at temp table or table variable level.

    First, welcome to the SSC forums!

    Note the link in my signature line about the best way to get help on this forum. Would it be possible for you to provide:

    1. The DDL and sample data for a few of your tables (you do not need all of them, just the temp variables and perhaps one of the permanent tables required by your query).

    2. Some easily consumable sample data

    3. A query that shows what your desired output would be

    Something like this (note that I can just copy/paste the code below and run it locally to understand what you have and what you are looking to accomplish)...

    USE tempdb

    GO

    -- Here's my DDL

    IF OBJECT_ID('TXXX') IS NOT NULL DROP TABLE TXXX;

    CREATE TABLE TXXX (XXID int primary key, XX varchar(100));

    DECLARE @table1 TABLE (C1 int primary key);

    DECLARE @table2 TABLE (C1 int primary key, X1 int);

    -- Here's some sample values

    INSERT INTO @table1 VALUES (1),(2),(3);

    INSERT INTO @table2 VALUES (1,2),(2,3),(3,3);

    INSERT INTO TXXX VALUES (1,3),(2,2),(3,2);

    -- here's my current query

    SELECT t1.C1, t.XXID

    FROM

    @table1 t1

    JOIN @table2 t2 ON t1.C1 = t2.X1

    JOIN TXXX t ON t2.X1 = t.XX

    -- What I'm looking for is something like this (only the C1 value with the highest XXID, no duplicate combinations of C1 and XXID)

    SELECT 2 AS C1 ,3 AS XXID UNION ALL SELECT 3,1;

    Edit: fixed IFCode

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Thanks, Alan. This is hospital data and I am bound by HIPAA and PMI to not give account numbers or personal information out. I understand (went to the link in your sig) what you are talking about, but short of describing the problem more, theoretical is the best I can do.

  • jim.snyder (2/22/2016)


    Thanks, Alan. This is hospital data and I am bound by HIPAA and PMI to not give account numbers or personal information out. I understand (went to the link in your sig) what you are talking about, but short of describing the problem more, theoretical is the best I can do.

    I worked for HCSC for awhile and know more than I care to about HIPAA and PMI. By sample data I mean completely fictitious.. instead of a real account number you can use stuff like 'XXXABC123' and stuff like "john doe1" for names. like 3-5 rows of data for each table involved would be enough.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Change: WHERE row_num = MAX(row_num)

    To: WHERE row_num = 1

    This is a quick guess based on what you've posted, without any testing. But I think MAX (row_num) will be returning the number of guarantors on the invoice with the largest number of guarantors, and every invoice won't have that many so won't have a matching row number. Every invoice should have a row number 1, however.

  • Much appreciated. Monday and the coffee had not hit yet. I remember figuring that out last week and don't know for the life of me why it was so difficult today.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply