Distinct values from 2 columns

  • I want to get the distinct values from this table as row no 3 and 10 are the same .I just want to display one line

  • mnr123 - Sunday, July 16, 2017 1:14 PM

    I want to get the distinct values from this table as row no 3 and 10 are the same .I just want to display one line

    You'd use the DISTINCT keyword in your query.

    SELECT DISTINCT source_database_id, target_database_id
      FROM dbo.YourTable;

  • not sure if I am on the right track ..

    DECLARE @yourtable TABLE
    (source_database_id VARCHAR(20),
    target_database_id VARCHAR(50)
    );
    INSERT INTO @yourtable
    VALUES
    ('WDICENTRAL', 'WDIFINAL'),('WDIFINAL', 'WDICENTRAL'),
    ('GDFADVWKG', 'WDIFINAL'),('SOCQNT', 'WDICENTRAL')
    ;

    SELECT DISTINCT source_database_id, target_database_id
    FROM @YourTable;

    WITH cte
      AS (SELECT source_database_id,
          target_database_id,
          rn = ROW_NUMBER() OVER(PARTITION BY CAST(CHECKSUM(source_database_id) AS BIGINT) + CAST(CHECKSUM(target_database_id) AS BIGINT) ORDER BY source_database_id)
       FROM @YourTable)

    SELECT source_database_id,
       target_database_id
    FROM cte
    WHERE rn = 1;

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • @alan.B

    Yeah, that's how I understood the requirement as well.

    Another option:

    SELECT DISTINCT
    source_database_id=CASE WHEN source_database_id<target_database_id THEN source_database_id ELSE target_database_id END
    ,target_database_id=CASE WHEN source_database_id<target_database_id THEN target_database_id ELSE source_database_id END
    FROM @yourtable;

    The trick with this one is that for pairs of servers that have only one row, the returned values might not match the original (i.e., source and target might be swapped).

    I only bothered mentioning it because it's a smidge faster, and from the stated requirements it seems that which particular server is the source and which is the target doesn't matter much, since we're eliminating an arbitrary row when a row exists for both source/target combinations.

    Cheers!

  • Just to mention that at some point, preserving source data may have value, here's a way to keep it intact, while still accomplishing the same goal.  Note the differences and the fact that for larger datasets, this may not be ideal from a performance perspective:
    DECLARE @yourtable AS TABLE (
        row_id INT IDENTITY(1,1),
        source_database_id VARCHAR(20),
        target_database_id VARCHAR(50)
    );
    INSERT INTO @yourtable (source_database_id, target_database_id)
        VALUES    ('GDFADVWKG', 'WDIFINAL'),
                ('SOCQNT', 'WDICENTRAL'),
                ('WDICENTRAL', 'WDIFINAL'),
                ('WDIFINAL', 'DataVisual'),
                ('WDIFINAL', 'EDSTATS'),
                ('WDIFINAL', 'GFDDP'),
                ('WDIFINAL', 'IDA14RMS'),
                ('WDIFINAL', 'ISDB_2009'),
                ('WDIFINAL', 'MDG'),
                ('WDIFINAL', 'WDICENTRAL'),
                ('WDIWKG', 'WDIFINAL');

    WITH SORTED_LIST AS (

        SELECT row_id,
            CASE
                WHEN source_database_id < target_database_id THEN source_database_id
                ELSE target_database_id
            END AS database_id_1,
            CASE
                WHEN source_database_id < target_database_id THEN 'SOURCE'
                ELSE 'TARGET'
            END AS database_type_1,
            CASE
                WHEN target_database_id > source_database_id THEN target_database_id
                ELSE source_database_id
            END AS database_id_2,
            CASE
                WHEN target_database_id > source_database_id THEN 'TARGET'
                ELSE 'SOURCE'
            END AS database_type_2
        FROM @yourtable
    ),
        DISTINCT_LIST AS (

            SELECT SL.database_id_1, SL.database_id_2, MIN(row_id) AS MIN_ROW, MAX(row_id) AS MAX_ROW
            FROM SORTED_LIST AS SL
            GROUP BY SL.database_id_1, SL.database_id_2
    ),
        FINAL_LIST AS (

            SELECT    DL.database_id_1, SL.database_type_1,
                    DL.database_id_2, SL.database_type_2,
                    SL.row_id
            FROM DISTINCT_LIST AS DL
                INNER JOIN SORTED_LIST AS SL
                    ON DL.MIN_ROW = SL.row_id
    )
    SELECT    CASE FL.database_type_1 WHEN 'SOURCE' THEN database_id_1 ELSE database_id_2 END AS source_database_id,
            CASE FL.database_type_1 WHEN 'TARGET' THEN database_id_1 ELSE database_id_2 END AS target_database_id,
            FL.row_id
    FROM FINAL_LIST AS FL
    ORDER BY FL.row_id;

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • I like Jacob Wilkins's solution. And I'd go a step further and suggest you consider enforcing a rule that ColumnA <= ColumnB in your source table too if possible--if you have control over what goes into it. That will make things like this a lot easier in future.


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply