T-SQL find unique records

  • Dear friends,

    I have another problem to discuss with you.

    Find unique rows from below logistics table, assumption: hyd to bang and bang to hyd as one record etc.

    From To Distance

    hyd banglore 550

    hyd vizag 550

    banglorehyd 550

    hyd pune 650

    I have implemented a cursor solution but the interviewer is looking for single query using JOINs or other concept.

    Please do the needful.

  • If this is an interview, then 'the needful' means "admit you don't know".

    Eddie Wuerch
    MCM: SQL

  • How would you solve this on paper? If you explain what you'd do, or make an attempt, we are happy to help. However, as Eddie noted, we aren't looking to do work for you.

  • Hi all,

    Yes, its an interview question.

    As already mentioned in my post, I have implemented the solution using Cursors but the interviewer looking for solution using single SQL query which I don't know.

    Please help.

  • Eddie Wuerch (6/11/2016)


    If this is an interview, then 'the needful' means "admit you don't know".

    It's a curious expression isn't it. Never used in US or UK English, it seems to be specific for parts of India.

    durga.palepu (6/13/2016)


    Hi all,

    Yes, its an interview question.

    As already mentioned in my post, I have implemented the solution using Cursors but the interviewer looking for solution using single SQL query which I don't know.

    Please help.

    There are numerous ways of performing this in a set-based manner. I can think of four while reading the question. Most of them depend on conditionally swapping the [from] and [to] columns so that [from] is less than [to] (or vice versa) thus allowing the use of either ROW_NUMBER or an aggregate. I think you could even incorporate the conditional swap into ROW_NUMBER.

    Here’s something to get you started:

    ROP TABLE #Logistics

    ;WITH Logistics ([From], [To], Distance) AS (

    SELECT 'hyd', 'banglore', 550 UNION ALL

    SELECT 'hyd', 'vizag', 550 UNION ALL

    SELECT 'banglore', 'hyd', 550 UNION ALL

    SELECT 'hyd', 'pune', 650)

    SELECT * INTO #Logistics FROM Logistics

    SELECT

    [From] = CASE WHEN [From] < [To] THEN [From] ELSE [To] END,

    [To] = CASE WHEN [To] > [From] THEN [To] ELSE [From] END,

    Distance

    FROM #Logistics

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks for the quick response!!

    However this solution is not serving the purpose.

    meanwhile I have tried and my solution is as followed, the problem with solution is unique record(s) from duplicate records are missing.

    select l1.c_From, l1.c_To, l1.c_Distance

    from #logistics l1

    where not exists

    ( select 1

    from #logistics l2 where l2.c_From = l1.c_To and l2.c_To = l1.c_From);

    Result:

    c_Fromc_To c_Distance

    hyd vizag 550

    hyd pune 650

  • durga.palepu (6/13/2016)


    Thanks for the quick response!!

    However this solution is not serving the purpose.

    meanwhile I have tried and my solution is as followed, the problem with solution is unique record(s) from duplicate records are missing.

    select l1.c_From, l1.c_To, l1.c_Distance

    from #logistics l1

    where not exists

    ( select 1

    from #logistics l2 where l2.c_From = l1.c_To and l2.c_To = l1.c_From);

    Result:

    c_Fromc_To c_Distance

    hyd vizag 550

    hyd pune 650

    The code I posted isn't a full solution: "Here’s something to get you started". Converting it into a solution is a trivial matter. Your proposed solution completely ignores all of the hints I gave you. All of the information you need to create a working solution is there. Have another look at my code and the output and imagine what effects you might observe if you were to incorporate either ROW_NUMBER or GROUP BY into the query.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks, that's awesome solution.

    Your approach to provide hints so that I would work on actual solution is excellent so that I could learn my self.

    Here follow's my solution implementations:

    using GROUP BY:

    SELECT

    CASE WHEN c_From < c_To then c_From else c_To end as c_from

    ,CASE WHEN c_to > c_From then c_to else c_from end as c_to

    ,c_Distance

    FROM #logistics

    GROUP BY CASE WHEN c_From < c_to then c_from else c_to end

    , CASE WHEN c_to > c_from then c_to else c_from end

    , c_distance;

    using ROW_NUMBER():

    with ctelogistics

    as

    (

    select c_from,c_to,c_Distance, ROW_NUMBER() over(partition by CASE WHEN c_From < c_To then c_From else c_To end

    ,CASE WHEN c_to > c_From then c_to else c_from end order by c_from) as rn

    from #logistics

    )

    select c_From,c_to,c_Distance

    from ctelogistics

    where rn=1;

  • That's the ticket! Nice job.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thank you!!

  • durga.palepu (6/13/2016)


    Thanks, that's awesome solution.

    Your approach to provide hints so that I would work on actual solution is excellent so that I could learn my self.

    Here follow's my solution implementations:

    using GROUP BY:

    SELECT

    CASE WHEN c_From < c_To then c_From else c_To end as c_from

    ,CASE WHEN c_to > c_From then c_to else c_from end as c_to

    ,c_Distance

    FROM #logistics

    GROUP BY CASE WHEN c_From < c_to then c_from else c_to end

    , CASE WHEN c_to > c_from then c_to else c_from end

    , c_distance;

    using ROW_NUMBER():

    with ctelogistics

    as

    (

    select c_from,c_to,c_Distance, ROW_NUMBER() over(partition by CASE WHEN c_From < c_To then c_From else c_To end

    ,CASE WHEN c_to > c_From then c_to else c_from end order by c_from) as rn

    from #logistics

    )

    select c_From,c_to,c_Distance

    from ctelogistics

    where rn=1;

    Another way:

    SELECT [From], [To], Distance

    FROM

    (

    SELECT

    [From] = CASE WHEN [From] < [To] THEN [From] ELSE [To] END,

    [To] = CASE WHEN [To] > [From] THEN [To] ELSE [From] END,

    Distance

    FROM #Logistics

    ) x

    GROUP BY [From], [To], Distance;

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply