usage of DISTINCT OPERATOR

  • Hi All,

    Is there a better way to rewrite below code without using DISTINCT operator. Other thing, is that I don't see any duplicates inside the table. But not sure in future if we get duplicate combination. that could be reason why some developer has written such type of code. Can we re-write this code in a better way??

    select distinct top (1)
         [x].[DateDrawn]
         ,[x].[EquipmentLife]
         ,[x].[LubricantLife]
         ,[x].[DateRegistered]
         ,[x].[SampleID]
         FROM [LubeAnalyst].[Sample] AS [x]
       WHERE [x].[SampleStatusID] = 1

    Thanks, 

    Sam

  • vsamantha35 - Thursday, December 21, 2017 1:37 PM

    Hi All,

    Is there a better way to rewrite below code without using DISTINCT operator. Other thing, is that I don't see any duplicates inside the table. But not sure in future if we get duplicate combination. that could be reason why some developer has written such type of code. Can we re-write this code in a better way??

    select distinct top (1)
         [x].[DateDrawn]
         ,[x].[EquipmentLife]
         ,[x].[LubricantLife]
         ,[x].[DateRegistered]
         ,[x].[SampleID]
         FROM [LubeAnalyst].[Sample] AS [x]
       WHERE [x].[SampleStatusID] = 1

    Thanks, 

    Sam

    Since the 1 record from the TOP(1) is necessarily unique, the DISTINCT keyword is superfluous and is ignored.

    Also note that this query is non-deterministic, because no sort order has been specified.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen - Thursday, December 21, 2017 1:49 PM

    vsamantha35 - Thursday, December 21, 2017 1:37 PM

    Hi All,

    Is there a better way to rewrite below code without using DISTINCT operator. Other thing, is that I don't see any duplicates inside the table. But not sure in future if we get duplicate combination. that could be reason why some developer has written such type of code. Can we re-write this code in a better way??

    select distinct top (1)
         [x].[DateDrawn]
         ,[x].[EquipmentLife]
         ,[x].[LubricantLife]
         ,[x].[DateRegistered]
         ,[x].[SampleID]
         FROM [LubeAnalyst].[Sample] AS [x]
       WHERE [x].[SampleStatusID] = 1

    Thanks, 

    Sam

    Since the 1 record from the TOP(1) is necessarily unique, the DISTINCT keyword is superfluous and is ignored.

    Also note that this query is non-deterministic, because no sort order has been specified.

    Drew

    You are right drew... my bad.. the sql stmt is as follows :

    SELECT DISTINCT TOP (1) [x].[DateDrawn]
            ,[x].[EquipmentLife]
            ,[x].[LubricantLife]
            ,[x].[DateRegistered]
            ,[x].[SampleID]
        FROM [LubeAnalyst].[Sample] AS [x]
        WHERE ([x].[SampleStatusID] = 1)
            --AND ([x].[ComponentID] = @__componentId_0)
        ORDER BY [x].[DateDrawn] DESC
            ,[x].[EquipmentLife] DESC
            ,[x].[LubricantLife] DESC
            ,[x].[DateRegistered] DESC
            ,[x].[SampleID]

  • Are any of those columns the primary key for the table?
    I just noticed the table is Sample and you have a SampleID. If SampleID is the PK you don't need distinct.

    Sue

  • The distinct is still completely unnecessary, as there is no way whatsoever that a TOP(1) could return duplicate rows.

    That said, removing the DISTINCT will not change the query's performance, as distinct will be getting done as part of the sort.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • This is almost the exact same query from another post, it just has a DISTINCT TOP (1) added to it.

  • Thank you all for the suggestions. SampleID is the PK.

  • From the code alone posted so far, it's a bit difficult to make out what you're actually trying to do.  From the description and a couple of hints in the code, it sounds like you're simply trying to get the lastest row that meets a condition for each SampleID.  If that's true, something like the following should do the trick for you.  The key is the RowNum column that is created in the CTE.


       WITH cteEnumerate AS
    (
     SELECT  RowNum = ROW_NUMBER() OVER (PARTITION BY SampleID, ORDER BY DateDrawn DESC)
            ,DateDrawn
            ,EquipmentLife
            ,LubricantLife
            ,DateRegistered
            ,SampleID
       FROM LubeAnalyst.Sample
      WHERE SampleStatusID = 1
     --AND (x.ComponentID = @__componentId_0)
    )
     SELECT  DateDrawn
            ,EquipmentLife
            ,LubricantLife
            ,DateRegistered
            ,SampleID
       FROM cteEnumerate
      WHERE RowNum = 1
    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply