Unexpected behaviour of query

  • Hi All,

    I am using below query in sql server to find out top 5 amount and date by customer.

    I have used union for the same. But it gives different counts when i ran this query again and again. Sometimes it shows count as 69071 and sometimes it 69072 with same query.

    Can you please help me ?

    Query:

    SELECT COUNT(*)

    FROM EMEA_SERVICEMAX.Servicemax.SERVICE_CONTRACTS_STG c

    where (CONTRACT_STATUS_DESCRIPTION !='Active')

    and erp_id not in ( select

    a.ERP_ID

    from

    (select * ,ROW_NUMBER() over (Partition BY customer_number order by ANNUAL_CONTRACT_PRICE desc) as Rank

    from EMEA_SERVICEMAX.Servicemax.SERVICE_CONTRACTS_STG where CONTRACT_START_DATE >= GETDATE()-730

    and CONTRACT_STATUS_DESCRIPTION not in ('Free','Canceled','Cancelled')

    ) a

    where a.rank <=5

    union all

    select

    b.ERP_ID

    from

    (select *,ROW_NUMBER() over (Partition BY customer_number order by CONTRACT_START_DATE desc) as Rank

    from EMEA_SERVICEMAX.Servicemax.SERVICE_CONTRACTS_STG where CONTRACT_START_DATE >= GETDATE()-730

    and CONTRACT_STATUS_DESCRIPTION not in ('Free','Canceled','Cancelled')

    ) b

    where b.rank <=5

    )

  • Without sample data it's not easy to tell, however, you're using ROW_NUMBER which can be non-deterministic and can give inconsistent results.

    Try using RANK or DENSE_RANK instead of ROW_NUMBER

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • I agree that the problem's going to be connected to your row_number() and the fact that the field used in the Order by section of the OVER clause i.e. ANNUAL_CONTRACT_PRICE and/or CONTRACT_START_DATE are not unique for a customer.

    If you can add the other elements of the primary key into the 2 Order By's, then you should be OK. (I think!)

  • I was thinking it's the GETDATE()-730 part. That's non-deterministic and might explain the different results as it's part of your where clause.

  • My primary key for table is erp_id and service_contract_number. I have added the erp_id in order by clause. but my top records shows wrong.

    e.g. Top records by customer with annual_contract_price.

    select

    ERP_ID ,CUSTOMER_NUMBER,ANNUAL_CONTRACT_PRICE,Rank

    from

    (select csa.* ,ROW_NUMBER() over (Partition BY csa.customer_number order by csa.ANNUAL_CONTRACT_PRICE desc) as Rank

    from EMEA_SERVICEMAX.Servicemax.SERVICE_CONTRACTS_STG csa where csa.CONTRACT_START_DATE >= GETDATE()-730

    and csa.CONTRACT_STATUS_DESCRIPTION not in ('Free','Canceled','Cancelled')) a

    where a.rank <=5 and a.CUSTOMER_NUMBER='110000053'

    Result:

    ERP_IDCUSTOMER_NUMBERANNUAL_CONTRACT_PRICERank

    C10003890-axq2011-1781100000531963.941

    C10003778-axq2011-178110000053720.132

    C10003775-axq2011-178110000053639.393

    C10003886-axq2011-178110000053591.874

    C10003847-axq2011-178110000053579.685

    By Adding erp_id in order by clause:

    result is defferent.

    select

    ERP_ID ,CUSTOMER_NUMBER,ANNUAL_CONTRACT_PRICE,Rank

    from

    (select csa.* ,ROW_NUMBER() over (Partition BY csa.customer_number order by

    csa.erp_id,csa.ANNUAL_CONTRACT_PRICE desc) as Rank

    from EMEA_SERVICEMAX.Servicemax.SERVICE_CONTRACTS_STG csa where csa.CONTRACT_START_DATE >= GETDATE()-730

    and csa.CONTRACT_STATUS_DESCRIPTION not in ('Free','Canceled','Cancelled')) a

    where a.rank <=5 and a.CUSTOMER_NUMBER='110000053'

    Result :

    ERP_IDCUSTOMER_NUMBERANNUAL_CONTRACT_PRICERank

    C10003731-axq2011-178110000053348 1

    C10003752-axq2011-178110000053559.842

    C10003753-axq2011-178110000053536.133

    C10003754-axq2011-178110000053545.164

    C10003755-axq2011-178110000053457.455

    Plz help me?

  • Try changing

    select csa.* ,ROW_NUMBER() over (Partition BY csa.customer_number order by

    csa.erp_id,csa.ANNUAL_CONTRACT_PRICE desc) as Rank

    to

    select csa.* ,ROW_NUMBER() over (Partition BY csa.customer_number order by

    csa.ANNUAL_CONTRACT_PRICE, csa.erp_id desc) as Rank

    So that if you have a draw on the ANNUAL_CONTRACT_PRICE, then the erp_id will ensure consistency in the ordering.

  • -- This looks correct:

    SELECT

    ERP_ID,

    customer_number,

    ANNUAL_CONTRACT_PRICE,

    [Rank]

    FROM (

    SELECT

    ERP_ID,

    customer_number,

    ANNUAL_CONTRACT_PRICE,

    ROW_NUMBER() OVER (PARTITION BY customer_number ORDER BY ANNUAL_CONTRACT_PRICE DESC) AS [Rank]

    FROM EMEA_SERVICEMAX.Servicemax.SERVICE_CONTRACTS_STG

    WHERE CONTRACT_START_DATE >= GETDATE()-730

    AND CONTRACT_STATUS_DESCRIPTION NOT IN ('Free','Canceled','Cancelled')

    ) a

    WHERE a.[rank] <=5

    AND a.CUSTOMER_NUMBER = '110000053'

    -- Test it:

    SELECT TOP 5 *

    FROM EMEA_SERVICEMAX.Servicemax.SERVICE_CONTRACTS_STG

    WHERE CONTRACT_START_DATE >= GETDATE()-730

    AND CONTRACT_STATUS_DESCRIPTION NOT IN ('Free','Canceled','Cancelled')

    AND a.CUSTOMER_NUMBER = '110000053'

    ORDER BY ANNUAL_CONTRACT_PRICE DESC

    -- Suggestions:

    -- Use a variable for the cutoff date and remove the time component

    SELECT @CutoffDate = DATEADD(day,DATEDIFF(day,0,GETDATE()-730),0)

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply