Slow query.

  • My query

    select *

    from

    service_type_dim

    where

    service_type_code not in

    (

    select distinct service_type_code

    from service_type_dim

    inner join revenue_fact

    on

    service_type_dim.service_type_key = revenue_fact.service_type_key

    )

    It has been running 34 min and it is still running.

    Schema:

    --------

    CREATE TABLE [dbo].[REVENUE_FACT](

    [TIME_KEY] [tinyint] NOT NULL,

    [CUST_KEY] [int] NOT NULL,

    [CUR_ACCT_KEY] [int] NOT NULL,

    [CURRENCY_KEY] [smallint] NOT NULL,

    [SERVICE_TYPE_KEY] [smallint] NOT NULL,

    [BRANCH_KEY] [smallint] NOT NULL,

    [AUDIT_KEY] [tinyint] NOT NULL,

    [CUST_AGRMNT_KEY] [int] NOT NULL,

    [TRN_METRIC] [numeric](11, 2) NOT NULL,

    [RATE_USED] [numeric](8, 3) NOT NULL,

    [BILLED_REVENUE_ORGNL_CRNCY_AMT] [money] NOT NULL,

    [BILLED_REVENUE_AMT] [money] NOT NULL,

    [LAST_UPD_TS] [datetime] NOT NULL,

    [MP_TOT_SALE_AMT] [money] NULL,

    [MP_GROSS_REVENUE_AMT] [money] NULL,

    [MP_INTRCHG_PAID_AMT] [money] NULL,

    [MP_INTRCHG_RECD_AMT] [money] NULL,

    [STD_FEE_AMT_CALCULATION] [money] NULL,

    [FULL_COST_AMT] [money] NULL,

    [SUB_CUST_kEY] [int] NULL,

    [DIRECT_COST_AMT] [money] NULL,

    [PROFIT_ABOVE_TARGET] [money] NULL,

    [PROFIT_ABOVE_FULL_COST] [money] NULL,

    [PROFIT_ABOVE_DIRECT_COST] [money] NULL,

    [EMP_KEY] [smallint] NULL,

    [PRODUCT_REV_AMT] [money] NULL,

    [PRODUCT_REV_ORGNL_CRNCY_AMT] [money] NULL,

    [STD_FEE_AMT_CAD] [money] NULL

    ) ON [PROD_REVENUE_DATA]

    CREATE TABLE [dbo].[SERVICE_TYPE_DIM](

    [SERVICE_TYPE_KEY] [dbo].[SURROGATE_KEY] IDENTITY(50,1) NOT NULL,

    [SERVICE_TYPE_CODE] [smallint] NOT NULL,

    [SERVICE_TYPE_NAME] [char](60) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

    [SERVICE_NAME] [char](60) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

    [PRODUCT_CODE] [char](3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

    [PRODUCT_NAME] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

    [SERVICE_TYPE_REC_EFF_DATE] [datetime] NOT NULL,

    [SERVICE_TYPE_REC_END_DATE] [datetime] NOT NULL,

    [LAST_UPD_TS] [datetime] NOT NULL,

    [SERVICE_LAUNCH_DATE] [datetime] NULL,

    [SERVICE_TERM_DATE] [datetime] NULL,

    [PRODUCT_LAUNCH_DATE] [datetime] NULL,

    [PRODUCT_TERM_DATE] [datetime] NULL,

    [FEE_TYPE_NAME] [varchar](30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [FEE_UOM_DESC] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [FEE_CURRENCY_CODE] [varchar](4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [STD_FEE_AMT] [numeric](11, 6) NOT NULL CONSTRAINT [DF_SERVICE_TYPE_DIM_STD_FEE_AMT] DEFAULT (0),

    [DIRECT_COST_AMT] [numeric](11, 6) NOT NULL CONSTRAINT [DF_SERVICE_TYPE_DIM_DIRECT_COST_AMT] DEFAULT (0),

    [INDIRECT_COST_AMT] [numeric](11, 6) NOT NULL CONSTRAINT [DF_SERVICE_TYPE_DIM_INDIRECT_COST_AMT] DEFAULT (0),

    [COST_UOM_DESC] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [COST_CURRENCY_CODE] [varchar](4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [FULL_COST_AMT] [numeric](11, 6) NOT NULL CONSTRAINT [DF_SERVICE_TYPE_DIM_FULL_COST_AMT] DEFAULT (0),

    [CORE_OR_NONCORE] [char](2) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

    [KPI_flag] [char](1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    CONSTRAINT [PK__SERVICE_TYPE_DIM__195694DD] PRIMARY KEY CLUSTERED

    (

    [SERVICE_TYPE_KEY] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 99) ON [PROD_DIMENSION_DATA]

    ) ON [PROD_DIMENSION_DATA]

    REVENUE_FACT has no PK,indexes.

    It is 59 million records.

    SERVICE_TYPE_DIM has 14,000 records

    I need to find which SERVICE_TYPE_CODE's

    do not exist in REVENUE_FACT

  • It would perform better if you correlated the sub query or used a left join.

    SELECT *

    FROM dbo.service_type_dim std

    WHERE std.service_type_code not in

    (

    SELECT DISTINCT std2.service_type_code

    FROM service_type_dim std2

    INNER JOIN revenue_fact rf

    ON std2.service_type_key = rf.service_type_key

    AND std.service_type_code = std2.service_type_code

    )

    OR

    SELECT std.*

    FROM dbo.service_type_dim std

    LEFT JOIN revenue_fact rf

    ON std2.service_type_key = rf.service_type_key

    WHERE rf.service_type_key IS NULL

    Gary Johnson
    Sr Database Engineer

  • Guys,

    I found what the problem was.

    The query was running too long because

    a heavy insert was being done in REVENUE_FACT at that moment from some SSIS package.

    I switched to Production server and this query gave me the right results

    in 32 sec:

    select

    distinct

    PRODUCT_CODE,

    SERVICE_NAME,

    SERVICE_TYPE_CODE,

    SERVICE_TYPE_NAME,

    FEE_CURRENCY_CODE,

    STD_FEE_AMT,

    FEE_UOM_DESC,

    COST_CURRENCY_CODE,

    DIRECT_COST_AMT,

    INDIRECT_COST_AMT as Overhead,

    FULL_COST_AMT,

    SERVICE_TYPE_REC_END_DATE

    from

    service_type_dim

    where

    service_type_code not in

    (

    select distinct service_type_code

    from service_type_dim

    inner join revenue_fact

    on

    service_type_dim.service_type_key = revenue_fact.service_type_key

    )

    and

    SERVICE_TYPE_REC_END_DATE = '9999-12-31'

    order by

    service_type_code

  • riga1966 (9/11/2008)


    Guys,

    I found what the problem was.

    The query was running too long because

    a heavy insert was being done in REVENUE_FACT at that moment from some SSIS package.

    I switched to Production server and this query gave me the right results

    in 32 sec:

    select

    distinct

    PRODUCT_CODE,

    SERVICE_NAME,

    SERVICE_TYPE_CODE,

    SERVICE_TYPE_NAME,

    FEE_CURRENCY_CODE,

    STD_FEE_AMT,

    FEE_UOM_DESC,

    COST_CURRENCY_CODE,

    DIRECT_COST_AMT,

    INDIRECT_COST_AMT as Overhead,

    FULL_COST_AMT,

    SERVICE_TYPE_REC_END_DATE

    from

    service_type_dim

    where

    service_type_code not in

    (

    select distinct service_type_code

    from service_type_dim

    inner join revenue_fact

    on

    service_type_dim.service_type_key = revenue_fact.service_type_key

    )

    and

    SERVICE_TYPE_REC_END_DATE = '9999-12-31'

    order by

    service_type_code

    Not sure if it matters much, but you can drop the distinct from the subquery in the NOT IN clause.

    😎

  • riga1966 (9/10/2008)


    My query

    select *

    from

    service_type_dim

    where

    service_type_code not in

    (

    select distinct service_type_code

    from service_type_dim

    inner join revenue_fact

    on

    service_type_dim.service_type_key = revenue_fact.service_type_key

    )

    In many cases a "not exists" outperforms your in-list.

    It's certainly worth the test !

    select *

    from service_type_dim A

    where not exists

    ( select *

    from service_type_dim D

    inner join revenue_fact F

    on D.service_type_key = F.service_type_key

    and D.service_type_code = A.service_type_code

    )

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Looks like ALZDBA version

    is the best when you compare the speed:

    Total execution time250

    Wait time on server replies0

    My version:

    Total execution time 343

    Wait time on server replies328

    By the way, what is the most accurate way to compare performance?

    I'm using Client Statistics and focus on

    Total execution time, Wait time on server replies

    but not sure it;s a good method.

    Is "Wait time on server replies" shows you a pure server time execution?

    I mean I want to ignore Network and stuff. I just want to see how much time it took for a query to execute on the server.

  • I always use

    set statistics io on

    set statistics time on

    to get a grip rearding what SQLServer has to do for my query.

    That in combination with the graphical execution plan opens a whole new world.

    (and its XML content)

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • ALZDBA,

    I think when you choose "Include Client Statistics"

    in QA it does the same thing as

    set statistics io on

    set statistics time on

    But the important question is:

    How to read the results?

    What is the property that shows the exact execution time on the server?

    Excluding Network/Client overhead?

  • Euhm.... no. It does not produce the same result as show client stats.

    Test it and you'll see it reports on object base (stats io).

    You can then compare that to what you supposed it was going to do and how you supposed it would handle it.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply