September 10, 2008 at 1:56 pm
My query
select *
from
service_type_dim
where
service_type_code not in
(
select distinct service_type_code
from service_type_dim
inner join revenue_fact
on
service_type_dim.service_type_key = revenue_fact.service_type_key
)
It has been running 34 min and it is still running.
Schema:
--------
CREATE TABLE [dbo].[REVENUE_FACT](
[TIME_KEY] [tinyint] NOT NULL,
[CUST_KEY] [int] NOT NULL,
[CUR_ACCT_KEY] [int] NOT NULL,
[CURRENCY_KEY] [smallint] NOT NULL,
[SERVICE_TYPE_KEY] [smallint] NOT NULL,
[BRANCH_KEY] [smallint] NOT NULL,
[AUDIT_KEY] [tinyint] NOT NULL,
[CUST_AGRMNT_KEY] [int] NOT NULL,
[TRN_METRIC] [numeric](11, 2) NOT NULL,
[RATE_USED] [numeric](8, 3) NOT NULL,
[BILLED_REVENUE_ORGNL_CRNCY_AMT] [money] NOT NULL,
[BILLED_REVENUE_AMT] [money] NOT NULL,
[LAST_UPD_TS] [datetime] NOT NULL,
[MP_TOT_SALE_AMT] [money] NULL,
[MP_GROSS_REVENUE_AMT] [money] NULL,
[MP_INTRCHG_PAID_AMT] [money] NULL,
[MP_INTRCHG_RECD_AMT] [money] NULL,
[STD_FEE_AMT_CALCULATION] [money] NULL,
[FULL_COST_AMT] [money] NULL,
[SUB_CUST_kEY] [int] NULL,
[DIRECT_COST_AMT] [money] NULL,
[PROFIT_ABOVE_TARGET] [money] NULL,
[PROFIT_ABOVE_FULL_COST] [money] NULL,
[PROFIT_ABOVE_DIRECT_COST] [money] NULL,
[EMP_KEY] [smallint] NULL,
[PRODUCT_REV_AMT] [money] NULL,
[PRODUCT_REV_ORGNL_CRNCY_AMT] [money] NULL,
[STD_FEE_AMT_CAD] [money] NULL
) ON [PROD_REVENUE_DATA]
CREATE TABLE [dbo].[SERVICE_TYPE_DIM](
[SERVICE_TYPE_KEY] [dbo].[SURROGATE_KEY] IDENTITY(50,1) NOT NULL,
[SERVICE_TYPE_CODE] [smallint] NOT NULL,
[SERVICE_TYPE_NAME] [char](60) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[SERVICE_NAME] [char](60) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[PRODUCT_CODE] [char](3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[PRODUCT_NAME] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[SERVICE_TYPE_REC_EFF_DATE] [datetime] NOT NULL,
[SERVICE_TYPE_REC_END_DATE] [datetime] NOT NULL,
[LAST_UPD_TS] [datetime] NOT NULL,
[SERVICE_LAUNCH_DATE] [datetime] NULL,
[SERVICE_TERM_DATE] [datetime] NULL,
[PRODUCT_LAUNCH_DATE] [datetime] NULL,
[PRODUCT_TERM_DATE] [datetime] NULL,
[FEE_TYPE_NAME] [varchar](30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[FEE_UOM_DESC] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[FEE_CURRENCY_CODE] [varchar](4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[STD_FEE_AMT] [numeric](11, 6) NOT NULL CONSTRAINT [DF_SERVICE_TYPE_DIM_STD_FEE_AMT] DEFAULT (0),
[DIRECT_COST_AMT] [numeric](11, 6) NOT NULL CONSTRAINT [DF_SERVICE_TYPE_DIM_DIRECT_COST_AMT] DEFAULT (0),
[INDIRECT_COST_AMT] [numeric](11, 6) NOT NULL CONSTRAINT [DF_SERVICE_TYPE_DIM_INDIRECT_COST_AMT] DEFAULT (0),
[COST_UOM_DESC] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[COST_CURRENCY_CODE] [varchar](4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[FULL_COST_AMT] [numeric](11, 6) NOT NULL CONSTRAINT [DF_SERVICE_TYPE_DIM_FULL_COST_AMT] DEFAULT (0),
[CORE_OR_NONCORE] [char](2) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[KPI_flag] [char](1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
CONSTRAINT [PK__SERVICE_TYPE_DIM__195694DD] PRIMARY KEY CLUSTERED
(
[SERVICE_TYPE_KEY] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 99) ON [PROD_DIMENSION_DATA]
) ON [PROD_DIMENSION_DATA]
REVENUE_FACT has no PK,indexes.
It is 59 million records.
SERVICE_TYPE_DIM has 14,000 records
I need to find which SERVICE_TYPE_CODE's
do not exist in REVENUE_FACT
September 10, 2008 at 2:48 pm
It would perform better if you correlated the sub query or used a left join.
SELECT *
FROM dbo.service_type_dim std
WHERE std.service_type_code not in
(
SELECT DISTINCT std2.service_type_code
FROM service_type_dim std2
INNER JOIN revenue_fact rf
ON std2.service_type_key = rf.service_type_key
AND std.service_type_code = std2.service_type_code
)
OR
SELECT std.*
FROM dbo.service_type_dim std
LEFT JOIN revenue_fact rf
ON std2.service_type_key = rf.service_type_key
WHERE rf.service_type_key IS NULL
Gary Johnson
Sr Database Engineer
September 11, 2008 at 9:54 am
Guys,
I found what the problem was.
The query was running too long because
a heavy insert was being done in REVENUE_FACT at that moment from some SSIS package.
I switched to Production server and this query gave me the right results
in 32 sec:
select
distinct
PRODUCT_CODE,
SERVICE_NAME,
SERVICE_TYPE_CODE,
SERVICE_TYPE_NAME,
FEE_CURRENCY_CODE,
STD_FEE_AMT,
FEE_UOM_DESC,
COST_CURRENCY_CODE,
DIRECT_COST_AMT,
INDIRECT_COST_AMT as Overhead,
FULL_COST_AMT,
SERVICE_TYPE_REC_END_DATE
from
service_type_dim
where
service_type_code not in
(
select distinct service_type_code
from service_type_dim
inner join revenue_fact
on
service_type_dim.service_type_key = revenue_fact.service_type_key
)
and
SERVICE_TYPE_REC_END_DATE = '9999-12-31'
order by
service_type_code
September 11, 2008 at 10:25 am
riga1966 (9/11/2008)
Guys,I found what the problem was.
The query was running too long because
a heavy insert was being done in REVENUE_FACT at that moment from some SSIS package.
I switched to Production server and this query gave me the right results
in 32 sec:
select
distinct
PRODUCT_CODE,
SERVICE_NAME,
SERVICE_TYPE_CODE,
SERVICE_TYPE_NAME,
FEE_CURRENCY_CODE,
STD_FEE_AMT,
FEE_UOM_DESC,
COST_CURRENCY_CODE,
DIRECT_COST_AMT,
INDIRECT_COST_AMT as Overhead,
FULL_COST_AMT,
SERVICE_TYPE_REC_END_DATE
from
service_type_dim
where
service_type_code not in
(
select distinct service_type_code
from service_type_dim
inner join revenue_fact
on
service_type_dim.service_type_key = revenue_fact.service_type_key
)
and
SERVICE_TYPE_REC_END_DATE = '9999-12-31'
order by
service_type_code
Not sure if it matters much, but you can drop the distinct from the subquery in the NOT IN clause.
😎
September 11, 2008 at 11:32 am
riga1966 (9/10/2008)
My queryselect *
from
service_type_dim
where
service_type_code not in
(
select distinct service_type_code
from service_type_dim
inner join revenue_fact
on
service_type_dim.service_type_key = revenue_fact.service_type_key
)
In many cases a "not exists" outperforms your in-list.
It's certainly worth the test !
select *
from service_type_dim A
where not exists
( select *
from service_type_dim D
inner join revenue_fact F
on D.service_type_key = F.service_type_key
and D.service_type_code = A.service_type_code
)
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
September 11, 2008 at 1:41 pm
Looks like ALZDBA version
is the best when you compare the speed:
Total execution time250
Wait time on server replies0
My version:
Total execution time 343
Wait time on server replies328
By the way, what is the most accurate way to compare performance?
I'm using Client Statistics and focus on
Total execution time, Wait time on server replies
but not sure it;s a good method.
Is "Wait time on server replies" shows you a pure server time execution?
I mean I want to ignore Network and stuff. I just want to see how much time it took for a query to execute on the server.
September 11, 2008 at 2:15 pm
I always use
set statistics io on
set statistics time on
to get a grip rearding what SQLServer has to do for my query.
That in combination with the graphical execution plan opens a whole new world.
(and its XML content)
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
September 11, 2008 at 2:37 pm
ALZDBA,
I think when you choose "Include Client Statistics"
in QA it does the same thing as
set statistics io on
set statistics time on
But the important question is:
How to read the results?
What is the property that shows the exact execution time on the server?
Excluding Network/Client overhead?
September 12, 2008 at 12:13 am
Euhm.... no. It does not produce the same result as show client stats.
Test it and you'll see it reports on object base (stats io).
You can then compare that to what you supposed it was going to do and how you supposed it would handle it.
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
Viewing 9 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply