Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Why does this take so long! Expand / Collapse
Author
Message
Posted Sunday, September 30, 2012 6:05 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 3:19 PM
Points: 10, Visits: 60
Hi Everybody


This is my first posting in this forum, so please bear with me if I should get some thing wrong! I will eventually get there!

I have this query which does make use of some derived tables. My understanding is that this would result in a very efficient query but apparently things work a trifle differently. Could someone please advise why this query takes inordinately long time when the base data set in table "evd_NonNBV_ForChecking_BAU" contains any more that 20,000 records. This one contains a little over 46000 records.

This is the query (partial) :-

Select count(*) FRMP_RecordCount, b.RECORD_ID FRMP_Record_Id, b.METER_ID FRMP_Meter_Id, Convert(Varchar(10), a.FromDate, 103) FRMP_FromDate
, Convert(Varchar(10), a.ToDate, 103) FRMP_ToDate, Convert(Varchar(10), b.Billing_Start_Date, 103) FRMP_Billing_Start_Date, Convert(Varchar(10)
, b.Billing_End_Date, 103) FRMP_Billing_End_Date, NetAmount FRMP_NetAmount, a.Party FRMP_Party, b.PROVIDER FRMP_Provider

From (Select d.RECORD_ID, d.METER_ID, d.BILLING_START_DATE, d.BILLING_END_DATE, d.NET_AMOUNT As NetAmount, d.PROVIDER
From (Select METER_ID, BILLING_START_DATE, BILLING_END_DATE, PROVIDER
From NETWORK_VALIDATION..evd_NonNBV_ForChecking_BAU
Group By METER_ID, BILLING_START_DATE, BILLING_END_DATE, Provider
Having Sum(net_AMOUNT) <> 0) c
Inner Join
NETWORK_VALIDATION..evd_NonNBV_ForChecking_BAU d
On
c.METER_ID = d.METER_ID And c.BILLING_START_DATE = d.BILLING_START_DATE And c.BILLING_END_DATE = d.BILLING_END_DATE) b



--From #MyTemp_1 b

Left Join
NETWORK_VALIDATION..MSATS_DATA_BAU_FINAL a
On
a.NMI = b.METER_ID and (((b.BILLING_START_DATE between a.FromDate and a.ToDate) and (b.billing_end_date between a.FromDate and a.ToDate))
and a.Role = 'FRMP')

--Where a.NMI Is Null

Group By b.RECORD_ID, b.METER_ID, Convert(Varchar(10), a.FromDate, 103), Convert(Varchar(10), a.ToDate, 103), Convert(Varchar(10), b.Billing_Start_Date, 103)
, Convert(Varchar(10), a.ToDate, 103), Convert(Varchar(10), b.Billing_End_Date, 103), NetAmount, a.Party, b.PROVIDER


The question is :-
When this query is run with the derived table (as defined by table definition 'b' - shown in italics), the query takes a very long time to run - its been 16 minutes and its still going - but when this derived table is temporarily stored as a temp table or even a physical table, the query takes about 1 second. This is when the 'where clause' has been commented out. With the where clause included it takes even longer.


Any suggestions why is this happening. One more thing though (if this helps to know) is that when this derived table 'b' is inner joined with table 'a', it run very quickly, no problems there. This problem only shows up with left join. The query does complete but take about an hour when it should normally not take any more than a few seconds.

I have done this kind of left joins with large tables (with about 70 million records) and it hasn't taken this long ever. So what is happening with this one!


Any help with this one would be greatly appreciated.



Best regards



Deepak


Post #1366302
Posted Monday, October 1, 2012 1:20 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 3:21 PM
Points: 42,495, Visits: 35,566
Not enough information. Could be missing indexes, could be incorrect stats, could be a few other things

Please post table definitions, index definitions and execution plan, see http://www.sqlservercentral.com/articles/SQLServerCentral/66909/ for details

Cleaned up query:

SELECT  COUNT(*) FRMP_RecordCount ,
b.RECORD_ID FRMP_Record_Id ,
b.METER_ID FRMP_Meter_Id ,
CONVERT(VARCHAR(10), a.FromDate, 103) FRMP_FromDate ,
CONVERT(VARCHAR(10), a.ToDate, 103) FRMP_ToDate ,
CONVERT(VARCHAR(10), b.Billing_Start_Date, 103) FRMP_Billing_Start_Date ,
CONVERT(VARCHAR(10), b.Billing_End_Date, 103) FRMP_Billing_End_Date ,
NetAmount FRMP_NetAmount ,
a.Party FRMP_Party ,
b.PROVIDER FRMP_Provider
FROM ( SELECT d.RECORD_ID ,
d.METER_ID ,
d.BILLING_START_DATE ,
d.BILLING_END_DATE ,
d.NET_AMOUNT AS NetAmount ,
d.PROVIDER
FROM ( SELECT METER_ID ,
BILLING_START_DATE ,
BILLING_END_DATE ,
PROVIDER
FROM NETWORK_VALIDATION..evd_NonNBV_ForChecking_BAU
GROUP BY METER_ID ,
BILLING_START_DATE ,
BILLING_END_DATE ,
Provider
HAVING SUM(net_AMOUNT) <> 0
) c
INNER JOIN NETWORK_VALIDATION..evd_NonNBV_ForChecking_BAU d ON c.METER_ID = d.METER_ID
AND c.BILLING_START_DATE = d.BILLING_START_DATE
AND c.BILLING_END_DATE = d.BILLING_END_DATE
) b --From #MyTemp_1 b
LEFT JOIN NETWORK_VALIDATION..MSATS_DATA_BAU_FINAL a ON a.NMI = b.METER_ID
AND ( ( ( b.BILLING_START_DATE BETWEEN a.FromDate AND a.ToDate )
AND ( b.billing_end_date BETWEEN a.FromDate AND a.ToDate )
)
AND a.Role = 'FRMP'
)

--Where a.NMI Is Null
GROUP BY b.RECORD_ID ,
b.METER_ID ,
CONVERT(VARCHAR(10), a.FromDate, 103) ,
CONVERT(VARCHAR(10), a.ToDate, 103) ,
CONVERT(VARCHAR(10), b.Billing_Start_Date, 103) ,
CONVERT(VARCHAR(10), a.ToDate, 103) ,
CONVERT(VARCHAR(10), b.Billing_End_Date, 103) ,
NetAmount ,
a.Party ,
b.PROVIDER




Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1366339
Posted Monday, October 1, 2012 7:40 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 3:19 PM
Points: 10, Visits: 60
Hi Gail


Thanks for your response.

I don't know how would the table definition help but here they are :-

evd_NonNBV_ForChecking_BAU
([METER_ID] [varchar](10) NOT NULL,
[BILLING_START_DATE] [smalldatetime] NOT NULL,
[BILLING_END_DATE] [smalldatetime] NOT NULL,
[RECORD_ID] [int] IDENTITY(1,1) NOT NULL,
[NET_AMOUNT] [decimal](15, 2) NOT NULL,
[PROVIDER] [varchar](10) NOT NULL
)



MSATS_DATA_BAU_FINAL
([SequenceNumber] [numeric](18, 0) NULL,
[CreationDate] [datetime] NULL,
[MaintenanceDate] [datetime] NULL,
[RowStatus] [varchar](5) NULL,
[FromDate] [datetime] NULL,
[ToDate] [datetime] NULL,
[NMI] [char](10) NULL,
[Party] [varchar](30) NULL,
[Role] [varchar](20) NULL,
[CATS_FILE_ID] [int] NULL
)


There are no indexes on any of the tables

The execution plans for both options (one with temp table and the other with derived table) are attached herewith for your perusal. Also, attached is the spreadsheet showing the statistics and result output for the option with temp table. Will send the same when the query finishes execution with the derived tables.

Hope this would you in determining what is causing this inordinate delay in the query execution with the derived tables.



P.S The attachment has now (at 1.00 pm AEST) been updated to include the results and statistics for the option with derived tables as well.
This option took 75 minutes against less than a minute with temp table!!!!!!!!!!!!!!!!!!!!!



Best regards



Deepak


  Post Attachments 
Test Query Outputs.zip (3 views, 111.67 KB)
Post #1366847
Posted Tuesday, October 2, 2012 3:16 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 8:56 AM
Points: 7,137, Visits: 13,527
deepakagarwalathome (10/1/2012)
...There are no indexes on any of the tables...


This might have something to do with it. The first sentence of an excellent indexing article reads "Indexes are fundamental to database design".

Your query can be made more efficient by changing the structure slightly:

SELECT
COUNT(*) FRMP_RecordCount ,
b.RECORD_ID FRMP_Record_Id ,
b.METER_ID FRMP_Meter_Id ,
CAST(a.FromDate AS DATE) FRMP_FromDate ,
CAST(a.ToDate AS DATE) FRMP_ToDate ,
CAST(b.Billing_Start_Date AS DATE) FRMP_Billing_Start_Date ,
CAST(b.Billing_End_Date AS DATE) FRMP_Billing_End_Date ,
NetAmount FRMP_NetAmount ,
a.Party FRMP_Party ,
b.PROVIDER FRMP_Provider
FROM (
SELECT
RECORD_ID ,
METER_ID ,
BILLING_START_DATE ,
BILLING_END_DATE ,
NET_AMOUNT AS NetAmount ,
PROVIDER,
SUMnet_AMOUNT = SUM(net_AMOUNT) OVER(PARTITION BY METER_ID, BILLING_START_DATE, BILLING_END_DATE, Provider)
FROM NETWORK_VALIDATION..evd_NonNBV_ForChecking_BAU
) b

LEFT JOIN NETWORK_VALIDATION..MSATS_DATA_BAU_FINAL a
ON a.NMI = b.METER_ID
AND b.BILLING_START_DATE BETWEEN a.FromDate AND a.ToDate
AND b.billing_end_date BETWEEN a.FromDate AND a.ToDate
AND a.Role = 'FRMP'

WHERE b.SUMnet_AMOUNT <> 0

GROUP BY
b.RECORD_ID ,
b.METER_ID ,
CAST(a.FromDate AS DATE) FRMP_FromDate ,
CAST(a.ToDate AS DATE) FRMP_ToDate ,
CAST(b.Billing_Start_Date AS DATE) FRMP_Billing_Start_Date ,
CAST(b.Billing_End_Date AS DATE) FRMP_Billing_End_Date ,
b.NetAmount ,
a.Party ,
b.PROVIDER

But it's no substitute for correct indexing. The tables are only half the story.


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1366917
Posted Tuesday, October 2, 2012 10:21 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 3:19 PM
Points: 10, Visits: 60
Hi ChrisM@Work


Thanks a zillion for your response. That works fine. No problems.

A couple of questions though :-

First, whilst both 'Over Partiion By...' and the derived table sub-query do the same thing, is the former more efficient in its execution by an order of magnitude or it happened to be so in this one instance?
Second, I am attaching the full query that I now have (I haven't replaced the similar code in the first half of the query as the inner join is very quick - it was only the left join that was the problem - which has now been addressed - thanks to you) and I find that it work fine in about 9 seconds if the last 'Where...' clause after the table 'z' definition is not included. However, if you include this last 'Where...' clause' it takes a hell of a lot longer (10+ minutes). I can certainly populate a table with the dataset without the last 'Where...' clause and then extract those records where the condition of the last 'Where...' clause' is implemented. But I wanted to know is this happening. Any suggestions.

Further, out of interest, did the execution plans and the statistics that I had included give you any more indication as to why one option took 75+ minutes whilst the other took less than 10 seconds????????????



P.S. I have updated this post with the output and statistics for the two options - took 105 minutes with the last where clause included.


Best regards




Deepak


  Post Attachments 
MyQuery.txt (2 views, 7.67 KB)
Post #1367394
Posted Wednesday, October 3, 2012 9:51 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 8:56 AM
Points: 7,137, Visits: 13,527
deepakagarwalathome (10/2/2012)
Hi ChrisM@Work


Thanks a zillion for your response. That works fine. No problems.

A couple of questions though :-

First, whilst both 'Over Partiion By...' and the derived table sub-query do the same thing, is the former more efficient in its execution by an order of magnitude or it happened to be so in this one instance?

It’s more efficient because it only requires one read of the table. Aggregating then joining back requires two.


Second, I am attaching the full query that I now have (I haven't replaced the similar code in the first half of the query as the inner join is very quick - it was only the left join that was the problem - which has now been addressed - thanks to you) and I find that it work fine in about 9 seconds if the last 'Where...' clause after the table 'z' definition is not included. However, if you include this last 'Where...' clause' it takes a hell of a lot longer (10+ minutes). I can certainly populate a table with the dataset without the last 'Where...' clause and then extract those records where the condition of the last 'Where...' clause' is implemented. But I wanted to know is this happening. Any suggestions.

Lack of indexes and index stats – the optimizer can only make a poor guess with little to work on.
I’ve made a couple of amendments to your full query but there’s a way to go yet. There’s a lot of almost-duplicated reads of tables.


Further, out of interest, did the execution plans and the statistics that I had included give you any more indication as to why one option took 75+ minutes whilst the other took less than 10 seconds????????????

Not really. The execution plans are estimated – the actual plans are far more useful. If you can post them as attachments, that would be grand.


P.S. I have updated this post with the output and statistics for the two options - took 105 minutes with the last where clause included.


Best regards


Deepak


Here’s what I’ve got so far – test and see if it matches your existing output. Get the indexing done if you can.

/*
recommended indexes
NETWORK_VALIDATION..evd_NonNBV_ForChecking_BAU
METER_ID, RECORD_ID, BILLING_START_DATE, billing_end_date, provider

NETWORK_VALIDATION..MSATS_DATA_BAU_FINAL
NMI, FromDate, ToDate, Role, Party
*/

-- SumNetAmount1 and SumNetAmount2 generate the same result set
; WITH SumNetAmount1 AS (
Select d.RECORD_ID, d.METER_ID, d.BILLING_START_DATE, d.BILLING_END_DATE, d.NET_AMOUNT As NetAmount, d.PROVIDER
From (
Select METER_ID, BILLING_START_DATE, BILLING_END_DATE, PROVIDER
From NETWORK_VALIDATION..evd_NonNBV_ForChecking_BAU
Group By METER_ID, BILLING_START_DATE, BILLING_END_DATE, Provider
Having Sum(net_AMOUNT) <> 0
) c
Inner Join NETWORK_VALIDATION..evd_NonNBV_ForChecking_BAU d
on c.METER_ID = d.METER_ID
And c.BILLING_START_DATE = d.BILLING_START_DATE
And c.BILLING_END_DATE = d.BILLING_END_DATE
),
SumNetAmount2 AS (
Select RECORD_ID, METER_ID, BILLING_START_DATE, BILLING_END_DATE, NET_AMOUNT AS NetAmount, PROVIDER,
SUMNET_AMOUNT = SUM(net_AMOUNT) OVER(PARTITION BY METER_ID, BILLING_START_DATE, BILLING_END_DATE, Provider)
From NETWORK_VALIDATION..evd_NonNBV_ForChecking_BAU
)

select z.FRMP_RecordCount, z.FRMP_Record_Id, z.FRMP_Meter_Id, z.FRMP_FromDate, z.FRMP_ToDate, z.FRMP_Billing_Start_Date, z.FRMP_Billing_End_Date
, z.FRMP_NetAmount, z.FRMP_Party, z.FRMP_Provider
, z.MDP_RecordCount, z.MDP_Record_Id, z.MDP_Meter_Id, z.MDP_FromDate, z.MDP_ToDate, z.MDP_Billing_Start_Date, z.MDP_Billing_End_Date
, z.MDP_NetAmount, z.MDP_Party, z.MDP_Provider
, z.StatusComment

From (Select FRMP_RecordCount, FRMP_Record_Id, FRMP_Meter_Id, FRMP_FromDate, FRMP_ToDate, FRMP_Billing_Start_Date, FRMP_Billing_End_Date
, FRMP_NetAmount, FRMP_Party, FRMP_Provider
, MDP_RecordCount, MDP_Record_Id, MDP_Meter_Id, MDP_FromDate, MDP_ToDate, MDP_Billing_Start_Date, MDP_Billing_End_Date
, MDP_NetAmount, MDP_Party, MDP_Provider
, Case When z1.FRMP_Party Is Not Null Then Case When z2.MDP_Party Is Not Null Then 'VALID - TRU FRMP, CORRECT MDP' Else 'INVALID - TRU FRMP, INCORRECT MDP' End
Else 'INVALID - TRU NOT FRMP'
End 'StatusComment'
From ( -- z1
Select Count(*) FRMP_RecordCount, b.RECORD_ID FRMP_Record_Id, b.METER_ID FRMP_Meter_Id, Convert(Varchar(10), a.FromDate, 103) FRMP_FromDate
, Convert(Varchar(10), a.ToDate, 103) FRMP_ToDate, Convert(Varchar(10), b.Billing_Start_Date, 103) FRMP_Billing_Start_Date,
Convert(Varchar(10), b.Billing_End_Date, 103) FRMP_Billing_End_Date
, NetAmount FRMP_NetAmount, a.Party FRMP_Party, b.PROVIDER FRMP_Provider
From SumNetAmount1 b -- CTE

Inner Join NETWORK_VALIDATION..MSATS_DATA_BAU_FINAL a
On a.NMI = b.METER_ID
and (((b.BILLING_START_DATE between a.FromDate and a.ToDate)
and (b.billing_end_date between a.FromDate and a.ToDate))
and a.Role = 'FRMP')

Group By
b.RECORD_ID,
b.METER_ID,
Convert(Varchar(10), a.FromDate, 103),
Convert(Varchar(10), a.ToDate, 103),
Convert(Varchar(10), b.Billing_Start_Date, 103),
Convert(Varchar(10), b.Billing_End_Date, 103),
NetAmount, a.Party, b.PROVIDER
) z1

Full Outer Join
( -- z2
Select count(*) MDP_RecordCount, b.RECORD_ID MDP_Record_Id, b.METER_ID MDP_Meter_Id, Convert(Varchar(10), a.FromDate, 103) MDP_FromDate
, Convert(Varchar(10), a.ToDate, 103) MDP_ToDate, Convert(Varchar(10), b.Billing_Start_Date, 103) MDP_Billing_Start_Date, Convert(Varchar(10), b.Billing_End_Date, 103) MDP_Billing_End_Date
, NetAmount MDP_NetAmount, a.Party MDP_Party, b.PROVIDER MDP_Provider

From SumNetAmount1 b -- CTE

Inner Join NETWORK_VALIDATION..MSATS_DATA_BAU_FINAL a
On a.NMI = b.METER_ID
and (((b.BILLING_START_DATE between a.FromDate and a.ToDate)
and (b.billing_end_date between a.FromDate and a.ToDate))
and a.Role = 'MDP')

Where 1 = Case When Upper(b.PROVIDER) = 'SPA' and Left(a.party, 3) In ('GLO', 'SPA') Then 1
When Upper(b.PROVIDER) <> 'SPA' and Upper(b.PROVIDER) = Left(a.party, 3) Then 1
End
Group By
b.RECORD_ID,
b.METER_ID,
Convert(Varchar(10), a.FromDate, 103),
Convert(Varchar(10), a.ToDate, 103),
Convert(Varchar(10), b.Billing_Start_Date, 103),
Convert(Varchar(10), b.Billing_End_Date, 103),
NetAmount, a.Party, b.PROVIDER
) z2
On z1.FRMP_Record_Id = z2.MDP_Record_Id


Union All


Select FRMP_RecordCount, FRMP_Record_Id, FRMP_Meter_Id, FRMP_FromDate, FRMP_ToDate, FRMP_Billing_Start_Date, FRMP_Billing_End_Date
, FRMP_NetAmount, FRMP_Party, FRMP_Provider
, MDP_RecordCount, MDP_Record_Id, MDP_Meter_Id, MDP_FromDate, MDP_ToDate, MDP_Billing_Start_Date, MDP_Billing_End_Date
, MDP_NetAmount, MDP_Party, MDP_Provider
, Case When z1.FRMP_Party Is Not Null Then Case When z2.MDP_Party Is Not Null Then 'VALID - TRU FRMP, CORRECT MDP'
Else 'INVALID - TRU FRMP, INCORRECT MDP'
End
Else 'INVALID - TRU NOT FRMP'
End 'StatusComment'
From ( -- z1
Select count(*) FRMP_RecordCount, b.RECORD_ID FRMP_Record_Id, b.METER_ID FRMP_Meter_Id, Convert(Varchar(10), a.FromDate, 103) FRMP_FromDate
, Convert(Varchar(10), a.ToDate, 103) FRMP_ToDate, Convert(Varchar(10), b.Billing_Start_Date, 103) FRMP_Billing_Start_Date, Convert(Varchar(10), b.Billing_End_Date, 103) FRMP_Billing_End_Date
, NetAmount FRMP_NetAmount, a.Party FRMP_Party, b.PROVIDER FRMP_Provider

From SumNetAmount2 b --- ## CTE
LEFT JOIN NETWORK_VALIDATION..MSATS_DATA_BAU_FINAL a
On a.NMI = b.METER_ID
and b.BILLING_START_DATE BETWEEN a.FromDate AND a.ToDate
and b.billing_end_date BETWEEN a.FromDate AND a.ToDate
and a.Role = 'FRMP'

Where b.SUMnet_AMOUNT <> 0
and a.NMI Is Null

Group By
b.RECORD_ID,
b.METER_ID,
Convert(Varchar(10), a.FromDate, 103),
Convert(Varchar(10), a.ToDate, 103),
Convert(Varchar(10), b.Billing_Start_Date, 103),
Convert(Varchar(10), a.ToDate, 103),
Convert(Varchar(10), b.Billing_End_Date, 103),
NetAmount, a.Party, b.PROVIDER
) z1

Full Outer Join ( -- z2

Select count(*) MDP_RecordCount, b.RECORD_ID MDP_Record_Id, b.METER_ID MDP_Meter_Id, Convert(Varchar(10), a.FromDate, 103) MDP_FromDate, Convert(Varchar(10), a.ToDate, 103) MDP_ToDate
, Convert(Varchar(10), b.Billing_Start_Date, 103) MDP_Billing_Start_Date, Convert(Varchar(10), b.Billing_End_Date, 103) MDP_Billing_End_Date
, NetAmount MDP_NetAmount, a.Party MDP_Party, b.PROVIDER MDP_Provider

From SumNetAmount2 b --- ## CTE
LEFT JOIN NETWORK_VALIDATION..MSATS_DATA_BAU_FINAL a
On a.NMI = b.METER_ID
and b.BILLING_START_DATE BETWEEN a.FromDate AND a.ToDate
and b.billing_end_date BETWEEN a.FromDate AND a.ToDate
and a.Role = 'MDP'

Where b.SUMnet_AMOUNT <> 0
and a.NMI Is Null

and 1 = Case
When Upper(b.PROVIDER) = 'SPA' and Left(a.party, 3) In ('GLO', 'SPA') Then 1
When Upper(b.PROVIDER) <> 'SPA' and Upper(b.PROVIDER) = Left(a.party, 3) Then 1
End

Group By
b.RECORD_ID,
b.METER_ID,
Convert(Varchar(10), a.FromDate, 103),
Convert(Varchar(10), a.ToDate, 103),
Convert(Varchar(10), b.Billing_Start_Date, 103),
Convert(Varchar(10), a.ToDate, 103),
Convert(Varchar(10), b.Billing_End_Date, 103),
NetAmount, a.Party, b.PROVIDER
) z2
On z1.FRMP_Record_Id = z2.MDP_Record_Id

) z
Where z.FRMP_Record_Id Is Not Null



“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1367816
Posted Wednesday, October 3, 2012 5:17 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 3:19 PM
Points: 10, Visits: 60
Hi Chris


Thanks for that!I fully understand what you are trying to do with the CTE's. I have run the query with the CTE's.

The problem, however, remains the same, that is, it still executes the option without the last 'Where...' clause in around 10 seconds but with the last 'Where...' clause included, it takes aound the same 75 minutes.

I have attached the 'Actual Execution' plans for the two options for your perusal - I don't understand how to read them. These are for the queries as was sent to you yesterday (without the CTE definition - but as I said earlier that didn't make a hell of a lot of difference in the execution times at all).

Looking forward to hearing from you about what you find therein!




Best regards



Deepak
Post #1368108
Posted Thursday, October 4, 2012 5:13 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 8:56 AM
Points: 7,137, Visits: 13,527
deepakagarwalathome (10/3/2012)
Hi Chris


Thanks for that!I fully understand what you are trying to do with the CTE's. I have run the query with the CTE's.

The problem, however, remains the same, that is, it still executes the option without the last 'Where...' clause in around 10 seconds but with the last 'Where...' clause included, it takes aound the same 75 minutes.

I have attached the 'Actual Execution' plans for the two options for your perusal - I don't understand how to read them. These are for the queries as was sent to you yesterday (without the CTE definition - but as I said earlier that didn't make a hell of a lot of difference in the execution times at all).

Looking forward to hearing from you about what you find therein!




Best regards



Deepak


It takes 75 minutes because the optimiser is choosing a poor plan based on the information it has. It would be interesting though to see the execution plans with and without the WHERE clause. I'm afraid the attachment is of no value to me. What you should do is save an actual plan as a .sqlplan file and attach it to the post. Do this for the actual plans with and without the WHERE clause.

I've had another play with your query. There are four queries within it which read data, something like

q1 full outer join q2
union all
q3 full outer join q4

I've reconstructed the common elements of q1 and q2 as CTE1, and q3 and q4 as CTE2. The resulting query looks like this:

  • Reconstructed query with CTEs


  • ; WITH
    CTE1 AS ( -- used twice by part above UNION ALL
    SELECT
    b.RECORD_ID RECORD_ID,
    b.METER_ID Meter_Id,
    Convert(Varchar(10), a.FromDate, 103) FromDate,
    Convert(Varchar(10), a.ToDate, 103) ToDate,
    Convert(Varchar(10), b.Billing_Start_Date, 103) Billing_Start_Date,
    Convert(Varchar(10), b.Billing_End_Date, 103) Billing_End_Date,
    b.NetAmount NetAmount,
    SUMNET_AMOUNT = SUM(b.net_AMOUNT) OVER(PARTITION BY METER_ID, BILLING_START_DATE, BILLING_END_DATE, Provider),
    a.Party Party,
    b.PROVIDER Provider,
    a.[Role] [Role]
    FROM NETWORK_VALIDATION..evd_NonNBV_ForChecking_BAU b
    INNER JOIN NETWORK_VALIDATION..MSATS_DATA_BAU_FINAL a
    ON a.NMI = b.METER_ID
    AND b.BILLING_START_DATE between a.FromDate and a.ToDate
    AND b.billing_end_date between a.FromDate and a.ToDate
    ),
    CTE2 AS ( -- used twice by part below UNION ALL
    SELECT
    b.RECORD_ID RECORD_ID,
    b.METER_ID Meter_Id,
    Convert(Varchar(10), a.FromDate, 103) FromDate,
    Convert(Varchar(10), a.ToDate, 103) ToDate,
    Convert(Varchar(10), b.Billing_Start_Date, 103) Billing_Start_Date,
    Convert(Varchar(10), b.Billing_End_Date, 103) Billing_End_Date,
    b.NetAmount NetAmount,
    SUMNET_AMOUNT = SUM(b.net_AMOUNT) OVER(PARTITION BY METER_ID, BILLING_START_DATE, BILLING_END_DATE, Provider),
    a.Party Party,
    b.PROVIDER Provider,
    a.[Role] [Role]
    FROM NETWORK_VALIDATION..evd_NonNBV_ForChecking_BAU b
    LEFT JOIN NETWORK_VALIDATION..MSATS_DATA_BAU_FINAL a
    On a.NMI = b.METER_ID
    AND b.BILLING_START_DATE BETWEEN a.FromDate AND a.ToDate
    AND b.billing_end_date BETWEEN a.FromDate AND a.ToDate
    WHERE a.NMI IS NULL
    )

    select z.FRMP_RecordCount, z.FRMP_Record_Id, z.FRMP_Meter_Id, z.FRMP_FromDate, z.FRMP_ToDate, z.FRMP_Billing_Start_Date, z.FRMP_Billing_End_Date
    , z.FRMP_NetAmount, z.FRMP_Party, z.FRMP_Provider
    , z.MDP_RecordCount, z.MDP_Record_Id, z.MDP_Meter_Id, z.MDP_FromDate, z.MDP_ToDate, z.MDP_Billing_Start_Date, z.MDP_Billing_End_Date
    , z.MDP_NetAmount, z.MDP_Party, z.MDP_Provider
    , z.StatusComment

    From (
    Select
    FRMP_RecordCount, FRMP_Record_Id, FRMP_Meter_Id, FRMP_FromDate, FRMP_ToDate, FRMP_Billing_Start_Date, FRMP_Billing_End_Date,
    FRMP_NetAmount, FRMP_Party, FRMP_Provider,
    MDP_RecordCount, MDP_Record_Id, MDP_Meter_Id, MDP_FromDate, MDP_ToDate, MDP_Billing_Start_Date, MDP_Billing_End_Date,
    MDP_NetAmount, MDP_Party, MDP_Provider,
    CASE
    WHEN a1.FRMP_Party IS NOT NULL AND a2.MDP_Party IS NOT NULL THEN 'VALID - TRU FRMP, CORRECT MDP'
    WHEN a1.FRMP_Party IS NOT NULL AND a2.MDP_Party IS NULL THEN 'INVALID - TRU FRMP, INCORRECT MDP'
    ELSE 'INVALID - TRU NOT FRMP'
    END [StatusComment]
    FROM ( -- q1

    SELECT
    Count(*) FRMP_RecordCount,
    RECORD_ID FRMP_Record_Id,
    Meter_Id FRMP_Meter_Id,
    FromDate FRMP_FromDate,
    ToDate FRMP_ToDate,
    Billing_Start_Date FRMP_Billing_Start_Date,
    Billing_End_Date FRMP_Billing_End_Date,
    NetAmount FRMP_NetAmount,
    Party FRMP_Party,
    Provider FRMP_Provider
    FROM CTE1
    WHERE [Role] = 'FRMP'
    AND SUMNET_AMOUNT <> 0
    GROUP BY RECORD_ID, Meter_Id, FromDate, ToDate,
    Billing_Start_Date, Billing_End_Date, NetAmount, Party, Provider

    ) q1

    FULL OUTER JOIN ( -- q2

    SELECT
    Count(*) MDP_RecordCount,
    RECORD_ID MDP_Record_Id,
    Meter_Id MDP_Meter_Id,
    FromDate MDP_FromDate,
    ToDate MDP_ToDate,
    Billing_Start_Date MDP_Billing_Start_Date,
    Billing_End_Date MDP_Billing_End_Date,
    NetAmount MDP_NetAmount,
    Party MDP_Party,
    Provider MDP_Provider
    FROM CTE1
    WHERE [Role] = 'MDP'
    AND SUMNET_AMOUNT <> 0
    AND ( (Upper(PROVIDER) = Left(party, 3)) OR (Upper(PROVIDER) = 'SPA' and Left(party, 3) = 'GLO') )
    GROUP BY Record_Id, Meter_Id, FromDate, ToDate,
    Billing_Start_Date, Billing_End_Date, NetAmount, Party, Provider

    ) q2 ON q1.FRMP_Record_Id = q2.MDP_Record_Id

    UNION ALL

    Select FRMP_RecordCount, FRMP_Record_Id, FRMP_Meter_Id, FRMP_FromDate, FRMP_ToDate, FRMP_Billing_Start_Date, FRMP_Billing_End_Date
    , FRMP_NetAmount, FRMP_Party, FRMP_Provider
    , MDP_RecordCount, MDP_Record_Id, MDP_Meter_Id, MDP_FromDate, MDP_ToDate, MDP_Billing_Start_Date, MDP_Billing_End_Date
    , MDP_NetAmount, MDP_Party, MDP_Provider,
    CASE
    WHEN z1.FRMP_Party Is Not Null AND z2.MDP_Party Is Not Null Then 'VALID - TRU FRMP, CORRECT MDP'
    WHEN z1.FRMP_Party Is Not Null AND z2.MDP_Party Is Null Then 'INVALID - TRU FRMP, INCORRECT MDP'
    ELSE 'INVALID - TRU NOT FRMP'
    END [StatusComment]
    From ( -- q3

    SELECT
    count(*) FRMP_RecordCount,
    RECORD_ID FRMP_Record_Id,
    METER_ID FRMP_Meter_Id,
    Convert(Varchar(10), FromDate, 103) FRMP_FromDate,
    Convert(Varchar(10), ToDate, 103) FRMP_ToDate,
    Convert(Varchar(10), Billing_Start_Date, 103) FRMP_Billing_Start_Date,
    Convert(Varchar(10), Billing_End_Date, 103) FRMP_Billing_End_Date,
    NetAmount FRMP_NetAmount, Party FRMP_Party, PROVIDER FRMP_Provider
    FROM CTE2
    WHERE [Role] = 'FRMP'
    AND SUMNET_AMOUNT <> 0
    GROUP BY Record_Id, Meter_Id, FromDate, ToDate,
    Billing_Start_Date, Billing_End_Date, NetAmount, Party, Provider

    ) q3

    FULL OUTER JOIN ( -- q4

    SELECT
    count(*) MDP_RecordCount,
    RECORD_ID MDP_Record_Id,
    METER_ID MDP_Meter_Id,
    Convert(Varchar(10), FromDate, 103) MDP_FromDate,
    Convert(Varchar(10), ToDate, 103) MDP_ToDate,
    Convert(Varchar(10), Billing_Start_Date, 103) MDP_Billing_Start_Date,
    Convert(Varchar(10), Billing_End_Date, 103) MDP_Billing_End_Date,
    NetAmount MDP_NetAmount, Party MDP_Party, PROVIDER MDP_Provider
    FROM CTE2
    WHERE [Role] = 'MDP'
    AND SUMnet_AMOUNT <> 0
    AND ( (Upper(PROVIDER) = Left(party, 3)) OR (Upper(PROVIDER) = 'SPA' and Left(party, 3) = 'GLO') )
    GROUP BY Record_Id, Meter_Id, FromDate, ToDate,
    Billing_Start_Date, Billing_End_Date, NetAmount, Party, Provider

    ) q4 ON q3.FRMP_Record_Id = q4.MDP_Record_Id

    ) z
    Where z.FRMP_Record_Id Is Not Null

    You should test that this works and amend it if it doesn't generate the expected results.


    Having reconstructed the query to use those two CTE's, you can now see the wood through the trees. The query actually looks like this;

    -- get all matching rows between table a and table b 
    SELECT columns
    FROM NETWORK_VALIDATION..evd_NonNBV_ForChecking_BAU b
    INNER JOIN NETWORK_VALIDATION..MSATS_DATA_BAU_FINAL a ON a.matchingcolumn = b.matchingcolumn
    UNION ALL
    -- get all rows from table b where there's no match in table a
    SELECT columns
    FROM NETWORK_VALIDATION..evd_NonNBV_ForChecking_BAU b
    LEFT JOIN NETWORK_VALIDATION..MSATS_DATA_BAU_FINAL a ON a.matchingcolumn = b.matchingcolumn
    WHERE a.matchingcolumn IS NULL


    Now this is logically equivalent to a left join; get all rows from the left hand table whether or not they match on the right (get all rows from the lhs where they match, plus all rows from the lhs where they don't). This simplifies your overall query to the following:

  • Replace UNION with LEFT JOIN

  • ; WITH 
    CTE1 AS ( -- used twice
    SELECT
    b.RECORD_ID RECORD_ID,
    b.METER_ID Meter_Id,
    CAST(a.FromDate AS DATE) FromDate,
    CAST(a.ToDate AS DATE) ToDate,
    CAST(b.Billing_Start_Date AS DATE) Billing_Start_Date,
    CAST(b.Billing_End_Date AS DATE) Billing_End_Date,
    b.NetAmount NetAmount,
    SUMNET_AMOUNT = SUM(b.net_AMOUNT) OVER(PARTITION BY METER_ID, BILLING_START_DATE, BILLING_END_DATE, Provider),
    a.Party Party,
    b.PROVIDER Provider,
    a.[Role] [Role]
    FROM NETWORK_VALIDATION..evd_NonNBV_ForChecking_BAU b
    LEFT JOIN NETWORK_VALIDATION..MSATS_DATA_BAU_FINAL a
    On a.NMI = b.METER_ID
    AND b.BILLING_START_DATE BETWEEN a.FromDate AND a.ToDate
    AND b.billing_end_date BETWEEN a.FromDate AND a.ToDate
    )

    SELECT
    FRMP_RecordCount, FRMP_Record_Id, FRMP_Meter_Id, FRMP_FromDate, FRMP_ToDate, FRMP_Billing_Start_Date, FRMP_Billing_End_Date,
    FRMP_NetAmount, FRMP_Party, FRMP_Provider,
    MDP_RecordCount, MDP_Record_Id, MDP_Meter_Id, MDP_FromDate, MDP_ToDate, MDP_Billing_Start_Date, MDP_Billing_End_Date,
    MDP_NetAmount, MDP_Party, MDP_Provider,
    CASE
    WHEN a1.FRMP_Party IS NOT NULL AND a2.MDP_Party IS NOT NULL THEN 'VALID - TRU FRMP, CORRECT MDP'
    WHEN a1.FRMP_Party IS NOT NULL AND a2.MDP_Party IS NULL THEN 'INVALID - TRU FRMP, INCORRECT MDP'
    ELSE 'INVALID - TRU NOT FRMP'
    END [StatusComment]
    FROM ( -- q1
    SELECT
    Count(*) FRMP_RecordCount,
    RECORD_ID FRMP_Record_Id,
    Meter_Id FRMP_Meter_Id,
    FromDate FRMP_FromDate,
    ToDate FRMP_ToDate,
    Billing_Start_Date FRMP_Billing_Start_Date,
    Billing_End_Date FRMP_Billing_End_Date,
    NetAmount FRMP_NetAmount,
    Party FRMP_Party,
    Provider FRMP_Provider
    FROM CTE1
    WHERE [Role] = 'FRMP'
    AND SUMNET_AMOUNT <> 0
    AND RECORD_ID IS NOT NULL
    GROUP BY RECORD_ID, Meter_Id, FromDate, ToDate,
    Billing_Start_Date, Billing_End_Date, NetAmount, Party, Provider
    ) q1
    LEFT JOIN ( -- q2
    SELECT
    Count(*) MDP_RecordCount,
    RECORD_ID MDP_Record_Id,
    Meter_Id MDP_Meter_Id,
    FromDate MDP_FromDate,
    ToDate MDP_ToDate,
    Billing_Start_Date MDP_Billing_Start_Date,
    Billing_End_Date MDP_Billing_End_Date,
    NetAmount MDP_NetAmount,
    Party MDP_Party,
    Provider MDP_Provider
    FROM CTE1
    WHERE [Role] = 'MDP'
    AND SUMNET_AMOUNT <> 0
    AND ( (Upper(PROVIDER) = Left(party, 3)) OR (Upper(PROVIDER) = 'SPA' and Left(party, 3) = 'GLO') )
    GROUP BY Record_Id, Meter_Id, FromDate, ToDate,
    Billing_Start_Date, Billing_End_Date, NetAmount, Party, Provider
    ) q2 ON q1.FRMP_Record_Id = q2.MDP_Record_Id

    Remember where you had a full outer join between the two derived tables – followed by that pesky WHERE clause (Where z.FRMP_Record_Id Is Not Null)? The WHERE clause turns the full outer join into a left join. I think that’s all it does, but just in case I’ve included it in q1.

    Removing a full outer join from a query against tables with no indexes is likely to significantly improve performance – give it a shot and let us know how you get on. When you’re done (preferably before), you should concentrate on those indexes. Ask if you are unsure how to go about deciding which columns to use.



    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
    Exploring Recursive CTEs by Example Dwain Camps
    Post #1368303
    Posted Sunday, October 7, 2012 2:52 PM
    Grasshopper

    GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

    Group: General Forum Members
    Last Login: Tuesday, July 29, 2014 3:19 PM
    Points: 10, Visits: 60
    Hi Chris


    Sorry for this inordinate delay. I couldn't upload the attachments on Friday and I didn't have access to this data at home, so didn't make the post then.

    Here it is now though.

    In the attachment, I have included the query that I ran, the result sets, the statistics data and the actual execution plans for both the options as requested by you, the first without the last where clause and the second with it. This time around, it didn't take anywhere near as long as it did earlier although I did replace the derived table 'b' with your '...partition by... clause for both the queries of the overall union query. May be that was what was bringing it down with the where clause.

    As I have said earlier, I dont't know how to read the execution plan as yet (I am going to be spending time soon to be able to understand at least the very basics of it so that I can write slightly more efficient queries this point forward), but I am sure it would tell you a lot and you would be able to intrepret the same and advise as to where it was falling down.

    Shall look forward to hearing from you soon.

    Once again, sorry for not providing you with the relevant details earlier.




    Best regards



    Deepak
    I will test with your revision later tonight and advise.



    Deepak


      Post Attachments 
    Resutls and Statistics - latest - 20121008.zip (1 view, 7.53 MB)
    Post #1369577
    « Prev Topic | Next Topic »

    Add to briefcase

    Permissions Expand / Collapse