Identifying Start Dates not Aligned with End Dates on a Prior Row

Question

Identifying Start Dates not Aligned with End Dates on a Prior Row

Dwain Camps

SSC Guru

Points: 86978
More actions
March 23, 2015 at 9:25 pm

#298919

Comments posted to this topic are about the item Identifying Start Dates not Aligned with End Dates on a Prior Row
My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

starunit SSCommitted Points: 1817 More actions · Answer 1

Nice article, Dwain.

I've been using the CTE approach, and hadn't thought about using Cross Apply - but it looks like that's a bit slower anyway. And I'll try to remember the LAG() fuction when we move to SQL-2012

Nevertheless, I'll bookmark your article for later reference.

Thanks!

Mark
Just a cog in the wheel.

ken.trock SSCertifiable Points: 5147 More actions · Answer 2

Interesting article; Dwain requires that, for a given product, the EffectiveStartDT of the next timeline matches the EffectiveEndDT of the previous. This helps to enforce no overlapping dates for a product. What's the price of that product on the transition date? The one on EffectiveStartDT of the later timeline.

In our employee hierarchy, we (or our predecessors) have EffectiveEndDT always 1 day before the next EffectiveStartDT for that employee. Our users are locked into selecting data and joining back to the employee data with

WHERE <dataDate> BETWEEN EffectiveStartDT AND EffectiveEndDT.

Here, there's an issue if they're reporting on a period of time including the transition day, where they would double the fact data. OTOH, we don't have all the nice integrity checking that Dwaine has.

Ken

Eirikur Eiriksson SSC Guru Points: 182951 More actions · Answer 3

Good article as always Dwain, thank you very much.

Some thoughts, first of all the 1M test code only produced 100 rows until I adjusted the top clauses. The second thing has to do with the sorting implied by the LEAD, simplifying the work by date diffing to 0 does improve the performance there, narrowing the "worktable" i guess makes a difference. Did similar recently which brought the execution time down by 40%

😎

SELECT ProductID, EffectiveStartDT, EffectiveEndDT, ProductPrice

FROM

(

SELECT ProductID, EffectiveStartDT, EffectiveEndDT, ProductPrice

,d=LAG(EffectiveEndDT, 1) OVER

(

PARTITION BY ProductID ORDER BY DATEDIFF(DAY,0,EffectiveStartDT)

)

FROM dbo.ProductPrices

) a

WHERE d IS NOT NULL AND d <> EffectiveStartDT;

SELECT ProductID, EffectiveStartDT, EffectiveEndDT, ProductPrice

FROM

(

SELECT ProductID, EffectiveStartDT, EffectiveEndDT, ProductPrice

,d=LAG(EffectiveEndDT, 1) OVER

(

PARTITION BY ProductID ORDER BY CONVERT(BINARY(8),EffectiveStartDT,0)

)

FROM dbo.ProductPrices

) a

WHERE d IS NOT NULL AND d <> EffectiveStartDT;

Stats

Without Datediff:

SQL Server parse and compile time:

CPU time = 3 ms, elapsed time = 3 ms.

Table 'ProductPrices'. Scan count 5, logical reads 4713, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 4352 ms, elapsed time = 2156 ms.

With Datediff:

Table 'ProductPrices'. Scan count 5, logical reads 4713, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 4352 ms, elapsed time = 1878 ms.

Dwain Camps SSC Guru Points: 86978 More actions · Answer 4

starunit (3/24/2015)
Nice article, Dwain.
I've been using the CTE approach, and hadn't thought about using Cross Apply - but it looks like that's a bit slower anyway. And I'll try to remember the LAG() fuction when we move to SQL-2012
Nevertheless, I'll bookmark your article for later reference.
Thanks!

I'm not sure quite what you mean by the "CTE approach" because any use of a CTE is simply a derived table in disguise (unless the CTE is used in multiple places in the query). CTEs are just a generic way to construct a temporary row set for use in the current query.

I would say, you've also got to consider the EXISTS approach. What I said about always "forgetting" to consider how to do a query using EXISTS as a filter is true. That's why I constantly have to kick myself when it turns out to be faster (and that is relatively often).

Thanks for your "nice article" comment!

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86978 More actions · Answer 5

ken.trock (3/24/2015)
Interesting article; Dwain requires that, for a given product, the EffectiveStartDT of the next timeline matches the EffectiveEndDT of the previous. This helps to enforce no overlapping dates for a product. What's the price of that product on the transition date? The one on EffectiveStartDT of the later timeline.
In our employee hierarchy, we (or our predecessors) have EffectiveEndDT always 1 day before the next EffectiveStartDT for that employee. Our users are locked into selecting data and joining back to the employee data with
WHERE <dataDate> BETWEEN EffectiveStartDT AND EffectiveEndDT.
Here, there's an issue if they're reporting on a period of time including the transition day, where they would double the fact data. OTOH, we don't have all the nice integrity checking that Dwaine has.
Ken

Hi Ken,

The transition date/time you refer to is the exact millisecond of transition to the new price. So at midnight on the effective date (that effective start date) the new price is in effect.

I believe there are a few people on this site (people that are real SQL experts) that would say use of BETWEEN on date/times is not the recommended approach. With a DATE type maybe. Certainly Joe Celko in his contiguous dates article seemed to suggest it was appropriate.

While I don't happen to agree with it, that doesn't mean you can't get away with it. And you could probably even use the queries I provided with a little modification to handle the gaps/overlaps identification in your scenario.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86978 More actions · Answer 6

Eirikur Eiriksson (3/24/2015)
Good article as always Dwain, thank you very much.
Some thoughts, first of all the 1M test code only produced 100 rows until I adjusted the top clauses. The second thing has to do with the sorting implied by the LEAD, simplifying the work by date diffing to 0 does improve the performance there, narrowing the "worktable" i guess makes a difference. Did similar recently which brought the execution time down by 40%
😎
SELECT ProductID, EffectiveStartDT, EffectiveEndDT, ProductPrice
FROM
(
SELECT ProductID, EffectiveStartDT, EffectiveEndDT, ProductPrice
,d=LAG(EffectiveEndDT, 1) OVER
(
PARTITION BY ProductID ORDER BY DATEDIFF(DAY,0,EffectiveStartDT)
)
FROM dbo.ProductPrices
) a
WHERE d IS NOT NULL AND d <> EffectiveStartDT;
SELECT ProductID, EffectiveStartDT, EffectiveEndDT, ProductPrice
FROM
(
SELECT ProductID, EffectiveStartDT, EffectiveEndDT, ProductPrice
,d=LAG(EffectiveEndDT, 1) OVER
(
PARTITION BY ProductID ORDER BY CONVERT(BINARY(8),EffectiveStartDT,0)
)
FROM dbo.ProductPrices
) a
WHERE d IS NOT NULL AND d <> EffectiveStartDT;
Stats
Without Datediff:
SQL Server parse and compile time:
CPU time = 3 ms, elapsed time = 3 ms.
Table 'ProductPrices'. Scan count 5, logical reads 4713, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 4352 ms, elapsed time = 2156 ms.
With Datediff:
Table 'ProductPrices'. Scan count 5, logical reads 4713, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 4352 ms, elapsed time = 1878 ms.

Hi Eirikur - Thanks for taking a look!

I'm a little confused by your examples though. You're using DATEDIFF in one bit of code and then CONVERT in the ORDER BY of the other. Yet your performance results suggest with/without DATEDIFF. Does without DATEDIFF translate into the CONVERT version? Or are you comparing your first example with the SQL 2012 version included in the article?

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Eirikur Eiriksson SSC Guru Points: 182951 More actions · Answer 7

dwain.c (3/24/2015)
Hi Eirikur - Thanks for taking a look!
I'm a little confused by your examples though. You're using DATEDIFF in one bit of code and then CONVERT in the ORDER BY of the other. Yet your performance results suggest with/without DATEDIFF. Does without DATEDIFF translate into the CONVERT version? Or are you comparing your first example with the SQL 2012 version included in the article?

My blunder, copied the wrong code snips:blush:

Eirikur Eiriksson SSC Guru Points: 182951 More actions · Answer 8

Eirikur Eiriksson (3/25/2015)
dwain.c (3/24/2015)
Hi Eirikur - Thanks for taking a look!
I'm a little confused by your examples though. You're using DATEDIFF in one bit of code and then CONVERT in the ORDER BY of the other. Yet your performance results suggest with/without DATEDIFF. Does without DATEDIFF translate into the CONVERT version? Or are you comparing your first example with the SQL 2012 version included in the article?
My blunder, copied the wrong code snips:blush:

Not only did I post the wrong code snips but even worse, I made an assumption which was wrong. Finally having time to look into it this morning I realised that the difference in execution time is simply because the server chooses a parallel execution plan for the DATEDIFF, the work behind it is in the first order of magnitude greater than without it. Again my bad.

😎

Dwain Camps SSC Guru Points: 86978 More actions · Answer 9

Hehe. I'm sure all will forgive you Eirikur!

Better to have stumped you than for you to find my code-craft lacking.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Stuart Pearson SSChasing Mays Points: 637 More actions · Answer 10

Hi Dwain

Nice article

The resources link on the article appears to be pointing at the wrong file.

Peter Heller Ten Centuries Points: 1000 More actions · Answer 11

Compliments on your article.

If you defined the EffectiveEndDT as DiscontinuedDT your queries would not have to use Isnull. It would be sargable and make the query optimizer a little happier :-).

Also, I always try avoid defining tables with dbo, I prefer to create a schema which in fact may be owned by dbo for a set of grouped SQL objects.

[p]

--

-- Setting a default value for Maximum EffectiveEndDT

--

EffectiveEndDT DATETIME NOT NULL DEFAULT ('2099-12-31')

DiscontinuedDT DATETIME NOT NULL DEFAULT ('2099-12-31')

DECLARE @GetPriceOn DATETIME = '2013-12-31';

--

-- EffectiveEndDT set default value '2099-12-31'

--

-- sargable

--

SELECT ProductID, EffectiveStartDT, EffectiveEndDT, ProductPrice

FROM dbo.ProductPrices

WHERE @GetPriceOn >= EffectiveStartDT AND @GetPriceOn < EffectiveEndDT;

[/p]

Thank you for sharing this article.

Dwain Camps SSC Guru Points: 86978 More actions · Answer 12

Peter Heller (3/29/2015)
Compliments on your article.
If you defined the EffectiveEndDT as DiscontinuedDT your queries would not have to use Isnull. It would be sargable and make the query optimizer a little happier :-).
Also, I always try avoid defining tables with dbo, I prefer to create a schema which in fact may be owned by dbo for a set of grouped SQL objects.
[p]
--
-- Setting a default value for Maximum EffectiveEndDT
--
EffectiveEndDT DATETIME NOT NULL DEFAULT ('2099-12-31')
DiscontinuedDT DATETIME NOT NULL DEFAULT ('2099-12-31')
DECLARE @GetPriceOn DATETIME = '2013-12-31';
--
-- EffectiveEndDT set default value '2099-12-31'
--
-- sargable
--
SELECT ProductID, EffectiveStartDT, EffectiveEndDT, ProductPrice
FROM dbo.ProductPrices
WHERE @GetPriceOn >= EffectiveStartDT AND @GetPriceOn < EffectiveEndDT;
[/p]
Thank you for sharing this article.

Yes you could. Often times, people don't though.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Peter Heller Ten Centuries Points: 1000 More actions · Answer 13

The practical example is when the product price changes the EffectiveEndDt of the last current price is updated with the current date and a new transaction is created with same date as the EffectiveStartDT equal to the previous end date. The new EffectiveEndDt will be null in your case and in my case 12/13/2099.