June 2, 2016 at 10:14 am
Hello,
I am fairly new to T-SQL and new to the forum. I was given the below query to see if there was a way to optimize it to be manageable. After trying to dissect and deconstruct it I could not avoid the massive row counts the conditional joins were causing. The main purpose of this exercise was to be able to add a rank order to each id to be able to request stats such as how many IDs were in xyz status on any given day. Any thoughts or feedback would be greatly appreciated.
select
backbone.id
,backbone.firstday
,backbone.datekey
, trans.logdate
, trans.tostatus
, DENSE_RANK () OVER ( PARTITION by backbone.id, backbone.datekey ORDER BY trans.logdate DESC) RankOrder
from
(select source.id,source.FirstDay, datekey.Datekey from
(select id, cast(Min(logdate) as date) FirstDay From Prod.dbo.history
group by id) source
cross join
(select cast(FullDateKey as Date) DateKey From Fin.dbo.Calendar where fulldatekey <= '03-31-2016') datekey
where datekey.Datekey >= source.FirstDay) backbone
left join Prod.dbo.history trans on backbone.id=trans.id and backbone.datekey >= trans.logdate
order by backbone.datekey, trans.logdate,RankOrder
Thanks!
June 2, 2016 at 10:56 am
What I get from your query.
It's creating one row for each date in your calendar table from the first logdate until '03-31-2016', this for each id. Then multiply all those rows for each row with that id. Is that what you want?
Just for one id with 2 rows and one year old, it returns 733 rows. That sounds illogical to me.
Here's a test. I rearranged your query to make it easier for me to read, but the logic is identical.
CREATE TABLE #history(
id int,
logdate date,
tostatus varchar(10));
INSERT INTO #history VALUES(1, '20150331', 'Test'), (1, '20150401', 'Test2')
CREATE TABLE #Calendar( FullDateKey datetime);
WITH
E(n) AS(
SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))E(n)
),
E2(n) AS(
SELECT a.n FROM E a, E b
),
E4(n) AS(
SELECT a.n FROM E2 a, E2 b
),
cteTally(n) AS(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) n
FROM E4
)
INSERT INTO #Calendar
SELECT DATEADD( dd, n - 1, '2010')
FROM cteTally;
WITH source AS(
select id,
cast(Min(logdate) as date) FirstDay
From #history
group by id
),
datekey AS(
select cast(FullDateKey as Date) DateKey
From #Calendar
where fulldatekey <= '03-31-2016'
),
backbone AS(
select source.id,
source.FirstDay,
datekey.Datekey
from source
join datekey ON datekey.Datekey >= source.FirstDay
)
select
backbone.id
, backbone.firstday
, backbone.datekey
, trans.logdate
, trans.tostatus
, DENSE_RANK () OVER ( PARTITION by backbone.id, backbone.datekey ORDER BY trans.logdate DESC) RankOrder
from backbone
left join #history trans on backbone.id = trans.id
and backbone.datekey >= trans.logdate
order by backbone.datekey,
trans.logdate,
RankOrder;
GO
DROP TABLE #history, #Calendar;
Here's an explanation on why you're having performance problems (if it wasn't clear enough): http://www.sqlservercentral.com/articles/T-SQL/61539/
Please share sample data and expected output to know what you're trying to do. We could help you correct the process.
June 2, 2016 at 10:59 am
I'm thinking this sounds like a classic rank/count per-partition scenario, in which case I would think it could be done with a fairly straight-forward windowing function since you are on SQL 2014. I don't have time to work up a solution though. 🙁
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
June 2, 2016 at 11:25 am
Luis C that was exactly what I was trying to do and exactly the issue I ran into. Below is some sample data and expected output. Hope this helps and thanks again!
Prod.dbo.history
ID SeqIDLogDatePhaseNbrFromStatus ToStatus
941165116/4/2015221 NULL 533
941165126/11/2015221 533 922
941165136/11/2015221 922 NULL
941165146/11/2015216 NULL 937
941165156/15/2015216 937 938
941165166/15/2015216 938 526
941165176/16/2015216 526 499
941165186/22/2015216 499 529
ID FirstDay
91407758/21/2013
91280403/26/2013
91052266/29/2012
91009817/13/2012
912379512/20/2012
91110604/12/2012
94753244/20/2016
94710793/30/2016
DateKey
11/13/2012
11/14/2012
11/15/2012
11/16/2012
11/17/2012
11/18/2012
11/19/2012
11/20/2012
Expected Output
ID firstday datekey logdate tostatusRankOrder
94116516/4/20156/4/2015NULL NULL1
94116516/4/20156/5/20156/4/20155331
94116516/4/20156/6/20156/4/20155331
94116516/4/20156/7/20156/4/20155331
94116516/4/20156/8/20156/4/20155331
94116516/4/20156/9/20156/4/20155331
94116516/4/20156/10/20156/4/20155331
94116516/4/20156/11/20156/4/20155331
94116516/4/20156/12/20156/4/20155334
94116516/4/20156/12/20156/11/20159223
94116516/4/20156/12/20156/11/2015NULL2
94116516/4/20156/12/20156/11/20159371
94116516/4/20156/13/20156/4/20155334
94116516/4/20156/13/20156/11/20159223
94116516/4/20156/13/20156/11/2015NULL2
94116516/4/20156/13/20156/11/20159371
94116516/4/20156/14/20156/4/20155334
94116516/4/20156/14/20156/11/20159223
94116516/4/20156/14/20156/11/2015NULL2
94116516/4/20156/14/20156/11/20159371
94116516/4/20156/15/20156/4/20155334
94116516/4/20156/15/20156/11/20159223
94116516/4/20156/15/20156/11/2015NULL2
94116516/4/20156/15/20156/11/20159371
94116516/4/20156/16/20156/4/20155336
94116516/4/20156/16/20156/11/20159225
June 2, 2016 at 12:40 pm
DK13 (6/2/2016)
Luis C that was exactly what I was trying to do and exactly the issue I ran into. Below is some sample data and expected output. Hope this helps and thanks again!
Prod.dbo.history
ID SeqIDLogDatePhaseNbrFromStatus ToStatus
941165116/4/2015221 NULL 533
941165126/11/2015221 533 922
941165136/11/2015221 922 NULL
941165146/11/2015216 NULL 937
941165156/15/2015216 937 938
941165166/15/2015216 938 526
941165176/16/2015216 526 499
941165186/22/2015216 499 529
ID FirstDay
91407758/21/2013
91280403/26/2013
91052266/29/2012
91009817/13/2012
912379512/20/2012
91110604/12/2012
94753244/20/2016
94710793/30/2016
DateKey
11/13/2012
11/14/2012
11/15/2012
11/16/2012
11/17/2012
11/18/2012
11/19/2012
11/20/2012
Expected Output
ID firstday datekey logdate tostatusRankOrder
94116516/4/20156/4/2015NULL NULL1
94116516/4/20156/5/20156/4/20155331
94116516/4/20156/6/20156/4/20155331
94116516/4/20156/7/20156/4/20155331
94116516/4/20156/8/20156/4/20155331
94116516/4/20156/9/20156/4/20155331
94116516/4/20156/10/20156/4/20155331
94116516/4/20156/11/20156/4/20155331
94116516/4/20156/12/20156/4/20155334
94116516/4/20156/12/20156/11/20159223
94116516/4/20156/12/20156/11/2015NULL2
94116516/4/20156/12/20156/11/20159371
94116516/4/20156/13/20156/4/20155334
94116516/4/20156/13/20156/11/20159223
94116516/4/20156/13/20156/11/2015NULL2
94116516/4/20156/13/20156/11/20159371
94116516/4/20156/14/20156/4/20155334
94116516/4/20156/14/20156/11/20159223
94116516/4/20156/14/20156/11/2015NULL2
94116516/4/20156/14/20156/11/20159371
94116516/4/20156/15/20156/4/20155334
94116516/4/20156/15/20156/11/20159223
94116516/4/20156/15/20156/11/2015NULL2
94116516/4/20156/15/20156/11/20159371
94116516/4/20156/16/20156/4/20155336
94116516/4/20156/16/20156/11/20159225
How do you get to that output from that sample data?
Your original query would return 2343 rows. Is that correct? If it is, then there's not much to do for performance.
Viewing 5 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply