Help with conditional joins

  • Hello,

    I am fairly new to T-SQL and new to the forum. I was given the below query to see if there was a way to optimize it to be manageable. After trying to dissect and deconstruct it I could not avoid the massive row counts the conditional joins were causing. The main purpose of this exercise was to be able to add a rank order to each id to be able to request stats such as how many IDs were in xyz status on any given day. Any thoughts or feedback would be greatly appreciated.

    select

    backbone.id

    ,backbone.firstday

    ,backbone.datekey

    , trans.logdate

    , trans.tostatus

    , DENSE_RANK () OVER ( PARTITION by backbone.id, backbone.datekey ORDER BY trans.logdate DESC) RankOrder

    from

    (select source.id,source.FirstDay, datekey.Datekey from

    (select id, cast(Min(logdate) as date) FirstDay From Prod.dbo.history

    group by id) source

    cross join

    (select cast(FullDateKey as Date) DateKey From Fin.dbo.Calendar where fulldatekey <= '03-31-2016') datekey

    where datekey.Datekey >= source.FirstDay) backbone

    left join Prod.dbo.history trans on backbone.id=trans.id and backbone.datekey >= trans.logdate

    order by backbone.datekey, trans.logdate,RankOrder

    Thanks!

  • What I get from your query.

    It's creating one row for each date in your calendar table from the first logdate until '03-31-2016', this for each id. Then multiply all those rows for each row with that id. Is that what you want?

    Just for one id with 2 rows and one year old, it returns 733 rows. That sounds illogical to me.

    Here's a test. I rearranged your query to make it easier for me to read, but the logic is identical.

    CREATE TABLE #history(

    id int,

    logdate date,

    tostatus varchar(10));

    INSERT INTO #history VALUES(1, '20150331', 'Test'), (1, '20150401', 'Test2')

    CREATE TABLE #Calendar( FullDateKey datetime);

    WITH

    E(n) AS(

    SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))E(n)

    ),

    E2(n) AS(

    SELECT a.n FROM E a, E b

    ),

    E4(n) AS(

    SELECT a.n FROM E2 a, E2 b

    ),

    cteTally(n) AS(

    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) n

    FROM E4

    )

    INSERT INTO #Calendar

    SELECT DATEADD( dd, n - 1, '2010')

    FROM cteTally;

    WITH source AS(

    select id,

    cast(Min(logdate) as date) FirstDay

    From #history

    group by id

    ),

    datekey AS(

    select cast(FullDateKey as Date) DateKey

    From #Calendar

    where fulldatekey <= '03-31-2016'

    ),

    backbone AS(

    select source.id,

    source.FirstDay,

    datekey.Datekey

    from source

    join datekey ON datekey.Datekey >= source.FirstDay

    )

    select

    backbone.id

    , backbone.firstday

    , backbone.datekey

    , trans.logdate

    , trans.tostatus

    , DENSE_RANK () OVER ( PARTITION by backbone.id, backbone.datekey ORDER BY trans.logdate DESC) RankOrder

    from backbone

    left join #history trans on backbone.id = trans.id

    and backbone.datekey >= trans.logdate

    order by backbone.datekey,

    trans.logdate,

    RankOrder;

    GO

    DROP TABLE #history, #Calendar;

    Here's an explanation on why you're having performance problems (if it wasn't clear enough): http://www.sqlservercentral.com/articles/T-SQL/61539/

    Please share sample data and expected output to know what you're trying to do. We could help you correct the process.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • I'm thinking this sounds like a classic rank/count per-partition scenario, in which case I would think it could be done with a fairly straight-forward windowing function since you are on SQL 2014. I don't have time to work up a solution though. 🙁

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Luis C that was exactly what I was trying to do and exactly the issue I ran into. Below is some sample data and expected output. Hope this helps and thanks again!

    Prod.dbo.history

    ID SeqIDLogDatePhaseNbrFromStatus ToStatus

    941165116/4/2015221 NULL 533

    941165126/11/2015221 533 922

    941165136/11/2015221 922 NULL

    941165146/11/2015216 NULL 937

    941165156/15/2015216 937 938

    941165166/15/2015216 938 526

    941165176/16/2015216 526 499

    941165186/22/2015216 499 529

    ID FirstDay

    91407758/21/2013

    91280403/26/2013

    91052266/29/2012

    91009817/13/2012

    912379512/20/2012

    91110604/12/2012

    94753244/20/2016

    94710793/30/2016

    DateKey

    11/13/2012

    11/14/2012

    11/15/2012

    11/16/2012

    11/17/2012

    11/18/2012

    11/19/2012

    11/20/2012

    Expected Output

    ID firstday datekey logdate tostatusRankOrder

    94116516/4/20156/4/2015NULL NULL1

    94116516/4/20156/5/20156/4/20155331

    94116516/4/20156/6/20156/4/20155331

    94116516/4/20156/7/20156/4/20155331

    94116516/4/20156/8/20156/4/20155331

    94116516/4/20156/9/20156/4/20155331

    94116516/4/20156/10/20156/4/20155331

    94116516/4/20156/11/20156/4/20155331

    94116516/4/20156/12/20156/4/20155334

    94116516/4/20156/12/20156/11/20159223

    94116516/4/20156/12/20156/11/2015NULL2

    94116516/4/20156/12/20156/11/20159371

    94116516/4/20156/13/20156/4/20155334

    94116516/4/20156/13/20156/11/20159223

    94116516/4/20156/13/20156/11/2015NULL2

    94116516/4/20156/13/20156/11/20159371

    94116516/4/20156/14/20156/4/20155334

    94116516/4/20156/14/20156/11/20159223

    94116516/4/20156/14/20156/11/2015NULL2

    94116516/4/20156/14/20156/11/20159371

    94116516/4/20156/15/20156/4/20155334

    94116516/4/20156/15/20156/11/20159223

    94116516/4/20156/15/20156/11/2015NULL2

    94116516/4/20156/15/20156/11/20159371

    94116516/4/20156/16/20156/4/20155336

    94116516/4/20156/16/20156/11/20159225

  • DK13 (6/2/2016)


    Luis C that was exactly what I was trying to do and exactly the issue I ran into. Below is some sample data and expected output. Hope this helps and thanks again!

    Prod.dbo.history

    ID SeqIDLogDatePhaseNbrFromStatus ToStatus

    941165116/4/2015221 NULL 533

    941165126/11/2015221 533 922

    941165136/11/2015221 922 NULL

    941165146/11/2015216 NULL 937

    941165156/15/2015216 937 938

    941165166/15/2015216 938 526

    941165176/16/2015216 526 499

    941165186/22/2015216 499 529

    ID FirstDay

    91407758/21/2013

    91280403/26/2013

    91052266/29/2012

    91009817/13/2012

    912379512/20/2012

    91110604/12/2012

    94753244/20/2016

    94710793/30/2016

    DateKey

    11/13/2012

    11/14/2012

    11/15/2012

    11/16/2012

    11/17/2012

    11/18/2012

    11/19/2012

    11/20/2012

    Expected Output

    ID firstday datekey logdate tostatusRankOrder

    94116516/4/20156/4/2015NULL NULL1

    94116516/4/20156/5/20156/4/20155331

    94116516/4/20156/6/20156/4/20155331

    94116516/4/20156/7/20156/4/20155331

    94116516/4/20156/8/20156/4/20155331

    94116516/4/20156/9/20156/4/20155331

    94116516/4/20156/10/20156/4/20155331

    94116516/4/20156/11/20156/4/20155331

    94116516/4/20156/12/20156/4/20155334

    94116516/4/20156/12/20156/11/20159223

    94116516/4/20156/12/20156/11/2015NULL2

    94116516/4/20156/12/20156/11/20159371

    94116516/4/20156/13/20156/4/20155334

    94116516/4/20156/13/20156/11/20159223

    94116516/4/20156/13/20156/11/2015NULL2

    94116516/4/20156/13/20156/11/20159371

    94116516/4/20156/14/20156/4/20155334

    94116516/4/20156/14/20156/11/20159223

    94116516/4/20156/14/20156/11/2015NULL2

    94116516/4/20156/14/20156/11/20159371

    94116516/4/20156/15/20156/4/20155334

    94116516/4/20156/15/20156/11/20159223

    94116516/4/20156/15/20156/11/2015NULL2

    94116516/4/20156/15/20156/11/20159371

    94116516/4/20156/16/20156/4/20155336

    94116516/4/20156/16/20156/11/20159225

    How do you get to that output from that sample data?

    Your original query would return 2343 rows. Is that correct? If it is, then there's not much to do for performance.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply