Set based super challenge

  • Please read the attached txt file.

    The challenge is to summarize the data per patient but only as / if the consultant changes.

    This is giving me sleepless nights. I've tried ranking functions,self joins and iterative cte's to no avail

    Thanks for your help

  • Instead of asking people to imagine data in a table, if you provide that data in a readily consumable way, you will get much more help.

    Have a read of this : How to post code questions[/url]

    edit: weird things going on with the url...

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • I may have an idea on how to do this, I just don't have time to convert the info in your text file into DDL and DML statements to do the work. Please read the first article I reference below in my signature block regarding asking for help. If you convert your information in the text file into a readily consumable format (think cut/paste/execute in ssms), you will get much better answers.

    Also, it would help if you would post the code that you have already tried. It is possible that you may have been close and just need a bit more guidance to get the answer.

  • This doesn't look too challenging. Can you correct both your sample data and your expected output for EpisodeNo 10010? Cheers.

    For those of you looking for sample data to play with in advance of LotusNotes posting the corrected versions:

    DROP TABLE #Consultant;

    WITH Consultant (PatientID, EpisodeNo, Activity_Date, EpisodeType_from, EpisodeType_to, Description, Staff_from, Staff_to, Ward_from, Ward_to) AS (

    SELECT 1,10001,'01/04/2013 00:01', NULL,7,'Admitted',NULL,'Mr A', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 1,10001,'01/04/2013 08:01', 7,11,'Discharged','Mr A', 'Mr A', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 2,10002,'01/04/2013 00:01', NULL,7,'Admitted',NULL,'Mr A', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 2,10002,'02/04/2013 00:01', 7,10,'Transfer','Mr A', 'Mr A', 'Ward Z', 'Ward Y' UNION ALL

    SELECT 2,10002,'02/04/2013 00:01', 10,10,'Transfer','Mr A', 'Mr A', 'Ward Y', 'Ward X' UNION ALL

    SELECT 2,10002,'05/04/2013 00:01', 10,11,'Discharged','Mr A', 'Mr A', 'Ward X', 'Ward X' UNION ALL

    SELECT 3,10010,'01/05/2013 00:01', NULL,7,'Admitted', NULL,'Mr A', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 3,10010,'05/05/2013 00:01', 7,10,'Transfer','Mr A', 'Mr B', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 3,10010,'05/05/2013 08:01', 10,10,'Transfer','Mr B', 'Mr B', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 3,10010,'06/05/2013 00:01', 10,10,'Transfer','Mr B', 'Mr B', 'Ward Z', 'Ward Y' UNION ALL

    SELECT 3,10010,'06/05/2013 08:01', 10,10,'Transfer','Mr B', 'Mr C', 'Ward Y', 'Ward Y' UNION ALL

    SELECT 3,10010,'07/06/2013 00:01', 10,11,'Discharged','Mr C', 'Mr C', 'Ward Y', 'Ward Y' UNION ALL -- changed date

    SELECT 4,20100,'01/05/2013 00:01', NULL,7,'Admitted', NULL,'Mr A', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 4,20100,'05/05/2013 00:01', 7,10,'Transfer','Mr A', 'Mr B', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 4,20100,'05/05/2013 08:01', 10,10,'Transfer','Mr B', 'Mr B', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 4,20100,'06/05/2013 00:01', 10,10,'Transfer','Mr B', 'Mr B', 'Ward Z', 'Ward Y' UNION ALL

    SELECT 4,20100,'06/05/2013 08:01', 10,10,'Transfer','Mr B', 'Mr C', 'Ward Y', 'Ward Z'

    )

    SELECT * INTO #Consultant FROM Consultant

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Gosh, I'm surprised that with a title like this, there have been so few takers - and so little feedback from the OP. Maybe those sleepless nights have caught up. Here's a simple solution for starters.

    ;WITH OrderedData AS (

    SELECT *,

    RangeStart = CASE WHEN cur.Staff_from IS NULL OR cur.Staff_from <> cur.Staff_to THEN 1 ELSE 0 END,

    RangeEnd = CASE WHEN rn = cnt OR cur.Staff_from <> cur.Staff_to THEN 1 ELSE 0 END

    FROM (

    SELECT *,

    rn = ROW_NUMBER() OVER(PARTITION BY c.PatientID, c.EpisodeNo ORDER BY c.Activity_Date),

    cnt = COUNT(*) OVER(PARTITION BY c.PatientID, c.EpisodeNo)

    FROM #Consultant c

    ) cur

    )

    SELECT

    currow.PatientID,

    currow.EpisodeNo,

    [Episode_Start]= currow.Activity_Date,

    [Episode_End]= nextrow.Activity_Date,

    [Staff]= currow.Staff_to,

    [EpisodeType_from] = currow.EpisodeType_to,

    [EpisodeType_to] = nextrow.EpisodeType_to

    FROM OrderedData currow

    OUTER APPLY (

    SELECT TOP 1

    Activity_Date, EpisodeType_to

    FROM OrderedData i

    WHERE i.RangeEnd = 1

    AND i.PatientID = currow.PatientID

    AND i.EpisodeNo = currow.EpisodeNo

    AND i.Activity_Date > currow.Activity_Date

    ORDER BY i.Activity_Date

    ) nextrow

    WHERE currow.RangeStart = 1

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (12/11/2013)


    Gosh, I'm surprised that with a title like this, there have been so few takers - and so little feedback from the OP. Maybe those sleepless nights have caught up. Here's a simple solution for starters.

    I don't know what's going on with the OP but when I see that ChrisM has worked a problem, I figure there's little left for me to do and move on. πŸ˜‰

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you to all.

    In a fit of inspiration I have come up with my own solution. Although it gives a slightly different result set, I'd be interested in more optimal solutions. Correct me if I'm wrong but my solution appears to use less io and have a lower estimated subtree cost. I provide files with my solution and also the source data.

    Thanks - LotusNotes

    --Consultant Challenge

    -- Use <database>

    --Select * from dbo.consultant -- source table

    --set statistics io on

    with cte1 as

    (

    Select * from--have to do this to harden the ranking function columns

    --so can be used in a where clause

    (

    SELECT [PatientID]

    ,[EpisodeNo]

    ,[Activity_Date]

    --,[EpisodeType_from]

    --,[EpisodeType]

    ,[Description]

    ,[Staff_from]

    ,[Staff_to]

    ,row_number() over (partition by EpisodeNo order by Activity_Date) as Row_num

    ,case

    when [Description]='Admitted' or (Staff_from=Staff_to) then 0

    else row_number() over (partition by EpisodeNo order by Activity_Date)

    end as Staff_change-- any number >0 identifies a staff change

    --,[Ward_from]

    --,[Ward_to]

    FROM [dbo].[consultant]

    --order by EpisodeNo-- this may be a sticking point?

    ) as X

    where

    Row_num= 1-- admitted

    or Row_num=Staff_change--staff change

    or Description='Discharged'

    ) -- end cte1

    -- do select * first of all to see all the columns

    --select * from

    select

    a.PatientID

    ,a.EpisodeNo

    ,a.Activity_Date as 'From'

    ,b.Activity_Date as 'To'

    ,b.Staff_from as 'Consultant'

    ,case

    when b.Description='Discharged' then b.Description+' by '+b.Staff_to

    else b.Description+' to '+b.Staff_to

    end as 'Result'

    from

    cte1 a join cte1 b-- self join

    on (a.EpisodeNo=b.EpisodeNo)-- groups the rows correctly

    and a.Staff_to=b.Staff_from-- to correlate the rows correctly

    where

    a.Description <>'Discharged'-- gets rid of discharged row

  • Until you test it on a much larger set of rows, you don't actually know what's better. I can show you two solutions for a different problem where it looks like one solution will take 0% compared to the other which looks like it will take 100%. When you run the code, the diametric opposite happens. Even the actual execution plan shows the same "mistake".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/11/2013)


    ChrisM@Work (12/11/2013)


    Gosh, I'm surprised that with a title like this, there have been so few takers - and so little feedback from the OP. Maybe those sleepless nights have caught up. Here's a simple solution for starters.

    I don't know what's going on with the OP but when I see that ChrisM has worked a problem, I figure there's little left for me to do and move on. πŸ˜‰

    I feel the same way! πŸ˜€

    I did look at this thread yesterday and saw that Chris suggested there was something wrong with the sample data/results so I passed on it at the time.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • lotusnotes (12/11/2013)


    Thank you to all.

    In a fit of inspiration I have come up with my own solution. Although it gives a slightly different result set, I'd be interested in more optimal solutions. Correct me if I'm wrong but my solution appears to use less io and have a lower estimated subtree cost. I provide files with my solution and also the source data.

    Thanks - LotusNotes

    -- Your query loses a row from episode 20100.

    -- If [Description] can be reliably used to determine the start and end of an episode

    -- and [EpisodeNo] is unique to each patient as your code suggests,

    -- then my query can be changed to this:

    ;WITH OrderedData AS (

    SELECT *,

    RangeStart = CASE WHEN [Description] = 'Admitted' OR Staff_from <> Staff_to THEN 1 ELSE 0 END,

    RangeEnd = CASE WHEN [Description] = 'Discharged' OR Staff_from <> Staff_to THEN 1 ELSE 0 END

    FROM #Consultant

    )

    SELECT

    currow.PatientID,

    currow.EpisodeNo,

    [Episode_Start]= currow.Activity_Date,

    [Episode_End]= nextrow.Activity_Date,

    [Staff]= currow.Staff_to,

    [EpisodeType_from] = currow.EpisodeType_to,

    [EpisodeType_to] = nextrow.EpisodeType_to

    FROM OrderedData currow

    OUTER APPLY (

    SELECT TOP 1

    Activity_Date, EpisodeType_to

    FROM OrderedData i

    WHERE i.RangeEnd = 1

    --AND i.PatientID = currow.PatientID

    AND i.EpisodeNo = currow.EpisodeNo

    AND i.Activity_Date > currow.Activity_Date

    ORDER BY i.Activity_Date

    ) nextrow

    WHERE currow.RangeStart = 1

    -- Here's some sample data to test against:

    DROP TABLE #Consultant;

    WITH Consultant (PatientID, EpisodeNo, Activity_Date, EpisodeType_from, EpisodeType_to, Description, Staff_from, Staff_to, Ward_from, Ward_to) AS (

    SELECT 1,10001,'01/04/2013 00:01', NULL,7,'Admitted',NULL,'Mr A', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 1,10001,'01/04/2013 08:01', 7,11,'Discharged','Mr A', 'Mr A', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 2,10002,'01/04/2013 00:01', NULL,7,'Admitted',NULL,'Mr A', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 2,10002,'02/04/2013 00:01', 7,10,'Transfer','Mr A', 'Mr A', 'Ward Z', 'Ward Y' UNION ALL

    SELECT 2,10002,'02/04/2013 00:02', 10,10,'Transfer','Mr A', 'Mr A', 'Ward Y', 'Ward X' UNION ALL

    SELECT 2,10002,'05/04/2013 00:01', 10,11,'Discharged','Mr A', 'Mr A', 'Ward X', 'Ward X' UNION ALL

    SELECT 3,10010,'01/05/2013 00:01', NULL,7,'Admitted', NULL,'Mr A', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 3,10010,'05/05/2013 00:01', 7,10,'Transfer','Mr A', 'Mr B', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 3,10010,'05/05/2013 08:01', 10,10,'Transfer','Mr B', 'Mr B', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 3,10010,'06/05/2013 00:01', 10,10,'Transfer','Mr B', 'Mr B', 'Ward Z', 'Ward Y' UNION ALL

    SELECT 3,10010,'06/05/2013 08:01', 10,10,'Transfer','Mr B', 'Mr C', 'Ward Y', 'Ward Y' UNION ALL

    SELECT 3,10010,'07/06/2013 00:01', 10,11,'Discharged','Mr C', 'Mr C', 'Ward Y', 'Ward Y' UNION ALL -- changed date

    SELECT 4,20100,'01/05/2013 00:01', NULL,7,'Admitted', NULL,'Mr A', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 4,20100,'05/05/2013 00:01', 7,10,'Transfer','Mr A', 'Mr B', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 4,20100,'05/05/2013 08:01', 10,10,'Transfer','Mr B', 'Mr B', 'Ward Z', 'Ward Z' UNION ALL

    SELECT 4,20100,'06/05/2013 00:01', 10,10,'Transfer','Mr B', 'Mr B', 'Ward Z', 'Ward Y' UNION ALL

    SELECT 4,20100,'06/05/2013 08:01', 10,10,'Transfer','Mr B', 'Mr C', 'Ward Y', 'Ward Z'

    )

    SELECT * INTO #Consultant FROM Consultant

    CREATE UNIQUE CLUSTERED INDEX ucx_EpisodeNo_Activity_Date ON #Consultant (EpisodeNo, Activity_Date)

    -- For a real performance test, this sample data set should be scaled up significantly.

    -- I might give it a go over lunch.

    ----------------------------------------------------------------------------------------

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (12/12/2013)


    -- I might give it a go over lunch.

    ----------------------------------------------------------------------------------------

    I can see it now. Cornish pasty in one hand while the other furiously types in the 1M row test harness.

    Mmmm, mmmm! πŸ˜›


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Thanks Chris. I think you have done enough already πŸ™‚

  • dwain.c (12/12/2013)


    ChrisM@Work (12/12/2013)


    -- I might give it a go over lunch.

    ----------------------------------------------------------------------------------------

    I can see it now. Cornish pasty in one hand while the other furiously types in the 1M row test harness.

    Mmmm, mmmm! πŸ˜›

    Good Lord no Dwain, I'd use one of Jeff's fancy shmancy tally tables!!

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • dwain.c (12/11/2013)


    Jeff Moden (12/11/2013)


    ChrisM@Work (12/11/2013)


    Gosh, I'm surprised that with a title like this, there have been so few takers - and so little feedback from the OP. Maybe those sleepless nights have caught up. Here's a simple solution for starters.

    I don't know what's going on with the OP but when I see that ChrisM has worked a problem, I figure there's little left for me to do and move on. πŸ˜‰

    I feel the same way! πŸ˜€

    I did look at this thread yesterday and saw that Chris suggested there was something wrong with the sample data/results so I passed on it at the time.

    That's 'cos I always pick the easy ones:-P

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Viewing 14 posts - 1 through 13 (of 13 total)

    You must be logged in to reply to this topic. Login to reply