Thank you to all.
In a fit of inspiration I have come up with my own solution. Although it gives a slightly different result set, I'd be interested in more optimal solutions. Correct me if I'm wrong but my solution appears to use less io and have a lower estimated subtree cost. I provide files with my solution and also the source data.
Thanks - LotusNotes
--Consultant Challenge
-- Use <database>
--Select * from dbo.consultant -- source table
--set statistics io on
with cte1 as
(
Select * from--have to do this to harden the ranking function columns
--so can be used in a where clause
(
SELECT [PatientID]
,[EpisodeNo]
,[Activity_Date]
--,[EpisodeType_from]
--,[EpisodeType]
,[Description]
,[Staff_from]
,[Staff_to]
,row_number() over (partition by EpisodeNo order by Activity_Date) as Row_num
,case
when [Description]='Admitted' or (Staff_from=Staff_to) then 0
else row_number() over (partition by EpisodeNo order by Activity_Date)
end as Staff_change-- any number >0 identifies a staff change
--,[Ward_from]
--,[Ward_to]
FROM [dbo].[consultant]
--order by EpisodeNo-- this may be a sticking point?
) as X
where
Row_num= 1-- admitted
or Row_num=Staff_change--staff change
or Description='Discharged'
) -- end cte1
-- do select * first of all to see all the columns
--select * from
select
a.PatientID
,a.EpisodeNo
,a.Activity_Date as 'From'
,b.Activity_Date as 'To'
,b.Staff_from as 'Consultant'
,case
when b.Description='Discharged' then b.Description+' by '+b.Staff_to
else b.Description+' to '+b.Staff_to
end as 'Result'
from
cte1 a join cte1 b-- self join
on (a.EpisodeNo=b.EpisodeNo)-- groups the rows correctly
and a.Staff_to=b.Staff_from-- to correlate the rows correctly
where
a.Description <>'Discharged'-- gets rid of discharged row