|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Thursday, February 14, 2013 2:04 PM
Points: 125,
Visits: 485
|
|
I have a use-case and I know a CTE is probably where I'm gonna go with this but I just can't get it right due to the logic and I thought I would throw it out here and see what you amazingly smart genius type folks could come up with. :) (figured I should butter you up a bit) I know I can do this with a cursor but this is going to need to work with millions of rows and the cursor is HORRIBLY slow (as expected but does work) NO CURSOR PLS OK so.... I have Schedules with a Type and a block of time allocated (in minutes) I have Request with a Type and a block of time requested (in minutes) In the order of the RequestID (the order they came in) I need to find the first available ScheduleID of the same type that has enough time remaining to handle the Request There can be multiple Requests that end up filling (or partially filling) a Schedule ie) Request 1 (15 minutes), Request 2 (30 minutes), Request 3 (15 Minutes) of Type A can all fit into Schedule 1 of Type A (that has 60 minutes allocated) Attached is a script with the setup of the tables and a small sample of data as well as a picture (I'm a picture person) of how logically the schedules and requests should fall.
How it should go:
RequestID 1 of Type A and length of 15 minutes gets slotted into Schedule 1 (Type A, 30 minutes)
RequestID 2 of Type B and length of 15 minutes gets slotted into Schedule 2 (Type B, 60 minutes)
RequestID 3 of Type A and length of 30 minutes gets slotted into Schedule 3 because Schedule 1 already has the first 15 minutes allocated to RequestID 1
RequestID 4 of Type B and length of 30 minutes get slotted into Schedule 2 because Schedule 2 still has 45 minutes remaining that can be allocated
RequestID 5 of Type C and length of 30 minutes get slotted into Schedule 4 (Type C, 45 minutes)
RequestID 6 of Type C and length of 15 minutes get slotted into Schedule 4 because Schedule 4 still has 15 minutes remaining that can be allocated
RequestID 7 of Type B and length of 30 minutes DOES NOT get a schedule because the only available schedule of type B only as 15 minutes left (due to RequestID 2 and RequestID 4)
tblScheduledRequests
Should look like this
RequestID ScheduleID RequestChunkStartTime
1, 1, 0
2, 2, 0
3, 3, 0
4, 2, 15
5, 4, 0
6, 4, 30
Keeping in mind the "RequestChunkStartTime" is when the RequestID will start in the ScheduleID (ScheduleLength)
The goal is to respect the RequestID order as priority.... Even though ScheduleID 2 (Type B) can be filled 100% with RequestID 4 and 7... we have to take RequestID 2 first, RequestID 4 second, and RequestID 7 doesn't get a ScheduleID
Colin http://benchmarkitconsulting.com
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Today @ 3:49 PM
Points: 5,244,
Visits: 7,063
|
|
If the sequential aspect is important (i.e. FIRST request must go to FIRST available schedule), then I think a cursor will be faster than any of the alternatives. But make it one cursor (for the requests). Don't use a cursor to find the first available schedule. And if it ruuns slow, check if the query to find the first available schedule is efficient. Having a column RemainingTime in the Schedules table can help, having an index on (ScheduleType, RemainingTime) may help even more - but check that the overhead of updating this column when adding a session to a schedule doesn't hurt performance more it helps the search query.
If, on the other hand, you are looking for an algorithm to pack as many sessions in the available schedules, read my series of blog posts about "bin packing" (google for "bin packing Hugo Kornelis") - especially part 5. It's not exactly the same, but hopefully close enough to give you some pointers. The trick I used to speed up the process is to use a loop that in each iteration assigns all available schedulers a session. Maybe you can work that into your situation? However, this will totally change the order in which sessions are assigned to schedulers, so you can't use this if the order matters.
Hugo Kornelis, SQL Server MVP Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Thursday, February 14, 2013 2:04 PM
Points: 125,
Visits: 485
|
|
yeah i have cursor code with the column (time remaining), etc just as you suggested and I'm just not a fan of it or it's peformance.
It's an interesting use-case and I'm going to spend some more time on it cause I just don't believe that a cursor is my only solution :)
Thanks for looking
Colin http://benchmarkitconsulting.com
|
|
|
|
|
Hall of Fame
       
Group: General Forum Members
Last Login: Today @ 6:52 PM
Points: 3,582,
Visits: 5,132
|
|
Colin, sorry I don't have time to dig into this more, but see if this sample code I have from one of my SQL Saturday sessions can point you in the right direction. NOTE: proper indexing can make this a LOT faster on large datasets:
-- Suppress data loading messages SET NOCOUNT ON;
DECLARE @Schedule table ( AppID int IDENTITY, AppTeam varchar(20), AppStart datetime, AppFinish datetime );
INSERT INTO @Schedule VALUES ( 'Start', NULL, '01/11/2007 09:00' ); INSERT INTO @Schedule VALUES ( 'Smith', '01/11/2007 09:00', '01/11/2007 09:30' ); INSERT INTO @Schedule VALUES ( 'Smith', '01/11/2007 10:00', '01/11/2007 10:15' ); INSERT INTO @Schedule VALUES ( 'Jones', '01/11/2007 11:00', '01/11/2007 12:00' ); INSERT INTO @Schedule VALUES ( 'Williams', '01/11/2007 12:00', '01/11/2007 14:45' ); INSERT INTO @Schedule VALUES ( 'Hsiao', '01/11/2007 15:30', '01/11/2007 16:00' ); INSERT INTO @Schedule VALUES ( 'Lopez', '01/11/2007 16:00', '01/11/2007 17:30' ); INSERT INTO @Schedule VALUES ( 'Green', '01/11/2007 17:30', '01/11/2007 18:30' ); INSERT INTO @Schedule VALUES ( 'Alphonso', '01/11/2007 20:00', '01/11/2007 20:30' ); INSERT INTO @Schedule VALUES ( 'End', '01/11/2007 21:00', NULL );
-- Determine the Length of Time Required DECLARE @ApptNeed int; SET @ApptNeed = 45; --SET @ApptNeed = 60; --comment out Lopez and run this one
--Find FIRST Available Time Slot ;WITH CTE AS ( SELECT *, RowNumber = ROW_NUMBER() OVER( ORDER BY AppStart ASC ) FROM @Schedule ) SELECT FirstApptAvail = min( a.AppFinish ) FROM CTE a INNER JOIN CTE b --be SURE you cover endpoints on self-joins like this!! ON a.RowNumber = b.RowNumber - 1 WHERE datediff( minute, a.AppFinish, b.AppStart) >= @ApptNeed;
Best,
Kevin G. Boles SQL Server Consultant SQL MVP 2007-2012 TheSQLGuru at GMail
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Thursday, February 14, 2013 2:04 PM
Points: 125,
Visits: 485
|
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Yesterday @ 12:39 PM
Points: 6,668,
Visits: 5,695
|
|
Colin,
Forgive me, but I'm a little confused on the scenario. Am I reading this correctly that your schedules table is not a distinct record set but will always have duplicate schedule types in it?
How do schedules get inserted into the table? What creates them? Or is there a set number of schedule types in tblSchedules?
Kevin's code probably already gave you the answer, but I'd like to look at this in detail myself. The answers to these questions would help me figure out my own approach to the problem.
Brandie Tarvin, MCITP Database Administrator, MCDBA, MCSA
Webpage: http://www.BrandieTarvin.net LiveJournal Blog: http://brandietarvin.livejournal.com/ On LinkedIn!, Google+, and Twitter.
Freelance Writer: Shadowrun Latchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Yesterday @ 10:22 AM
Points: 1,037,
Visits: 7,695
|
|
If I'm understanding it, the requirement as specified sounds fundamentally row-by-row as you must know what the previous row in the sequence did in order to process the next row, so a recursive CTE is likely to be as bad or worse than a well optimised cursor.
As Hugo says, if there was a less rigid requirement for the priority of each request, then you could perform recursion/looping over sets rather than over rows, massively reducing the iteration and improving the performance.
I think Kevin's code helps with finding the first individual gap in a schedule, but not in fitting a batch of requests into a schedule in a specific priority order.
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Thursday, February 14, 2013 2:04 PM
Points: 125,
Visits: 485
|
|
Hi Brandie
1st thanks for reading :)
2nd I "simplified" the Schedule table down to 3 columns for the sake of trying not to confuse the goals (mission failed apparently hehehe)
the actual table would have a date column (ScheduleDateTime) and an FK to a ScheduleType table, etc
I tried to keep the column count down and through the example data and picture try and really just focus on the goal as the other columns are not relevant to the use-case and I thought they would confuse.
Colin http://benchmarkitconsulting.com
|
|
|
|
|
Old Hand
      
Group: General Forum Members
Last Login: 2 days ago @ 9:08 PM
Points: 301,
Visits: 1,130
|
|
Hi
Here's something that may work, I still need to test it over a larger set of data. It uses quirky updates and I haven't done a lot with these, so I may not have applied the rules correctly. Hopefully the process I have used is clear. I'm creating tables with cumulative start and end times for the schedules and requests. Then I determine apply a cumulative offset to requests where their start and end times span an end time for a schedule. Once that is done you can get the schedule based on the end request time + offset.
CREATE TABLE #wrkSchedules ( ScheduleType char(10) not null, ScheduleID int not null, ScheduleSeq int not null, ScheduleLengthMinutes int not null, ScheduleStart int, ScheduleEnd int ) ALTER TABLE #wrkSchedules ADD CONSTRAINT ws_pk PRIMARY KEY (ScheduleType, ScheduleSeq)
CREATE TABLE #wrkRequests ( RequestType char(10) not null, RequestID int not null, RequestSeq int not null, RequestLengthMinutes int not null, RequestStart int, RequestEnd int, Offset int, ScheduleID int ) ALTER TABLE #wrkRequests ADD CONSTRAINT wr_pk PRIMARY KEY (RequestType, RequestSeq)
INSERT INTO #wrkSchedules (scheduleType,ScheduleID,ScheduleLengthMinutes,ScheduleSeq) SELECT scheduleType,ScheduleID,ScheduleLengthMinutes, row_number() over (partition by scheduleType order by ScheduleID) ScheduleSeq FROM tblSchedules
INSERT INTO #wrkRequests (RequestType,RequestID,RequestLengthMinutes,RequestSeq) SELECT RequestType,RequestID,RequestLengthMinutes, row_number() over (partition by RequestType order by RequestID) RequestSeq FROM tblRequests
declare @s int = 0
update #wrkSchedules set @s = ScheduleEnd = case when ScheduleSeq = 1 then 0 else @s end + ScheduleLengthMinutes, ScheduleStart = @s - ScheduleLengthMinutes
select * from #wrkSchedules
update #wrkRequests set @s = RequestEnd = case when RequestSeq = 1 then 0 else @s end + RequestLengthMinutes, RequestStart = @s - RequestLengthMinutes
select * from #wrkRequests
declare @o int = 0
-- Altered to get Maximum ScheduleEnd update #wrkRequests set @o = offset = case when RequestStart = 0 then isnull((select MAX(ScheduleEnd) - RequestStart from #wrkSchedules w where w.scheduletype = requesttype and w.ScheduleEnd between RequestStart and RequestEnd - 1), 0) else @o + isnull((select MAX(ScheduleEnd) - RequestStart from #wrkSchedules w where w.scheduletype = requesttype and w.ScheduleEnd between RequestStart + @o and RequestEnd + @o - 1), 0) end select * from #wrkRequests
update wr set wr.ScheduleID = ws.ScheduleID from #wrkRequests wr inner join #wrkSchedules ws on wr.RequestType = ws.ScheduleType and (wr.requestStart + offset) between ws.ScheduleStart and ws.ScheduleEnd - 1 and (wr.requestEnd + offset) between ws.ScheduleStart + 1 and ws.ScheduleEnd
select * from #wrkRequests select * from #wrkSchedules
select RequestID, wr.ScheduleID, (RequestStart + offset) - ScheduleStart RequestChunkStartTime from #wrkRequests wr inner join #wrkSchedules ws on wr.ScheduleID = ws.ScheduleID Edited after running with data from mister.magoo's generator to handle requests the span more than a single schedule.
|
|
|
|
|
SSChampion
        
Group: General Forum Members
Last Login: Today @ 5:01 PM
Points: 10,990,
Visits: 10,545
|
|
Hi Colin,
It's quite easy to go mad trying to do this sort of thing in T-SQL today (future things like ordered aggregates might help, I don't know).
A cursor solution is straightforward, the only downside being that T-SQL cursors suck performance-wise. So, I would use a CLR cursor: the obvious procedural logic in a CLR stored procedure that reads from Schedules and Requests and returns matched output as a set that can be directly INSERT...EXEC'd into the Scheduled Requests table. If the output set is huge, you could even use bulk copy from the CLR code.
Paul White SQL Server MVP SQLblog.com @SQL_Kiwi
|
|
|
|