Grouping and Sorting challenge

  • Hello all -

    I have an interesting challenge. Imagine I own a pizza restaurant and I have delivery drivers (4). I also have a fixed number of delivery Routes for my restaurant and these routes have varying number of households on the them. I want to sort and assign the routes as evenly as I can to the drivers.

    My route data my look like this:

    [RouteID]_[RouteName]_[HouseHolds]___[DriverID]

    -----------------------------------------------------

    1________Route01_____700__________0

    2________Route02_____210__________0

    3________Route03_____111__________0

    4________Route04_____452__________0

    5________Route05_____236__________0

    6________Route06_____111__________0

    7________Route07_____300__________0

    8________Route08_____421__________0

    9________Route09_____1200_________0

    10_______Route10______525_________0

    Drivers:

    -------------------------

    1 - Bob

    2 - John

    3 - Ricky Bobby

    4 - Batman

    What I'm TRYING to is write a Stored Proc to loop though the records (routes) and assign them to the drivers as evenly as possible based on households (staying with whole integers). Make sense?

    I've tried a couple of different concepts but keep hitting 'writers block'.

    Any suggestions?

  • This is a classic "Bin Stacking" problem. So far as I know, there's not a set based way to pull one of these off in T-SQL. You have to use some form of RBAR to do this.

    As a bit of a sidebar, take a look at the first link in my signature line below under "Helpful Links". That'll tell you how to post "readily consumable data" to make it a bit easier on folks trying to help you. Not doing it that way is probably one of the big reasons why no one responded to your post any more quickly.

    Here's another way to post such example data. It's what I used to test this and I'm posting it so if someone has a better idea on this problem, it's easier for them to test with.

    One thing that you need to be made aware of is that I added a column to the Driver table. You should make such a change to your real table and make sure that both tables have an appropriate PK.

    --=====================================================================================================================

    -- Setup and populate the test tables

    --=====================================================================================================================

    --===== If the test tables already exist, drop them to make reruns easier in SSMS

    IF OBJECT_ID('tempdb..#Route' ,'U') IS NOT NULL DROP TABLE #Route;

    IF OBJECT_ID('tempdb..#Driver','U') IS NOT NULL DROP TABLE #Driver;

    --===== Build the test tables with the data provided.

    -- Note that a column has been added to the Drivers table.

    SELECT *

    INTO #Route

    FROM (

    SELECT 1 ,'Route01',700 ,0 UNION ALL

    SELECT 2 ,'Route02',210 ,0 UNION ALL

    SELECT 3 ,'Route03',111 ,0 UNION ALL

    SELECT 4 ,'Route04',452 ,0 UNION ALL

    SELECT 5 ,'Route05',236 ,0 UNION ALL

    SELECT 6 ,'Route06',111 ,0 UNION ALL

    SELECT 7 ,'Route07',300 ,0 UNION ALL

    SELECT 8 ,'Route08',421 ,0 UNION ALL

    SELECT 9 ,'Route09',1200,0 UNION ALL

    SELECT 10,'Route10',525 ,0

    ) d (RouteID, RouteName, HouseHolds, DriverID)

    ;

    SELECT *

    INTO #Driver

    FROM (

    SELECT 1,'Bob' ,0 UNION ALL

    SELECT 2,'John' ,0 UNION ALL

    SELECT 3,'Ricky Bobby' ,0 UNION ALL

    SELECT 4,'Batman' ,0

    ) d (DriverID, Name, TotalHouseholds) --<<< Added the TotalHouseholds column

    ;

    --===== Add the required PK's to the tables

    ALTER TABLE #Route ADD PRIMARY KEY CLUSTERED (RouteID);

    ALTER TABLE #Driver ADD PRIMARY KEY CLUSTERED (DriverID);

    Here's a classic cursor solution for a classic RBAR-only problem. As always, details are in the comments in the code. Of course, you'll need to change the table names for your implementation.

    --=====================================================================================================================

    -- Distribute the routes using a classic "Bin Fill".

    -- The basic premise is to cycle through the routes in descending order according to the number of households in

    -- the routes and assign the next value to the driver with the fewest number of households. That driver will

    -- change on every iteration until we run out of routes.

    -- What we end up with is the route assignments by driver in the route table and a total number of households

    -- for each driver in the driver table.

    --=====================================================================================================================

    --===== Environmental presets for performance of RBAR.

    SET NOCOUNT ON;

    --===== Reset the Route and Driver tables to make reruns possible.

    UPDATE #Route SET DriverID = 0;

    UPDATE #Driver SET TotalHouseholds = 0;

    --===== Declare the necessary obviously named variables.

    -- Because we're using a cursor instead of temp tables, we don't need any counters.

    DECLARE @Households INT

    ,@DriverID INT

    ;

    --===== Define the cursor from the Route table.

    -- Note that the number of households are in descending order.

    DECLARE BinFill CURSOR LOCAL FORWARD_ONLY

    FOR SELECT Households --This is all we need from the table. The cursor keeps track of everything else.

    FROM #Route

    ORDER BY HouseHolds DESC, RouteID ASC --TieBreaker

    FOR UPDATE OF DriverID --This is key to updating the table where the cursor is currently positioned "WHERE CURRENT OF"

    ;

    --===== Open the cursor so that we can use it.

    OPEN BinFill

    ;

    --===== Get the first row from the cursor

    FETCH NEXT FROM BinFill

    INTO @Households

    ;

    --===== Loop through the rows in the cursor

    WHILE @@FETCH_STATUS = 0

    BEGIN

    --===== Identify the driver with the fewest households for this iteration.

    -- The DriverID is the tie breaker just to be pedantic. ;-)

    SELECT TOP 1 @DriverID = DriverID FROM #Driver ORDER BY TotalHouseholds ASC, DriverID ASC

    ;

    --===== Add the # of Households value from the cursor row to the designated driver

    UPDATE d

    SET d.TotalHouseholds = d.TotalHouseholds + @Households --@Households came from the cursor row,

    FROM #Driver d

    WHERE d.DriverID = @DriverID

    ;

    --===== Update the Route table with the current DriverID according to the current position of the cursor.

    -- It's freakin' magic. ;-)

    UPDATE #Route

    SET DriverID = @DriverID

    WHERE CURRENT OF BinFill

    ;

    --===== Get the next row. If we're out of rows, the fetch status will change to -1 auto-magically

    FETCH NEXT FROM BinFill

    INTO @Households

    ;

    END

    ;

    --======== Properly close the cursor

    CLOSE BinFill;

    DEALLOCATE BinFill;

    GO

    --===== All done. Let's see the results

    SELECT * FROM #Driver;

    SELECT * FROM #Route;

    Last but not least, here are the results...

    DriverID Name TotalHouseholds

    ----------- ----------- ---------------

    1 Bob 1200

    2 John 1047

    3 Ricky Bobby 1035

    4 Batman 984

    (4 row(s) affected)

    RouteID RouteName HouseHolds DriverID

    ----------- --------- ----------- -----------

    1 Route01 700 2

    2 Route02 210 3

    3 Route03 111 4

    4 Route04 452 4

    5 Route05 236 2

    6 Route06 111 2

    7 Route07 300 3

    8 Route08 421 4

    9 Route09 1200 1

    10 Route10 525 3

    (10 row(s) affected)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Any feedback? Is there something missing?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • CELKO (8/20/2014)


    Google “bin packing” and “knapsack” problems. Ther is also a classic book availalbre as a PDF download. http://www.or.deis.unibo.it/knapsack.html. It has heavy math and awful FORTRAN.

    The bottom line is that you are screwed for the general case. This is an NP Complete problem.

    Also your specs need to be more precise as to how you determine the sizes; would 3 identical sized bags and 1 outlier be better than 4 near-matches? Etc.

    Since you have a limited problem of 4 knapsacks and 10 items. My best guess is first partition the 10 items into 2 subset of approximately equal weight. Whatever that means and assuming it is possible. Then partition each of those recursively and hand them out to the 4 drivers.

    This will require a table of 2^10 rows (1024) with all possible combinations of the weights and 0 in each column.

    CREATE TABLE Routes

    (partition_nbr INTEGER NOT NULL PRIMARY KEY,

    r01 INTEGER NOT NULL

    CHECK(r01 IN 0, 700),

    r02 INTEGER NOT NULL

    CHECK(r02 IN 0, 210)

    r10 INTEGER NOT NULL

    CHECK(r10 IN 0, 525));

    Load the routes (exercise for the student) then

    SELECT partition_nbr, (r01+r02+ ..+r10) AS wgt

    FROM Routes;

    Pick partitions closest to the mid point.

    Table with 2^10 rows? "Screwed in the general case"? I don't believe any of that is necessary. This is one of those places where a loop (please see the solution I posted above) actually does work quite well and it's self correcting based on the number of rows in the two tables. And, remember... I'm one of those that hates loops! 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • CELKO (8/21/2014)


    This is one of those places where a loop (please see the solution I posted above) actually does work quite well and it's self correcting based on the number of rows in the two tables. And, remember... I'm one of those that hates loops! 😉

    We are getting into an example of why "set-oriented programming can stink" that us RDBMS (functional, declarative, whatever) do not want to admit. By its nature, set-oriented programming must produce all valid answers.

    Well there goes my set oriented chess program idea.

  • CELKO (8/21/2014)


    This is one of those places where a loop (please see the solution I posted above) actually does work quite well and it's self correcting based on the number of rows in the two tables. And, remember... I'm one of those that hates loops! 😉

    We are getting into an example of why "set-oriented programming can stink" that us RDBMS (functional, declarative, whatever) do not want to admit. By its nature, set-oriented programming must produce all valid answers.

    That can be a very large set 🙁 An unusably large set. And it the real world we need only one answer to move forward. It does not matter which one, really.

    Remember stable marriages problem? At least one solution must exist. Solutions can favor men or women various degrees. So pick one 😉

    🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff -

    I just wanted to jump in and say Thanks! I went out of town last week so I was never able to follow up on this. I look forward to trying your solution in the next day or so! THANKS for the time!

  • patrickmcginnis59 10839 (8/21/2014)


    CELKO (8/21/2014)


    This is one of those places where a loop (please see the solution I posted above) actually does work quite well and it's self correcting based on the number of rows in the two tables. And, remember... I'm one of those that hates loops! 😉

    We are getting into an example of why "set-oriented programming can stink" that us RDBMS (functional, declarative, whatever) do not want to admit. By its nature, set-oriented programming must produce all valid answers.

    Well there goes my set oriented chess program idea.

    Heh... it IS called a "Chess Set". 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (8/25/2014)


    patrickmcginnis59 10839 (8/21/2014)


    CELKO (8/21/2014)


    This is one of those places where a loop (please see the solution I posted above) actually does work quite well and it's self correcting based on the number of rows in the two tables. And, remember... I'm one of those that hates loops! 😉

    We are getting into an example of why "set-oriented programming can stink" that us RDBMS (functional, declarative, whatever) do not want to admit. By its nature, set-oriented programming must produce all valid answers.

    Well there goes my set oriented chess program idea.

    Heh... it IS called a "Chess Set". 😛

    LOL select from positions where checkmate = 1. Unfortunately I'd probably use a scalar function and run out the game clock.

    Interestingly enough that reflects my actual game, either lose badly or run out of game clock.

  • RedBirdOBX (8/25/2014)


    Jeff -

    I just wanted to jump in and say Thanks! I went out of town last week so I was never able to follow up on this. I look forward to trying your solution in the next day or so! THANKS for the time!

    You bet. I'd be very interested in knowing how it works out. It's been a little over a week. Have you tried it, yet?

    The reason I ask is because this is the first useful cursor that I've ever written in nearly two decades of database programming. I did it this way because WHERE CURRENT OF seemed to be a great shortcut instead of using a While Loop although a While Loop would probably be more obvious and easier to troubleshoot than a Cursor. I also tried to make it faster by using the FORWARD_ONLY hint.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply