Create sum closest to an integer

  • Hi,

    I need to create a function that will check an int column from a table and has to generate the best combinations to create a sum closest to an integer given.

    For example:

    the integer given is 10

    the int column will have the following records : 5,6,3,4,3.

    The function should return:

    5,4 and 6,3

    Thank you for any help!

  • What you are describing is classically known as the "Bin packing problem". It is very complicated to say the least. There is no easy way to do this and in t-sql it will most certainly be very slow.

    https://en.wikipedia.org/wiki/Bin_packing_problem

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • SQL Server MVP Hugo Kornelis has written a couple of blog posts about bin packing, see http://sqlblog.com/blogs/hugo_kornelis/archive/tags/Bin+Packing/default.aspx

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • I love SQL bin-packing problems! Hate to disagree with you Sean but they don't need to be overly complicated.

    Before I suggest a solution, shouldn't the answer be 4, 6 (=10?)

    Here's one solution that might start to get a little slow if you have:

    - lot's of integers in your table AND

    - you want to consider solutions that sum more than 3, 4 or 5 of those.

    DECLARE @t TABLE (strcol CHAR(3));

    DECLARE @ValueOfInterest INT = 10;

    INSERT INTO @t (strcol)

    SELECT ' 5' UNION ALL SELECT ' 6' UNION ALL SELECT ' 3'

    UNION ALL SELECT ' 4' UNION ALL SELECT ' 3';

    -- Improved Combinations

    ;WITH UNIQUEnTuples (n, Tuples, ID, CSum) AS (

    SELECT DISTINCT 1, CAST(strcol AS VARCHAR(8000)), strcol, CAST(strcol AS INT)

    FROM @t

    UNION ALL

    SELECT 1 + n.n, t.strcol + ',' + n.Tuples, strcol, CSum+CAST(t.strcol AS INT)

    FROM UNIQUEnTuples n

    CROSS APPLY (

    SELECT strcol

    FROM @t t

    WHERE t.strcol < n.ID) t

    )

    SELECT TOP 1 Tuples, SumOfTuples=CSum

    FROM UNIQUEnTuples

    WHERE n <= 2 AND CSum <= @ValueOfInterest

    ORDER BY CSum DESC;

    This is actually a slightly improved version of some code I submitted a couple of years back: Generating n-Tuples with SQL[/url]. The improved version appears towards the end of the discussion thread.

    Note that the WHERE clause can be adjusted to account for the number of integers you want to sum and/or whether you allow the value to be within a plus or minus range, for example:

    WHERE n <= 3 AND CSum BETWEEN @ValueOfInterest - 1 AND @ValueOfInterest + 1

    Would give you up to 3 summed integers where the range is plus or minus 1 from your value of interest. Of course, you may want to select the TOP 5 in such a case and then later decide to narrow it down.

    Edit: Note also that if the number of integers to sum is unbounded, meaning you don't care how many sum to meet your target, you should put a CSUM <= @ValueOfInterest WHERE clause in the recursive leg of the rCTE so that on the result that exceeds your target, all further combinations are avoided.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Here's an example that processes 100 random integers up to 3-tuples with target value of 100. It takes less than 3 seconds to run on my box.

    DECLARE @t TABLE (strcol CHAR(4));

    DECLARE @ValueOfInterest INT = 100;

    INSERT INTO @t (strcol)

    --SELECT ' 5' UNION ALL SELECT ' 6' UNION ALL SELECT ' 3'

    --UNION ALL SELECT ' 4' UNION ALL SELECT ' 3';

    SELECT TOP 100 RIGHT('000' + CAST(ABS(CHECKSUM(NEWID())) % 200 AS VARCHAR(3)), 4)

    FROM sys.all_columns;

    SET STATISTICS TIME ON;

    -- Improved Combinations

    WITH UNIQUEnTuples (n, Tuples, ID, CSum) AS (

    SELECT DISTINCT 1, CAST(strcol AS VARCHAR(8000)), strcol, CAST(strcol AS INT)

    FROM @t

    UNION ALL

    SELECT 1 + n.n, t.strcol + ',' + n.Tuples, strcol, CSum+t.strcol

    FROM UNIQUEnTuples n

    CROSS APPLY (

    SELECT strcol

    FROM @t t

    WHERE t.strcol < n.ID) t

    WHERE CSum+t.strcol <= @ValueOfInterest

    )

    SELECT TOP 5 Tuples, SumOfTuples=CSum

    FROM UNIQUEnTuples

    WHERE n <= 3 AND CSum <= @ValueOfInterest

    ORDER BY CSum DESC

    OPTION (RECOMPILE);

    SET STATISTICS TIME OFF;

    Edit: Two notes (afterthoughts):

    - I added OPTION(RECOMPILE) because it might help to eliminate parameter sniffing in case you run this many times on tables of different sizes.

    - If you need to consider a solution (from your original problem) like 3+3, you need to remove the DISTINCT clause in the anchor leg of the query.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (7/30/2013)


    I love SQL bin-packing problems! Hate to disagree with you Sean but they don't need to be overly complicated.

    Before I suggest a solution, shouldn't the answer be 4, 6 (=10?)

    Here's one solution that might start to get a little slow if you have:

    - lot's of integers in your table AND

    - you want to consider solutions that sum more than 3, 4 or 5 of those.

    DECLARE @t TABLE (strcol CHAR(3));

    DECLARE @ValueOfInterest INT = 10;

    INSERT INTO @t (strcol)

    SELECT ' 5' UNION ALL SELECT ' 6' UNION ALL SELECT ' 3'

    UNION ALL SELECT ' 4' UNION ALL SELECT ' 3';

    -- Improved Combinations

    ;WITH UNIQUEnTuples (n, Tuples, ID, CSum) AS (

    SELECT DISTINCT 1, CAST(strcol AS VARCHAR(8000)), strcol, CAST(strcol AS INT)

    FROM @t

    UNION ALL

    SELECT 1 + n.n, t.strcol + ',' + n.Tuples, strcol, CSum+CAST(t.strcol AS INT)

    FROM UNIQUEnTuples n

    CROSS APPLY (

    SELECT strcol

    FROM @t t

    WHERE t.strcol < n.ID) t

    )

    SELECT TOP 1 Tuples, SumOfTuples=CSum

    FROM UNIQUEnTuples

    WHERE n <= 2 AND CSum <= @ValueOfInterest

    ORDER BY CSum DESC;

    This is actually a slightly improved version of some code I submitted a couple of years back: Generating n-Tuples with SQL[/url]. The improved version appears towards the end of the discussion thread.

    Note that the WHERE clause can be adjusted to account for the number of integers you want to sum and/or whether you allow the value to be within a plus or minus range, for example:

    WHERE n <= 3 AND CSum BETWEEN @ValueOfInterest - 1 AND @ValueOfInterest + 1

    Would give you up to 3 summed integers where the range is plus or minus 1 from your value of interest. Of course, you may want to select the TOP 5 in such a case and then later decide to narrow it down.

    Edit: Note also that if the number of integers to sum is unbounded, meaning you don't care how many sum to meet your target, you should put a CSUM <= @ValueOfInterest WHERE clause in the recursive leg of the rCTE so that on the result that exceeds your target, all further combinations are avoided.

    Dwain this is pretty cool but it doesn't provide the results the OP was looking for. This only returns the one combination that most closely meets the @ValueOfInterest. The OP was looking to have all the values put into groups. The OP stated they wanted back 5,4 and 6,3. Now I agree that doesn't seem like the most logical thing and the second value of 3 went missing. I would think that 6,4 : 5,3 : 3 would be the best fit. I can't some up with a way to make your awesome example return the best fit of all the values.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (7/31/2013)


    dwain.c (7/30/2013)


    I love SQL bin-packing problems! Hate to disagree with you Sean but they don't need to be overly complicated.

    Before I suggest a solution, shouldn't the answer be 4, 6 (=10?)

    Here's one solution that might start to get a little slow if you have:

    - lot's of integers in your table AND

    - you want to consider solutions that sum more than 3, 4 or 5 of those.

    DECLARE @t TABLE (strcol CHAR(3));

    DECLARE @ValueOfInterest INT = 10;

    INSERT INTO @t (strcol)

    SELECT ' 5' UNION ALL SELECT ' 6' UNION ALL SELECT ' 3'

    UNION ALL SELECT ' 4' UNION ALL SELECT ' 3';

    -- Improved Combinations

    ;WITH UNIQUEnTuples (n, Tuples, ID, CSum) AS (

    SELECT DISTINCT 1, CAST(strcol AS VARCHAR(8000)), strcol, CAST(strcol AS INT)

    FROM @t

    UNION ALL

    SELECT 1 + n.n, t.strcol + ',' + n.Tuples, strcol, CSum+CAST(t.strcol AS INT)

    FROM UNIQUEnTuples n

    CROSS APPLY (

    SELECT strcol

    FROM @t t

    WHERE t.strcol < n.ID) t

    )

    SELECT TOP 1 Tuples, SumOfTuples=CSum

    FROM UNIQUEnTuples

    WHERE n <= 2 AND CSum <= @ValueOfInterest

    ORDER BY CSum DESC;

    This is actually a slightly improved version of some code I submitted a couple of years back: Generating n-Tuples with SQL[/url]. The improved version appears towards the end of the discussion thread.

    Note that the WHERE clause can be adjusted to account for the number of integers you want to sum and/or whether you allow the value to be within a plus or minus range, for example:

    WHERE n <= 3 AND CSum BETWEEN @ValueOfInterest - 1 AND @ValueOfInterest + 1

    Would give you up to 3 summed integers where the range is plus or minus 1 from your value of interest. Of course, you may want to select the TOP 5 in such a case and then later decide to narrow it down.

    Edit: Note also that if the number of integers to sum is unbounded, meaning you don't care how many sum to meet your target, you should put a CSUM <= @ValueOfInterest WHERE clause in the recursive leg of the rCTE so that on the result that exceeds your target, all further combinations are avoided.

    Dwain this is pretty cool but it doesn't provide the results the OP was looking for. This only returns the one combination that most closely meets the @ValueOfInterest. The OP was looking to have all the values put into groups. The OP stated they wanted back 5,4 and 6,3. Now I agree that doesn't seem like the most logical thing and the second value of 3 went missing. I would think that 6,4 : 5,3 : 3 would be the best fit. I can't some up with a way to make your awesome example return the best fit of all the values.

    I agree that clarification is needed on what is meant by "best fit." That's why I asked if 4,6 shouldn't be the correct answer and offered some suggestions to modify the WHERE clause to address whatever that requirement is.

    Then there's also the question of whether 3,3,4 is one of the "best fit" solutions needed.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (7/31/2013)


    I agree that clarification is needed on what is meant by "best fit." That's why I asked if 4,6 shouldn't be the correct answer and offered some suggestions to modify the WHERE clause to address whatever that requirement is.

    Then there's also the question of whether 3,3,4 is one of the "best fit" solutions needed.

    And this is why I said this can get really complicated. Maybe the rule is to get the best fit (least number of bins) and always fill them as full as possible. If that were the case then 4,3,3 : 5 : 6 would be the best fit. The debate goes on and on and on...

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Slightly modified Dwain code to return exactly what OP has asked:

    DECLARE @t TABLE (strcol CHAR(3));

    DECLARE @ValueOfInterest INT = 10;

    INSERT INTO @t (strcol)

    SELECT ' 5' UNION ALL SELECT ' 6' UNION ALL SELECT ' 3'

    UNION ALL SELECT ' 4' UNION ALL SELECT ' 3';

    -- Improved Combinations

    ;WITH UNIQUEnTuples (n, Tuples, ID, CSum) AS (

    SELECT DISTINCT 1, CAST(strcol AS VARCHAR(8000)), strcol, CAST(strcol AS INT)

    FROM @t

    UNION ALL

    SELECT 1 + n.n, t.strcol + ',' + n.Tuples, strcol, CSum+CAST(t.strcol AS INT)

    FROM UNIQUEnTuples n

    CROSS APPLY (

    SELECT strcol

    FROM @t t

    WHERE t.strcol < n.ID) t

    )

    SELECT DISTINCT Tuples, SumOfTuples=CSum

    FROM UNIQUEnTuples

    WHERE n <= 2 AND CSum = (SELECT MAX(CSum) MS

    FROM UNIQUEnTuples

    WHERE n <= 2 AND CSum < @ValueOfInterest)

    ORDER BY CSum DESC;

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Thank you very much for all your replies!

    You were right about the rule to get the best fit and always fill the batches as full as possible. I am so sorry for not giving the correct specifications sooner on this issue.

    But, thank you again on all your replies!

  • I am sorry, I am not sure I understand the question.

    I need for the sum created, to be closest to a given number (10, in the example above), but not go over it.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply