sum every two number combination

Question

Post reply

sum every two number combination

Amy.G

SSCrazy

Points: 2305
More actions
September 24, 2012 at 11:29 am

#268235

Let's say your coworker gave you the results of a query, and noted he/she did not include SalesOrderID numbers 43674 and 44295. However, you run that query: (from AdventureWorks)
SELECT [SalesOrderID]
, a.City
,[TotalDue]
FROM [AdventureWorks].[Sales].[SalesOrderHeader] s
INNER JOIN AdventureWorks.Person.Address a
ON s.BillToAddressID = a.AddressID
WHERE City = 'Ottawa'
AND SalesOrderID NOT IN ('43674','44295')
And find that you're grand total for orders = $98689.28, whereas the results the coworker sent has as grand total in sales of $109276.67. So you know he/she made a mistake and excluded the wrong two SalesOrderIDs. Now, obviously, the solution is to call and ask them to open the query and verify the excuded IDs. That's exactly what I did, and issue is resolved. But I got it into my head that it would be cool to be able to somehow figure out the two excluded IDs by summing all two number combinations until you found the sum that matched the difference.
This is more of a pet project or brain teaser I've given myself, but I'm not getting very far. A gentle nudge in the right direction would be nice. 😉

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply

patrickmcginnis59 SSCertifiable Points: 6483 More actions · Answer 1

A wild guess just for fun! Cue all smarter folks to line up and show how inefficient this query is!

SELECT TAB1.SalesOrderId AS ID1,

TAB1.TotalDue AS DUE1,

TAB2.SalesOrderId AS ID2,

TAB2.TotalDue AS DUE2

FROM SALESTABLE TAB1

CROSS JOIN

SALESTABLE TAB2

WHERE TAB1.SalesOrderID <> TAB2.SalesOrderID AND

TAB1.TotalDue + TAB2.Totaldue = @SPECIFIED_DIFFERENCE

Amy.G SSCrazy Points: 2305 More actions · Answer 2

You know, it worked. It tool 3.58 minutes, but it worked. 🙂

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 3

Lynn Pettis

SSC Guru

Points: 442467

September 24, 2012 at 12:31 pm

#1541566

Can TotalDue every be less than 0?

Luis Cazares SSC Guru Points: 183706 More actions · Answer 4

I don't have anything to test with, but I believe this could be better.

However, the problem is that it's still using a cartesian product to obtain the result.

SELECT TAB1.SalesOrderId AS ID1,

TAB1.TotalDue AS DUE1,

TAB2.SalesOrderId AS ID2,

TAB2.TotalDue AS DUE2

FROM SALESTABLE TAB1

JOIN SALESTABLE TAB2 ON TAB1.SalesOrderID < TAB2.SalesOrderID --Avoid half of the rows.

WHERE TAB1.TotalDue < @SPECIFIED_DIFFERENCE --This are useless unless you have negatives.

AND TAB2.TotalDue < @SPECIFIED_DIFFERENCE --Same as above.

AND TAB1.TotalDue + TAB2.Totaldue = @SPECIFIED_DIFFERENCE

I'm not sure if the conditions added will help but I'm sure the condition for the join must help the performance.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Amy.G SSCrazy Points: 2305 More actions · Answer 5

Lynn Pettis (9/24/2012)
Can TotalDue every be less than 0?

No. And there are no nulls.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 6

Luis Cazares (9/24/2012)
I don't have anything to test with, but I believe this could be better.
However, the problem is that it's still using a cartesian product to obtain the result.
SELECT TAB1.SalesOrderId AS ID1,
TAB1.TotalDue AS DUE1,
TAB2.SalesOrderId AS ID2,
TAB2.TotalDue AS DUE2
FROM SALESTABLE TAB1
JOIN SALESTABLE TAB2 ON TAB1.SalesOrderID < TAB2.SalesOrderID --Avoid half of the rows.
WHERE TAB1.TotalDue < @SPECIFIED_DIFFERENCE --This are useless unless you have negatives.
AND TAB2.TotalDue < @SPECIFIED_DIFFERENCE --Same as above.
AND TAB1.TotalDue + TAB2.Totaldue = @SPECIFIED_DIFFERENCE
I'm not sure if the conditions added will help but I'm sure the condition for the join must help the performance.

What you have here is what I was thinking. If the total due for each order can not be less than zero, then it makes sense to restrict the testing to orders where the total due for each order is less than or equal to the total difference. If you could have a negative total, then this check would not be worthwhile.

patrickmcginnis59 SSCertifiable Points: 6483 More actions · Answer 7

Or we could put an index on the totaldue column 😉

SELECT TAB1.SalesOrderId AS ID1,

TAB1.TotalDue AS DUE1,

TAB2.SalesOrderId AS ID2,

TAB2.TotalDue AS DUE2

FROM SALESTABLE TAB1

JOIN

SALESTABLE TAB2

ON TAB1.Totaldue = @SPECIFIED_DIFFERENCE - TAB2.Totaldue

Sean Lange SSC Guru Points: 286609 More actions · Answer 8

And of course if the exact rows matter this can get a LOT more challenging because there may be more than 1 pair where the sum is the total you are looking for.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Dwain Camps SSC Guru Points: 86978 More actions · Answer 9

Sean Lange (9/24/2012)
And of course if the exact rows matter this can get a LOT more challenging because there may be more than 1 pair where the sum is the total you are looking for.

I can do that! And I can examine triplet combinations (or deeper) as well!

DECLARE @Orders Table

(OrderID VARCHAR(4), Amount MONEY)

INSERT INTO @Orders

SELECT 2122, 5400

UNION ALL SELECT '2123', 5500

UNION ALL SELECT '2124', 1500

UNION ALL SELECT '2125', 5700

UNION ALL SELECT '2126', 4500

UNION ALL SELECT '2126', 1500

UNION ALL SELECT '2127', 5200

UNION ALL SELECT '2129', 1000

DECLARE @Amount MONEY = 7000

;WITH UNIQUEnTuples (n, Tuples, Amount) AS (

SELECT 1

-- Add brackets around the OrderID to make later string comparison unique

, CAST('[' + OrderID + ']' AS VARCHAR(max)) COLLATE Latin1_General_BIN

,Amount

FROM @Orders

UNION ALL

SELECT 1 + n.n, a.OrderID + n.Tuples, n.Amount + t.Amount

FROM @Orders t

CROSS APPLY (

SELECT '[' + t.OrderID + ']') a(OrderID)

JOIN UNIQUEnTuples n ON a.OrderID < n.Tuples

WHERE CHARINDEX(a.OrderID, n.Tuples) = 0 AND n < 3 AND

n.Amount + t.Amount <= @Amount

)

SELECT *

FROM UNIQUEnTuples

WHERE Amount = @Amount -- The sum you're intereseted in identifying

For an explanation of this approach, you can take a look at this article: http://www.sqlservercentral.com/articles/sql+n-Tuples/89809/

I've found that the COLLATE I added improves performance slightly. You can modify n < 3 to account for the depth of nTuples you want to examine.

Edit: Added the @Amount variable so I could use it to limit the rows examined in the recursive leg of the rCTE (potentially improves performance).

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Amy.G SSCrazy Points: 2305 More actions · Answer 10

Thank you for this solution. I had originally tried to do a recursive cte, but though I'm beginning to understand how they work, I haven't been able to succussfully create my own from scratch. The logic of pulling data from something that doesn't really exist yet is a little much for my linear brain to fully embrace. I'm trying though!

SomewhereSomehow SSCommitted Points: 1586 More actions · Answer 11

Hi!

If you are interested in how recursive cte is working, you may check out this blog post by Rob Farley

I am really sorry for my poor gramma. And I hope that value of my answers will outweigh the harm for your eyes.
Blog: http://somewheresomehow.ru[/url]
Twitter: @SomewereSomehow

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 12

dwain.c (9/24/2012)
Sean Lange (9/24/2012)
And of course if the exact rows matter this can get a LOT more challenging because there may be more than 1 pair where the sum is the total you are looking for.
I can do that! And I can examine triplet combinations (or deeper) as well!
...

When you wrote that excellent article Dwain, did you have any idea how often you would have to quote it? I think you would have said "about once a year". In practice it seems like once a week 😀

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Sean Lange SSC Guru Points: 286609 More actions · Answer 13

ChrisM@Work (9/25/2012)
dwain.c (9/24/2012)
Sean Lange (9/24/2012)
And of course if the exact rows matter this can get a LOT more challenging because there may be more than 1 pair where the sum is the total you are looking for.
I can do that! And I can examine triplet combinations (or deeper) as well!
...
When you wrote that excellent article Dwain, did you have any idea how often you would have to quote it? I think you would have said "about once a year". In practice it seems like once a week 😀

I knew that article from Dwain would have an answer. Of course my point got lost in the shuffle. The OP said something about knowing WHICH IDs made the sum. My point was that because of the possibility of multiple pairs getting the EXACT IDs would be impossible. Of course you would have to use Dwain's technique to find them. 😉

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Dwain Camps SSC Guru Points: 86978 More actions · Answer 14

ChrisM@Work (9/25/2012)
dwain.c (9/24/2012)
Sean Lange (9/24/2012)
And of course if the exact rows matter this can get a LOT more challenging because there may be more than 1 pair where the sum is the total you are looking for.
I can do that! And I can examine triplet combinations (or deeper) as well!
...
When you wrote that excellent article Dwain, did you have any idea how often you would have to quote it? I think you would have said "about once a year". In practice it seems like once a week 😀

Chris - It's interesting that you should say that. When I stumbled upon both the scripts in that article (entirely by accident) I just thought they were neat and banged out the article really quickly. I then later realized they were kinda like a solution looking for a problem. The shortfall of the article (which I remedied to a certain extent in the discussion thread) is that I didn't delve very deeply into their applicability.

And yes, I am a bit surprised at the number of times I've found an opportunity to use them. Apparently there aren't too many methods in SQL to arrive at all combinations like they do (at least not very easily).

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St