Arithmetic overflow error - double join?

  • Guys,

    I've created a temp table which has a bunch of IDs and dates, something dead simple like this:

    SELECT

    Pg.ID,

    pg.Name,

    COUNT(pa.Log_ID) as A,

    FROM tbla pg

    JOIN #log pa on pa.ID = pg.ID

    GROUP BY pg.id, pg.name

    Works just fine, if I try to join to the same table again (with a different alias) and do another count it gives me an error:

    Msg 8115, Level 16, State 2, Line 1

    Arithmetic overflow error converting expression to data type int.

    SELECT

    Pg.ID,

    pg.Name,

    COUNT(pa.Log_ID) as A,

    COUNT(pb.Log_ID) as B,

    FROM tbla pg

    JOIN #log pa on pa.ID = pg.ID

    JOIN #log pb on pb.ID = pg.ID

    GROUP BY pg.id, pg.name

    My initial reason for this double join was thinking I could restrict based on date - so whilst my temp table has the data going back over 1 year I could join to it several times, restricting my data set on each join - e.g. once for the full year, once for 6 months, once for 3 months etc.

    I've not come across this before, does anyone know what might be causing it? I tried casting to a bigint but it just ran for ages (well over 10mins when the origina query takes seconds).

    Thanks!

  • With nearly 300 points I assume you have been around here awhile. Please take a few minutes and read the article at the first link in my signature. Then post some ddl, sample data and desired output. Without these details we are guessing.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • I figured somebody might know the answer from the 'double join' side of things...

    Here's a couple of test tables with inserts:

    create table #logs

    (

    log_id int,

    id int,

    log_date datetime

    )

    INSERT INTO #Logs SELECT 27824224 as Log_ID, 94 as ID, '2012-12-20 20:39:07.433' as Log_Date

    INSERT INTO #Logs SELECT 27824212 as Log_ID, 151 as ID, '2012-12-20 20:38:57.130' as Log_Date

    INSERT INTO #Logs SELECT 28833771 as Log_ID, 117 as ID, '2013-01-30 16:36:54.047' as Log_Date

    INSERT INTO #Logs SELECT 28833854 as Log_ID, 119 as ID, '2013-01-30 16:38:00.673' as Log_Date

    INSERT INTO #Logs SELECT 28833690 as Log_ID, 121 as ID, '2013-01-30 16:35:23.073' as Log_Date

    INSERT INTO #Logs SELECT 28833733 as Log_ID, 121 as ID, '2013-01-30 16:36:02.000' as Log_Date

    INSERT INTO #Logs SELECT 28833871 as Log_ID, 121 as ID, '2013-01-30 16:38:13.247' as Log_Date

    INSERT INTO #Logs SELECT 28833900 as Log_ID, 121 as ID, '2013-01-30 16:38:50.350' as Log_Date

    INSERT INTO #Logs SELECT 28833970 as Log_ID, 121 as ID, '2013-01-30 16:40:10.247' as Log_Date

    INSERT INTO #Logs SELECT 28834134 as Log_ID, 121 as ID, '2013-01-30 16:42:20.503' as Log_Date

    INSERT INTO #Logs SELECT 28834217 as Log_ID, 121 as ID, '2013-01-30 16:43:37.447' as Log_Date

    INSERT INTO #Logs SELECT 28834329 as Log_ID, 121 as ID, '2013-01-30 16:45:01.253' as Log_Date

    INSERT INTO #Logs SELECT 28834465 as Log_ID, 121 as ID, '2013-01-30 16:46:39.210' as Log_Date

    INSERT INTO #Logs SELECT 28834544 as Log_ID, 121 as ID, '2013-01-30 16:47:47.623' as Log_Date

    INSERT INTO #Logs SELECT 28834577 as Log_ID, 121 as ID, '2013-01-30 16:48:13.480' as Log_Date

    INSERT INTO #Logs SELECT 28834589 as Log_ID, 121 as ID, '2013-01-30 16:48:21.643' as Log_Date

    INSERT INTO #Logs SELECT 28834639 as Log_ID, 121 as ID, '2013-01-30 16:48:49.837' as Log_Date

    INSERT INTO #Logs SELECT 28834679 as Log_ID, 122 as ID, '2013-01-30 16:49:21.230' as Log_Date

    create table #Items

    (

    id int,

    title varchar(50)

    )

    INSERT INTO #Items SELECT 94 as ID, 'Badger' as Title

    INSERT INTO #Items SELECT 151 as ID, 'Youngs' as Title

    INSERT INTO #Items SELECT 117 as ID, 'Fullers' as Title

    INSERT INTO #Items SELECT 119 as ID, 'Neame' as Title

    INSERT INTO #Items SELECT 121 as ID, 'Batemans' as Title

    INSERT INTO #Items SELECT 122 as ID, 'Adnams' as Title

    Now If I run this:

    SELECT

    t.ID,

    t.Title,

    COUNT(l.Log_ID) as c1

    FROM #Items t

    JOIN #logs l on l.ID = t.id

    GROUP BY t.id, t.title

    It works fine,

    If I run this:

    SELECT

    t.ID,

    t.Title,

    COUNT(l.Log_ID) as c1,

    COUNT(la.Log_ID) as c2

    FROM #Items t

    JOIN #logs l on l.ID = t.id

    JOIN #logs la on la.ID = t.id

    GROUP BY t.id, t.title

    It still works, as in it doesn't throw an error but the data isn't as intended, I'd expect another column duplicating the results of column C1, however, I've created some kind of monster which produces a count of 169 for 'Batemans' rather than 13.

    There's something obvious I'm not spotting here which is kind of annoying.

  • Rob-350472 (4/15/2013)


    I figured somebody might know the answer from the 'double join' side of things...

    Here's a couple of test tables with inserts:

    It still works, as in it doesn't throw an error but the data isn't as intended, I'd expect another column duplicating the results of column C1, however, I've created some kind of monster which produces a count of 169 for 'Batemans' rather than 13.

    There's something obvious I'm not spotting here which is kind of annoying.

    LOL you didn't create a monster. It did exactly what you asked of it. First it joins to #logs which now has 13 rows for 'Batemans'. Then it joins those results to #Items a second time, but this time there are 13 rows to join and 13 matches so you get 169 (13 * 13).

    It would probably help you visualize if you removed the aggregates. Still not quite sure what you are trying to do here but I can try to help.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • What I'm trying to do is something I've not done before (perhaps this is why!), say you've created a temp table - a bit like my example - but you wanted to count how many times an item appeared in it over specific date ranges...

    I was thinking I could join to the table once to get 'all time'

    then join again and have something along the lines of and Log_Date > dateadd(mm, -6, getdate()) to essentially perform the count on a subset and so on.

    In the past I'd used separate temp tables based on the time horizons, so say 4 temp tables rather than attempting 4 joins, or I'd done a union.

    I think the approach I'm tyring to adopt basically isn't going to work - I blame it on an office that's far too warm - can't think straight! (well that's my excuse and I'm sticking to it)

    The arithmetic overflow makes sense now though I guess because the actual temp table has 659,965 rows so yeah the square of that is quite large!

  • Rob-350472 (4/15/2013)


    What I'm trying to do is something I've not done before (perhaps this is why!), say you've created a temp table - a bit like my example - but you wanted to count how many times an item appeared in it over specific date ranges...

    I was thinking I could join to the table once to get 'all time'

    then join again and have something along the lines of and Log_Date > dateadd(mm, -6, getdate()) to essentially perform the count on a subset and so on.

    In the past I'd used separate temp tables based on the time horizons, so say 4 temp tables rather than attempting 4 joins, or I'd done a union.

    I think the approach I'm tyring to adopt basically isn't going to work - I blame it on an office that's far too warm - can't think straight! (well that's my excuse and I'm sticking to it)

    The arithmetic overflow makes sense now though I guess because the actual temp table has 659,965 rows so yeah the square of that is quite large!

    Maybe easiest in that case to use subselects? I will give you a hand if you need it but I would need a bit more details first. It certainly can be done fairly easily.

    Glad I was able to shed some light on what was going on at least. πŸ˜€

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Indeed, quite why I didn't just do something like this:

    SELECT

    t.ID,

    t.Title,

    (SELECT count(l.log_id) FROM #logs l WHERE l.id = t.id) as AllTime,

    (SELECT count(l.log_id) FROM #logs l WHERE l.id = t.id and l.log_date >= DATEADD(mm, -3, getdate()))as Mnth

    FROM #Items t

    GROUP BY t.id, t.title

    Is beyond me, it's only Monday too!

  • Rob-350472 (4/15/2013)


    Indeed, quite why I didn't just do something like this:

    SELECT

    t.ID,

    t.Title,

    (SELECT count(l.log_id) FROM #logs l WHERE l.id = t.id) as AllTime,

    (SELECT count(l.log_id) FROM #logs l WHERE l.id = t.id and l.log_date >= DATEADD(mm, -3, getdate()))as Mnth

    FROM #Items t

    GROUP BY t.id, t.title

    Is beyond me, it's only Monday too!

    Probably don't need to use 2 subselects here. You can use your original query.

    SELECT

    t.ID,

    t.Title,

    COUNT(l.Log_ID) as AllTime

    , (SELECT count(l.log_id) FROM #logs l WHERE l.id = t.id and l.log_date >= DATEADD(mm, -3, getdate()))as Mnth

    FROM #Items t

    JOIN #logs l on l.ID = t.id

    GROUP BY t.id, t.title

    You might find that using a CTE for the Months will end up performing a bit better.

    Something like this (untested of course because I don't have real data to work with):

    ;with MonthValues as

    (

    SELECT count(l.log_id) as Months, ID

    FROM #logs l

    WHERE l.log_date >= DATEADD(mm, -3, getdate()))

    group by ID

    )

    SELECT

    t.ID,

    t.Title,

    COUNT(l.Log_ID) as AllTime,

    mv.Months

    FROM #Items t

    JOIN #logs l on l.ID = t.id

    join MonthValues mv on mv.ID = t.ID

    GROUP BY t.id, t.title

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Performance may end up being important here, I've not decided, I may just pre-process the data in which case it's not overly important, thanks for the CTE example, I'll give that a look in a while - I've been meaning to investigate CTEs more so even if it's not used for performance gains I'd like to play with it!

    Cheers!

  • Rob-350472 (4/15/2013)


    Guys,

    I've created a temp table which has a bunch of IDs and dates, something dead simple like this:

    SELECT

    Pg.ID,

    pg.Name,

    COUNT(pa.Log_ID) as A,

    FROM tbla pg

    JOIN #log pa on pa.ID = pg.ID

    GROUP BY pg.id, pg.name

    Works just fine, if I try to join to the same table again (with a different alias) and do another count it gives me an error:

    Msg 8115, Level 16, State 2, Line 1

    Arithmetic overflow error converting expression to data type int.

    SELECT

    Pg.ID,

    pg.Name,

    COUNT(pa.Log_ID) as A,

    COUNT(pb.Log_ID) as B,

    FROM tbla pg

    JOIN #log pa on pa.ID = pg.ID

    JOIN #log pb on pb.ID = pg.ID

    GROUP BY pg.id, pg.name

    My initial reason for this double join was thinking I could restrict based on date - so whilst my temp table has the data going back over 1 year I could join to it several times, restricting my data set on each join - e.g. once for the full year, once for 6 months, once for 3 months etc.

    I've not come across this before, does anyone know what might be causing it? I tried casting to a bigint but it just ran for ages (well over 10mins when the origina query takes seconds).

    Thanks!

    Your query was counting more than 2,147,483,647 rows! If you're interested in a workaround - probably not - COUNT_BIG() would do the trick.

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Aye, that does make sense why BIG INT conversion didn't die but did run for ages before I killed it.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply