Premature casting on 0 rows insert

  • Hello,

    I have encountered some weird behaviour. Code that has been working for "eternities" suddenly started to fail.

    Alas, i couldn't recreate it on any other machines.

    I have written a sample script to illustrate the issue:

    CREATE TABLE #t_parcels (parcel_id INT, current_pos NUMERIC(28, 0), end_pos NUMERIC(28, 0))

    CREATE TABLE #t_orders (outorder INT, parcel_id INT)

    CREATE TABLE #t_missing_parcels (parcel_id INT, diff INT)

    INSERT INTO #t_parcels (parcel_id, current_pos, end_pos)

    SELECT1, 100000000000.0, 900000000000.0

    INSERT INTO #t_orders (outorder, parcel_id)

    SELECT1, 1

    UNION

    SELECT2, 1

    -- Find out if we have enough parcels to cover all orders

    INSERT INTO #t_missing_parcels (parcel_id, diff)

    SELECTt.parcel_id, t.end_pos - t.current_pos - COUNT(*)

    FROM#t_parcels t

    INNER JOIN #t_orders o

    ONo.parcel_id = t.parcel_id

    GROUP BY t.parcel_id, t.end_pos - t.current_pos

    HAVING t.end_pos - t.current_pos < COUNT(*)

    SELECT @@VERSION

    -- Microsoft SQL Server 2012 - 11.0.2100.60 (X64)

    -- Feb 10 2012 19:39:15

    -- Copyright (c) Microsoft Corporation

    -- Developer Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1) (Hypervisor)

    The last insert crashed with :"Arithmetic overflow error converting expression to data type int", even though there are no rows that satisfies the condition!

    This is due to "diff" column having wrong datatype, BUT, the insert had no hits in the database. So how can inserting 0 rows crash with incorrect datatype?:w00t:

    I even copied the select so it was ran before the insert, in in that case, the SELECT completed successfully.

    When i changed datatype in the table, the error went away, but i'm still curious what led to the error.

    BR,

    Sigge

  • Just a hunch, suspect cached temp table might cause this.

    😎

  • Hi,

    Could you explain it a bit more? 🙂 Not sure i follow it exactly

  • Have you checked the execution plan? Are there any unexpected implicit conversions happening?

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Aha, it does something like that:

    [Expr1010] = Scalar Operator(CONVERT_IMPLICIT(int,#t.[parcelno_available] as [tpa].[parcelno_available]-[Expr1011],0))

    Right before the insert into the #t-table.

    and parcelno_available is well above INTs value.

    This cast doesn't happens when i just select the values.

    So, apparently SQL Server pre-casts the values, even though there are no rows to insert?

  • siggemannen (2/28/2015)


    Hi,

    Could you explain it a bit more? 🙂 Not sure i follow it exactly

    Have a look at this article by Paul White

    😎

  • The bahaviour is decoumented here

    https://msdn.microsoft.com/en-us/library/ms190309.aspx


    N 56°04'39.16"
    E 12°55'05.25"

  • siggemannen (2/28/2015)


    Hello,

    I have encountered some weird behaviour. Code that has been working for "eternities" suddenly started to fail.

    Alas, i couldn't recreate it on any other machines.

    I'm afraid that this is part of the ongoing farce resulting from treating the optimiser as a sacred cow. There's a Microsoft connect item that covers this bizarre behavious and a lot of other similar nonsense and Microsoft said way back 5 years ago that they hoped to address it in a future version, but nothing has happened yet. Of course teh SQL standard probably says that what should happen is undefined, as it was driven by people (IBM, Oracle, etc) with axes to grind and awfully ill-conceived implementations to support). It's the same problem as the issue raised in another Microsoft Connect item where MS said (again about 5 years ago) "Another workaround, which is what the server should be doing automatically, is to push the subquery down into a subselect" before closing the item because the insance behaviour is "By Design". Rules about evaluation order which efectively render the language free of any useful semantics which, without a rule that an error occurring on an execution branch which is not logically required should not cause a failure, were added to the standard so as to permit bizarre results.

    We are all suffering from IBM's decision to exclude both Codd and all its famous computer language experts from the design of SEQUEL (whch is what became SQL), and the absolute failure of ANSI and ISO to impose any decent language design on teh resulting mess.

    Tom

  • TomThomson (3/1/2015)


    siggemannen (2/28/2015)


    Hello,

    I have encountered some weird behaviour. Code that has been working for "eternities" suddenly started to fail.

    Alas, i couldn't recreate it on any other machines.

    I'm afraid that this is part of the ongoing farce resulting from treating the optimiser as a sacred cow. There's a Microsoft connect item that covers this bizarre behavious and a lot of other similar nonsense and Microsoft said way back 5 years ago that they hoped to address it in a future version, but nothing has happened yet. Of course teh SQL standard probably says that what should happen is undefined, as it was driven by people (IBM, Oracle, etc) with axes to grind and awfully ill-conceived implementations to support). It's the same problem as the issue raised in another Microsoft Connect item where MS said (again about 5 years ago) "Another workaround, which is what the server should be doing automatically, is to push the subquery down into a subselect" before closing the item because the insance behaviour is "By Design". Rules about evaluation order which efectively render the language free of any useful semantics which, without a rule that an error occurring on an execution branch which is not logically required should not cause a failure, were added to the standard so as to permit bizarre results.

    We are all suffering from IBM's decision to exclude both Codd and all its famous computer language experts from the design of SEQUEL (whch is what became SQL), and the absolute failure of ANSI and ISO to impose any decent language design on teh resulting mess.

    +1.

    Don Simpson



    I'm not sure about Heisenberg.

  • DonlSimpson (3/2/2015)


    TomThomson (3/1/2015)


    siggemannen (2/28/2015)


    Hello,

    I have encountered some weird behaviour. Code that has been working for "eternities" suddenly started to fail.

    Alas, i couldn't recreate it on any other machines.

    I'm afraid that this is part of the ongoing farce resulting from treating the optimiser as a sacred cow. There's a Microsoft connect item that covers this bizarre behavious and a lot of other similar nonsense and Microsoft said way back 5 years ago that they hoped to address it in a future version, but nothing has happened yet. Of course teh SQL standard probably says that what should happen is undefined, as it was driven by people (IBM, Oracle, etc) with axes to grind and awfully ill-conceived implementations to support). It's the same problem as the issue raised in another Microsoft Connect item where MS said (again about 5 years ago) "Another workaround, which is what the server should be doing automatically, is to push the subquery down into a subselect" before closing the item because the insance behaviour is "By Design". Rules about evaluation order which efectively render the language free of any useful semantics which, without a rule that an error occurring on an execution branch which is not logically required should not cause a failure, were added to the standard so as to permit bizarre results.

    We are all suffering from IBM's decision to exclude both Codd and all its famous computer language experts from the design of SEQUEL (whch is what became SQL), and the absolute failure of ANSI and ISO to impose any decent language design on teh resulting mess.

    +1.

    Codd was working for IBM at the time, and IBM asked him to do it. To be fair, a lot of it was not just "ill-conceived". Computers were vastly less powerful back then, and some compromises had to be made to get any style of relational db functioning. The first one or two SystemR versions were known for horrible performance at times.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher (3/2/2015)


    Codd was working for IBM at the time, and IBM asked him to do it. To be fair, a lot of it was not just "ill-conceived". Computers were vastly less powerful back then, and some compromises had to be made to get any style of relational db functioning. The first one or two SystemR versions were known for horrible performance at times.

    Codd was indeed working for IBM at the time but was denied any influence on the SEQUEL project. Various IBM people have denied that, of course, but Codd made it very clear that SEQUEL was in his opinion nothing like a relational calculus, and he had practically no influence at all on it because he had blotted his copy-book by talking to customers about the relatonal model (which IBM didn't want as a rival to its non-relational DBMS).

    The compromises were needed not because the computers were not sufficiently powerful, but because the team didn't understand the basics of language design and comilation/interpretation. Allowing speculative computation to cause a failure which rolls back lots of work does NOT improve performance - in fact it is fairly unusual for speculative computation to improve performance unless both (a) it isn't allowed to cause unwanted aborts and (b) there is more parallelism available than can be exploited by non-speculative computation; clearly the team disregarded (a) and if you think (b) was satisfied you clearly have an interesting view of the amount of parallelism available on typical IBM hardware in the 1970s.

    Tom

  • TomThomson (3/2/2015)


    ScottPletcher (3/2/2015)


    Codd was working for IBM at the time, and IBM asked him to do it. To be fair, a lot of it was not just "ill-conceived". Computers were vastly less powerful back then, and some compromises had to be made to get any style of relational db functioning. The first one or two SystemR versions were known for horrible performance at times.

    Codd was indeed working for IBM at the time but was denied any influence on the SEQUEL project. Various IBM people have denied that, of course, but Codd made it very clear that SEQUEL was in his opinion nothing like a relational calculus, and he had practically no influence at all on it because he had blotted his copy-book by talking to customers about the relatonal model (which IBM didn't want as a rival to its non-relational DBMS).

    The compromises were needed not because the computers were not sufficiently powerful, but because the team didn't understand the basics of language design and comilation/interpretation. Allowing speculative computation to cause a failure which rolls back lots of work does NOT improve performance - in fact it is fairly unusual for speculative computation to improve performance unless both (a) it isn't allowed to cause unwanted aborts and (b) there is more parallelism available than can be exploited by non-speculative computation; clearly the team disregarded (a) and if you think (b) was satisfied you clearly have an interesting view of the amount of parallelism available on typical IBM hardware in the 1970s.

    Yes, Codd preferred his own language, "ALPHA"(? not sure of the name), and he was locked out of the SysR team.

    But, many of the compromises were indeed due to computer processing speed. In fact, even today, no modern relational implementation satisfies all of Codd's rules, some of them for the same reason.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Actually, the issue is absolutely valid and comes from the sequence of the query execution.

    HAVING is executed after GROUP BY and is applied to the values calculated by the GROUP BY.

    Those values have to be stored somewhere, and the engine has to allocate some memory for that.

    INSERT - SELECT query, unlike simple SELECT, has the data type of the target data field defined, so the calculations of the [diff] value happen within int data type limits.

    Pretty natural for a language with data type control. Just need to keep in mind that the INSERT is an internal part of the query and it affects the rest of it.

    _____________
    Code for TallyGenerator

  • I got bitten by the same problem a while ago, see MS Connect

    It's nothing to do with cached temporary tables, just a "feature" of the way the query optimizer works.

    The MS response was "in general letting the optimizer choose any predicate evaluation order is necessary to give us the flexibility to choose the best query plan".

    Even one that fails?

    P.S. The behaviour is still there in SQL Server 2014.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply