Insert Into - Please Please Help Me !

  • All,

    We have just migrated from SQL 2000 to SQL 2008. Table structures,indexes etc are the same as they were before. However, when we run the following query in 08 Management Studio on our new server we get a different result than running it through MMS on our old server. It appears that the insert into or the order by is just not working as it seems to insert in a differnet order each time we run the query. Sometimes, we get what we expect but also NOT what we expect.

    Please note that we are not selecting from a view so the 'order by' view bug i assume does not affect this ??

    Query :

    use userdata

    go

    drop table BB_HOLB999

    go

    create table BB_HOLB999 (

    SYST varchar (10),

    CREF decimal (10),

    BDAT decimal(7,0))

    go

    create unique clustered index BI on BB_HOLB999(CREF,syst) with ignore_dup_key

    go

    insert into BB_HOLB999

    select 'OSCAR' as [syst], CREF, bkbdat

    from BB_TABLE join refdata.dbo.TDFBK on CREF=BKCREF

    where BKCOST > 0

    and BKPAX > 0

    and BKINV1 > 0

    and BKLINK is NULL

    and bktype is NULL

    and BKBSTS = ''

    and BKLAPD is null

    and BKBDAT<dmat

    and (BKDEPT in ('20','23','16','50','10','40','52','80')

    or (BKDEPT ='30' and BKPROD in ('Qp','39','8p','8q','8r','8s','8t','8u','8v','8w','8x','1A','9d','X2','AG','AH','AF','AI','6N','6A','8N','V1','U1')))

    order by bkbdat desc

    go

    When running the query below to check a given customer we find the result can be different.(i.e not the customers last booking !!)

    select *

    from BB_HOLB999

    where CREF='8458923'

    go

    Please please help as i have spent hours googling this and trying to understand why this is not working !!!!!!!!!

    Thanks In Advance

    Lee

  • The only thing that an order by does when used on an insert...select is to set the order of the identity columns (if there is one). If you want an ordered resultset, you need to put the order by on the select statement. Without an order by on the select, SQL is free to return the rows in any order it likes, regardless of the 'order' of rows in the table.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Excuse my ignorance, could you explain to me what you mean by an order by on the select statement please ?

    Why does this work in 2000 though ??

    Thanks In Advance

    Lee

  • Instead of putting the ORDER BY on the statement that inserts into BB_HOLB999, you need to put it onto the statement that retrieves from BB_HOLB999. So in the example you gave...

    create table BB_HOLB999 (

    SYST varchar (10),

    CREF decimal (10),

    BDAT decimal(7,0))

    go

    create unique clustered index BI on BB_HOLB999(CREF,syst) with ignore_dup_key

    go

    insert into BB_HOLB999

    select 'OSCAR' as [syst], CREF, bkbdat

    from BB_TABLE join refdata.dbo.TDFBK on CREF=BKCREF

    where BKCOST > 0

    and BKPAX > 0

    and BKINV1 > 0

    and BKLINK is NULL

    and bktype is NULL

    and BKBSTS = ''

    and BKLAPD is null

    and BKBDAT<dmat

    and (BKDEPT in ('20','23','16','50','10','40','52','80')

    or (BKDEPT ='30' and BKPROD in ('Qp','39','8p','8q','8r','8s','8t','8u','8v','8w','8x','1A','9d','X2','AG','AH','AF','AI','6N','6A','8N','V1','U1')))

    go

    select *

    from BB_HOLB999

    where CREF='8458923'

    ORDER BY BDAT DESC

    go

    As for why it worked on SQL 2000, luck probably more than anything. The whole SQL architecture changed quite a lot between 2000 and 2005 and there were a lot of things that changed. The order of results has never been guaranteed without an order by. If SQL 2000 behaved in a certain way, that's by chance and not a feature of the DB engine.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • All,

    Thanks for the replies but i am still at a loss here. Let me explain further and simplify the query and exactly what i am trying to do. I know there is another way of doing this that does work but i want to understand 'WHY' this worked in 2000 and 'WHY' it does not in 2008 !

    What i want to do (in English) :

    I want to insert into the table BB_HOLB999 the latest booking date for each customer and make use of the with ignore_dup_key to ensure the insert does not fail when it finds a duplicate customer number.

    use userdata

    go

    drop table BB_HOLB999

    go

    create table BB_HOLB999 (

    SYST varchar (10),

    CREF decimal (10),

    BDAT decimal(7,0))

    go

    create unique clustered index BI on BB_HOLB999(CREF,syst) with ignore_dup_key

    go

    Does the query below NOT inssert the data in descending order of booking date ??

    e.g. for our example of customer 8458923 who has booking dates as follows :

    990401

    1050912

    930604

    i would expect the 1050912 record to be inserted first and then because of the ignore_dup_key i would expect the next 2 records to be ignored. However, i can run this multiple times and sometimes i would get 1050912 in the table BB_HOLBB999 and sometimes i will not, hence, my question, what is going on ??

    insert into BB_HOLB999

    select 'OSCAR' as [syst], CREF, bkbdat

    from BB_TABLE join refdata.dbo.TDFBK on CREF=BKCREF

    where BKCOST > 0

    and BKPAX > 0

    and BKINV1 > 0

    and BKLINK is NULL

    and bktype is NULL

    and BKBSTS = ''

    and BKLAPD is null

    and BKBDAT<dmat

    and (BKDEPT in ('20','23','16','50','10','40','52','80')

    or (BKDEPT ='30' and BKPROD in ('Qp','39','8p','8q','8r','8s','8t','8u','8v','8w','8x','1A','9d','X2','AG','AH','AF','AI','6N','6A','8N','V1','U1')))

    order by bkbdat desc

    go

    Interestingly, if i specify the insert for just customer 8458923, it is correct every time. ie change the above query to the one below (so the order by appears to work !):

    insert into BB_HOLB999

    select 'OSCAR' as [syst], CREF, bkbdat

    from BB_TABLE join refdata.dbo.TDFBK on CREF=BKCREF

    where BKCOST > 0

    and BKPAX > 0

    and BKINV1 > 0

    and BKLINK is NULL

    and bktype is NULL

    and BKBSTS = ''

    and BKLAPD is null

    and BKBDAT<dmat

    and (BKDEPT in ('20','23','16','50','10','40','52','80')

    or (BKDEPT ='30' and BKPROD in ('Qp','39','8p','8q','8r','8s','8t','8u','8v','8w','8x','1A','9d','X2','AG','AH','AF','AI','6N','6A','8N','V1','U1')))

    and bkcref=8458923

    order by bkbdat desc

    go

  • leebuckman (7/28/2010)


    All,

    Thanks for the replies but i am still at a loss here. Let me explain further and simplify the query and exactly what i am trying to do. I know there is another way of doing this that does work but i want to understand 'WHY' this worked in 2000 and 'WHY' it does not in 2008 !

    Why it worked in 2000 - The 2000 engine did honour the order by in various situations where it shouldn't have and didn't need to (including within views, subqueries and inserts). This was a bug, a shortcoming in the engine. Why it does not work in 2008 - changes to the query execution engine to allow it to work more optimally and it will not do sorts in cases where they are not necessary.

    Remember that tables by definition DO NOT have an order. They are unordered sets of rows. Hence the order that data is inserted is irrelevent. The only time the presence of an order by affects an insert is when there's an identity column in the destination table. Then the Order By sets the sequence for that.

    I want to insert into the table BB_HOLB999 the latest booking date for each customer and make use of the with ignore_dup_key to ensure the insert does not fail when it finds a duplicate customer number.

    Now that's a lazy way of doing things if I ever heard of one. Why not just insert the rows that you want? That way you're not depending on undocumented behaviour which can and does change.

    If you'd mentioned that in the first post, i could have given you an efficient, working solution then.

    Does the query below NOT inssert the data in descending order of booking date ??

    No. Order of rows in a table is meaningless, so the SQL engine is under no obligation to do the order by, or to insert the rows in any order whatsoever. Especially the case if the query runs in parallel.

    Try this, which just inserts the rows that you actually need.

    insert into BB_HOLB999

    SELECT syst, CREF, bkbdat FROM

    (select 'OSCAR' as [syst], CREF, bkbdat, ROW_NUMBER() OVER (PARTITON BY CREF ORDER BY bkbdat DESC) AS BookingOrder

    from BB_TABLE join refdata.dbo.TDFBK on CREF=BKCREF

    where BKCOST > 0

    and BKPAX > 0

    and BKINV1 > 0

    and BKLINK is NULL

    and bktype is NULL

    and BKBSTS = ''

    and BKLAPD is null

    and BKBDAT<dmat

    and (BKDEPT in ('20','23','16','50','10','40','52','80')

    or (BKDEPT ='30' and BKPROD in ('Qp','39','8p','8q','8r','8s','8t','8u','8v','8w','8x','1A','9d','X2','AG','AH','AF','AI','6N','6A','8N','V1','U1')))

    ) sub

    WHERE BookingOrder = 1

    go

    That will insert just one row per CREF, the one with the latest booking date.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Once again, thanks for your reply. The query was sent to me by another team member who was doing it this way. I personally said to group by the customer ref and insert the maximum date using the max function !

    But the question remained, why did it work before and does not now, hence my post on here !

    Thankyou for your answer, i will go back to my colleague and give them the explanantion as to why this is the case.

    Best Regards

    Lee

  • leebuckman (7/28/2010)


    But the question remained, why did it work before and does not now, hence my post on here !

    Basically your colleague was depending on undocumented and incorrect behaviour that was fixed in a later version of SQL.

    The only time you can depend on ORDER BY to order rows is when it's applied to the outer query in a SELECT statement, so not in inserts, not in views, not in subqueries.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Some references for you to help you understand the things people have said:

    The behavior of the IDENTITY function when used with SELECT INTO or INSERT .. SELECT queries that contain an ORDER BY clause (Microsoft Knowledge Base)

    Blog entries by Conor Cunningham (currently a Principal Software Architect for the SQL Server Engine at Microsoft):

    No Seatbelt - Expecting Order without ORDER BY

    Ordering guarantees in SQL Server

    TOP 100 Percent ORDER BY Considered Harmful

    Paul

  • My colleague is asking me why the below works ?

    Interestingly, if i specify the insert for just one customer ie 8458923, it is correct every time. (so the order by appears to work !):

    insert into BB_HOLB999

    select 'OSCAR' as [syst], CREF, bkbdat

    from BB_TABLE join refdata.dbo.TDFBK on CREF=BKCREF

    where BKCOST > 0

    and BKPAX > 0

    and BKINV1 > 0

    and BKLINK is NULL

    and bktype is NULL

    and BKBSTS = ''

    and BKLAPD is null

    and BKBDAT<dmat

    and (BKDEPT in ('20','23','16','50','10','40','52','80')

    or (BKDEPT ='30' and BKPROD in ('Qp','39','8p','8q','8r','8s','8t','8u','8v','8w','8x','1A','9d','X2','AG','AH','AF','AI','6N','6A','8N','V1','U1')))

    and bkcref=8458923

    order by bkbdat desc

    go

    if the customer has 3 different booking dates it will ALWAYS insert the lastest one and then ignore the next two due to the ignore_dup_key but when removing the 'and bkcref=8458923' we are back to where we were with the randomness(which has now been explained), but why does it work when specifying just one customer number ?

  • leebuckman (7/28/2010)


    My colleague is asking me why the below works ?

    Interestingly, if i specify the insert for just one customer ie 8458923, it is correct every time. (so the order by appears to work !):

    Sheer, absolute luck.

    It may be that with a single customer SQL processes the rows in just the right way to get that result. This behaviour is NOT guaranteed. If it just happens to come out that way in a specific test it just means that it's happened to come out that way in that specific test at that specific time, with that specific data set. It may change tomorrow. It may change with a different data set. It may change with the next service pack/hot fix. It is not something that you can depend on always working.

    What's likely happening is that, due to the data volume and the indexes on the source table, the execution plan that is been used just happens to have the data arranged so that you get that specific result.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply