Query cost

  • CirquedeSQLeil (1/26/2010)


    ... if one examines execution plans for these two queries and run them together, the query optimizer treats them as the same execution plan and equates both queries to the same cost.

    I agree, so it comes down to what is meant by 'cost effective'. I got the answer wrong, but I was using the above meaning of 'cost effective'.....so I feel hard done by πŸ™

  • Kelsey Thornton (1/27/2010)


    CirquedeSQLeil (1/26/2010)


    if one examines execution plans for these two queries and run them together, the query optimizer treats them as the same execution plan and equates both queries to the same cost.

    Hmmmm. Wonders if they are optimised to a "(s)lowest common denominator" πŸ™‚

    (Being a bit of a lightweight here I don't know how to examine the execution plans. Is that part of the SQL Profiler?)

    You can click on the Include Actual Execution Plan Icon in Management Studio before running your query. You will get the actual execution plan in the result.

    Clicking on the Display Estimated Execution Plan displays the Estimated Execution Plan.

    Thanks

    Saurabh Dwivedy
    ___________________________________________________________

    My Blog: http://tinyurl.com/dwivedys

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537

    Be Happy!
  • Hugo Kornelis (1/27/2010)


    If I critisize a QotD, I always try to maintain a positive tone. Especially since, after having submitted some of my own, I know how hard it is to create one, and how impossible it is to satisfy everyone.

    This question makes it very hard for me to stay positive, because it actually shows a severe lack of understanding of the subject matter by the question author. I'll just enumerate my the issues.

    1. The schema of the tables has not been supplied. In questions like this, that may be of the utmost importance. For isntance, had the question been about NOT IN versus NOT EXISTS, than the queries would only have been equivalent if student.teacher_id is not nullable. I must admit that after trying some schema variations, I have not yet found one where the query performance of these particular queries is affected by the schema, but I only tried a few ones so I can't exclude the possibility.

    2. The author obviously has not bothered to check his ideas. I did (as indicated above). And with all the schema variation I tried, the two queries were executed using the EXACT SAME execution plan. The query optimizer obviously sees that these two queries are equivalent, so they are processed the same. And as a result, there can never be any performance difference. "Both are equal" is the correct answer, as based on this evidence - but there is no guarantee whatsoever that the same holds on all versions of SQL Server, on all possible variations of hardware, and with all possible data distributions.

    3. Performance related questions are always disputable because, as indicated above, there are so many factors involved in query optimization that it is almost impossible to predict what the optimizer will do with a query. And it will not always be the same either. Even on the same system, results may change overnight for no apparent reason (happened to me yesterday in the DB I'm working on - a stored proc that suddenly took many minutes to complete).

    Other, minor issues are the unneeded brackets around [name] (name is not on the list of reserved keywords, so no delimitation required); the strangely popular but really rather odd EXISTS 1 instead of EXISTS * (EXISTS checks for rows, not values, so what you put there is immaterial - except that * is the standard thhat anyone understands immediately while EXISTS(SELECT 1 makes everybody pause to think); and the broken link in the explanation (the two links both point to the same page).

    Bottom line - the only truly correct answer is "it depends". Of the options give, "both are equal" is almost correct. The other two options are plain nonsense.

    Presence of NULL values in the columns being compared can make a difference to the execution plans if the query were comparing NOT EXISTS and NOT IN constructs.

    In case the TeacherID column contained a NULL Value - Not IN and Not Exists would have showed different execution plans (this is of course assuming that TeacherID allows for NULLS - a schema dependent condition). However since the query is checking for plain Exists versus IN, this condition is inapplicable from the start. But I still wanted to make this point.

    Saurabh Dwivedy
    ___________________________________________________________

    My Blog: http://tinyurl.com/dwivedys

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537

    Be Happy!
  • Saurabh Dwivedy (1/27/2010)


    Kelsey Thornton (1/27/2010)


    (Being a bit of a lightweight here I don't know how to examine the execution plans. Is that part of the SQL Profiler?)

    You can click on the Include Actual Execution Plan Icon in Management Studio before running your query. You will get the actual execution plan in the result.

    Clicking on the Display Estimated Execution Plan displays the Estimated Execution Plan.

    Thanks

    Thanks - I've learned my something for today - I can go back to sleep πŸ˜€

    Kelsey Thornton
    MBCS CITP

  • This appears to be based on the silly assumption that your machine costs something and your time costs nothing!!!

  • Initially I was going to answer that the Exists clause would return more cost effectively just because I know that in certain situations it does behave better. However, not knowing what the data looked like that the QOTD author used, I could only run the test using my own data. The execution plans were exactly the same on my data. I had read before that SQL 2005 usually treats the Exists and In statements the same so I thought that this might have been the correct answer. Obviously I was incorrect in that assumption.

    while I am not asking for points on this one, I would like to see QOTD authors give some more schema information on questions related to performance as these factors can definitely impact the outcome.

  • Hugo Kornelis (1/27/2010)


    the unneeded brackets around [name] (name is not on the list of reserved keywords, so no delimitation required)

    Sometimes I use those unneeded brackets too, because SSMS highlights "name" in blue as if it were a keyword πŸ™‚

  • mike.anderson 52709 (1/27/2010)


    honza.mf (1/27/2010)


    CirquedeSQLeil (1/27/2010)


    honza.mf (1/27/2010)


    I answered OK, but...

    The first variant with IN can be more effective, it allways depend on data. Compare two queries without any knowledge about structure of tables, indexes, possible data distributions...

    This is a very valid point - the answer really is an It Depends kind of answer.

    Thanks.

    I must say, I think you can infer that that teacher_id is a PK / FK, but I'm not sure that it makes any difference. Both queries boil down to an inner join between the whole of both tables.

    Since all of the data in each table is involved, and the same columns are used in both cases, could you suggest some examples in which the structure / data / indexes would make a difference to these queries?

    I'm not sure, I haven't examined execution plans on different SQL Server versions.

    Just as a suggestion: both teacher_id columns nullable and ANSI NULLS set to off. Well, little bit strange, but I met more horrible combinations.



    See, understand, learn, try, use efficient
    Β© Dr.Plch

  • honza.mf (1/27/2010)


    mike.anderson 52709 (1/27/2010)


    I must say, I think you can infer that that teacher_id is a PK / FK, but I'm not sure that it makes any difference. Both queries boil down to an inner join between the whole of both tables.

    Since all of the data in each table is involved, and the same columns are used in both cases, could you suggest some examples in which the structure / data / indexes would make a difference to these queries?

    I'm not sure, I haven't examined execution plans on different SQL Server versions.

    Just as a suggestion: both teacher_id columns nullable and ANSI NULLS set to off. Well, little bit strange, but I met more horrible combinations.

    Trying this out, I still get execution plans that are on the face of it identical (they must be different, because the result sets are different); the cost is very slightly different (3.15002 vs. 3.1484) [SQL 2005; 20,000 rows and 60,000 row; ~10% NULLs in each column].

  • mike.anderson 52709 (1/27/2010)

    Trying this out, I still get execution plans that are on the face of it identical (they must be different, because the result sets are different); the cost is very slightly different (3.15002 vs. 3.1484) [SQL 2005; 20,000 rows and 60,000 row; ~10% NULLs in each column].

    And there is the other point of view: The qeuries are not identical. Compare two queries with different results on effectivity only is not a good idea.

    Once more, my suggested conditions are strange, but not prohibited by the question.



    See, understand, learn, try, use efficient
    Β© Dr.Plch

  • honza.mf (1/27/2010)


    mike.anderson 52709 (1/27/2010)

    Trying this out, I still get execution plans that are on the face of it identical (they must be different, because the result sets are different); the cost is very slightly different (3.15002 vs. 3.1484) [SQL 2005; 20,000 rows and 60,000 row; ~10% NULLs in each column].

    And there is the other point of view: The qeuries are not identical. Compare two queries with different results on effectivity only is not a good idea.

    Once more, my suggested conditions are strange, but not prohibited by the question.

    So what you're saying is that the question doesn't prohibit a conditions in which the question itself is meaningless? πŸ˜‰

  • Well, the whole point here probably is that SQL Server nowadays has no issue transforming many query types (especially these simple two liners), which are semantically equivalent. Therefore the plans are equal. But if the processor would not transform the queries that much and just use the logical plan that was given by the user, I would say that query two performs better than query one under the following assumptions:

    1. There are lots of students with not so much teachers.

    2. Teachers would be a table scan (since there are not so many)

    3. Exists for each teacher would be an index seek (quick and not many due to the limited amount of teachers) to determine if a student exists.

    4. IN would require to scan the whole student index which is huge compared to the teachers table (imagine a school with trillions of students but only 4 teachers πŸ™‚ before correlating back to the teachers table. Again here I am neglecting the fact that SQL Server is intelligent enough not to do this and to choose a different plan.

    Best Regards,

    Chris BΓΌttner

  • I've awarded back all points and then changed the question to just have a right answer and wrong answer.

    Apologies. This one slipped by

  • I have found that both queries are sematically equivalent, and perhaps the better answer could have been C (both) (FYI, I choose B and got it right). However, the real answer would actually be, it depends. If the result set returned in the first query is large enough it is possible that it would not be as efficient as the second query.

  • Was it my browser or were we being a bit sarcastic in reference to a couple of last week's questions. Because the answers that were provded when I pulled up the question were, wrong answer, right answer, wrong answer with exactly that wording, funny enough right answer did show up as the correct answer.

Viewing 15 posts - 16 through 30 (of 83 total)

You must be logged in to reply to this topic. Login to reply