Select Where_Min()

  • Ramesh (1/16/2009)


    The only obvious reason I am thinking of is that the optimizer has to do a "Scalar Operation" on RANK() & ROW_NUMBER() whereas "Aggreagate Operation" on MIN() function....

    Agreed - the windowed aggregate functions end up forcing a lot of rewinds, which I don't think are necessary in the RANK() and ROW_NUMBER().

    And - they are quite a bit faster for some reason....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Jeff Moden (1/16/2009)


    RBarryYoung (1/16/2009)


    It's weird actually. There's no obvious reason why RANK() and ROW_NUMBER() should be faster than MIN(), but they clearly are when you look at the differences in the execution plans.

    Heh... You, of all people, should know that the execution plan is a troubleshooting tool and can have nothing to do with reality insofar as which code will be more performant especially if you're looking at that damnable "% of Batch%" indicator. Unless you actually test it and measure, you cannot base decisions on which code is the most performant just by looking at the execution plan.

    I did test it Jeff. 🙂

    And the execution plans are a reliable indication of what SQL Server is doing with a query as long as you remember what it limitations are: Cursors, While Loops, Recursion and UDF's. Those things in does not do well/right. But for normal old queries it is extremely reliable at telling you what the plan is (estimated row counts and costs are another story), which is what I based my statement on. You can see that it is doing additional lookups for the MIN().

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Thanks RBarry, Matt, Ramesh and Jeff. This is good stuff, and exactly what I needed.

    Chad

  • RBarryYoung (1/16/2009)


    I did test it Jeff. 🙂

    And the execution plans are a reliable indication of what SQL Server is doing with a query as long as you remember what it limitations are: Cursors, While Loops, Recursion and UDF's. Those things in does not do well/right. But for normal old queries it is extremely reliable at telling you what the plan is (estimated row counts and costs are another story), which is what I based my statement on. You can see that it is doing additional lookups for the MIN().

    Whew! I feel much better now. I didn't see a test run and thought "dang it Barry, ya just know better." Glad I was wrong. 🙂

    Your are correct in the things you listed having a whacko effect on % of Batch. But, there are other things and I'm pretty sure that neither you nor I know them all. Until I'm guaranteed that I actually know them all, I just can't bring myself to trust the % of batch numbers. For example...

    Test data...

    --===== Create and populate a 1,000,000 row test table.

    -- Column "RowNum" has a range of 1 to 1,000,000 unique numbers

    -- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers

    -- Column "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings

    -- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers

    -- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times

    -- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'

    -- for all rows.

    -- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)

    -- Jeff Moden

    SELECT TOP 1000000

    RowNum = IDENTITY(INT,1,1),

    SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,

    SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65),

    SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),

    SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),

    SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),

    SomeHex12 = RIGHT(NEWID(),12)

    INTO dbo.JBMTest

    FROM Master.dbo.SysColumns t1

    CROSS JOIN Master.dbo.SysColumns t2

    --===== A table is not properly formed unless a Primary Key has been assigned

    ALTER TABLE dbo.JBMTest

    ADD PRIMARY KEY CLUSTERED (RowNum)

    Test... run the estimated and actual execution plans of the code below. Then look at the message tab for a big surprise. What else can make the execution plan wrong and do you know them all? If you do, you should write and article about it. Not being a smart a$$ either. It would make a great article.

    DECLARE @StartTime DATETIME

    SET @StartTime = GETDATE()

    SELECT TOP 10 *

    FROM dbo.JBMTest

    ORDER BY SomeLetters2

    PRINT DATEDIFF(ms,@StartTime,GETDATE())

    SET @StartTime = GETDATE()

    SET ROWCOUNT 10

    SELECT *

    FROM dbo.JBMTest

    ORDER BY SomeLetters2

    SET ROWCOUNT 0

    PRINT DATEDIFF(ms,@StartTime,GETDATE())

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Chad Crawford (1/16/2009)


    Thanks RBarry, Matt, Ramesh and Jeff. This is good stuff, and exactly what I needed.

    Chad

    Thanks for the feedback, Chad.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The Execution Plans, as reported, contain a lot of stuff and we are really talking about 2 different things in them. You are talking about the cost estimates and row counts, which at their best are only as good as the statistics and the fixed costing factors (which are variable in reality). Given that statistics and the costing factors have a lot of holes, it is no wonder that the costing estimates of Execution Plans is so wonky.

    On the other hand, I am talking about the plan itself: i.e., what SQL actually plans to do (or already did). Nested Loops here, table scan there, etc. For queries alone (that is, no procedural code), without recursion, this is almost always correct. And if it is not, it is extremely difficult to figure out what it really is doing.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Correct. But even the estimated execution plan is "wonky" in that respect. I don't have the code example to prove it, but I've seen where a Loop join gets changedto a Merge join. Ya just can't trust the execution plan. Good place to start, but don't put your faith in it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/17/2009)


    Correct. But even the estimated execution plan is "wonky" in that respect. I don't have the code example to prove it, but I've seen where a Loop join gets changedto a Merge join. Ya just can't trust the execution plan. Good place to start, but don't put your faith in it.

    Well, I prefer to use Actual Execution Plans which are not wonky in that respect. I would certainly like to see a repeatable example of this estimated to actual plan change though. I cannot say that I have ever seen an example of this that could not be attributed to statistics recalculation (which is of course not repeatable). This aspect of even estimated execution plans has been so consistent for me for so long that I am very comfortable in regarding them as reliable.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (1/17/2009)


    Jeff Moden (1/17/2009)


    Correct. But even the estimated execution plan is "wonky" in that respect. I don't have the code example to prove it, but I've seen where a Loop join gets changedto a Merge join. Ya just can't trust the execution plan. Good place to start, but don't put your faith in it.

    Well, I prefer to use Actual Execution Plans which are not wonky in that respect. I would certainly like to see a repeatable example of this estimated to actual plan change though. I cannot say that I have ever seen an example of this that could not be attributed to statistics recalculation (which is of course not repeatable). This aspect of even estimated execution plans has been so consistent for me for so long that I am very comfortable in regarding them as reliable.

    I agree... actual execution plans are better. I'll see if I can find the example I was talking about...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • RBarryYoung (1/17/2009)


    Jeff Moden (1/17/2009)


    Correct. But even the estimated execution plan is "wonky" in that respect. I don't have the code example to prove it, but I've seen where a Loop join gets changedto a Merge join. Ya just can't trust the execution plan. Good place to start, but don't put your faith in it.

    Well, I prefer to use Actual Execution Plans which are not wonky in that respect. I would certainly like to see a repeatable example of this estimated to actual plan change though. I cannot say that I have ever seen an example of this that could not be attributed to statistics recalculation (which is of course not repeatable). This aspect of even estimated execution plans has been so consistent for me for so long that I am very comfortable in regarding them as reliable.

    So as I was hastily trying to finish this post last night as my wife was impatiently waiting for me to take her on our date. I briefly considered saying "hold on a few minutes, I have to make sure that I state this correctly or Jeff will find a hole in it and drive a million rows through it!" 🙂 Then the ol' Life Optimizer kicked in and pointed out that cost of the internet IO saved by double-checking my post was going to be vastly exceeded by the cost of the IO in my personal life.

    Hit "Post Reply" and go.

    Of course now that I reread it, I can see that I have to clarify my earlier statement a bit.

    As I stated earlier, (but forgot in this last post), as long as you stick to single statements (queries & DML), my statement is true, the plan is the same from estimated to actual. Indeed, since execution plan is based soley on

    1) the query/DML,

    2) the metadata definitions,

    3) the database's "state" (statistics, etc.),

    then the only way that the plan should be able to change is if one or more of these things change.

    Now for single queries this should be extremely rare, but not impossible because the statistics could change on a table in use. But that's not very repeatable, and I certainly would not{edit} cite it as a fault of execution plans.

    For procedural code this might seem just as unlikely, however, the story is really much different because in procedural code you frequently change both the metadata (creating temp tables) and the statistics (loading tables, etc.). Here is a simple example:

    CREATE table #temp( foo int)

    INSERT into #temp

    Select object_id from sys.columns

    SELECT foo, count(*)

    From #temp

    Group by foo

    DROP table #temp

    You will note the change in the execution plan for the SELECT from the estimated plan from before running the batch, and the actual plan. The cause is obvious: the #temp table had no data in it when the estimated execution plan report was generated.

    This all gets back to T-SQL's general problem with procedural code. Sure, it can do procedural code correctly, but a lot of the support and tools is spotty with respect to it, and do it a lot (like for every row) and it will be noticeably slow.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (1/18/2009)So as I was hastily trying to finish this post last night as my wife was impatiently waiting for me to take her on our date. I briefly considered saying "hold on a few minutes, I have to make sure that I state this correctly or Jeff will find a hole in it and drive a million rows through it!" 🙂 Then the ol' Life Optimizer kicked in and pointed out that cost of the internet IO saved by double-checking my post was going to be vastly exceeded by the cost of the IO in my personal life.

    Hit "Post Reply" and go.

    Wise choice. The Estimated Execution Plan for your Life Optimizer was veering dangerously close to a cigarette, a blindfold and a loud bang.

    Or maybe that's just my experience.

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • jcrawf02 (1/20/2009)


    Wise choice. The Estimated Execution Plan for your Life Optimizer was veering dangerously close to a cigarette, a blindfold and a loud bang.

    Or maybe that's just my experience.

    Heh, mine too. 🙂 MY wife is patiently indulgent of my obsessions, but even she has limits... :unsure:

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Heh... limits are good. Neither my driving a million rows through something nor the speediest pork chop that I can muster up will ever have the impact that a miffed "war department" can render. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply