relative performance sizing/estimation

  • I'm trying to get a better understanding of relative performance expectations for different types of queries. For example, given a typical SS configuration, what type of response time would you expect for the following scenarios:

    * select 1 row of data from a single table which only has 1 row

    * make 10 select statements, 1 select statement against 10 different tables, with each of the 10 tables having only 1 row

  • You'd get a better idea if you'd actually write some tests, one theme you'll see often here is that folks are much more interested in measuring than estimating.

    But I'll guess maybe a few hundred milliseconds from a cold start for either one.

  • sqlguy-736318 (12/5/2016)


    I'm trying to get a better understanding of relative performance expectations for different types of queries. For example, given a typical SS configuration, what type of response time would you expect for the following scenarios:

    * select 1 row of data from a single table which only has 1 row

    * make 10 select statements, 1 select statement against 10 different tables, with each of the 10 tables having only 1 row

    If I'm understanding your question correctly, the answer is somewhere around: 00:00:00

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • sqlguy-736318 (12/5/2016)


    I'm trying to get a better understanding of relative performance expectations for different types of queries. For example, given a typical SS configuration, what type of response time would you expect for the following scenarios:

    * select 1 row of data from a single table which only has 1 row

    * make 10 select statements, 1 select statement against 10 different tables, with each of the 10 tables having only 1 row

    People normally ask such questions because there's a problem. What problem are you actually having?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • sqlguy-736318 (12/5/2016)


    I'm trying to get a better understanding of relative performance expectations for different types of queries. For example, given a typical SS configuration, what type of response time would you expect for the following scenarios:

    * select 1 row of data from a single table which only has 1 row

    * make 10 select statements, 1 select statement against 10 different tables, with each of the 10 tables having only 1 row

    Scenario 1 is totally trivial, and meaningless without a larger context. Assuming that scenario 2 is the "context", it too is rather meaningless, as I can't determine exactly what you mean without making a potentially unwarranted assumption. You state "make 10 select statements", and then add a comma, and then say "1 select statement against 10 different tables...", so I'm not sure if you mean repeat that last part 10 times, or that it represents the "10 select statements" by virtue of a 10-way JOIN. Given that both scenarios involve tables with exactly 1 row, neither scenario is in any way meaningful, as worrying about performance requires some kind of volume. One row is not meaningful unless you want to measure the cost of 1 row as part of the overall cost of a larger query, where the number of rows returned is considerably larger than 1. Since you made no mention of any kind of volume, your question has no useful meaning, as the cost of 1 row in any larger context is entirely dependent on that larger context, and meaningless without it.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • I'm trying to understand relative sizing for design conversation purposes. For example, in this context, would it be reasonable to state that?:

    * Selecting 1 row from 1 table with 1 row shouldn't take > 1ms

    * Selecting 1 row from 10 different tables, where each table has 1 row, shouldn't take > 10ms

  • sqlguy-736318 (12/6/2016)


    I'm trying to understand relative sizing for design conversation purposes. For example, in this context, would it be reasonable to state that?:

    * Selecting 1 row from 1 table with 1 row shouldn't take > 1ms

    * Selecting 1 row from 10 different tables, where each table has 1 row, shouldn't take > 10ms

    If this is more of an academic question about baseline performance metrics, then speak in terms of millions of rows. When dealing with single row tables, things like indexing, sargability of the query predicates, seeks vs. scans, recompilation, spooling, logical vs physical reads, etc. don't come into play.

    But the answer is no, relational databases do not scale linearly. It will be more along the lines of 1 row = 10ms, 1000 rows = 1 second, 1,000,000 rows = 1 minute, 10,000,000 rows = 1 hour, and 100,000,000 rows = 3 days (or until the spool fills tempdb and aborts). But it's not the volume of data but how it's joined, filtered, sorted, and indexed.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • sqlguy-736318 (12/6/2016)


    I'm trying to understand relative sizing for design conversation purposes. For example, in this context, would it be reasonable to state that?:

    * Selecting 1 row from 1 table with 1 row shouldn't take > 1ms

    * Selecting 1 row from 10 different tables, where each table has 1 row, shouldn't take > 10ms

    That is a rather dangerous conclusion to draw, as I suspect you would then use this to "estimate" queries solely based on the number of rows, and that isn't going to work very well when your query is more complex than a simple select of fields in a table. A design conversation needs to be about one heck of a lot more than just row counts. You have to ask questions like what kind of row size are we going to end up with and is the data going to be normalized, and how might those answers have a bearing on each other, given the specifics of the data in question. Trying too hard to generalize any form of performance is like putting the cart before the horse and then wondering why the horse wandered away while you were tending to the cart, or, like leaving the barn door wide open all night and expecting the animals you placed inside the barn to still be inside the next day. You need to understand your data and the business rules that need to be applied to it, and take all the objectives and the data design into account. It's not a simple equation.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sqlguy-736318 (12/6/2016)


    I'm trying to understand relative sizing for[font="Arial Black"] design conversation purposes[/font]. For example, in this context, would it be reasonable to state that?:

    * Selecting 1 row from 1 table with 1 row shouldn't take > 1ms

    * Selecting 1 row from 10 different tables, where each table has 1 row, shouldn't take > 10ms

    For "Design Conversation Purposes". I think a better conversation would be to say that the SLA from time of request at the GUI 'til the full return of the data is presented on the screen should never be more than 5 seconds (way too long for me but that's a normal max standard) for "singleton" lookups (the case for both your assumptions above) with the desired perception by the users being the impression that it's "nearly instantaneous", regardless of the number of rows in the table or tables. For complex reporting returns against tables with possibly millions of rows (not returning a million rows in the report of course), you need to decide on what the SLA maximum should be with the understanding that user perception is still the key. The closer to "instant" you can get, the fewer the resources on the database side you'll likely be using and the happier the users will be.

    To be sure, you should have good hardware throughout but performance truly comes from two things... great code and good database design. I'll also state that great code can usually overcome some of the hardware problems and database design problems. Conversely, you could have a room full of MPP appliances running against perfectly designed databases and still have major performance problems simply because of poor/crappy code. Also be keenly aware that ORM code frequently fits the signature of seriously crappy code.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply