What's the best way to count?

  • mtillman-921105 (10/19/2010)


    Oracle's even harder to use.

    Hmm... Are you saying this as a developer or as a DBA?

  • vk-kirov (10/19/2010)


    mtillman-921105 (10/19/2010)


    Oracle's even harder to use.

    Hmm... Are you saying this as a developer or as a DBA?

    I'm saying this as a developer, but my understanding is that both are more difficult with Oracle.

    The greatest enemy of knowledge is not ignorance, it is the illusion of knowledge. - Stephen Hawking

  • Here is the execution resuls of the 1-st and 4-th query:

    --Query 1

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 76 ms.

    --Query 4

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 142 ms.

    So the Query #1 was faster.

  • Well, if I'm typing the query into a SSMS window, I'll go with number 1 and finish way before anyone else 😀

  • mtillman-921105 (10/19/2010)


    It's simple tasks like this that I think MS needs to work on. Judging by this question, most of us (yours truly included) can't even get an accurate count of rows in a table. That's just nuts! MS can't simply write us a function so we don't have to sweat the small stuff like this?

    If I saw any production code like 2, 3 or 4, the developer would get an ear full. I will stick with count(*) thank you very much.

  • Alexander Kuznetsov (10/19/2010)


    Did you verify how method 4 works under high concurrency? Can we get somewhat incorrect totals?

    Hi Alex,

    Examples of incorrect rsults for this method have already been posted in the disussion. Though "incorrect" depends on the goal here - do you need to know the number of order detail lines, or the number of rows that the table occupies at moment X. In the latter case, I'd argue that uncommitted rows HAVE to be counted.

    In the former case, the only truly reliable way is method #1 with either snapshot isolation, or a table lock.

    mtillman-921105 (10/19/2010)


    It's simple tasks like this that I think MS needs to work on. Judging by this question, most of us (yours truly included) can't even get an accurate count of rows in a table. That's just nuts! MS can't simply write us a function so we don't have to sweat the small stuff like this?

    I think I know why they can get away with it. It's because Oracle's even harder to use. :hehe:

    Note that you only run into problems under high concurrency. Oracle has less of those problems, because it uses snapshot isolation by default. You can set the same default in SQL Server (read_committed_snapshot). This does come at the expense of extra overhead, which is why this option is disabled by default in SQL Server.

    Richard Sisk (10/19/2010)


    Well, if I'm typing the query into a SSMS window, I'll go with number 1 and finish way before anyone else 😀

    Not if it's a 20TB table 😛


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • mtillman-921105 (10/19/2010)


    I think I know why they can get away with it. It's because Oracle's even harder to use. :hehe:

    Very true indeed. I did have quite few unfortunate years in my career when I had to work with Oracle databases. I remember it used to really frustrate me that the select count(*) from the_table; in Oracle takes forever longer than the similar query against similar table in SQL Server. Of course there was a decent workaround to NEVER use count(*), but opt for a much better performing count('X') instead, but still it was frustrating.

    Even though now I work with SQL Server and do undertand that count(*) is basically just as good as count(some_literal) because scan is a scan, I still remember my dark days and opt for some_literal even in T-SQL. I don't go for count('X') because it emits a very strong and offensive Oracle odor, but opt for count(1) just in case. 🙂

    Oleg

  • kevin.l.williams (10/19/2010)


    mtillman-921105 (10/19/2010)


    It's simple tasks like this that I think MS needs to work on. Judging by this question, most of us (yours truly included) can't even get an accurate count of rows in a table. That's just nuts! MS can't simply write us a function so we don't have to sweat the small stuff like this?

    If I saw any production code like 2, 3 or 4, the developer would get an ear full. I will stick with count(*) thank you very much.

    Maybe you're right for most everyday applications. I just tested SELECT COUNT(*) on a table with 5,900,000 rows and it was almost immediate. I think I'll stick with that too.

    I think that I was being too hard on MS earlier since COUNT(*) is accurate, even if it can be slow in some circumstances.

    The greatest enemy of knowledge is not ignorance, it is the illusion of knowledge. - Stephen Hawking

  • I also tried all four queries and found that query 1 (SELECT COUNT(*)) was marginally faster in the Client Statistics! Possibly its server dependent?

    q4 q3 q2 q1

    Client Execution Time14:55:2214:54:4514:54:2714:54:05

    q4 q3 q2 q1

    14:52:1314:51:5214:51:3914:51:23

    Ron K.

    "Any fool can write code that a computer can understand. Good programmers write code that humans can understand." -- Martin Fowler

  • good question. Thanks

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • I answered it correctly because I knew what the author was talking about, but for myself I will stick with COUNT(*). (It runs in under 2 seconds on our 70 GB table, so speed isn't that big of an issue.)

    Also, most code completion tools won't help you with the schema and table name when writing the alternate count queries.

  • Oleg Netchaev (10/19/2010)


    Very true indeed. I did have quite few unfortunate years in my career when I had to work with Oracle databases. I remember it used to really frustrate me that the select count(*) from the_table; in Oracle takes forever longer than the similar query against similar table in SQL Server. Of course there was a decent workaround to NEVER use count(*), but opt for a much better performing count('X') instead, but still it was frustrating.

    I don't know that that is true anymore, at least with Oracle 10g. Count(*) is still slow compared to SQL Server, but I find that Count('X') is even slower. (36 secs vs. 80 secs in my test.)

  • UMG Developer (10/19/2010)


    I don't know that that is true anymore, at least with Oracle 10g. Count(*) is still slow compared to SQL Server, but I find that Count('X') is even slower. (36 secs vs. 80 secs in my test.)

    Fortunately for me, I don't know anything about Oracle past version 8i and 9. Perhaps they figured out how to speed up their count(*) by now (it is 21st century after all). The bottom line is that the execution time of Oracle's count(*) is still inferior when compared with SQL Server 2000 or better.

    Oleg

  • The Question and the discussion is awesome.

  • mtillman-921105 (10/19/2010)


    kevin.l.williams (10/19/2010)


    If I saw any production code like 2, 3 or 4, the developer would get an ear full. I will stick with count(*) thank you very much.

    Maybe you're right for most everyday applications. I just tested SELECT COUNT(*) on a table with 5,900,000 rows and it was almost immediate. I think I'll stick with that too.

    I think that I was being too hard on MS earlier since COUNT(*) is accurate, even if it can be slow in some circumstances.

    COUNT(*) is only guaranteed accurate if your isolation level is REPEATABLE READ, SERIALIZABLE, or SNAPSHOT (or of course if you use HOLDLOCK).

    Tom

Viewing 15 posts - 31 through 45 (of 57 total)

You must be logged in to reply to this topic. Login to reply