SELECT COUNT(*) vs DMVs to get row count

  • Eirikur Eiriksson (4/11/2015)


    sql-lover (4/11/2015)


    Grant Fritchey (4/11/2015)


    As far as I know, if you have to have a guaranteed, 100% accurate, count of the rows in the table, the way to do it is COUNT. Everything I know about all the other DMVs say that they'll get you close, if not spot on, most of the time. But it's that most bit that's an issue if you have to have a perfect set of counts. The thing is, why do you need a perfect set of row counts?

    So the script will run on empty sets only. Those are the only databases that can be done on the fly.

    Belt and braces then?

    😎

    SELECT COUNT(*)

    FROM dbo.TABLE_NAME WITH (TABLOCKX);

    This will lock the table while running. The data does not change so fast enough from zero to some rows to justify that hint.

    I decided I will use count with if exists.

  • sql-lover (4/11/2015)


    Eirikur Eiriksson (4/11/2015)


    sql-lover (4/11/2015)


    Grant Fritchey (4/11/2015)


    As far as I know, if you have to have a guaranteed, 100% accurate, count of the rows in the table, the way to do it is COUNT. Everything I know about all the other DMVs say that they'll get you close, if not spot on, most of the time. But it's that most bit that's an issue if you have to have a perfect set of counts. The thing is, why do you need a perfect set of row counts?

    So the script will run on empty sets only. Those are the only databases that can be done on the fly.

    Belt and braces then?

    😎

    SELECT COUNT(*)

    FROM dbo.TABLE_NAME WITH (TABLOCKX);

    ?

    This will lock the table while running. The data does not change so fast enough from zero to some rows to justify that hint.

    That's kind of the point I'm trying to make, if you can produce an example, on a modern version of Sql Server, where these conditions apply and sys,partitions are not providing the correct count, let me know!;-)

    😎

  • Eirikur Eiriksson (4/11/2015)


    sql-lover (4/11/2015)


    Eirikur Eiriksson (4/11/2015)


    sql-lover (4/11/2015)


    Grant Fritchey (4/11/2015)


    As far as I know, if you have to have a guaranteed, 100% accurate, count of the rows in the table, the way to do it is COUNT. Everything I know about all the other DMVs say that they'll get you close, if not spot on, most of the time. But it's that most bit that's an issue if you have to have a perfect set of counts. The thing is, why do you need a perfect set of row counts?

    So the script will run on empty sets only. Those are the only databases that can be done on the fly.

    Belt and braces then?

    😎

    SELECT COUNT(*)

    FROM dbo.TABLE_NAME WITH (TABLOCKX);

    ?

    This will lock the table while running. The data does not change so fast enough from zero to some rows to justify that hint.

    That's kind of the point I'm trying to make, if you can produce an example, on a modern version of Sql Server, where these conditions apply and sys,partitions are not providing the correct count, let me know!;-)

    😎

    Sorry, not sure what are you trying to establish here.

    I am not looking for how many rows exactly I have on a table. I am looking for those tables without any rows. Based on that and the knowledge that I have of my data, I know in advance that it is way more difficult and very unlikely that data will change from zero to something; it is more likely though, that data will change from x to y. So count, on this case, should be more than enough.

    My doubt was about the accuracy of the DMV.

    A table lock will add unnecessary overhead to the script.

  • Sean Lange (4/10/2015)


    sql-lover (4/10/2015)


    Is this an accurate way to get row count?

    SELECT

    t.name table_name,

    s.name schema_name,

    p.rows AS total_rows

    FROM sys.tables t

    join sys.schemas s on (t.schema_id = s.schema_id)

    join sys.partitions p on (t.object_id = p.object_id)

    WHERE p.index_id in (0,1) AND t.name='MyTable' AND p.rows=0

    vs the old and simple, SELECT COUNT(*) ? ...

    I'm about to query hundreds of databases to get row count for a particular table. The 2nd one looks more resource intensive to me, but I want to be sure the DMV provides accurate row information.

    Thanks in advance.

    If ALL of your statistics are refreshed first this should return an accurate count...or really close in nearly every circumstance.

    No.

    The DMV row count has nothing whatsoever to do with statistics. It is not affected in any way by when (or if) stats were last updated.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • jphillips 46546 (4/10/2015)


    Agree much less costly to just do a if exists (select top 1 from tablename)

    You don't need a TOP 1 in an EXISTS subquery.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Alan.B (4/10/2015)


    Ok, quick update... I did some research on this - it appears that sys.partitions does not guarantee an accurate row count. According to Microsoft, the rows column in sys.partitions "Indicates the approximate number of rows in this partition."

    Because that piece of documentation was copied almost verbatim from sysindexes in SQL 2000, which was prone to errors and inaccuracies. It was trivial in SQL 2000 to find cases where sysindexes's row count did not match Count(*) from table, in SQL 2005+, that should not happen with sys.partitions, if it does, it is a bug (from one of the dev team)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • sql-lover (4/11/2015)


    I am not looking for how many rows exactly I have on a table. I am looking for those tables without any rows.

    IF EXISTS (SELECT 1 FROM TableName)

    PRINT 'Table has rows'

    ELSE

    PRINT 'Table has no rows'

    It won't read more than one row from any table (EXISTS exits as soon as it has a result), so that's probably your most efficient option.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • sql-lover (4/11/2015)


    I decided I will use count with if exists.

    Count has no place in an IF EXISTS check. EXISTS doesn't care about the columns, hence why SELECT 1 is often used. You don't need to count at all if all you want to do is see if there's a row.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (4/13/2015)


    Sean Lange (4/10/2015)


    sql-lover (4/10/2015)


    Is this an accurate way to get row count?

    SELECT

    t.name table_name,

    s.name schema_name,

    p.rows AS total_rows

    FROM sys.tables t

    join sys.schemas s on (t.schema_id = s.schema_id)

    join sys.partitions p on (t.object_id = p.object_id)

    WHERE p.index_id in (0,1) AND t.name='MyTable' AND p.rows=0

    vs the old and simple, SELECT COUNT(*) ? ...

    I'm about to query hundreds of databases to get row count for a particular table. The 2nd one looks more resource intensive to me, but I want to be sure the DMV provides accurate row information.

    Thanks in advance.

    If ALL of your statistics are refreshed first this should return an accurate count...or really close in nearly every circumstance.

    No.

    The DMV row count has nothing whatsoever to do with statistics. It is not affected in any way by when (or if) stats were last updated.

    Good to know and thanks for the correction.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • GilaMonster (4/13/2015)


    Alan.B (4/10/2015)


    Ok, quick update... I did some research on this - it appears that sys.partitions does not guarantee an accurate row count. According to Microsoft, the rows column in sys.partitions "Indicates the approximate number of rows in this partition."

    Because that piece of documentation was copied almost verbatim from sysindexes in SQL 2000, which was prone to errors and inaccuracies. It was trivial in SQL 2000 to find cases where sysindexes's row count did not match Count(*) from table, in SQL 2005+, that should not happen with sys.partitions, if it does, it is a bug (from one of the dev team)

    This is why I still don't trust it. I probably should get over it but it is difficult to accept that one thing that was so inaccurate has been fixed and is now actually accurate. I got burned by those counts more than once in 2000 and have never fully accepted the fix in my mind. 😉

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Not fixed. Large portions of the storage engine were rewritten between SQL 2000 and 2005.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (4/13/2015)


    Not fixed. Large portions of the storage engine were rewritten between SQL 2000 and 2005.

    I would say that rewriting the storage engine is what "fixed" the problem.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • GilaMonster (4/13/2015)


    Sean Lange (4/10/2015)


    sql-lover (4/10/2015)


    Is this an accurate way to get row count?

    SELECT

    t.name table_name,

    s.name schema_name,

    p.rows AS total_rows

    FROM sys.tables t

    join sys.schemas s on (t.schema_id = s.schema_id)

    join sys.partitions p on (t.object_id = p.object_id)

    WHERE p.index_id in (0,1) AND t.name='MyTable' AND p.rows=0

    vs the old and simple, SELECT COUNT(*) ? ...

    I'm about to query hundreds of databases to get row count for a particular table. The 2nd one looks more resource intensive to me, but I want to be sure the DMV provides accurate row information.

    Thanks in advance.

    If ALL of your statistics are refreshed first this should return an accurate count...or really close in nearly every circumstance.

    No.

    The DMV row count has nothing whatsoever to do with statistics. It is not affected in any way by when (or if) stats were last updated.

    Ok Gail, now I'm curious... what populates those DMVs and what's the frequency they are populated? What's the mechanism that keep those in sync.

    If there is an official Microsoft link that explains it, I would love to read it.

  • GilaMonster (4/13/2015)


    sql-lover (4/11/2015)


    I am not looking for how many rows exactly I have on a table. I am looking for those tables without any rows.

    IF EXISTS (SELECT 1 FROM TableName)

    PRINT 'Table has rows'

    ELSE

    PRINT 'Table has no rows'

    It won't read more than one row from any table (EXISTS exits as soon as it has a result), so that's probably your most efficient option.

    That's cool, thanks. I will update my script.

  • sql-lover (4/13/2015)


    Ok Gail, now I'm curious... what populates those DMVs and what's the frequency they are populated? What's the mechanism that keep those in sync.

    No idea. Probably same thing that keeps indexes in sync with their tables and updates the other metadata, ie the internals of the query processor and/or storage engine

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 16 through 30 (of 38 total)

You must be logged in to reply to this topic. Login to reply