Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12345»»»

What's the best way to count? Expand / Collapse
Author
Message
Posted Monday, October 18, 2010 8:23 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 6:26 AM
Points: 2,003, Visits: 1,998
Comments posted to this topic are about the item What's the best way to count?
Post #1006683
Posted Monday, October 18, 2010 9:02 PM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Thursday, July 10, 2014 12:49 PM
Points: 1,676, Visits: 1,754
This is a very good question, thank you Dave. The key here is the index_id: if the table has a clustered index then there will be a record in the stats view with index_id = 1. If the table is a heap then it will have index_id = 0. There is no way to have both index_id 0 and 1, so index_id < 2 or like you have it index_id = 0 or index_id = 1 should suffice. Grouping is important because it possible to have partitioned table.

Oleg
Post #1006689
Posted Monday, October 18, 2010 11:43 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Monday, July 21, 2014 3:43 AM
Points: 1,938, Visits: 1,162
Good question.keep on posting this type of questions.

Malleswarareddy
I.T.Analyst
MCITP(70-451)
Post #1006725
Posted Tuesday, October 19, 2010 12:58 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 6:53 AM
Points: 2,889, Visits: 1,779
I can't see anything that says that the figure returned will be reliable. Fast yes, blindingly so. Reliable.....not so sure.

LinkedIn Profile
Newbie on www.simple-talk.com
Post #1006739
Posted Tuesday, October 19, 2010 2:09 AM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Friday, May 9, 2014 12:47 AM
Points: 3,448, Visits: 4,407
which query is the fastest and most reliable for returning an accurate record count?

(One of) the fastest? Yes. The most reliable? I don't think so, because that query counts both committed and uncommitted records, while the "SELECT COUNT(*)" statement counts only committed records.

Here is an example.

1) Create a table:
CREATE TABLE dbo.QOTD_TEST (ID INT);

2) Insert several records into the table, don't commit them:
BEGIN TRANSACTION;
INSERT dbo.QOTD_TEST VALUES(1);
INSERT dbo.QOTD_TEST VALUES(2);
INSERT dbo.QOTD_TEST VALUES(3);

3) In another query window (e.g. in another connection) run queries 2-4:
SELECT CONVERT(bigint, [rows])
FROM sys.sysindexes
WHERE id = OBJECT_ID('dbo.QOTD_TEST') AND indid < 2

SELECT CAST(p.[rows] AS float)
FROM sys.tables AS tbl
INNER JOIN sys.indexes AS idx ON idx.[object_id] = tbl.[object_id] And idx.index_id < 2
INNER JOIN sys.partitions AS p ON p.[object_id] = CAST(tbl.[object_id] AS int) And p.index_id=idx.index_id
WHERE ((tbl.name = N'QOTD_TEST' AND SCHEMA_NAME(tbl.schema_id) = 'dbo'));

SELECT SUM (row_count)
FROM sys.dm_db_partition_stats
WHERE object_id = OBJECT_ID('dbo.QOTD_TEST') AND (index_id=0 or index_id=1);

All these queries return the value 3.

4) After that, run the 1st query:
SELECT COUNT(*) FROM dbo.QOTD_TEST;

If the database READ_COMMITTED_SNAPSHOT option is set to ON, this query will immediately return the value 0. If READ_COMMITTED_SNAPSHOT = OFF, this query will wait until the locks are removed.

5) In 1st query window, roll back the transaction and drop the table:
ROLLBACK TRANSACTION;
DROP TABLE dbo.QOTD_TEST;

I think that the 1st query is the most reliable, while others are the fastest.
Post #1006773
Posted Tuesday, October 19, 2010 2:13 AM
SSChasing Mays

SSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing Mays

Group: General Forum Members
Last Login: Yesterday @ 4:25 AM
Points: 611, Visits: 479
I chose #1 because all the others appeared to be referencing indexes, and there was nothing in the question that stated the table had an index.
Maybe I'm missing something here...?
Post #1006775
Posted Tuesday, October 19, 2010 2:23 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 6:40 AM
Points: 5,916, Visits: 8,166
I liked the question - allthough with the exception of option 1 on really large (or busy) tables, all options will be blindingly fast, and you won't run this type of query often enough that a few microseconds more or less really matters.

I gave the right answer, but not for the reasons listed. The question specifically asked for "fastest and most reliable", so I focused on those aspects of the queries.

Option 1 scans the complete table (or one of the indexes). Absolutely reliable, but not fast. This answer can only be correct if all others are not.

Option 3 is not reliable. The Books Online entry for sys.partitions specifically describes the "rows" column as an "approximate number of rows in this partition". (The cast to float of the result and the cast to int of object_id are not wrong, though they are of course weird).

Options 2 and 4 are both reliable. (The explanation of the question says sysindexes can be unreliable. This is not supported by the referenced sources. The sysindexes system table was unreliable in SQL 2000 and before, but this has changed in SQL Server 2005 - here is a reference. Also, Books Online does not contain the words "approximate", "inaccurate", or anything similar for these views, as it does for sys.partitions.

Both the question explanation and the description in Books Online of sys.sysindexes mention that the sys.sysindexes compatibility view will be removed in future versions of SQL Server. This is very relevant for real work - but it was not a deciding factor in this question, so this should not be used to decide between options 2 and 4.
The same page in Books Online also contains this warning: "Partitioned tables and indexes are not fully supported in this view; use the sys.indexes catalog view instead." The table used in the example code is not partitioned, but I interpreted the code as examples for a randomly chosen sample table, and the question about the method in general. Since method 2 does not work for partitioned tables, it's not reliable.

Method 4 and method 1 are both reliable. Method 1 scans the table (or an index); method 4 uses systemm views - this one is much faster. That's why I chose method 4.



Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Post #1006778
Posted Tuesday, October 19, 2010 2:25 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 6:40 AM
Points: 5,916, Visits: 8,166
Bob Cullen-434885 (10/19/2010)
I chose #1 because all the others appeared to be referencing indexes, and there was nothing in the question that stated the table had an index.
Maybe I'm missing something here...?

If the table has a clustered index, information is stored in the system views as the index with index id 1.
If the table has no clustered index, information is stored in the system views as the index with index id 0. Not a real index, but apprently Microsoft decided to take a practical approach and not use specific different system views for heaps (tables without clustered index).



Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Post #1006779
Posted Tuesday, October 19, 2010 2:34 AM
SSChasing Mays

SSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing Mays

Group: General Forum Members
Last Login: Yesterday @ 4:25 AM
Points: 611, Visits: 479
Well, at least I got something right - I had missed something!!
Thanks, Hugo
Post #1006786
Posted Tuesday, October 19, 2010 2:57 AM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Friday, May 9, 2014 12:47 AM
Points: 3,448, Visits: 4,407
Hugo Kornelis (10/19/2010)
The same page in Books Online also contains this warning: "Partitioned tables and indexes are not fully supported in this view; use the sys.indexes catalog view instead." ... Since method 2 does not work for partitioned tables, it's not reliable.

The 2nd query works just fine for partitioned tables
That BOL page states what columns are not supported in case of partitioned tables: first, root, groupid, dpages, reserved, used, rowcnt, rowmodctr, and FirstAIM. The column 'rows' is not in this list, and it's supported.
Post #1006798
« Prev Topic | Next Topic »

Add to briefcase 12345»»»

Permissions Expand / Collapse