Get record count for a specific database

  • Comments posted to this topic are about the item Get record count for a specific database

    ~ Without obstacles, you cannot progress ~
    http://sqln.blogspot.com/

  • Hi,

    Works very nice and ...counting the table with million records!

    Thnx!

    ============================================================
    SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • Just a innocent question.. In order for the query to return correct row count, doesn't the stats of the indexes of the tables in the database need to have been updated just before collecting this information.

    I have seen in the past the count doesn't quite match with the actual row count. Can you enlighten the reason behind it.

  • Dugi,

    Thanks for the compliment. Most of the credit must go to ‘meird[/url]’; I simply had an idea that could possibly improve the usage of the results and provide a more readable structure and who wouldn’t want a grand total at the end? 🙂

    Thanks,

    James Rea

    ~ Without obstacles, you cannot progress ~
    http://sqln.blogspot.com/

  • Cupidking,

    I apologize for the delay in the response; I wanted to ensure that I was thorough in my research and testing to give you my best opinion of the answer. With that said, keep in mind that this is an opinion and based on my independent testing.

    I’ve tried researching the subject to see if I could find any evidence that would support or deny the requirement of the stats of the indexes to be updated.

    My research has indicated stats could be required to be updated in SQL Server 2000 (especially prior to release of SP 2); however, to get an incorrect reading you’d have to update a table and then query the count within ‘sysindexes’ in the SAME transaction. This was resolved by Microsoft in release of SP 2 for SQL Server 2000. There was another anomaly within that Service Pack release that could potentially create a miscount of rows also. It was more particular to the ‘sysindexes.rowmodctr’ column. I am including the links to those two Knowledge Base articles for your reference; if you are trying to use this script on SQL Server 2000 then these links may help:

    KB 308822: http://support.microsoft.com/kb/308822

    KB 317847: http://support.microsoft.com/kb/317847

    In either case that doesn’t apply to my script because the script was designed for use with SQL Server 2005, not SQL Server 2000.

    With that said; I am a firm believer in that nothing is perfect and there must be a way to break the script (well, in this case to make a miscounting of the records ). To do this I had attempted a few different things and all had passed with 100% accurate results. The best method I could think of to create a miscount was to create a test table and load 1,000,000 records in it and verify the count was accurate, all within the same transaction (SQL 2000 problem would’ve resulted in a count of 0); so this confirms that previous problem had been resolved. I then used a script that would create the test table and load 1,000,000 records; then delete 300,000+ records, and then immediately perform a record count. NOTE: All actions were performed within as few commits as possible, and particularly the delete of records and getting a record count was done in ONE commit transaction. My results came back with 100% accuracy. I am including 2/3 of the script (I left out the code for the ‘Get Record Count’ because you obviously will have it already).

    From my testing and results I can only deduce that the stats do not need to be updated manually if they are a factor. Again, this is all informal testing and I may very well be incorrect, but I have nothing that can prove the need to update the stats…by all means I don’t believe it’s impossible for the miscount to happen, and maybe someday I might come across a method that does indicate an incorrect count and I’ll update the script or submit a new script to address this. If you can provide any additional information, specific settings or series of events that could replicate this in SQL Server 2005 then I’d be happy to try to replicate it and deduce why this is occurring.

    The following is the script I used for testing. For your (and anyone else that comes across this topic) reference, I had used SQL Server 2005 Developer Edition (Product Version: 9.00.3054.00, Product Level: Service Pack 2, IsClustered = 0).

    --***START GET RECORD COUNT BREAKING SCRIPT***

    USE [TEST_DB];

    GO

    --Detect if table already exists;

    --if so then delete existing table

    IF OBJECT_ID('dbo.abc_TestCount', 'TABLE') IS NOT NULL

    DROP TABLE dbo.abc_TestCount;

    GO

    --Create new table (should not be

    --included in index since it is new)

    CREATE TABLE abc_TestCount

    (

    --Ensure to avoid using Unique

    --(Unique could trigger indexing automatically)

    TestID INT NOT NULL,

    TestText nvarchar(25) NOT NULL,

    TestDate datetime NOT NULL,

    );

    --Declare variables

    DECLARE @tmpCount INT

    DECLARE @intCount INT

    DECLARE @stTest nvarchar(25)

    DECLARE @stTestMod nvarchar(25)

    DECLARE @dtDate datetime

    DECLARE @dtDateMod datetime

    --Set variables

    SET @tmpCount = 0

    SET @intCount = 1

    SET @stTest = 'Testing # '

    SET @dtDate = GETDATE()

    --Perform Loop to insert code

    --1 million records should be sufficient

    WHILE @tmpCount < 1000000

    --unfortunately must include BEGIN/END

    --because of loop, and can't possibly justify

    --'GetRecordCount' within loop over 1 million times

    --Must keep some realistic measurements/constraints

    BEGIN

    SET @tmpCount = @tmpCount + 1

    SET @stTestMod = @stTest + CAST(@intCount as nvarchar(25))

    SET @dtDateMod = DATEADD(day, 1, @dtDate)

    SET @intCount = @intCount + 1

    INSERT INTO dbo.abc_TestCount

    VALUES (@tmpCount, @stTestMod, @dtDateMod);

    END

    --Immediately deleting 300,000+ records before record count

    --This should simulate an 'out-of-date' index

    --if an index somehow snuck by

    DELETE FROM dbo.abc_TestCOUNT

    WHERE TestID > '305675' AND TestID < '623456'

    --Now you would put the T-SQL to Get the Record Count

    --Remember don't use any additional commits/GO commands

    --between this T-SQL and the Get Record Count.

    ~ Without obstacles, you cannot progress ~
    http://sqln.blogspot.com/

  • I found this undocumented statement that works real well.

    sp_MSForEachTable 'sp_spaceused "?"'

    This statement will return a count for each table in your database / catalog.

    Make sure that your query view option is set to "text" instead of the "grid" especially if you have a large number of tables.

    enjoy

    John Esraelo

    Cheers,
    John Esraelo

  • John Esraelo (2/9/2008)


    I found this undocumented statement that works real well.

    sp_MSForEachTable 'sp_spaceused "?"'

    This statement will return a count for each table in your database / catalog.

    Make sure that your query view option is set to "text" instead of the "grid" especially if you have a large number of tables.

    enjoy

    John Esraelo

    John,

    That's a great stored procedure to use for iterating through all the tables. In fact that Stored Procedure was introduced back in SQL Server 6.5; it’s always been an undocumented stored procedure.

    I have chosen not to use this procedure because you can’t put the results in an ORDER, SUM the results, or omit system tables (i.e. sysdiagrams) without having to use a temp table and performing a SELECT on that temp table, to not have those features would not be helpful for a database that contains millions of records, 100’s or 1000’s of tables; as you can’t easily determine the results for any type of reports or other practical uses because of the default order is not logical (for reporting/reading purposes).

    To make this into a useable method you would need to create a temporary table and INSERT INTO that temp table and then perform a SELECT to pull the information you want and present it in the ORDER you find most useful, as well as SUM the results, and omit any tables you would like.

    The problem I have with it is that it increases your overhead by more than 500% (this is based on a small table with 160 tables and 3,000,000 records); I couldn’t imagine what the overhead would be on a medium sized table, or yet a large table. It’s a scary thought in a world where performance is everything!

    A great way to see this is to perform my script and the below method on AdventureWorks database; you may be surprised to find the execution time is way different…and this is only with 70 tables and a little over 500,000 records! Imagine a terabyte sized database?!

    An example to get the same information as my script, and allow the ability to ORDER, SUM, and omit would be:

    USE DATABASE_NAME;

    GO

    CREATE TABLE #t_rCount (vTableName varchar(50), rcount int);

    EXEC SP_MSForEachTable

    'INSERT INTO #t_rCount SELECT "?", count(*) from ?';

    SELECT * FROM #t_rCount

    WHERE vTableName <> '[dbo].[sysdiagrams]'

    ORDER BY vTableName

    COMPUTE SUM(rcount);

    DROP TABLE #t_rCount;

    I would discourage using this Stored Procedure alone if you need any type of ORDER or Aggregation function performed, or if you need to place this in a report or a logically readable format (i.e. by table name or row count). I would also highly discourage the use of this stored procedure with a temp table as the base performance overhead is well beyond any acceptable levels.

    I hope this makes it clear as to why I had chosen not to use the ‘SP_MSForEachTable’ stored procedure.

    Thanks,

    James Rea

    ~ Without obstacles, you cannot progress ~
    http://sqln.blogspot.com/

  • I like using sp_msforeachtable for certain quick and dirty works.

    sp_msforeachtable as you know is a base function / sp that overloads some requests, meaning you can run other functions thru this sp.

    You are absolutely right, there are times that you don't want to make and push the server do silly things and waste the system resouces.

    One of the usages for this sp would be something in the line of indexing all the table at 2:00 am and I have tried that with couple of hundred tables and two or three indexes on each table and an average of few 10s of thousands records in 25 seconds.

    Your experience shows that you have dealt with many great databases and obviously I can learn few things from you.

    thx for the reply and looking forward in learning more cool things from you in SQL Server Central.

    John Esraelo

    Cheers,
    John Esraelo

  • John Esraelo (2/13/2008)


    I like using sp_msforeachtable for certain quick and dirty works.

    sp_msforeachtable as you know is a base function / sp that overloads some requests, meaning you can run other functions thru this sp.

    You are absolutely right, there are times that you don't want to make and push the server do silly things and waste the system resouces.

    One of the usages for this sp would be something in the line of indexing all the table at 2:00 am and I have tried that with couple of hundred tables and two or three indexes on each table and an average of few 10s of thousands records in 25 seconds.

    Your experience shows that you have dealt with many great databases and obviously I can learn few things from you.

    thx for the reply and looking forward in learning more cool things from you in SQL Server Central.

    John Esraelo

    John,

    Thanks for the compliment; I just have different experiences. I am sure I can, and will, learn as much from you.

    I agree with 100% that SP_MSForEachTable is a great method for small / non-complex tasks that require iterating throughout the database. Indexing, believe it or not, is not something I've thought of using it for...go figure, I've used the good ol' fashioned SSIS Scripting trick...but, I've also liked the ability to log the results (Successful or Failure) to a flat file for later review. I'll definitely give SP_MSForEachTable a shot on one of my test systems to see the benefits.

    I look forward to many, many future discussions with you as well.

    Thanks,

    James Rea

    ~ Without obstacles, you cannot progress ~
    http://sqln.blogspot.com/

  • Thnx guys for all ur solutions. Also thank the author for an article.

    😛

  • hi kahe laphare mein pada hai be

  • d----- (10/7/2008)


    hi kahe laphare mein pada hai be

    English only no other language plz!

    :w00t::w00t::w00t::w00t::w00t::w00t:

    ============================================================
    SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • When I read the article, I was thinking, "why would anyone wnat to know the total number of records in a database?" Then, I ran the code and it listed each table and the records. I misunderstood the point. Age does that to you. :hehe:

    Anyway, thanks for the code and I will save it for later use. The next time somebody wants to know the number of records in a table, voila, I will use this.

    I just got an idea. I will build a .NET front end for this... 😎

  • Thank you for the compliment.

    It absolutely is wonderful to see the row count for each table, the purpose behind the final (total) count is very useful in cases of replication and/or migrating of databases to new servers. Some cases I've found myself replicating a database with 1,000s of tables in it and reviewing each record count per table is a very daunting task.

    In my particular instance the replication was done multiple times a day; in that scenario I would 90% of the time review only the total number of records in the entire database for both Publisher A and Subscriber A. If there was a descrepancy then I'd delve further down into the table comparison counts. Of course this is by running the script once each on both the Publisher and Subscriber database. To ensure best possible accuracy I'd once a week or two go through the task of comparing each table (even if the final count matched); just to be 100% accurate and nothing has gone awry in the databases.

    I've developed a little more useful script that does a direct comparison of tables between two servers, in a very similar format. It'll provide a table by table direct comparison; it's really only missing the ability to detect missing tables (where Pub A has the table and Subscriber A does not, or vice versa). I'll be providing it to this website in the upcoming week or so; keep an eye out for it, it may prove useful in a .NET front end application (as opposed to running the script seperately on two servers and pulling the data together)...if that's the intention of the app you have in mind.

    ~ Without obstacles, you cannot progress ~
    http://sqln.blogspot.com/

  • d----- (10/7/2008)


    hi kahe laphare mein pada hai be

    Please do not waste your time here.

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply