Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


COUNT_BIG


COUNT_BIG

Author
Message
Oleg Netchaev
Oleg Netchaev
SSCommitted
SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)

Group: General Forum Members
Points: 1693 Visits: 1809
sknox (3/9/2010)
sjimmo (3/9/2010)
Good question - Thought that I knew the answer, and was wrong. Now, I have a question. I ran the following against an existing database to prove/disprove the answer, and found the following:

SELECT COUNT_BIG(*) FROM test_table
98481

SELECT COUNT_BIG(testfield2) FROM test_table
98481

SELECT COUNT_BIG(ALL testfield2) FROM test_table
98481

SELECT COUNT_BIG(DISTINCT testfield2) FROM test_table
28452

Based upon testing, though brief, and the answer, I would assume that the correct answer is 1, 2, and 3. The answer description states that ALL is the default, whether used or not thus selection 1 would use the default of ALL as shown in the above example, correct?


No. It appears that you have no NULL values in testfield2. Here's the breakdown:

(a) COUNT_BIG(*) is essentially a row count function -- it returns the number of rows returned, including those with NULL values in any column.
(b) COUNT_BIG(testfield2) will return the number of rows where testfield2 is NOT NULL.
(c) COUNT_BIG(ALL testfield2) is the same as the previous, as ALL is the default.
(d) COUNT_BIG(DISTINCT testfield2) will return the number of unique nonnull values of testfield2

So, (b) and (c) are always equal. If (a) is equal to (b) and (c), that means you have no NULLs in testfield2 in your result set. If (d) is equal to (b) and (c), that means you have no duplicate values in testfield2 in your result set. If they're all 4 equal, that means testfield2 is a column with all unique non-null values.


This is an excellent explanation!. The (a) holds true even if you have a table with records having null in every column (not that you ever want to have such table :-)).

Consider, for example:

declare @boloney table (column_1 int, column_2 int);

insert into @boloney
select 1, 1
union
select 2, null
union
select 3, 2
union
select 4, 2
union
select 5, 3
union
select null, null;

select
count_big(*) option_1,
count_big(column_2) option_2,
count_big(all column_2) option_3,
count_big(distinct column_2) option_4
from @boloney;
go



The result of the above will be

opt_1   opt_2   opt_3   opt_4
----- ----- ----- -----
6 4 4 3


Even though there is a row with null in every column, the count_big(*) just like count(*) will still include it because it returns the number of rows in the table regardless of any values in any columns. This is probably because it includes the volatile rowid of the heap table below and the rowid is never null.

Oleg
SQLRNNR
SQLRNNR
SSC-Insane
SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)SSC-Insane (21K reputation)

Group: General Forum Members
Points: 21113 Visits: 18259
Nice explanation and test setup Oleg.



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw

sjimmo
sjimmo
Hall of Fame
Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)

Group: General Forum Members
Points: 3354 Visits: 2846
Oleg,
Thanks for the explanatin. You're right - I did not have any nulls.

Steve Jimmo
Sr DBA
“If we ever forget that we are One Nation Under God, then we will be a Nation gone under." - Ronald Reagan
Oleg Netchaev
Oleg Netchaev
SSCommitted
SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)

Group: General Forum Members
Points: 1693 Visits: 1809
CirquedeSQLeil (3/9/2010)
Nice explanation and test setup Oleg.

Thank you Jason.

I wish the day will come when somebody will explain to me how to properly format the results output when including it in the post. Normal copy from SSMS results pane initially looks good but then all tabs and extra spaces get replaced by a single space (default html behavior). I tried to use the [code] tag, but it still does not look right Hehe The cave man attempt to cheat by typing [pre]the results here[/pre] does not work either because the pre tag does not get parsed.

Oleg
Oleg Netchaev
Oleg Netchaev
SSCommitted
SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)

Group: General Forum Members
Points: 1693 Visits: 1809
sjimmo (3/9/2010)
Oleg,
Thanks for the explanatin. You're right - I did not have any nulls.

Looks like I am getting the credit I don't deserve w00t. The original explanation was posted by sknox, I just used it as a quoted reply and added the snippet to it.

Oleg
KevinC.
KevinC.
Right there with Babe
Right there with Babe (771 reputation)Right there with Babe (771 reputation)Right there with Babe (771 reputation)Right there with Babe (771 reputation)Right there with Babe (771 reputation)Right there with Babe (771 reputation)Right there with Babe (771 reputation)Right there with Babe (771 reputation)

Group: General Forum Members
Points: 771 Visits: 504
I, also, learned something new. Thanks.
sistemas 95572
sistemas 95572
SSC Rookie
SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)

Group: General Forum Members
Points: 29 Visits: 33
Hey pal,

what do you mean by volatile rowid from the heap?

Check a couple of posts upstairs, i pointed out the case of a full nulled row, but wasn't sure "why" count(*) included that kind of rows.

Could it be just implementation desicion?
Could the count(*) function get its return value from some sort of catalog when there is no where clause, instead of actually getting into the data?
Oleg Netchaev
Oleg Netchaev
SSCommitted
SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)

Group: General Forum Members
Points: 1693 Visits: 1809
sistemas 95572 (3/9/2010)
Hey pal,

what do you mean by volatile rowid from the heap?

Check a couple of posts upstairs, i pointed out the case of a full nulled row, but wasn't sure "why" count(*) included that kind of rows.

Could it be just implementation desicion?
Could the count(*) function get its return value from some sort of catalog when there is no where clause, instead of actually getting into the data?





What I meant is that even though you have a table with rows which have null in every column, every row in the table can still be uniquely identified by the engine, which actually stores the data on pages. If you have rows with null in every column then the odds are that this is a heap table, and from what I understand, the rowid is the only way for the engine to uniquely identify every record in such table.

I cannot speculate whether implementation of the count(*) gets, as you say, "its return value from some sort of catalog", I just suggested that it might use the rowid (those are always unique and never null) to figure the count. In other words, I suggested that yes, it is "actually getting into the data".

Sorry that I missed your earlier post, I apologize.

Oleg
sistemas 95572
sistemas 95572
SSC Rookie
SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)

Group: General Forum Members
Points: 29 Visits: 33
Thank you Oleg, I am curious about uniquely identifying rows in tables w/o PKs (sorry about the off topic).

I mean, as I understand from your explanation there is some "row id" that we have no access to, and it´s managed by the dbms.

If that is the case, let's assume a table with 2 int cols, with no primary key, and then you store 5 times the same values, ie:

create table t2 (a int, b int)
insert into t2 values (1,1)
insert into t2 values (1,1)
insert into t2 values (1,1)
insert into t2 values (1,1)

How can you actually uniquely identify one of those rows let's say if i want to delete only one of those?
I've tryed doing it with enterprise manager, but got an error such as "wrong primary key or insufficcient column information".

Sorry about the weird case, i'm a student just taking DB course :p
Oleg Netchaev
Oleg Netchaev
SSCommitted
SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)

Group: General Forum Members
Points: 1693 Visits: 1809
sistemas 95572 (3/9/2010)

Thank you Oleg, I am curious about uniquely identifying rows in tables w/o PKs (sorry about the off topic).

I mean, as I understand from your explanation there is some "row id" that we have no access to, and it´s managed by the dbms.

If that is the case, let's assume a table with 2 int cols, with no primary key, and then you store 5 times the same values, ie:

create table t2 (a int, b int)
insert into t2 values (1,1)
insert into t2 values (1,1)
insert into t2 values (1,1)
insert into t2 values (1,1)

How can you actually uniquely identify one of those rows let's say if i want to delete only one of those?
I've tryed doing it with enterprise manager, but got an error such as "wrong primary key or insufficcient column information".

Sorry about the weird case, i'm a student just taking DB course :p


Actually, you have 4 records in t2, but lets say there are 5 of them. Suppose you want to delete only one of those records while leaving other 4 intact. Since as lesser mortals, we do not have access to rowid, we cannot just issue the statement to, for example, delete THE_THIRD_ROW. On the other hand, if all we care is to delete one of the existing rows sparing other 4 then we could not care less which one we actually delete, because they all look the same to us. Therefore, we can do something like this (assuming that you have SQL Server 2005 or better):

with cte (RecordNumber, a, b) as
(
select
row_number() over (order by a) RecordNumber, *
from t2
)
delete from cte where RecordNumber = 3;



This will delete the row which we perceive as number 3, still giving us the result we need (One row deleted, other 4 are spared).

Usually, the problems like this (delete the dups) arise when there is a need to get rid of all the duplicates in the existing heap table. Lets expand the question like this: I will add 3 more records into t2, 2 records with a = 2 and 1 record with a = 3.

insert into t2 values (2,1);
insert into t2 values (2,1);
insert into t2 values (3,1);



This will yield t2 to have 5 records with a = 1, 2 records with a = 2 and 1 record with a = 3. Out of those, the last record is already clean but all records before that have dups. Suppose we want to get rid of all dups while preserving all unique rows. In other words, the end result is expected to have t2 with one record with a = 1, 1 record with a = 2, and 1 record with a = 3. The statement to do this can be like this:

with cte (PartitionedNumber, a, b) as
(
select
row_number() over (partition by a order by a) PartitionedNumber, *
from t2
)
delete from cte where PartitionedNumber > 1;



Now selecting * from t2 will return just 3 leftover unique records.

Oleg
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search