RE: COUNT_BIG – SQLServerCentral

SSCertifiable

Points: 5276

March 9, 2010 at 1:32 pm

sistemas 95572 (3/9/2010)
Thank you Oleg, I am curious about uniquely identifying rows in tables w/o PKs (sorry about the off topic).
I mean, as I understand from your explanation there is some "row id" that we have no access to, and it´s managed by the dbms.
If that is the case, let's assume a table with 2 int cols, with no primary key, and then you store 5 times the same values, ie:
create table t2 (a int, b int)
insert into t2 values (1,1)
insert into t2 values (1,1)
insert into t2 values (1,1)
insert into t2 values (1,1)
How can you actually uniquely identify one of those rows let's say if i want to delete only one of those?
I've tryed doing it with enterprise manager, but got an error such as "wrong primary key or insufficcient column information".
Sorry about the weird case, i'm a student just taking DB course :p

Actually, you have 4 records in t2, but lets say there are 5 of them. Suppose you want to delete only one of those records while leaving other 4 intact. Since as lesser mortals, we do not have access to rowid, we cannot just issue the statement to, for example, delete THE_THIRD_ROW. On the other hand, if all we care is to delete one of the existing rows sparing other 4 then we could not care less which one we actually delete, because they all look the same to us. Therefore, we can do something like this (assuming that you have SQL Server 2005 or better):

with cte (RecordNumber, a, b) as

(

select

row_number() over (order by a) RecordNumber, *

from t2

)

delete from cte where RecordNumber = 3;

This will delete the row which we perceive as number 3, still giving us the result we need (One row deleted, other 4 are spared).

Usually, the problems like this (delete the dups) arise when there is a need to get rid of all the duplicates in the existing heap table. Lets expand the question like this: I will add 3 more records into t2, 2 records with a = 2 and 1 record with a = 3.

insert into t2 values (2,1);

insert into t2 values (3,1);

This will yield t2 to have 5 records with a = 1, 2 records with a = 2 and 1 record with a = 3. Out of those, the last record is already clean but all records before that have dups. Suppose we want to get rid of all dups while preserving all unique rows. In other words, the end result is expected to have t2 with one record with a = 1, 1 record with a = 2, and 1 record with a = 3. The statement to do this can be like this:

with cte (PartitionedNumber, a, b) as

(

select

row_number() over (partition by a order by a) PartitionedNumber, *

from t2

)

delete from cte where PartitionedNumber > 1;

Now selecting * from t2 will return just 3 leftover unique records.

Oleg