Find whether a clustered index is actually unique

  • DamianC

    SSCertifiable

    Points: 7831

    Hi

     

    I have a table that has a clustered index on it

    The table does not have a primary key and the clustered index has not been defined as unique

    There are approximately 3B rows in the table

    It's been around for years and I'm wondering why neither a PK or a unique clustered index was set against it

    Is there an easy way to interogate this table (say sysindexes or ...) such that I can see whether the columns that make up the clustered index are unique? Other than something like count distinct

    I'm pretty certain they should be

     

    Thanks

     

    - Damian

  • Grant Fritchey

    SSC Guru

    Points: 396617

    Billions of rows over years and no constraint. I'd put good money down that there are non-unique values.

    Easiest way I know to validate this would be to do two queries. One, COUNT, and the other COUNT DISTINCT. Compare those two values. That won't tell you what's duplicated, but it'll let you know whether or not your suspicions are correct.

    ----------------------------------------------------
    The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood...
    Theodore Roosevelt

    The Scary DBA
    Author of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd Edition
    Product Evangelist for Red Gate Software

  • DamianC

    SSCertifiable

    Points: 7831

    Thanks for the response Grant

    Think I'll go with the COUNT and COUNT DISTINCT then.

    Just wasn't sure whether there's a quicker option (bit like doing a count using sys.table and sys.partition vs count(*))

     

    - Damian

  • Grant Fritchey

    SSC Guru

    Points: 396617

    You can get some estimates on row counts from the statistics. There's also DMVs that will show estimates on the row counts. However, nothing will show if there's duplicate data except going and checking.

    Another way to check, and I wouldn't recommend this, you could just try creating a constraint. However, with a table that big, you're likely to see some blocking & locking, so it might be a very bad choice. Just thought I'd pass it along regardless.

    ----------------------------------------------------
    The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood...
    Theodore Roosevelt

    The Scary DBA
    Author of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd Edition
    Product Evangelist for Red Gate Software

  • ScottPletcher

    SSC Guru

    Points: 98490

    --if you just want to see if a dup key exists:
    SELECT TOP (1) key_col1, key_col2, COUNT(*) - 1 AS duplicate_count
    FROM dbo.table_name
    GROUP BY key_col1, key_col2
    HAVING COUNT(*) > 1

    --if you want to see all dup keys, remove the "TOP (1)" clause

    SQL DBA,SQL Server MVP(07, 08, 09) Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: "If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them."

  • leward

    SSC Rookie

    Points: 27

    At a former employer, we had a number of very large tables that had absolutely no constraints on them, and occasionally wanted to clean up the duplicates.  There were also not really any natural keys, and those duplicate rows would often be complete duplicates, i.e., every column value was the same across multiple rows.  After several conversations about how to delete duplicated rows while leaving exactly one of them in place, we opted for creating a delete statement based on this sort of select:

    ;with cte as
    ( select FIRST_COLUMN, SECOND_COLUMN,
    row_number() over(partition by FIRST_COLUMN, SECOND_COLUMN order by FIRST_COLUMN, SECOND_COLUMN) instance
    from MyTable
    )
    select * from cte where instance > 1

    • This reply was modified 3 weeks, 1 day ago by  leward.

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply