Index creation

  • Hi all

    I'm trying to create a new unique contstrain using the designer. I'm doing it on a single column. One of the option in the Unique key /index window is "Type" with the values "unique key" or "index". Please advice what the diffrences is as i'm being dumn - can't find it in BOL

    ,l0n3i200n

  • Creating a unique constraint is different than creating a key field. It sounds like you're in the Index/Key creation screen. Basically if you select unique key you're making an alternate key for the table, enforced similarly to the PK. If you're creating a unique index, all you get is the index. The difference between the two is down to an internal operation. Constraints such as the unique key, are checked prior to inserting the data within a transaction and if duplicate values are found the insert doesn't occur. Indexes are checked after the insert is completed when the data in the index is updated, which means if the duplicate values are found, the insert has to be undone. Other than that, at the end of the day, they're doing the same thing.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Do you have any script, link to prove that? I have not been able to reproduce this behavior at all.

  • No, I can't prove it. It's information I've inherited. I thought it was from an article by Kalen Delaney, but I can't find it. Here's a reference from Kimberly Tripp: http://www.sqlskills.com/blogs/kimberly/2004/10/25/WhenDidSQLServerStopPuttingIndexesOnForeignKeyColumns.aspx

    On the other hand, there's this one from Ron Talmage:

    http://msdn2.microsoft.com/en-us/library/aa224827(sql.80).aspx

    On the third hand, in Itzik Ben Gan's book on TSQL Querying, he describes a process where by constraints are checked prior to updating of data as a method that the storage engine uses to increase speed. It makes logical sense that it would try to avoid updating the clustered index if it could since after updating the clustered index, the non-clustered indexes would all have to be updated (depending of course on the type & scope of data changes we're talking about).

    But no, I don't have a script that can prove it. It just makes sense to me logically and enough of the evidence seems to indicate that its true. Thing is, unless we're talking about very large amounts of data or a very high failure rate, I doubt the performance difference between the two is worth talking about.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I'm still not conviced. I understand the logic behind what is said that it is better to check prior to make any changes... but I would assume that MS is smart enough to figure this logic for itself and use the same process for indexes and constraints...

    This debate is still opened, for me at least :).

  • This debate is still opened, for me at least

    That makes us 2 😉


    * Noel

  • Do you have any ideas on how to test this? The only 2 tests I did were not conclusive.

  • You know, I've never tried, but I have to assume that it would involve a larger amount of data. Based on the information, I doubt you'd see a difference with only a few hundred rows or less. But that's just a guess, I just don't know.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • This is the script I used to do my first tests. I have another in progress where I have only 1 duplicate row in the 1M insert (Select ID - 999999 >>> only the last row is duplicate). There does not seem to be any difference there either, but I have not be able to finish that series of test yet.

    Maybe you guys can figure out a way to prove of disprove once and for all this myth.

    USE MASTER

    GO

    IF NOT EXISTS (SELECT * FROM Sys.SysDatabases WHERE NAME = 'SSC')

    CREATE DATABASE SSC

    GO

    USE SSC

    GO

    --CREATE TABLE and fill with sample data, test to see what is faster between Unique constraint and unique index on a failed massive insert

    IF EXISTS (SELECT * FROM Sys.SysObjects WHERE NAME = 'UCUI' AND TYPE = 'U')

    DROP TABLE dbo.UCUI

    GO

    SELECT TOP 1000000 IDENTITY(INT, 1, 1) AS ID INTO dbo.UCUI FROM MASTER.Sys.Syscolumns S1 CROSS JOIN MASTER.Sys.SysColumns C2

    GO

    ALTER TABLE dbo.UCUI

    ADD CONSTRAINT UniqueConstraint UNIQUE (ID)

    GO

    DBCC DROPCLEANBUFFERS()

    GO

    SET IDENTITY_INSERT dbo.UCUI ON

    GO

    DECLARE @i AS INT

    DECLARE @Loops AS INT

    DECLARE @sDate AS DATETIME

    SET @sDate = GETDATE()

    SET @i = 0

    SET @Loops = 10

    WHILE @i < @Loops

    BEGIN

    INSERT INTO dbo.UCUI (ID) SELECT ID FROM dbo.UCUI

    SET @i = @i + 1

    END

    PRINT 'Exec time UniqueConstraint : ' + CONVERT(VARCHAR(30), DATEDIFF(MS, @sDate, GETDATE())) + ' MS'

    --#1 Exec time UniqueConstraint : 5550 MS

    --#2 Exec time UniqueConstraint : 5733 MS

    --#3 Exec time UniqueConstraint : 7136 MS

    --#4 Exec time UniqueConstraint : 6873 MS

    --#5 Exec time UniqueConstraint : 6863 MS

    --#6 Exec time UniqueConstraint : 6863 MS

    GO

    SET IDENTITY_INSERT dbo.UCUI OFF

    GO

    ALTER TABLE dbo.UCUI

    DROP CONSTRAINT UniqueConstraint

    GO

    CREATE UNIQUE INDEX UniqueIndex ON dbo.UCUI (ID)

    GO

    DBCC DROPCLEANBUFFERS()

    GO

    SET IDENTITY_INSERT dbo.UCUI ON

    GO

    DECLARE @i AS INT

    DECLARE @Loops AS INT

    DECLARE @sDate AS DATETIME

    SET @sDate = GETDATE()

    SET @i = 0

    SET @Loops = 10

    WHILE @i < @Loops

    BEGIN

    INSERT INTO dbo.UCUI (ID) SELECT ID FROM dbo.UCUI

    SET @i = @i + 1

    END

    PRINT 'Exec time UniqueIndex : ' + CONVERT(VARCHAR(30), DATEDIFF(MS, @sDate, GETDATE())) + ' MS'

    --#1 Exec time UniqueIndex : 8353 MS

    --#2 Exec time UniqueIndex : 5746 MS

    --#3 Exec time UniqueIndex : 6303 MS

    --#4 Exec time UniqueIndex : 6300 MS

    --#5 Exec time UniqueIndex : 6373 MS

    --#6 Exec time UniqueIndex : 6456 MS

    GO

    SET IDENTITY_INSERT dbo.UCUI OFF

    GO

    --DROP INDEX dbo.UCUI.UniqueIndex

    --First batch of test is not very conclusive

    :w00t::w00t::w00t::w00t::w00t::w00t::w00t::w00t::w00t::w00t::w00t::w00t::w00t:

    :D:D:D:D:D:D:D:D:D:D:D:D:D

    This is the one. Thanks for all the memories ;).

    :D:D:D:D:D:D:D:D:D:D:D:D:D

    :w00t::w00t::w00t::w00t::w00t::w00t::w00t::w00t::w00t::w00t::w00t::w00t::w00t:

  • This was the SQL script generated by the type:

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[Customer](

    [CustomerID] [int] NOT NULL,

    [UserName] [varchar](50) NOT NULL,

    [FirstName] [varchar](50) NOT NULL,

    [Surname] [varchar](50) NOT NULL,

    [IsActive] [bit] NOT NULL CONSTRAINT [DF_ApplicationUser_IsActive] DEFAULT ((1)),

    CONSTRAINT [PK_ApplicationUser] PRIMARY KEY CLUSTERED

    (

    [CustomerID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],

    CONSTRAINT [IX_UserName] UNIQUE NONCLUSTERED

    (

    [UserName] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

  • That'll do just fine for your business requirements. We're having a philosophical discussion about a performance myth of sql server. Since I'm pretty sure it's a myth, you can go with either a constraint or an index.

    If you want to know, I personnally choose to use a constraint because it is always visible when I script the table and it makes things easier to understand (to me anyways). This is however a matter of personal choice.

  • Thanks for all the replys.

    Tend to agree, the index did not script when I selected Index

  • One key difference, ignoring the "myth" for the moment, is that the constraint can be used as part of a foriegn key if you choose while the index can't.

    Now, to the myth... I'm going to try an experiment or three myself.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (10/4/2007)


    One key difference, ignoring the "myth" for the moment, is that the constraint can be used as part of a foriegn key if you choose while the index can't.

    Now, to the myth... I'm going to try an experiment or three myself.

    Post some of you result and scripts, would be nice to see the diff.

  • OK. I take it all back. I've done some more research in preparation for the tests I was going to run. The Ninja is right. It's a myth.

    This is from the BOL (September 2007 update):

    http://msdn2.microsoft.com/en-us/library/ms175132.aspx

    Backed up by another BOL entry:

    http://technet.microsoft.com/en-us/library/ms187019.aspx

    And this discussion with Kalen Delaney outlining the key difference, only one of intent, not implementation:

    http://www.dbforums.com/archive/index.php/t-795169.html

    I promise to never perpetuate the myth again and root it out when seen. Sorry for the bad info.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply