Avoiding lookups for exsitence

  • Everyone.

    Something I have been trying lately is the reduction of Existence lookups on some of our larger tables in the domain and having the work done with a check constraint or unique index. Here is a simple example with simple users table.

    CREATE SCHEMA Org

    CREATE TABLE Org.Users(

    UserID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED NOT NULL,

    SourceSystemId INT NOT NULL,

    SourceUserId VARCHAR(50) NOT NULL,

    OrganizationIDINT NOT NULL,

    LastName VARCHAR(75) NOT NULL,

    FirstName VARCHAR(50) NOT NULL,

    EmailAddress VARCHAR(100) NULL,

    PhoneNumber VARCHAR(14) NULL)

    --here comes a large check

    ALTER TABLE Org.Users

    ADD CONSTRAINT [chk_Unique_users] UNIQUE

    (

    SourceUserId,

    OrganizationID,

    LastName,

    FirstName,

    EmailAddress,

    PhoneNumber

    )

    Now in my dac code:

    IF @@TRANCOUNT = 0

    BEGIN

    RAISERROR(50101,16,1)

    RETURN @@ERROR

    END

    BEGIN TRY

    INSERT INTO org.Users

    (

    SourceSystemId ,

    SourceUserID,

    OrganizationId ,

    LastName ,

    FirstName ,

    PhoneNumber,

    EmailAddress

    )

    VALUES

    (1 ,

    ISNULL(@strSourceUserID,''),

    @intOrganizationId ,

    @strUserLastName ,

    @strUserFirstName,

    @strUserPhoneNumber,

    @strUserEmail

    )

    SET @intUsersId = SCOPE_IDENTITY()

    END TRY

    BEGIN CATCH

    --IF a duplicate lets return back Key and carry on happily

    IF ERROR_NUMBER() in(2627,23) --dup constraint

    BEGIN

    SELECT @intUsersId = UserId

    FROM org.users WITH(NOLOCK)

    WHERE OrganizationId = @intOrganizationId AND

    LastName = @strUserLastName AND

    FirstName = @strUserFirstName AND

    EmailAddress = @strUserEmail AND

    PhoneNumber = @strUserPhoneNumber

    RETURN 0

    END

    EXEC ord.usp_HandleError

    RETURN -1

    END CATCH

    The question point is in the catch. I consume the thrown dup and get then pass the key to the client. The biggest problem I see is the large check constraint/index. But the code is failry clean (minus the hard coded error codes). It removes all logic to figure out update vs insert which removes some locking. Thoughts?

  • Well, as far as I am concerned, if that is truly your definition of uniqueness you should have that unique index on the table anyway.

  • Yes its there. I guess the question is around the handling of the exception when the constraint check fails.

  • I think the error handling is logical. I would suggest alos passing back some kind of error flag so that your UI/business layer knows that it tried to create a duplicate. I am working under the assumption that you are always passing back @intUsersId so that the UI/business layer can continue to work. The reason I suggest that is that, in the example you present, there is the possibility that 2 or more people could meet your unique constraint as currently corrected. For example Jack Corbett and Jack Corbett, Jr. each with either a null or the same phone number and a null email address. If I know they are 2 different people I would want a message saying the Jack Corbett already exists and then I could fudge the last name to include the Suffix.

  • I'm not sure you've reduced the workload. There might be less locking, but you are still doing a lookup.

    I agree that there should be a unique index if this is really unique, but I can see possible problems with the fields you've picked.

  • Sorry this was a bad example. This was some example code and would never be appropriate to identity a user.

    From the app side, I was thinking the savings would incur in there code not having any additional handling for the duplicate situation, as they would really never know. All they want is the userid created/fecthed returned back.

    Is there any best practice on how may bytes a check constraint can be before it becomes to large and affects performance. I usually just come to this conclusion by feel, and neve with a metric that says a x byte wide check can not be efficient.

  • I'm not sure about performance metrics, but there's a 900byte limit I believe. So that might cause issues.

    If you just want to return the ID, I'd wrap this into a stored proc and perform the check there. If you've indexed these fields, it should be a very quick seek to determine if the row exists.

  • Steve's right - unique constraints are implemented through indexes, so they're subject to the "no more than 16 columns, which can't add up to more than 900 characters" limit.

    If you're going over those limits - then chances are very good that there's a problem in either the process or the model being implemented.

    If you MUST go over that limit - then you'd need something under 900 characters that makes the lookup unique - the rest can then be added as included columns and the match up performed "manually". Meaning - you're off the reservation at that point into custom land.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply