DataType Performance

  • I am currently developing some new apps on top of a really old DB structure. We have discussed a redesign of this DB in the near future.

    Currently, every table in the database (regardless of how many Records) uses a "Decimal(9,0)" Data Type for the Primary Key. This was just taken over from the old Sybase SQL Anywhere 5.5 days when the "Numeric" data type existed. Most of the tables in this DB could get away with just using an INT data type (or BigInt at the most).

    My question is this: Is there any performance difference in Selecting, Updating / Inserting a Decimal vs. an Int data type? I understand that CPUs are currently optimized for Ints (and soon for BigInts with the 64bit technology). Keep in mind that I have already exhausted EVERY possible Indexing scenario in these tables and our views using Index Hints as well.

    I only ask this becuase the queries we perform across 300,000+ records using a Decimal are FAR slower than another DB we have that uses VarChar's for Primary Keys (and it has more records and more columns)!!!

  • Well, that depends...using an INT is the right thing but besides that -

    There are two things to consider in this respect:

    1) NUMERIC(9,0) will take up 5 bytes and INT will take up 4 bytes.  Besides the obvious storage issue (which is not really a concern), the concern is that the page size is fixed and so in an index if using an INT, you can fit in say 2015 enteries on a page, you can fit in only 1612 enteries on a page using NUMERIC(9,0) (assuming 8060 bytes for the page).  Now, that means that the more records you have the more number of index pages you will have for storing those records.  If your queries in the system are such that it will require traversing a lot of index pages (range scans), then having it as INT is beneficial so that it does not have to traverse large number of pages.

    2)  The second thing to consider is that if NUMERIC(9,0) is used as the data-type and the queries are specifying integer values while comparing the records, it will incur a small hit because the optimizer has to apply the convert function on the literal to first convert it to the correct data-type - the way to get around this issue used to be either to use the correct data-type which is INT or specify say 2.0, 3.0 instead of 2, 3 in the filter criteria.

    On SP3a:

    CREATE TABLE TESTCASE2 (COL1 NUMERIC(9,0) PRIMARY KEY)

    GO

    DECLARE @I INT

    SET @I = 1

    WHILE (@I <= 10000)

    BEGIN

    INSERT INTO TESTCASE2 VALUES (@I)

    SET @I = @I + 1

    END

    GO

    SET SHOWPLAN_TEXT ON

    GO

    SELECT * FROM TESTCASE2 WHERE COL1 = 2

    GO

    StmtText                                                                                                                                          

    ---------------------------------------------------------------------------  |--Clustered Index Seek(OBJECT([master].[dbo].[TESTCASE2].[PK__TESTCASE2__1E855E4E]), SEEK([TESTCASE2].[COL1]=Convert([@1])) ORDERED FORWARD)

    --this one uses the right execution plan without the convert since

    --2.0 is specified

    SELECT * FROM TESTCASE2 WHERE COL1 = 2.0

    GO

    StmtText                                                                                                                                 

    ---------------------------------------------------------------------------  |--Clustered Index Seek(OBJECT([master].[dbo].[TESTCASE2].[PK__TESTCASE2__1E855E4E]), SEEK([TESTCASE2].[COL1]=[@1]) ORDERED FORWARD)

    Since you are seeing slower performance, I would take a look at the execution plans that are being generated - you can create two databases - one with the INT data-type for those columns and the other one as is and then run the same perf test against the two and see how the execution plans are different and then after capturing those SQLs, run the STATISTICS PROFILE, STATISTICS IO and STATISTICS TIME on them.

    Hth

  • I see your point. Since none of the Queries performed on these tables actually use the correct format ("2.0" vs. "2"), then the Query plan is probably converting these all to INTs intrinsically to compare values.

    Thanks for the info and I'll try to come up with some test cases!

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply