Indexing on CHAR/INT columns

  • Hallo,

    I would like to use a char column for indexing some of my tables. (as clustered index or as unclustered index) 

    The question is:

    Is an INT/SMALLINT/TINYINT column better than a CHAR(4) column?

    I would garantee that the 4 characters are always occupied and unique.

    Reason: I just want to make my code more readable when writting queries e.g.

    SELECT  * FROM Auto WHERE Auto.ColorID in ( 'PINK' , '_RED' , 'BLAC', 'YELW')

    instead of

    SELECT  * FROM Auto WHERE Auto.ColorID IN (5, 20, 45, 9)

    Thank you in advance

  • Hmm... OK, the answer is a firm "it depends".

    How many rows are you indexing, how many unique values are included in your column(s)?

    The usefullness/value of any given field in an index is going to vary depending on the cardinality of the values and the number of bytes in the column indexed... In theory an integer column is smaller byte wise than a char x 4 field but the cardinality/selectivity/value of the index will vary wildly depending on the number of values contained in the data... basically the more selective your index column is, the more valuable efficient it can be.

    Joe

  • Yep... it does depend... but indexing on INT will be faster than just about anything else including SmallInt and TinyInt because of the way the operating/disk systems work... INT is a natural storage "size".

    If you want to make your code more readable and still maintain the performance, what's wrong with comments? :blink:

    SELECT * FROM Auto WHERE Auto.ColorID IN (5, 20, 45, 9) --PINK , _RED , BLAC, YELW

    Of course, you could always do it the good ol' fashioned way and still avoid the index on the char column whilst maintaining extreme readability...

    DECLARE @Pink INT

    DECLARE @red INT

    DECLARE @Black INT

    DECLARE @Yellow INT

    SET @Pink = 5

    SET @red = 20

    SET @Black = 45

    SET @Yellow = 9

    SELECT * FROM Auto WHERE Auto.ColorID IN (@Pink, @red, @Black, @Yellow)

    Or, better yet... make a table to store groups of colors (in a normalized fashion, of course)... then, you will NEVER have to change the code just to accomodate a color change...

    SELECT * FROM Auto a, ColorCombos cc WHERE a.ColorID = cc.ColorID and cc.ColorGroup = 'Premium Colors'

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • There could be a cost in translating the character field to an integer somewhere between the application and hardware; likely in the operating system level. Ideally integers suffer no such penalty, although realistically it is likely to be minimal in modern systems.

  • Thank you all for your responses. Now I can make a better decision based on your feedback.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply