Measuring performance on database/tables

  • Are there any tools for measuring the perfomance hit give by having an entire database or table build with varchar(n) for every column? I have many pre-existing tables that contain 30 - 50 tables each containing tables that were mostly built using varchar for all column datatypes. For example I can have 12 or so tables with aprox 95 columns all varchar(255) or varchar(50) with upwards of 500,000 rows. The development team keeps having performance issues and I want to provide statistics on poorly performing databases to see if this is part of the issue. In many of these tables there is room to put fixed size char or ints etc and leave the varchar where it would be best for. I know what my poor performing databases are and wonder the best way to get performance statistics on these.

    Any help or ideas would be great! Thanks

  • What you need to do is look at the long running queries in the database(s) and then analyze them for what makes them slow. I doubt it's varchar vs char. More often, it's poorly designed tables, or a bad indexing scheme, or non-existent indexes, or things like that.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Yes... it does make a difference... about 21% without an order by and about 12% with an order by. But wait...

    Heh... on a million rows of data with good code, that's a difference of a couple hundred milliseconds and 1 second respectively. Overall, not much of a difference unless you wanna play the percent game.

    90% of your performance problems will be found in the way your code was written... nothing else matters in comparison. 9.5% of your performance problems will be because of missing or improper indexing. The remaining .5% are hardware and configuration.

    Of course, there's always test code...

    --===== Create and populate a 1,000,000 row test table containing "right sized" datatypes

    -- Column "RowNum" has a range of 1 to 1,000,000 unique numbers

    -- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers

    -- Column "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings

    -- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers

    -- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times

    -- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'

    -- for all rows.

    -- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)

    -- Jeff Moden

    SELECT TOP 1000000

    RowNum = IDENTITY(INT,1,1),

    SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,

    SomeLetters2 = CAST(CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65) AS CHAR(2)),

    SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),

    SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),

    SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),

    SomeHex12 = RIGHT(NEWID(),12)

    INTO dbo.JBMTest1

    FROM Master.dbo.SysColumns t1,

    Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN

    --===== A table is not properly formed unless a Primary Key has been assigned

    ALTER TABLE dbo.JBMTest1

    ADD CONSTRAINT PK_JBMTest_RowNum PRIMARY KEY CLUSTERED (RowNum)

    ----------------------------------------------------------------------------------------------------------------------

    --===== Create a nearly identical table with all VARCHAR(255) except for the RowNum

    SELECT TOP 1000000

    RowNum = IDENTITY(INT,1,1),

    SomeInt = CAST(ABS(CHECKSUM(NEWID()))%50000+1 AS VARCHAR(255)),

    SomeLetters2 = CAST(CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65) AS VARCHAR(255)),

    SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(255)),

    SomeMoney = CAST(CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY) AS VARCHAR(255)),

    SomeDate = CAST(CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME) AS VARCHAR(255)),

    SomeHex12 = CAST(RIGHT(NEWID(),12) AS VARCHAR(255))

    INTO dbo.JBMTest2

    FROM Master.dbo.SysColumns t1,

    Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN

    --===== A table is not properly formed unless a Primary Key has been assigned

    ALTER TABLE dbo.JBMTest2

    ADD CONSTRAINT PK_JBMTest2_RowNum PRIMARY KEY CLUSTERED (RowNum)

    --===== Demonstrate the time difference of a simple Select into a bit bucket

    DECLARE @N VARCHAR(255)

    SET STATISTICS TIME ON

    SELECT @N=SomeLetters2 FROM JBMTest1 --ORDER BY SomeLetters2

    SELECT @N=SomeLetters2 FROM JBMTest2 --ORDER BY SomeLetters2

    SET STATISTICS TIME OFF

    EXEC sp_SpaceUsed JBMTest1

    EXEC sp_SpaceUsed JBMTest2

    --DROP TABLE dbo.JBMTest1,dbo.JBMTest2

    The big difference is in the extra 10 MB the VARCHAR(255) table had more than the regular table... imagine, say, 500 such tables, all with 10 MB more storage space... that's an extra 5 GIG of storage space and an extra 5 GIG of backup space as well as the extra time it takes to backup and, in a pinch, restore the extra 5 Gig.

    But, they're all sissies... this is SQL Server 2005... let's make all columns VARCHAR(MAX) so we never have to worry about table design or changes in data size, eh? (Ya just gotta know I'm kidding, folks). 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • THANK YOU! JEFF! That is what I was looking for! I can now use this and do some testing with my databases, the sizing statistics just add to my already existing arguments for tighter datatypes where possible, since we must have varchars, just not for every item!! The Timing stats added to that are exactly what I was looking for.

  • Lowry Kozlowski (2/13/2008)


    THANK YOU! JEFF! That is what I was looking for! I can now use this and do some testing with my databases, the sizing statistics just add to my already existing arguments for tighter datatypes where possible, since we must have varchars, just not for every item!! The Timing stats added to that are exactly what I was looking for.

    Heh... thanks for the great feedback, Lowry...

    "One good test is worth a thousand expert opinions." 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It certainly matters...here is a dated article but it has some good points you can reference...

    http://www.sqlservercentral.com/articles/Miscellaneous/datatype/173/

    😎

  • Thanks Ash! All good info and great that all of you are willing to help your fellow DBA's in need! 😀

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply