Measuring performance on database/tables

Question

Measuring performance on database/tables

Lowry Kozlowski

SSCommitted

Points: 1918
More actions
February 12, 2008 at 1:22 pm

#119973

Are there any tools for measuring the perfomance hit give by having an entire database or table build with varchar(n) for every column? I have many pre-existing tables that contain 30 - 50 tables each containing tables that were mostly built using varchar for all column datatypes. For example I can have 12 or so tables with aprox 95 columns all varchar(255) or varchar(50) with upwards of 500,000 rows. The development team keeps having performance issues and I want to provide statistics on poorly performing databases to see if this is part of the issue. In many of these tables there is room to put fixed size char or ints etc and leave the varchar where it would be best for. I know what my poor performing databases are and wonder the best way to get performance statistics on these.
Any help or ideas would be great! Thanks

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

GSquared SSC Guru Points: 260824 More actions · Answer 1

What you need to do is look at the long running queries in the database(s) and then analyze them for what makes them slow. I doubt it's varchar vs char. More often, it's poorly designed tables, or a bad indexing scheme, or non-existent indexes, or things like that.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Jeff Moden SSC Guru Points: 1003875 More actions · Answer 2

Yes... it does make a difference... about 21% without an order by and about 12% with an order by. But wait...

Heh... on a million rows of data with good code, that's a difference of a couple hundred milliseconds and 1 second respectively. Overall, not much of a difference unless you wanna play the percent game.

90% of your performance problems will be found in the way your code was written... nothing else matters in comparison. 9.5% of your performance problems will be because of missing or improper indexing. The remaining .5% are hardware and configuration.

Of course, there's always test code...

--===== Create and populate a 1,000,000 row test table containing "right sized" datatypes

-- Column "RowNum" has a range of 1 to 1,000,000 unique numbers

-- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers

-- Column "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings

-- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers

-- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times

-- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'

-- for all rows.

-- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)

-- Jeff Moden

SELECT TOP 1000000

RowNum = IDENTITY(INT,1,1),

SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,

SomeLetters2 = CAST(CHAR(ABS(CHECKSUM(NEWID()))%26+65)

+ CHAR(ABS(CHECKSUM(NEWID()))%26+65) AS CHAR(2)),

SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),

SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),

SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),

SomeHex12 = RIGHT(NEWID(),12)

INTO dbo.JBMTest1

FROM Master.dbo.SysColumns t1,

Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN

--===== A table is not properly formed unless a Primary Key has been assigned

ALTER TABLE dbo.JBMTest1

ADD CONSTRAINT PK_JBMTest_RowNum PRIMARY KEY CLUSTERED (RowNum)

----------------------------------------------------------------------------------------------------------------------

--===== Create a nearly identical table with all VARCHAR(255) except for the RowNum

SELECT TOP 1000000

RowNum = IDENTITY(INT,1,1),

SomeInt = CAST(ABS(CHECKSUM(NEWID()))%50000+1 AS VARCHAR(255)),

SomeLetters2 = CAST(CHAR(ABS(CHECKSUM(NEWID()))%26+65)

+ CHAR(ABS(CHECKSUM(NEWID()))%26+65) AS VARCHAR(255)),

SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(255)),

SomeMoney = CAST(CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY) AS VARCHAR(255)),

SomeDate = CAST(CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME) AS VARCHAR(255)),

SomeHex12 = CAST(RIGHT(NEWID(),12) AS VARCHAR(255))

INTO dbo.JBMTest2

FROM Master.dbo.SysColumns t1,

Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN

--===== A table is not properly formed unless a Primary Key has been assigned

ALTER TABLE dbo.JBMTest2

ADD CONSTRAINT PK_JBMTest2_RowNum PRIMARY KEY CLUSTERED (RowNum)

--===== Demonstrate the time difference of a simple Select into a bit bucket

DECLARE @N VARCHAR(255)

SET STATISTICS TIME ON

SELECT @N=SomeLetters2 FROM JBMTest1 --ORDER BY SomeLetters2

SELECT @N=SomeLetters2 FROM JBMTest2 --ORDER BY SomeLetters2

SET STATISTICS TIME OFF

EXEC sp_SpaceUsed JBMTest1

EXEC sp_SpaceUsed JBMTest2

--DROP TABLE dbo.JBMTest1,dbo.JBMTest2

The big difference is in the extra 10 MB the VARCHAR(255) table had more than the regular table... imagine, say, 500 such tables, all with 10 MB more storage space... that's an extra 5 GIG of storage space and an extra 5 GIG of backup space as well as the extra time it takes to backup and, in a pinch, restore the extra 5 Gig.

But, they're all sissies... this is SQL Server 2005... let's make all columns VARCHAR(MAX) so we never have to worry about table design or changes in data size, eh? (Ya just gotta know I'm kidding, folks). 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Lowry Kozlowski SSCommitted Points: 1918 More actions · Answer 3

THANK YOU! JEFF! That is what I was looking for! I can now use this and do some testing with my databases, the sizing statistics just add to my already existing arguments for tighter datatypes where possible, since we must have varchars, just not for every item!! The Timing stats added to that are exactly what I was looking for.

Jeff Moden SSC Guru Points: 1003875 More actions · Answer 4

Lowry Kozlowski (2/13/2008)
THANK YOU! JEFF! That is what I was looking for! I can now use this and do some testing with my databases, the sizing statistics just add to my already existing arguments for tighter datatypes where possible, since we must have varchars, just not for every item!! The Timing stats added to that are exactly what I was looking for.

Heh... thanks for the great feedback, Lowry...

"One good test is worth a thousand expert opinions." 😀

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Ash73 SSC Veteran Points: 286 More actions · Answer 5

It certainly matters...here is a dated article but it has some good points you can reference...

http://www.sqlservercentral.com/articles/Miscellaneous/datatype/173/

😎

Lowry Kozlowski SSCommitted Points: 1918 More actions · Answer 6

Thanks Ash! All good info and great that all of you are willing to help your fellow DBA's in need! 😀