SQL Server Central is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
Search:  
 
 

Suggestions for Datatypes

By Amit Lohia, 2005/06/06

Total article views: 10034 | Views in the last 30 days: 29

Let’s face it – Most of the time when designing the structure of a table, the data types are not chosen properly. Mostly int is used, even if we can get away with smallint or tinyint. Char is used instead of varchar, or if you are a fan of varchar, then even for fixed length columns, the table has a varchar defined for those columns. This is even true for all other data types. At present I have limited the scope of this project to these types: BigInt, Int, SmallInt , TinyInt , Varchar , Char , DateTime and SmallDateTime

Wouldn’t it be nice if the system helped us to determine which data type is most appropriate for the column, based on the data stored? For example, if I am storing the SSN in a column with varchar(9), won’t it better if the system suggests using char(9)?

If you need to store the hiring date of an employee (assuming the company is not 100 years old), you can use smalldatetime instead of datetime. (Ok guys, I know the application will break in 2079, but do you really think the company will be using that application for next 74 years?)

Now comes the best part. Let us a write a procedure which will do the work for us, to determine the correct data type based on the data which we already have. First, let us create a table to store the information

CREATE TABLE [dbo].[SuggestionDataType](
	[ColumnName] [varchar](255) NOT NULL,
	[ObjectName] [varchar](18)  NOT NULL,
	[Suggestion] [varchar](25)  NULL,
	[Currenttype] [varchar](20) NOT NULL,
	[MaxValueOrLen] [varchar](20) NULL,
	[MinValueOrLen] [varchar](20) NULL

) ON [PRIMARY]

Let. us now create a store procedure which will be used to determine the logic and suggest a data type. You can download the procedure here, but snippets are shown below and explained in more detail. The first block that we will look at is shown below:

--Now let’s check the type of the column. We will start with 
-- ‘'BIGINT','INT','SMALLINT'

SET @SQLStr = 'SELECT ''' + @ColumnName  + ''' ColumnName  ,''' 
	+ @ObjectName  + ''' ObjectName,'

	If @Columntype IN ('bigint','int','smallint')
	BEGIN
		SET @SQLStr = @SQLStr +  '
		CASE WHEN (MAX(' + @ColumnName + ') > 2147483647 OR (MIN(' + @ColumnName + ') < -2147483648))
		THEN ''bigint''
		WHEN (MAX(' + @ColumnName + ') > 32767 OR (MIN(' + @ColumnName + ') < -32768 ))
		THEN ''int''
		WHEN (MAX(' + @ColumnName + ') > 255 OR (MIN(' + @ColumnName + ') < 0 ))
		THEN ''smallint''
		WHEN (MAX(' + @ColumnName + ') > 1)
		THEN ''tinyint''
		ELSE ''No suggestion or use bit''
		END
		As Suggestion,''' + @Columntype + ''' Currenttype,' 
		+ 'MAX(' + @ColumnName + ') MaxValueOrLen,' 
		+ 'MIN(' + @ColumnName + ') MinValueOrLen'  
		+ ' FROM ' +  @ObjectName + '(Nolock)'
	END
Let us see in the detail what the above If block is doing: It will create select query with ColumnName , ObjectName , Suggestion , Currenttype , MaxValue, MinValue. Suggestion is based on the following logic: if the max value for the column exceeds more than 2147483647 then suggest “BIGINT”. if the value is less than 2147483647 but more than 32767 suggest “Int”. So if the value is less the 32767 and more than 255 suggest “smallint”. Note: we can expand the scope to bit also.

Now let us look into the logic for Char and Varchar.

	If @Columntype IN ('Varchar','Char')
	BEGIN
	SET @SQLStr = @SQLStr +  
	'CASE WHEN (MAX(LEN(' + @ColumnName + ')) = MIN(LEN(' + 	@ColumnName + ')) ) 
	THEN ''CHAR('' + CAST(MAX(LEN(' + @ColumnName + ')) as 	VARCHAR(4)) +'')'' 
	ELSE ''VARCHAR(' + @Length	+')'' 
	END
	As Suggestion,''' + @Columntype +  '(' + @Length + ')'' Currenttype ,' 
	+ 'MAX(LEN(' + @ColumnName + ')) MaxValueOrLen,' 
	+ 'MIN(LEN(' + @ColumnName + ')) MinValueOrLen' 
	+ ' FROM ' +  @ObjectName + '(Nolock)'
	END
For the varchar and char. I am using the following simple logic: if the MIN length of data for the column is equal to the MAX length of data, then use a char otherwise use a varchar. We can modify the data to see the difference. Now let's look at the last logic part of Datetime and Smalltime
	If @Columntype IN ('datetime','smalldatetime')
	BEGIN
		SET @SQLStr = @SQLStr +  '
		CASE WHEN (MAX(' + @ColumnName + ') BETWEEN ''01/01/1900 00:00''   AND ''6/06/2079 23:59'' AND MIN(' + @ColumnName + ') BETWEEN ''01/01/1900 00:00''   AND ''6/06/2079 23:59'' )
		THEN ''smalldatetime'' 
		ELSE ''datetime'' 
		END
		As Suggestion,''' + @Columntype +  ''' Currenttype ,' 
		+ 'CAST(MAX(' + @ColumnName + ') as char(20)) MaxValueOrLen,' 
		+ 'CAST(MIN(' + @ColumnName + ') as char(20)) MinValueOrLen' 
		+ ' FROM ' +  @ObjectName + '(Nolock)'
	END	
Note: smalldatetime values with 29.998 seconds or lower are rounded down to the nearest minute; values with 29.999 seconds or higher are rounded up to the nearest minute. Now just loop through each column for the table and close the cursor The Below Line of code will insert the data in SuggestionDataType, which will be used later.
INSERT INTO SuggestionDataType (ColumnName,ObjectName,Suggestion,Currenttype,MaxValueOrLen,MinValueOrLen)
EXEC (@SQLStr)
Now you can query the table to see the suggestions. The suggestions are purely based on the data that already exists in the table. This procedure will give you a start on changing your schema and all the changes should be made only after careful consideration as to the nature of the field. We will currently see a suggestion for every column of the checked types, even if the suggestion is the same as the current type; a where cause on each EXEC/INSERT would eliminate this.

Will this save any space and by how much is the question most of us will have. It is beyond the scope of this article as it depends on many other factors like Indexes, record size, fill factor, average free space on a page, number of records and so on. I will try to write alone those lines in my next article

By Amit Lohia, 2005/06/06

Total article views: 10034 | Views in the last 30 days: 29
Your response
 
 
 
Already registered?  

Free registration required

To read the rest of this article, and access thousands of other articles, we ask you to register on the site and subscribe to our newsletters.

Register

E-mail address:
Password:
Password (confirm):

  

Subscriptions

We ask you to register on the site and subscribe to our newsletters. Subscribing to our newsletters gets you:

  • ALL of our content (thousands of articles, scripts, and forum postings)
  • A daily newsletter (example)
  • A weekly news round up (example)
  • The opportunity to ask and answer questions in our forums
  • A daily Question of the Day to test and help you increase your knowledge of SQL Server.

We ask that you give the newsletter a try for a week. Over 200,000 SQL Server Professionals a day find it entertaining and useful. If not, you are welcome to unsubscribe at anytime.

Steve Jones
Editor, SQLServerCentral.com