Creating appropriate data type from varchar(max)

  • This is something I am thinking someone may have done already. Basically we have data loader which loads data from any source but the data type in the table will always be varchar(max). if it is db to db transfer then the table will have correct data type as source. This was perhaps designed long time ago with the thought that it can import anything but then apply the data specification so any error troubleshooting may be easy. In many cases we do not have data spec and end up with varchar(max). I know this is not the best solution but this is what it is currently.

    My first thought was varchar(max) is not the best practice for small string like name or address etc. This applies to all tables in database so I simply scanned the length of all fields and look for max length. If the max length is 20 characters then I convert that data type from varchar(max) to varchar(30). Just giving that 50% extra to incorporate any changes in the future for comment type field.

    Then I thought this only applies to fields where it is varchar and the data is string instead of number, float, date etc.

    Is there a script already available which can define the data type automatically?

    As per example sql has around 30 data types so the script will check if the content is integer, if not is it float, if not is it decimal, if not is it date, if not is it numeric etc. and eventually varchar(30) – as mentioned above.

  • There are functions like ISNUMERIC() and others that can determine if a given value is of that type. However, they're very dependent on exact formats. Most data imports I've seen require the Mk I Eyeball to make quite a few of the determinations. While I know people have built custom scripts for their work environment, I'm not aware of a generic script that would cover the majority of data imports.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Yep, doing this yourself will be very complex.

    I would let SQL itself determine the type(s) in your situation.  This will involve an extra full load of the data.  Make every data type in the loaded table sql_variant rather than varchar(max).  Then, once the data is fully loaded, you can query the type(s) that SQL determined the data to be.  If, for example, SQL assigned date for every value loaded into a column, you could use date for that column.

    If you're willing to consider this approach, and you need more details on how this method might be used, please just let me know.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • I think the functions TRY_CAST, TRY_CONVERT and TRY_PARSE might be of use.

  • You can use a case statement  along with the functions Grant alludes to determine the column  datatypes. I hope you only have to do this once per column / per import. That is , sampling one row.

    ----------------------------------------------------

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply