sql_variant

  • I've just seen the schema of an application under development, which is designed to be 'flexible'. It takes data of all sorts into a single table, putting the value into a sql_variant field and other fields will describe the 'field name' and record the base type.

    I've never used sql_variant before, as it's one of those things I'd instinctively avoid. And when I saw the schema, my first reaction was that I don't want to be downstream of this application. I can't help feeling that Mr Codd wouldn't like it.

    But perhaps I'm being unfair....is it time I loosened up?

    Have any of you chaps any experience of this, and concrete reasons why it is or isn't a good idea.

    Thanks,

    David McKinney.

  • The variant doesn't scare me (much), but everything into one table? How many users are expected to access this? What does the indexing look like? Has anyone vetted the design under simulated load?

    If the the answers to the last three questsions are no or don't know, run screaming.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • The app takes as input a csv'ish file which comes from a 3rd party reporting tool. However the app is trying to allow flexibility with the actual columns in input. i.e. allow input columns to be added and be automatically incorporated.

    Basically the application then outputs more csv type files of a different format, generated from the input files. The output should also 'benefit' from the same flexibility.

    So I can see where he's coming from...

    I'm sort of thinking that this approach is ok providing this table isn't the definitive storage point for any important data -- that data could pass through this structure for the purposes of a transformation, but that any business critical data should reside elsewhere, in strongly typed fields.

    What do you think?

  • Oh no! The one table to rule them all design. If I were you, I'd start running now. This database 'design' tends to crop up again and again in tales of applications/databases from hell, and for good reason

    http://worsethanfailure.com/Articles/Tom_Kyte_on_The_Ultimate_Extensibility.aspx

    http://www.simple-talk.com/opinion/opinion-pieces/bad-carma/

    Things to worry about:

    Performance under load.

    Easy of querying.

    Type enforcement - If the column datatype reads 'date' what prevents a value like '2007/45/86' from going in. If it's supposed to be an int, can I put 56AE3ZZZ in there?

    As a staging area, it's probably OK. Provided it remains a staging area and nothing more.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks Gail,

    I already had that Bad Carma article in my IE Favourites list.

    It's one of my all-time favourite articles.

  • Yep... I know this is an old post but I've just gone through an escapade with SQL_Variant with great success.

    I agree that the "one table" idea is typically a very bad one. It does make applications quite flexible because users can easily create their own attributes for an entity but, as well all know too well, there are large problems down that particular rabbit hole.

    I will say, however, that it makes a hell of a nice "V" column for an EAV style of Audit table especially since it stores some meta-data about the datatype which can be easily returned. Yes, you can certainly get the datatype by interrogating sys.columns or INFORMATION_SCHEMA.Columns but then the audit table wouldn't be recording the datatype changes for free. We also use the Audit table for certain types of reporting where the output needs to be formatted according to datatype. In the previous audit table (the one I'm replacing), they stored formatted data instead of the raw data because they didn't want to go through the hassle of looking up the data type according to the "EA" (which could change, as previously stated).

    There are some caveats with storing Blob, TimeStamp, and SQL_Variant (from another table) datatypes but the SQL_Variant datatype in an audit table was just what the doctor ordered for the types of things I need to do. It's pretty darned fast, too.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    This is interesting. I've read about but never used the variant type but I can (now that you mention it) appreciate how it could be useful in an audit table scenario.

    you can certainly get the datatype by interrogating sys.columns or INFORMATION_SCHEMA.Columns but then the audit table wouldn't be recording the datatype changes for free. We also use the Audit table for certain types of reporting where the output needs to be formatted according to datatype.

    I don't suppose you could offer up some examples on the above?


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • It's pretty simple. Most people store their "changes" in Audit tables as VARCHAR and use some other method (interrogate sys.columns or somesuch) to figure out what the datatype of the column was for "change reporting". When you use the SQL_Variant datatype instead, you can use the SQL_VARIANT_PROPERTY function to easily determine the base datatype and a couple of other things.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply