RE: Antipathy for Entity Attribute Value data models

SSC Guru

Points: 1003865

February 2, 2013 at 10:46 pm

David.Poole (2/1/2013)
One comment I didn't understand in the feedback in this thread was about SQL_Variant making rubbish of the points in my article. I know it wasn't you who said it but can you shine any light on this?

My apologies. I missed this comment/question and I don't know if it will be "light" that I can shed on the subject but I can make a comment here. 😀

In your good article, you stated the following...

If the Entity-Attribute-Value model is to be truly generic then that means that the "value" field has to be generic and untyped which usually means it has to be some form of string or variant.

... and that's probably what a lot of folks took exception to because they probably stopped reading before they got to the word "variant". I know I missed it on the very first read through. :blush:

It doesn't shred what you said about EAVs in the quote above or in other places in the article but there's a great advantage to using the SQL_Variant datatype in EAVs. You don't have to do an explicit conversion for source datatypes that don't have an implicit conversion to, say, VARCHAR(8000). You can also figure out what datatype something was stored from by using the SQL_VARIANT_PROPERTY function. Of course (and I strongly agree with you here), if you can more accurately type the value column as you suggested, that would frequently be better even if you might have to use more than one EAV. The most successful EAVs that I've used have had some form of numeric typed value column. Of course, for "element" based audit tables stored as EAVs, that option is usually out of the question because it's probably not as effective to have a table for every datatype nor performant for the audit triggers to make so many decisions. In those cases, SQL_Variant may work better for performance by the associated audit triggers. It does suck for indexing though.

So, to answer your question, "It Depends". I would't say that the use of SQL_Variant "makes rubbish" of the points in your article except at what people think of at face value. There's a lot more than just face value at stake here, though, and I think that anyone that ignores your points in the article just because they used SQL_Variant are probably overlooking some potential problems (like the indexing problem, for example).

[Edit]

I think another problem that some folks may have taken up with certain parts of the article could be based on some misunderstanding. For example, you stated in the article that....

As the ParentID attribute will be a string and the record to which is refers will be some form of integer then a lot of casting will have to take place in order to resolve the hierarchy.

I know a lot of good folks looked at that and, having worked with Adjacency Lists where both the child and parent IDs are numeric, might wonder why you even brought this up as an EAV. I have to admit, even I don't understand why you brought this up as an EAV.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)