Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

Avoiding the Deep model (Key, value) Expand / Collapse
Author
Message
Posted Wednesday, August 22, 2012 1:08 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 10:44 PM
Points: 22,525, Visits: 30,290
Evil Kraig F (8/22/2012)
Lynn Pettis (8/22/2012)
I'm going to go against the flow a bit here. I think it really depends on what the application is intended to do. Where I work now I think the application I am helping to support would actually benefit from an EAV or hybrid EAV design.

Also, Microsoft themselves is using the EAV design in their Engineering Excellence Group to support compliance across product lines. Interviewed with the group 2 years ago bit didn't have the experience they needed.


I would say that a well reasoned approach to EAV for particular datasets with an understanding of the underlying complexity and complications it can cause by a seasoned professional could and has been useful. Say, once in every few hundred databases.

I would also say that anyone looking to implement it for 'ease of development' and not 'optimization of attribute access' is approaching the process from the wrong side. I don't disagree with you Lynn, just that the wrong approach to allow for optimal usage is being taken here. It completely depends on the data and intended use, not the intended development environment.


And here is where we agree.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1348671
Posted Wednesday, August 22, 2012 1:32 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Monday, April 14, 2014 10:45 AM
Points: 54, Visits: 192
The value data-type in this case will always be the same. I did not think using EAV, but several developers here thought that we should. Thank you everyone for your comments.
Post #1348685
Posted Wednesday, August 22, 2012 1:36 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 2:49 PM
Points: 12,016, Visits: 11,046
Lynn Pettis (8/22/2012)
Evil Kraig F (8/22/2012)
Lynn Pettis (8/22/2012)
I'm going to go against the flow a bit here. I think it really depends on what the application is intended to do. Where I work now I think the application I am helping to support would actually benefit from an EAV or hybrid EAV design.

Also, Microsoft themselves is using the EAV design in their Engineering Excellence Group to support compliance across product lines. Interviewed with the group 2 years ago bit didn't have the experience they needed.


I would say that a well reasoned approach to EAV for particular datasets with an understanding of the underlying complexity and complications it can cause by a seasoned professional could and has been useful. Say, once in every few hundred databases.

I would also say that anyone looking to implement it for 'ease of development' and not 'optimization of attribute access' is approaching the process from the wrong side. I don't disagree with you Lynn, just that the wrong approach to allow for optimal usage is being taken here. It completely depends on the data and intended use, not the intended development environment.


And here is where we agree.


Ditto. EAV does in fact have some value in VERY specialized situations. Kind of like cursors. They are good for a very small amount of things. From the OP statements and the fact they weren't really sure what that meant I took a pretty hard stance. I would say that unless you really know what you are getting into and can explain to a group of seasoned database folks why it is the best approach you are better walking away from EAV.


_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Post #1348688
Posted Wednesday, August 22, 2012 2:07 PM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 12:17 PM
Points: 3,081, Visits: 11,231
mishka-723908 (8/22/2012)
I understand and completely agree, but are there any other options? I dont see any but just add a column in the future when necessary.


I don't see why just adding a column when necessary is a problem.

It will certainly be less effort than the numerous disadvantages of the EVA model.



Post #1348712
Posted Wednesday, August 22, 2012 2:34 PM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 12:17 PM
Points: 3,081, Visits: 11,231
This is one of my all time favorite threads about an EVA:
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=61024

I especially like the query posted by RobWafle at 02/01/2006 12:05:48 that had over 40 left joins.

I still stand by what I posted on that thread:
"..I think that the query you posted is a perfect illustration of the biggest disadvantage of the Entity/Attribute model, that it saves a little work up front in data modeling by allowing “open ended” insertion of new attributes at the cost of having to program the true data structure into each query. Of course, there are other annoying little problems, like enforcing not null, DRI, domain integrity, default values, check constraints, creating useful indexes, transactional integrity, etc. Basically, it takes all the most useful features of a relational data model, and throws them away..."




Post #1348732
Posted Wednesday, August 22, 2012 2:52 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 2:49 PM
Points: 12,016, Visits: 11,046
Michael Valentine Jones (8/22/2012)
This is one of my all time favorite threads about an EVA:
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=61024

I especially like the query posted by RobWafle at 02/01/2006 12:05:48 that had over 40 left joins.

I still stand by what I posted on that thread:
"..I think that the query you posted is a perfect illustration of the biggest disadvantage of the Entity/Attribute model, that it saves a little work up front in data modeling by allowing “open ended” insertion of new attributes at the cost of having to program the true data structure into each query. Of course, there are other annoying little problems, like enforcing not null, DRI, domain integrity, default values, check constraints, creating useful indexes, transactional integrity, etc. Basically, it takes all the most useful features of a relational data model, and throws them away..."






Yeah that big ole nasty query that has to join to the same table 40 times is EXACTLY what I have seen on a couple of EAV systems in the past. It is nothing but pain to appease somebody thinking they are clever.


_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Post #1348749
Posted Thursday, August 23, 2012 1:37 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Wednesday, April 16, 2014 9:25 AM
Points: 7,070, Visits: 12,523
Here is another thread that discusses the topic with additional worthwhile responses to those already made on this thread:

Pros and Cons - Table with loads of uniqe columns VS Row based keys with row data type

Edit: In case you do not make it too far into the other thread, please have a look at this for a great summary of the topic Keeping It Simple > EAV Fail


__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato

Believe you can and you're halfway there. --Theodore Roosevelt

Everything Should Be Made as Simple as Possible, But Not Simpler --Albert Einstein

The significant problems we face cannot be solved at the same level of thinking we were at when we created them. --Albert Einstein

1 apple is not exactly 1/8 of 8 apples. Because there are no absolutely identical apples. --Giordy
Post #1349333
Posted Thursday, August 30, 2012 2:47 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Yesterday @ 3:34 PM
Points: 8,295, Visits: 8,748
I have to agree with most of what's been said here. As a general rule, EAV is much more costly in development time (especially for enhancements), storage, and performance than a proper relational model - there are exceptions to this general rule (for example if almost everything is extremely sparse you may get a storage saving from EAV, provided you avoid having to encode the datatype with each value) but these are very rare indeed.

Tom
Post #1352503
Posted Thursday, August 30, 2012 7:51 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 9:45 PM
Points: 36,013, Visits: 30,300
mishka-723908 (8/22/2012)
The value data-type in this case will always be the same. I did not think using EAV, but several developers here thought that we should. Thank you everyone for your comments.


To be honest, we don't actually have enough information to recommend or condemn because you haven't told us what the table is designed to hold, what the column names and purposes are, nor why you might want to eventually add a column or two. Some information in that area would certainly help us help you.

I'll join the others in saying that an EAV or NVP will most likely cause you some huge problems but, again, there's just not enough information to tell. It may very well be that an EAV or NVP is exactly what the doctor ordered (I stess again... probably not).

For example, it would be a huge benefit to you to use an NVP to store monthly data rather than storing monthly data in separate columns.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1352555
Posted Friday, August 31, 2012 7:28 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Tuesday, January 15, 2013 11:11 AM
Points: 1,945, Visits: 2,782
mishka-723908 (8/22/2012)
Yes, like key-value pair.


No. It will be 10 orders of magnitude worse than anything you would do with a proper data model. Here is a quick posting on some of the MANY problems:

I found an old "cut & paste". Someone like you posted this:

CREATE TABLE EAV -- no key declared
(key_col VARCHAR (10), -- what does null mean?
attrib_value VARCHAR (50)); -- what does null mean?

INSERT INTO EAV VALUES ('LOCATION', 'Bedroom');
INSERT INTO EAV VALUES ('LOCATION', 'Dining Room');
INSERT INTO EAV VALUES ('LOCATION', 'Bathroom');
INSERT INTO EAV VALUES ('LOCATION', 'courtyard');
INSERT INTO EAV VALUES ('EVENT', 'verbal aggression');
INSERT INTO EAV VALUES ('EVENT', 'peer');
INSERT INTO EAV VALUES ('EVENT', 'bad behavior');
INSERT INTO EAV VALUES ('EVENT', 'other');

CREATE TABLE EAV_DATA --no constraints, defaults, DRI
(id INTEGER IDENTITY (1, 1) NOT NULL, --vague names
bts_id INTEGER NULL,
key_col VARCHAR (10) NULL,
attrib_value VARCHAR (50) NULL );

INSERT INTO EAV_DATA VALUES (1, 'LOCATION', 'Bedroom');
INSERT INTO EAV_DATA VALUES (1, 'EVENT', 'other');
INSERT INTO EAV_DATA VALUES (1, 'EVENT', 'bad behavior');
INSERT INTO EAV_DATA VALUES (2, 'LOCATION', 'Bedroom');
INSERT INTO EAV_DATA VALUES (2, 'EVENT', 'other');
INSERT INTO EAV_DATA VALUES (2, 'EVENT', 'verbal aggression');
INSERT INTO EAV_DATA VALUES (3, 'LOCATION', 'courtyard');
INSERT INTO EAV_DATA VALUES (3, 'EVENT', 'other');
INSERT INTO EAV_DATA VALUES (3, 'EVENT', 'peer');

Ideally, the result set of the query would be Location Event count (headings if possible)

Bedroom verbal aggression 1
Bedroom peer 0
Bedroom bad behavior 0
Bedroom other 2
Dining Room verbal aggression 0
Dining Room peer 0
Dining Room bad behavior 0
Dining Room other 0
Bathroom verbal aggression 0
Bathroom peer 0
Bathroom bad behavior 0
Bathroom other 0
courtyard verbal aggression 0
courtyard peer 1
courtyard bad behavior 0
courtyard other 1

Also, if possible, another query would return this result set. (I think I know how to do this one.)

Location Event count
Bedroom verbal aggression 1
Bedroom other 2
courtyard peer 1
courtyard other 1

Here is an answer From: Thomas Coleman

SELECT Locations.locationvalue, Events.eventvalue,
(SELECT COUNT(*)
FROM (SELECT LocationData.locationvalue, EventData.eventvalue
FROM (SELECT TD1.bts_id, TD1.value AS locationvalue
FROM eav_data AS TD1
WHERE TD1.key = 'location') AS LocationData
INNER JOIN
(SELECT TD2.bts_id, TD2.value AS eventvalue
FROM eav_data AS TD2
WHERE TD2.key = 'event'
) AS EventData
ON LocationData.bts_id = EventData.bts_id
) AS CollatedEventData
WHERE CollatedEventData.locationvalue = Locations.locationvalue
AND CollatedEventData.eventvalue = Events.eventvalue
FROM (SELECT T1.value AS locationvalue
FROM EAV AS T1
WHERE T1.key = 'location') AS Locations,
(SELECT T2.value AS eventvalue
FROM EAV AS T2
WHERE T2.key = 'event') AS Events
ORDER BY Locations.locationvalue, Events.eventvalue ,
SELECT Locations.locationvalue, Events.eventvalue
(SELECT COUNT(*)
FROM (SELECT LocationData.locationvalue, EventData.eventvalue
FROM (SELECT TD1.bts_id, TD1.value AS locationvalue
FROM eav_data AS TD1
WHERE TD1.key = 'location') AS LocationData
INNER JOIN
(SELECT TD2.bts_id, TD2.value AS eventvalue
FROM eav_data AS TD2
WHERE TD2.key = 'event') AS EventData
ON LocationData.bts_id = EventData.bts_id)
AS CollatedEventData
WHERE CollatedEventData.locationvalue = Locations.locationvalue
AND CollatedEventData.eventvalue = Events.eventvalue)
FROM (SELECT T1.value AS locationvalue
FROM EAV AS T1
WHERE T1.key = 'location') AS Locations,
(SELECT T2.value AS eventvalue
FROM EAV AS T2
WHERE T2.key = 'event') AS Events;

Is the same thing in a proper schema as:

SELECT L.locationvalue, E.eventvalue, COUNT(*)
FROM Locations AS L, Events AS E
WHERE L.btd_id = E.btd_id
GROUP BY L.locationvalue, E.eventvalue;

The reason that I had to use so many subqueries is that those entities are all plopped into the same table. There should be separate tables for Locations and Events.

The column names are seriously painful. Don't use reserved words like "key" and "value" for column names. It means that the developer *has* surround the column name with double quotes for everything. And they are too vague to be data element names anyway!

There is such a thing as "too" generic. There has to be some structure or everything becomes nothing more than a couple of tables called "things". The real key (no pun intended) is commonality. Is there a pattern to the data that they want to store? It may not be possible to create one structure to rule them all and in the darkness bind them.

"To be is to be something in particular; to be nothing in particular is to be nothing." --Law of Identity

All data integrity is destroyed. Any typo becomes a new attribute or entity. Entities are found missing attributes, so all the reports are wrong.

Try to write a single CHECK() constraint that works for all the attributes of those 30+ entities your users created because you were too dumb or too lazy to do your job. It can be done! You need a case expression almost 70 WHEN clauses for a simple invoice and order system when I tried it as an exercise.

Try to write a single DEFAULT clause for 30+ entities crammed into one column. Impossible!

Try to set up DRI actions among the entities. If you thought the WHEN clauses in the single CASE expression were unmaintainable, wait until you see the "TRIGGERs from Hell" -- Too bad that they might not fit into older SQL Server which had some size limits. Now maintain it.

For those who are interested, there are couple of links to articles I found on the net:

Generic Design of Web-Based Clinical Databases
http://www.jmir.org/2003/4/e27­/

The EAV/CR Model of Data Representation
http://ycmi.med.yale.edu/nadka­rni/eav_CR_contents.htm

An Introduction to Entity-Attribute-Value Design for Generic
Clinical Study Data Management Systems
http://ycmi.med.yale.edu/nadka­rni/Introduction%20to%20EAV%20­systems.htm

Data Extraction and Ad Hoc Query of an Entity— Attribute— Value Database
http://www.pubmedcentral.nih.g­ov/articlerender.fcgi?tool=pub­med&pubme...

Exploring Performance Issues for a Clinical Database Organized Using
an Entity-Attribute-Value Representation
http://www.pubmedcentral.nih.g­ov/articlerender.fcgi?tool=pub­med&pubme...

A really good horror story about this kind of disaster is at:

http://www.simple-talk.com/opinion/opinion-pieces/bad-carma/



Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Post #1352801
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse