RE: my favorite normalization case study

SSCertifiable

Points: 6483

January 17, 2012 at 3:10 am

L' Eomot Inversé (1/16/2012)
patrickmcginnis59 (1/16/2012)
PaulB-TheOneAndOnly (1/16/2012)
Please do not take it the wrong way but, out of curiosity, why is the subject of this thread "my favorite normalization case study" if you never intended to normalize the data and you don't even care about the basics of data modeling?
I'm not judging, you can do it anyway you want 🙂
It wasn't actually meant to be all that detailed a thread, but I'm interested that you felt it important to mention that "you don't even care about the basics of data modeling" when I clearly posted both the normalized and denormalized views, and actually shipped the normalized view to our reporting system. I took advantage of the denormalized view to help me ship. Its not like I didn't consider the normalized view, like I said, I materialized it for reporting.
Why did you feel the need to say "you don't even care about the basics of data modeling"? I thought it was an interesting case, but now your reaction is the interesting item in this thread 😉
So tell me about why you believe I don't care or didn't care when I produced this system?
Don't get too upset Patrick.

Its ok. Paul is welcome to his opinion.

Paul is (I think - I hope he'll correct me if I'm wrong) taking things from the point of view of someone who uses an RDBMS which sorts out all those storage issues for him, whereas you are taking the point of view of someone who has to design the optimal storage and then present a relational view on top of it.
Obviously you will differ sometimes (probably almost always, given the two points of view), and this is one of those times.

What I think is useful is to consider alternatives, and I think its useful to attempt to understand why folks do consider alternatives. As I have mentioned, there seems to be an orthodoxy in the RDBMS world which prevents this, while I think its less than useful, I'm ok that it exists as I know how strongly many folks feel about it. Its worth my consideration that I may have asked for such a dismissal from Paul by even posting this anecdote on an SQL forum.

It doesn't help, of course, that none of our currently available RDBMSs can be told what sort of queries they will get with what frequency and attempt to optimise the storage mechanism to fit, but have instead a fixed set (usually two members to that set) of storage organisations within which they permit the user to attempt to optimise by choosing indices and clustering factors. It's not clear to me that the relational model prefers one of those two views over the other, but for most people it seems to be absolutely clear that the relational model implies Paul's view, not yours (needless to say, I disagree with them; that doesn't mean that I think your storage structure is relational, of course, it just means that I think it's fairly clear that it can be used to support a relational view of the data; I don't for a moment think that Microsoft's view storage view of pages, files, and filegroups is the slightest bit relational either, but it certainly can supprt something close to a relational view (maybe exactly a relational view) of the data. But I do think it's unreasonable to ask for an explanation of storage engine primitives in terms of relational primitives - that's confusing the cart and the horse.

I believe that there is a continuum of cases with varying degrees of mappability to relational solutions so to speak. While there are clearly cases where relational mappings do no good, and there are cases where they do much good, I'm of the belief that there are some grey areas. To me, I saw clear benefit in "pre-joining" elements, and it looked much like Oracle's vararrays, and what I found interesting was the benefit I got in using this approach.