Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Denormalization Strategies Expand / Collapse
Author
Message
Posted Monday, March 15, 2010 7:35 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 7:13 AM
Points: 2,381, Visits: 176
The approach we are taking is to normalize as much as possible. Then if it just doesn't work or can be made efficient, we denormalize.
Post #882925
Posted Monday, March 15, 2010 8:00 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Sunday, June 8, 2014 10:40 PM
Points: 28, Visits: 180
We use a de-normalized data-mart--I may be using the term loosely but it's esssentially "a separate place for up-to-yesterday production data" where semi-trained users can put together queries on our core data without getting into trouble with join syntax and without us investing in a huge Crystal Reports development effort.

It makes a lot of sense for this because it's not the main production transactional db, we can get away with a read-optimized box with a few huge wide-flat tables rather than many tables with joins.

Disk space is cheap. Why make the same Y2k mistake and add unnecessary complexity to a data mart to save what amounts to in our case, about $100 worth of disk space.

90% of the most commonly asked reports are no longer dumped on the desk of our IT staff--a small investment in simple query syntax training for end users solved most of the issues.

Granted, we're small and simple but a similar approach this could be scaled.

In addition, we have some data-specific issue where this helps. For example, Customer 123 might be "Bobs Widgets" today, but later he changes it to "Bob & Sons Widgets". I understand about temporal tables but if you've ever seen what joining a temporal table does to an execution plan (not to mention 8 or 10 temporal tables) then that's something for which I'm more than willing to add a redundant field.
--Jim

Post #882954
Posted Monday, March 15, 2010 8:26 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Monday, May 9, 2011 1:05 PM
Points: 69, Visits: 104
Gilles Willard (3/15/2010)

It seems to me that what Timothy is proposing is simply adding a few FKs to his database design to cut down on the JOINs required to get at some data.

How is that denormalizing?



I had the same assumption...that is until he mentioned the fact that he decreased the number of tables. It would have been nice to have a diagram with the table design showing the suggested changes.
Post #882981
Posted Monday, March 15, 2010 8:27 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: 2 days ago @ 9:21 PM
Points: 11,192, Visits: 11,090
Jim,

Yes. Data warehouses are a totally different kettle.




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #882982
Posted Monday, March 15, 2010 8:41 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 2:10 PM
Points: 3,103, Visits: 7,811
Paul White (3/15/2010)
Jim,

Yes. Data warehouses are a totally different kettle.


It's normal for denormalization to be present in a data warehouse.

(Seriously, there was no pun intended.)




Alvin Ramard
Memphis PASS Chapter

All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.
Post #882998
Posted Monday, March 15, 2010 8:59 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: 2 days ago @ 9:21 PM
Points: 11,192, Visits: 11,090
Alvin Ramard (3/15/2010)
(Seriously, there was no pun intended.)

Given your track record for bad puns, Alvin, I have my doubts
Benefit of the doubt.




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #883019
Posted Monday, March 15, 2010 9:02 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 2:10 PM
Points: 3,103, Visits: 7,811
Paul White (3/15/2010)
Alvin Ramard (3/15/2010)
(Seriously, there was no pun intended.)

Given your track record for bad puns, Alvin, I have my doubts
Benefit of the doubt.


If a pun had been intended, I would have included a smiley.

I can understand you comment. I do have a reputation for trying to add a bit of humor to many situations.




Alvin Ramard
Memphis PASS Chapter

All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.
Post #883022
Posted Monday, March 15, 2010 11:01 AM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, January 30, 2014 4:28 PM
Points: 81, Visits: 258
In spite of the criticism, it was still a simple example of minimal denormalization to achieve an end result rather than a full-on explosion of wide rows to reduce the NF to 0

Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...

WHERE P_MS.DateReceived > getdate() - 365

...would have been better expressed declaring a scalar variable:

DECLARE @selectDate = getdate()-365
...
WHERE P_MS.DateReceived > @selectDate
...


...which would allow the optimizer to use an index on DateReceived.


Unfortunately, denormalization for immutable datasets as we've used it in the past just doesn't scale, especially on large datasets...not to mention the escalating complexity (and headaches) it entails. Ironically, not even for data-warehouses that go the same route (MOLAP) vs. a Multi-dimensional ROLAP Snowflake Schema. The stastical implications are the subject of current research, though it's proving a bit of a challenge to account for all the complexities it can entail (code proliferation, data-correctness, increased complexity of refactoring to accomodate changing business rules, data-explosion, etc.). The data-tsunami is upon us


Dr. Les Cardwell, DCS-DSS
Enterprise Data Architect
Central Lincoln PUD
Post #883154
Posted Monday, March 15, 2010 11:28 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, January 3, 2014 11:38 AM
Points: 5, Visits: 52
Both of these examples depict DSS type operations. Rather then denormalize a live database I would prefer to created a data warehouse where denormalization is the norm. I believe you denormalize during load testing and then only IF you have a significant performance issue. Over time, a normalized database is easier to modify than a denormalized database.

Another alternative would be to use all natural keys and that way the UID of the parent table would be carried down to the UID of the children, grandchildren etc. Of course the big disadvantage to this approach is if you have many generations there would be more key columns than data columns in the lowest generation.
Post #883177
Posted Monday, March 15, 2010 11:30 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Monday, May 9, 2011 1:05 PM
Points: 69, Visits: 104
Les Cardwell (3/15/2010)

Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...

WHERE P_MS.DateReceived > getdate() - 365

...would have been better expressed declaring a scalar variable:

DECLARE @selectDate = getdate()-365
...
WHERE P_MS.DateReceived > @selectDate
...


...which would allow the optimizer to use an index on DateReceived.



Nicely said Grasshopper

I like the idea of denormalization, but many people look for these types of articles to re-establish their non-existing point of designing a sloppy, good for nothing database. They just totally ignore the last paragraph!

It usually takes alot of time and effort to denormalize a database. But shouldn't you FIRST NORMALIZE then DENORMALIZE if benefit can be measured???? Right???




Post #883179
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse