﻿<?xml version='1.0' encoding='UTF-8'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>SQLServerCentral / Article Discussions / Article Discussions by Author / Discuss content posted by Timothy  Claason  / Denormalization Strategies / Latest Posts</title><generator>InstantForum.NET v2.9.0</generator><description>SQLServerCentral</description><link>http://www.sqlservercentral.com/Forums/</link><webMaster>notifications@sqlservercentral.com</webMaster><lastBuildDate>Thu, 23 May 2013 18:30:59 GMT</lastBuildDate><ttl>20</ttl><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Tom.Thomson (3/22/2010)[/b][hr]I think Paul's 3 stars were over-generous.[/quote]I did waver over the two star option, but went with 'average' in the end.You are probably right, though.  Too late now! :-)</description><pubDate>Mon, 22 Mar 2010 08:04:05 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Tom.Thomson (3/22/2010)[/b][hr][quote][b]Gilles Willard (3/15/2010)[/b][hr]Maybe this is due to my lack of experience, or the fact that it's been several years since I last studied database normalization, but:It seems to me that what Timothy is proposing is simply adding a few FKs to his database design to cut down on the JOINs required to get at some data.How is that denormalizing?I went back to my database design textbook (as well as the paper I wrote then on denormalization...)and could not figure out which form (1NF, 2NF or 3NF) was being violated?Also, and please forgive my fussiness with this but I have teachers in my family going back three generations, Timothy's article, while being interesting, could have used a quick run through spell-checking. This is not just a "purity" thing: it is a little jarring to find typos or spelling mistakes in a technical article while you're concentrating on understanding the author's point, something like stumbling over a rock while on a good walk.[/quote]I agree about the rocks!Maybe your problem with the denormalisation is the way he describes it: he talks about "adding Foreign Keys" when what he actually means is "adding extra columns to support extra Foreign Key relationships" (and he has a barbarian naming convention for the columns which support foreign keys, so that they sound as if they were foreign keys, not columns).  When he adds the column called "FK_Medication" to his Pet_MedicationSeries table that introduces a new functional dependency within that table: the value of the column FK_Medication is determined (according to the schema design) completely by the value of the column FK_PetMedication; but FK_PetMedication is not a key, so the table is now not in 3NF.Of course the proposed denormalisations are probably not the right way to solve the problem described - the first issue appears to arise largely because the date is not recorded in the pet_medication table (one consequence of this is that for medication that is not part of a series there is no means of discovering the date,  so neither the query before the denormalisation nor the query after it can deliver the answer it is claimed to deliver). The solution is to put the date into the Pet_Medication table, and remove it from the Pet_MedicationSeries table.  The first query now doesn't need to use the Pet_MedicationSeries table at all (and is now able to deliver what it was claimed to deliver) so there's no need to denormalise that table to help this query, and if it performs badly a clustered index on (date, id) [actually just on (date) would work too, but why not make sure it's unique and use it as the primary key?] will probably work wonders.Denormalising to fix a consequence of a schema design that puts attributes in the wrong tables is not a sensible use of denormalisation - it's just a deeper burying of the underlying design problem.I think Paul's 3 stars were over-generous.(edited - a spelling error)[/quote]Tom, thank you.After a week and several pages of comments to this article, I was starting to despair of ever getting a clear, to-the-point answer to my question.You've now done that.You are of course correct in your assessment of how the author's new schema is not in 3NF, and also  in describing the fundamental flaw that lead to his denormalizing and how it should be resolved.</description><pubDate>Mon, 22 Mar 2010 07:10:21 GMT</pubDate><dc:creator>Gilles Willard</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]trubolotta (3/16/2010)[/b][hr]While I find the discussion of how to best use the getdate() function interesting, I also find it distressing that the pursuit of code efficiency brushes aside the more important issue of business practicality.  As I mentioned in a previous post, the sample presented to demonstrate a "need for denormalization" more aptly demonstrates the need for better design, and that includes how the application might be used in the real world.[/quote]I agree completely.</description><pubDate>Mon, 22 Mar 2010 06:59:00 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Robert Frasca (3/15/2010)[/b][hr]One of the reasons I don't like de-normalization in the OLTP database is that the business intent of the logical data model can become obfuscated by the perceived need for short-cuts in the physical data model.  There is a reason why each entity exists in the normalized data model.[/quote]If by denormalisation you mean violating EKNF (or violating 3NF) I agree with you, but if you mean violating one of the higher levels of normalisation (BCNF, 4NF, 5NF) while maintaining EKNF (or even just maintaining 3NF) I have to disagree. The higher normalisations can destroy the ability to use primary key constraints to enforce all the functional dependencies of business data, and denormalisation from them (in cases where they have resulted in something not already required by EKNF) is more often right than wrong.  Long live the principle of representability!</description><pubDate>Mon, 22 Mar 2010 06:55:01 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Gilles Willard (3/15/2010)[/b][hr]Maybe this is due to my lack of experience, or the fact that it's been several years since I last studied database normalization, but:It seems to me that what Timothy is proposing is simply adding a few FKs to his database design to cut down on the JOINs required to get at some data.How is that denormalizing?I went back to my database design textbook (as well as the paper I wrote then on denormalization...)and could not figure out which form (1NF, 2NF or 3NF) was being violated?Also, and please forgive my fussiness with this but I have teachers in my family going back three generations, Timothy's article, while being interesting, could have used a quick run through spell-checking. This is not just a "purity" thing: it is a little jarring to find typos or spelling mistakes in a technical article while you're concentrating on understanding the author's point, something like stumbling over a rock while on a good walk.[/quote]I agree about the rocks!Maybe your problem with the denormalisation is the way he describes it: he talks about "adding Foreign Keys" when what he actually means is "adding extra columns to support extra Foreign Key relationships" (and he has a barbarian naming convention for the columns which support foreign keys, so that they sound as if they were foreign keys, not columns).  When he adds the column called "FK_Medication" to his Pet_MedicationSeries table that introduces a new functional dependency within that table: the value of the column FK_Medication is determined (according to the schema design) completely by the value of the column FK_PetMedication; but FK_PetMedication is not a key, so the table is now not in 3NF.Of course the proposed denormalisations are probably not the right way to solve the problem described - the first issue appears to arise largely because the date is not recorded in the pet_medication table (one consequence of this is that for medication that is not part of a series there is no means of discovering the date,  so neither the query before the denormalisation nor the query after it can deliver the answer it is claimed to deliver). The solution is to put the date into the Pet_Medication table, and remove it from the Pet_MedicationSeries table.  The first query now doesn't need to use the Pet_MedicationSeries table at all (and is now able to deliver what it was claimed to deliver) so there's no need to denormalise that table to help this query, and if it performs badly a clustered index on (date, id) [actually just on (date) would work too, but why not make sure it's unique and use it as the primary key?] will probably work wonders.Denormalising to fix a consequence of a schema design that puts attributes in the wrong tables is not a sensible use of denormalisation - it's just a deeper burying of the underlying design problem.I think Paul's 3 stars were over-generous.(edited - a spelling error)</description><pubDate>Mon, 22 Mar 2010 06:36:52 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>In SQL 2000, it's the syscacheobjects table.  There is a sql column and a usecounts column that can give you insights.</description><pubDate>Fri, 19 Mar 2010 13:13:23 GMT</pubDate><dc:creator>timclaason</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>First: Great article, interesting replies.My question:This query does not work in MsSql 2000:[code="vb"]SELECT text, total_worker_time/execution_count AS [Avg CPU Time], execution_countFROM sys.dm_exec_query_stats AS qs CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS stORDER BY execution_count DESC[/code]Is there some kind of alternative for MsSql2000? (Sorry if this has been asked before, I have read the replies rather quickly)</description><pubDate>Fri, 19 Mar 2010 11:45:08 GMT</pubDate><dc:creator>tim-605484</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>While I find the discussion of how to best use the getdate() function interesting, I also find it distressing that the pursuit of code efficiency brushes aside the more important issue of business practicality.  As I mentioned in a previous post, the sample presented to demonstrate a "need for denormalization" more aptly demonstrates the need for better design, and that includes how the application might be used in the real world.The reason I would run such a report is to notify customers that their pets are due for a particular medication, probably a month in advance.  The way the code was written to produce that list requires it be run every day so as not to miss any pet due to date gaps.  More likely, the business will run that report once a month or once week, but not always on some developer conceived target due date.  People do things early or late and those gaps must be covered.  Hence getdate() is totally inappropriate for use in that report.I know my comment digresses from the topic, but I think it is import to use examples that illustrate very practical business considerations, especially those covering fundamentals of design that n00bs wish to learn.  Someone reading the article or comments regarding the use of getdate() is going to walk away with the wrong lesson.  They may write more efficient code as a result, but they will also write an application that puts undue burden on the business that uses it.</description><pubDate>Tue, 16 Mar 2010 09:01:55 GMT</pubDate><dc:creator>trubolotta</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Les Cardwell (3/15/2010)[/b][hr][quote]Since 'getdate()' is a non-deterministic function, like all non-deterministic functions, we've always assigned them to a scalar variable to ensure the dbms won't perform a table-scan...although admittedly, these days they seem to be more implementation dependent.[/quote]Non-determinism isn't the issue.  SQL Server can evaluate the expression once and seek on the result.Assigning to a variable might cause a poor plan since SQL Server cannot 'look inside' the variable in many cases.</description><pubDate>Mon, 15 Mar 2010 21:59:14 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>I agree more with trubolotta than the contents of the article. I do not see why there needs to be separate table for one-time meds vs. recurring meds. In fact Pet_Medication could contain a bit column to indicate it is recurring. The overhead of one bit is much less than an entire table along with the foreign keys to maintain it. Possibly another tinyint column to indicate recurring frequency (e.g. 1= daily, 2-weekly, 3=monthly...)A properly designed structure with proper indexes will perform amazingly fast, especially in this environment. Unless they are a verterinary clinic with 50 doctors seeing 30 patients each a day, six days/week and you have 20 years of medical history with each patient having an average of 2 medications per visit. Even then, you have to realize that a large percentage of patients may not even require medication. But even so, this is very doable with good response time in SQL Server.I have designed HR systems that join up to 20 tables in certain queries with sub-second response time reading the data.</description><pubDate>Mon, 15 Mar 2010 21:42:28 GMT</pubDate><dc:creator>dbishop</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>I don't want to hit this too hard but I see the example more as a case for improved design versus dernomarlization.  Just a few obervations if I may:1) The Pet_Medication table should include the date the med was administered and the dosage, whether it's a one time event or part of a series.  The Pet_MedicationSeries table can be elimniated and the Pet ID and Med ID used as a composite key.  That change would seem to me to satisfy a rather obvious business requirement.2) I would speculate this hypothetical business would want some measure of inventory and there is no better place to put that than the Medication table.  But that table should also include any requirement for readministering the medication.  That would eliminate the need to write a separate select statement for each med and allow for changes without requiring a developer to write more SQL.I know the example is hypothetical, but this cat can be skinned a much better way.  I think the changes I have suggested would eliminate any need for denormalization and provide a far more flexible application that is less DB developer dependent.</description><pubDate>Mon, 15 Mar 2010 20:57:39 GMT</pubDate><dc:creator>trubolotta</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>Sometimes when you identify a one to many relationship there is a tendency toward dogmatic decision-making.I wrote code not long ago to help users categorize records.  Instead of using a lookup table i used a varchar field and an int field with a binary hash function triggered on insert or update.  As users enter data, previous entries are distinct selected on that binary hash, in order to give a drop-down option based on what they're typing.  It seemed easier to me than creating a lookup table and writing code to insert or update to the lookup table.I would only denormalize in rare cases though.</description><pubDate>Mon, 15 Mar 2010 20:04:43 GMT</pubDate><dc:creator>ben.sapp</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>I feel your pain.  I should probably have prefaced my remarks with "when it's possible".  It's much easier to take the approach I recommended at the beginning of a project.  It's much more difficult to retrofit; however, one of the things I'm working on now is to create that ODS and gradually move, starting with the longest running, queries off of the production OLTP database and on to the ODS to reduce stress.  It'll probably take a couple of years and only then will we be able to think about tweaking the underlying OLTP physical model.  It's a cool project however with a number of interesting facets to it.</description><pubDate>Mon, 15 Mar 2010 15:20:18 GMT</pubDate><dc:creator>Robert Frasca</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Robert Frasca (3/15/2010)[/b][hr]I've actually found that it's easier and faster to develop the OLTP model and the ODS in parallel.  It also allows you to build a data source for your down-stream data warehouse that can be accessed without impacting the OLTP database.  This is, after all, why so many companies use transactional replication, i.e. in order to populate their ODS for querying/reporting purposes.[/quote]I appreciate your purist view of this, and I agree that a transfer from OLTP to a OLAP/Warehouse environment is optimal, and best-practice for reporting and other analyses, and if the resources are available, I totally agree with that notion; however, I have a number of customers and environments where operating in that environment is quite a challenge because of financial resources, and support and maintenance concerns (ie not having a DBA to support it).</description><pubDate>Mon, 15 Mar 2010 15:10:44 GMT</pubDate><dc:creator>timclaason</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>I'm not a big fan of de-normalization in OLTP databases.  If I need to run queries I'll run them against a structure that's is optimized for queries such as a dimensional model or a heavily indexed Operational Data Store.  I think an OLTP database exists for one reason, to capture (and validate) transactional data as quickly as possible.  The emphasis is as much on write performance as it is on read performance.  Anything that gets in the way of that mission such as locking associated with read-only type queries should be consciously reduced if not completely eliminated.  I've actually found that it's easier and faster to develop the OLTP model and the ODS in parallel.  It also allows you to build a data source for your down-stream data warehouse that can be accessed without impacting the OLTP database.  This is, after all, why so many companies use transactional replication, i.e. in order to populate their ODS for querying/reporting purposes.One of the reasons I don't like de-normalization in the OLTP database is that the business intent of the logical data model can become obfuscated by the perceived need for short-cuts in the physical data model.  There is a reason why each entity exists in the normalized data model.  Referential integrity demonstrates how those entities are related to each other.  When those relationships are eliminated and attributes are scattered redundantly around the model the reasoning behind those relationships is no longer clear.  A normalized data model is very easy to "read".  Once de-normalization occurs a level of unnecessary complexity is added.There was a time when the cost of physical referential integrity was quite high and locking algorithms were relatively primitive so de-normalization was a means of overcoming the shortcomings of the engine.  That time is past.</description><pubDate>Mon, 15 Mar 2010 14:57:00 GMT</pubDate><dc:creator>Robert Frasca</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>Thx Lynn... good to know. Still, alligators and all that... :unsure:</description><pubDate>Mon, 15 Mar 2010 13:24:58 GMT</pubDate><dc:creator>Les Cardwell</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Lynn Pettis (3/15/2010)[/b][hr][code="sql"]USE [SandBox]GO/****** Object:  Table [dbo].[[b]JBMTest[/b]]    Script Date: 03/15/2010 12:49:16 ******/[/quote]Looks like a familiar setup ;-)</description><pubDate>Mon, 15 Mar 2010 13:10:24 GMT</pubDate><dc:creator>SQLRNNR</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Les Cardwell (3/15/2010)[/b][hr][quote]Also, this:[code="sql"]DECLARE @selectDate = getdate()-365[/code]won't work.  In SQL Server 2008 it needs to be like this:[code="sql"]DECLARE @selectDate datetime = getdate()-365[/code][/quote][quote]For what it's worth, it doesn't work in 2005 either.[code="sql"]Cannot assign a default value to a local variable.[/code][/quote][quote]Nope, it doesn't.  Being able to assign a value to a variable when it is declared is new to SQL Server 2008.  Guess what, we upgraded our PeopleSoft systems to SQL Server 2008 EE.  Now, we just need to start upgrading our other systems.[/quote]Good catch on the 'type' :)Actually, in 2005 it needs to be...DECLARE @selectDate DATETIMESET @selectDate = getdate() - 365;I'm jumping around between SQL2000, SQL2005, SQL2008, Oracle10g, and DB2... nutz.[/quote]Pretty sure.Table/Index defs[code="sql"]USE [SandBox]GO/****** Object:  Table [dbo].[JBMTest]    Script Date: 03/15/2010 12:49:16 ******/SET ANSI_NULLS ONGOSET QUOTED_IDENTIFIER ONGOCREATE TABLE [dbo].[JBMTest](	[RowNum] [int] IDENTITY(1,1) NOT NULL,	[AccountID] [int] NOT NULL,	[Amount] [money] NOT NULL,	[Date] [datetime] NOT NULL) ON [PRIMARY]GO/****** Object:  Index [IX_JBMTest]    Script Date: 03/15/2010 12:49:16 ******/CREATE CLUSTERED INDEX [IX_JBMTest] ON [dbo].[JBMTest] (	[Date] ASC)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]GO/****** Object:  Index [IX_JBMTest_AccountID_Date]    Script Date: 03/15/2010 12:49:16 ******/CREATE NONCLUSTERED INDEX [IX_JBMTest_AccountID_Date] ON [dbo].[JBMTest] (	[AccountID] ASC,	[Date] ASC)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY][/code]Simple query:[code="sql"]select * from dbo.JBMTest where Date &amp;gt; getdate() - 365[/code]Actual execution plan attached.There are 1,000,000 records in the test table.</description><pubDate>Mon, 15 Mar 2010 12:52:55 GMT</pubDate><dc:creator>Lynn Pettis</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>Good points made.  I have never found using getdate() inside a SQL query to be problematic in my execution plans.  However, if it's "best practice" to not do it, then I'll probably stop.  I had never thought about it, before now.</description><pubDate>Mon, 15 Mar 2010 12:39:04 GMT</pubDate><dc:creator>timclaason</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote]Actually, this:[code="sql"]  WHERE P_MS.DateReceived &amp;gt; getdate() - 365[/code]can use an index on DateReceived.  The function call is on the right of the conditional and will only be calculated once.[/quote]Hmmm... positive? Since 'getdate()' is a non-deterministic function, like all non-deterministic functions, we've always assigned them to a scalar variable to ensure the dbms won't perform a table-scan...although admittedly, these days they seem to be more implementation dependent.From SQL Server Help...[quote]For example, the function GETDATE() is nondeterministic. SQL Server puts restrictions on various classes of nondeterminism. Therefore, nondeterministic functions should be used carefully. The lack of strict determinism of a function can block valuable performance optimizations. Certain plan reordering steps are skipped to conservatively preserve correctness. Additionally, the number, order, and timing of calls to user-defined functions is implementation-dependent. Do not rely on these invocation semantics.[/quote]JFWIW...</description><pubDate>Mon, 15 Mar 2010 12:17:07 GMT</pubDate><dc:creator>Les Cardwell</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote]Also, this:[code="sql"]DECLARE @selectDate = getdate()-365[/code]won't work.  In SQL Server 2008 it needs to be like this:[code="sql"]DECLARE @selectDate datetime = getdate()-365[/code][/quote][quote]For what it's worth, it doesn't work in 2005 either.[code="sql"]Cannot assign a default value to a local variable.[/code][/quote][quote]Nope, it doesn't.  Being able to assign a value to a variable when it is declared is new to SQL Server 2008.  Guess what, we upgraded our PeopleSoft systems to SQL Server 2008 EE.  Now, we just need to start upgrading our other systems.[/quote]Good catch on the 'type' :)Actually, in 2005 it needs to be...DECLARE @selectDate DATETIMESET @selectDate = getdate() - 365;I'm jumping around between SQL2000, SQL2005, SQL2008, Oracle10g, and DB2... nutz.</description><pubDate>Mon, 15 Mar 2010 12:10:02 GMT</pubDate><dc:creator>Les Cardwell</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]CirquedeSQLeil (3/15/2010)[/b][hr][quote][b]Lynn Pettis (3/15/2010)[/b][hr][quote][b]Les Cardwell (3/15/2010)[/b][hr]In spite of the criticism, it was still a simple example of minimal denormalization to achieve an end result rather than a full-on explosion of wide rows to reduce the NF to 0 :pinch:Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...[code="other"]WHERE P_MS.DateReceived &amp;gt; getdate() - 365[/code]...would have been better expressed declaring a scalar variable:DECLARE @selectDate = getdate()-365...WHERE P_MS.DateReceived &amp;gt; @selectDate......which would allow the optimizer to use an index on DateReceived.Unfortunately, denormalization for immutable datasets as we've used it in the past just doesn't scale, especially on large datasets...not to mention the escalating complexity (and headaches) it entails. Ironically, not even for data-warehouses that go the same route (MOLAP) vs. a Multi-dimensional ROLAP Snowflake Schema. The stastical implications are the subject of current research, though it's proving a bit of a challenge to account for all the complexities it can entail (code proliferation, data-correctness, increased complexity of refactoring to accomodate changing business rules, data-explosion, etc.). The data-tsunami is upon us :w00t:[/quote]Also, this:[code="sql"]DECLARE @selectDate = getdate()-365[/code]won't work.  In SQL Server 2008 it needs to be like this:[code="sql"]DECLARE @selectDate datetime = getdate()-365[/code][/quote]For what it's worth, it doesn't work in 2005 either.[code="sql"]Cannot assign a default value to a local variable.[/code][/quote]Nope, it doesn't.  Being able to assign a value to a variable when it is declared is new to SQL Server 2008.  Guess what, we upgraded our PeopleSoft systems to SQL Server 2008 EE.  Now, we just need to start upgrading our other systems.</description><pubDate>Mon, 15 Mar 2010 11:58:16 GMT</pubDate><dc:creator>Lynn Pettis</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Lynn Pettis (3/15/2010)[/b][hr][quote][b]Les Cardwell (3/15/2010)[/b][hr]In spite of the criticism, it was still a simple example of minimal denormalization to achieve an end result rather than a full-on explosion of wide rows to reduce the NF to 0 :pinch:Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...[code="other"]WHERE P_MS.DateReceived &amp;gt; getdate() - 365[/code]...would have been better expressed declaring a scalar variable:DECLARE @selectDate = getdate()-365...WHERE P_MS.DateReceived &amp;gt; @selectDate......which would allow the optimizer to use an index on DateReceived.Unfortunately, denormalization for immutable datasets as we've used it in the past just doesn't scale, especially on large datasets...not to mention the escalating complexity (and headaches) it entails. Ironically, not even for data-warehouses that go the same route (MOLAP) vs. a Multi-dimensional ROLAP Snowflake Schema. The stastical implications are the subject of current research, though it's proving a bit of a challenge to account for all the complexities it can entail (code proliferation, data-correctness, increased complexity of refactoring to accomodate changing business rules, data-explosion, etc.). The data-tsunami is upon us :w00t:[/quote]Also, this:[code="sql"]DECLARE @selectDate = getdate()-365[/code]won't work.  In SQL Server 2008 it needs to be like this:[code="sql"]DECLARE @selectDate datetime = getdate()-365[/code][/quote]For what it's worth, it doesn't work in 2005 either.[code="sql"]Cannot assign a default value to a local variable.[/code]</description><pubDate>Mon, 15 Mar 2010 11:51:46 GMT</pubDate><dc:creator>SQLRNNR</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Alvin Ramard (3/15/2010)[/b][hr][quote][b]Paul White (3/15/2010)[/b][hr]Jim,Yes.  Data warehouses are a totally different kettle.[/quote]It's normal for denormalization to be present in a data warehouse. [i](Seriously, there was no pun intended.)[/i][/quote]Absolutely.  There should not be a lot of transactions occurring there and flatter structures can be much more beneficial.</description><pubDate>Mon, 15 Mar 2010 11:47:27 GMT</pubDate><dc:creator>SQLRNNR</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Paul White (3/15/2010)[/b][hr]Normalize 'til it hurts...de-normalize* 'til it works![/quote]Agreed.</description><pubDate>Mon, 15 Mar 2010 11:45:44 GMT</pubDate><dc:creator>SQLRNNR</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Les Cardwell (3/15/2010)[/b][hr]In spite of the criticism, it was still a simple example of minimal denormalization to achieve an end result rather than a full-on explosion of wide rows to reduce the NF to 0 :pinch:Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...[code="other"]WHERE P_MS.DateReceived &amp;gt; getdate() - 365[/code]...would have been better expressed declaring a scalar variable:DECLARE @selectDate = getdate()-365...WHERE P_MS.DateReceived &amp;gt; @selectDate......which would allow the optimizer to use an index on DateReceived.Unfortunately, denormalization for immutable datasets as we've used it in the past just doesn't scale, especially on large datasets...not to mention the escalating complexity (and headaches) it entails. Ironically, not even for data-warehouses that go the same route (MOLAP) vs. a Multi-dimensional ROLAP Snowflake Schema. The stastical implications are the subject of current research, though it's proving a bit of a challenge to account for all the complexities it can entail (code proliferation, data-correctness, increased complexity of refactoring to accomodate changing business rules, data-explosion, etc.). The data-tsunami is upon us :w00t:[/quote]Also, this:[code="sql"]DECLARE @selectDate = getdate()-365[/code]won't work.  In SQL Server 2008 it needs to be like this:[code="sql"]DECLARE @selectDate datetime = getdate()-365[/code]</description><pubDate>Mon, 15 Mar 2010 11:44:02 GMT</pubDate><dc:creator>Lynn Pettis</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Les Cardwell (3/15/2010)[/b][hr]In spite of the criticism, it was still a simple example of minimal denormalization to achieve an end result rather than a full-on explosion of wide rows to reduce the NF to 0 :pinch:Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...[code="other"]WHERE P_MS.DateReceived &amp;gt; getdate() - 365[/code]...would have been better expressed declaring a scalar variable:DECLARE @selectDate = getdate()-365...WHERE P_MS.DateReceived &amp;gt; @selectDate......which would allow the optimizer to use an index on DateReceived.Unfortunately, denormalization for immutable datasets as we've used it in the past just doesn't scale, especially on large datasets...not to mention the escalating complexity (and headaches) it entails. Ironically, not even for data-warehouses that go the same route (MOLAP) vs. a Multi-dimensional ROLAP Snowflake Schema. The stastical implications are the subject of current research, though it's proving a bit of a challenge to account for all the complexities it can entail (code proliferation, data-correctness, increased complexity of refactoring to accomodate changing business rules, data-explosion, etc.). The data-tsunami is upon us :w00t:[/quote]Actually, this:[code="sql"]WHERE P_MS.DateReceived &amp;gt; getdate() - 365[/code]can use an index on DateReceived.  The function call is on the right of the conditional and will only be calculated once.</description><pubDate>Mon, 15 Mar 2010 11:42:08 GMT</pubDate><dc:creator>Lynn Pettis</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Les Cardwell (3/15/2010)[/b][hr]Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...[code="other"]WHERE P_MS.DateReceived &amp;gt; getdate() - 365[/code]...would have been better expressed declaring a scalar variable:DECLARE @selectDate = getdate()-365...WHERE P_MS.DateReceived &amp;gt; @selectDate......which would allow the optimizer to use an index on DateReceived.[/quote]Nicely said Grasshopper :-)I like the idea of denormalization, but many people look for these types of articles to re-establish their non-existing point of designing a sloppy, good for nothing database.  They just totally ignore the last paragraph! :rolleyes:It usually takes alot of time and effort to denormalize a database.  But shouldn't you FIRST NORMALIZE then DENORMALIZE if benefit can be measured???? Right???  :-D</description><pubDate>Mon, 15 Mar 2010 11:30:54 GMT</pubDate><dc:creator>Jaji03</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>Both of these examples depict DSS type operations.  Rather then denormalize a live database I would prefer to created a data warehouse where denormalization is the norm.  I believe you denormalize during load testing and then only IF you have a significant performance issue.  Over time, a normalized database is easier to modify than a denormalized database.Another alternative would be to use all natural keys and that way the UID of the parent table would be carried down to the UID of the children, grandchildren etc.  Of course the big disadvantage to this approach is if you have many generations there would be more key columns than data columns in the lowest generation.</description><pubDate>Mon, 15 Mar 2010 11:28:11 GMT</pubDate><dc:creator>baconm-1145631</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>In spite of the criticism, it was still a simple example of minimal denormalization to achieve an end result rather than a full-on explosion of wide rows to reduce the NF to 0 :pinch:Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...[code="other"]WHERE P_MS.DateReceived &amp;gt; getdate() - 365[/code]...would have been better expressed declaring a scalar variable:DECLARE @selectDate = getdate()-365...WHERE P_MS.DateReceived &amp;gt; @selectDate......which would allow the optimizer to use an index on DateReceived.Unfortunately, denormalization for immutable datasets as we've used it in the past just doesn't scale, especially on large datasets...not to mention the escalating complexity (and headaches) it entails. Ironically, not even for data-warehouses that go the same route (MOLAP) vs. a Multi-dimensional ROLAP Snowflake Schema. The stastical implications are the subject of current research, though it's proving a bit of a challenge to account for all the complexities it can entail (code proliferation, data-correctness, increased complexity of refactoring to accomodate changing business rules, data-explosion, etc.). The data-tsunami is upon us :w00t:</description><pubDate>Mon, 15 Mar 2010 11:01:37 GMT</pubDate><dc:creator>Les Cardwell</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Paul White (3/15/2010)[/b][hr][quote][b]Alvin Ramard (3/15/2010)[/b][hr][i](Seriously, there was no pun intended.)[/i][/quote]Given your track record for bad puns, Alvin, I have my doubts :laugh:Benefit of the doubt. :-P[/quote]If a pun had been intended, I would have included a smiley.I can understand you comment.  I do have a reputation for trying to add a bit of humor to many situations.  :-)</description><pubDate>Mon, 15 Mar 2010 09:02:51 GMT</pubDate><dc:creator>Alvin Ramard</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Alvin Ramard (3/15/2010)[/b][hr][i](Seriously, there was no pun intended.)[/i][/quote]Given your track record for bad puns, Alvin, I have my doubts :laugh:Benefit of the doubt. :-P</description><pubDate>Mon, 15 Mar 2010 08:59:43 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Paul White (3/15/2010)[/b][hr]Jim,Yes.  Data warehouses are a totally different kettle.[/quote]It's normal for denormalization to be present in a data warehouse. [i](Seriously, there was no pun intended.)[/i]</description><pubDate>Mon, 15 Mar 2010 08:41:56 GMT</pubDate><dc:creator>Alvin Ramard</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>Jim,Yes.  Data warehouses are a totally different kettle.</description><pubDate>Mon, 15 Mar 2010 08:27:18 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Gilles Willard (3/15/2010)[/b][hr]It seems to me that what Timothy is proposing is simply adding a few FKs to his database design to cut down on the JOINs required to get at some data.How is that denormalizing?[/quote]I had the same assumption...that is until he mentioned the fact that he decreased the number of tables.  It would have been nice to have a diagram with the table design showing the suggested changes.</description><pubDate>Mon, 15 Mar 2010 08:26:20 GMT</pubDate><dc:creator>Jaji03</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>We use a de-normalized data-mart--I may be using the term loosely but it's esssentially "a separate place for up-to-yesterday production data" where semi-trained users can put together queries on our core data without getting into trouble with join syntax and without us investing in a huge Crystal Reports development effort.It makes a lot of sense for this because it's not the main production transactional db, we can get away with a read-optimized box with a few huge wide-flat tables rather than many tables with joins.Disk space is cheap.  Why make the same Y2k mistake and add unnecessary complexity to a data mart to save what amounts to in our case, about $100 worth of disk space.90% of the most commonly asked reports are no longer dumped on the desk of our IT staff--a small investment in simple query syntax training for end users solved most of the issues.Granted, we're small and simple but a similar approach this could be scaled.In addition, we have some data-specific issue where this helps. For example, Customer 123 might be "Bobs Widgets" today, but later he changes it to "Bob &amp; Sons Widgets".   I understand about temporal tables but if you've ever seen what joining a temporal table does to an execution plan (not to mention 8 or 10 temporal tables) then that's something for which I'm more than willing to add a redundant field.--Jim</description><pubDate>Mon, 15 Mar 2010 08:00:31 GMT</pubDate><dc:creator>James Stephens</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>The approach we are taking is to normalize as much as possible. Then if it just doesn't work or can be made efficient, we denormalize.</description><pubDate>Mon, 15 Mar 2010 07:35:22 GMT</pubDate><dc:creator>crussell-931424</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Gilles Willard (3/15/2010)[/b][hr]It seems to me that what Timothy is proposing is simply adding a few FKs to his database design to cut down on the JOINs required to get at some data.  How is that denormalizing?[/quote]This is why I think code examples would have helped.  My assumption is that he is de-normalizing the entire lower structure - putting the result of the JOINs into the parent table.  That is how I read it anyway.  I agree about the rocks by the way.</description><pubDate>Mon, 15 Mar 2010 07:18:28 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>Maybe this is due to my lack of experience, or the fact that it's been several years since I last studied database normalization, but:It seems to me that what Timothy is proposing is simply adding a few FKs to his database design to cut down on the JOINs required to get at some data.How is that denormalizing?I went back to my database design textbook (as well as the paper I wrote then on denormalization...)and could not figure out which form (1NF, 2NF or 3NF) was being violated?Also, and please forgive my fussiness with this but I have teachers in my family going back three generations, Timothy's article, while being interesting, could have used a quick run through spell-checking. This is not just a "purity" thing: it is a little jarring to find typos or spelling mistakes in a technical article while you're concentrating on understanding the author's point, something like stumbling over a rock while on a good walk.</description><pubDate>Mon, 15 Mar 2010 06:59:00 GMT</pubDate><dc:creator>Gilles Willard</dc:creator></item><item><title>RE: Denormalization Strategies</title><link>http://www.sqlservercentral.com/Forums/Topic882621-2631-1.aspx</link><description>[quote][b]Paul White (3/15/2010)[/b][hr]Normalize 'til it hurts...de-normalize* 'til it works![size="-1"][i]* appropriately[/i][/size][/quote]good point! :)</description><pubDate>Mon, 15 Mar 2010 06:28:38 GMT</pubDate><dc:creator>hakan.winther</dc:creator></item></channel></rss>