SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Denormalization Strategies


Denormalization Strategies

Author
Message
Lynn Pettis
Lynn Pettis
SSC Guru
SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)

Group: General Forum Members
Points: 90742 Visits: 38945
Les Cardwell (3/15/2010)
In spite of the criticism, it was still a simple example of minimal denormalization to achieve an end result rather than a full-on explosion of wide rows to reduce the NF to 0 Pinch

Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...

WHERE P_MS.DateReceived > getdate() - 365



...would have been better expressed declaring a scalar variable:

DECLARE @selectDate = getdate()-365
...
WHERE P_MS.DateReceived > @selectDate
...


...which would allow the optimizer to use an index on DateReceived.


Unfortunately, denormalization for immutable datasets as we've used it in the past just doesn't scale, especially on large datasets...not to mention the escalating complexity (and headaches) it entails. Ironically, not even for data-warehouses that go the same route (MOLAP) vs. a Multi-dimensional ROLAP Snowflake Schema. The stastical implications are the subject of current research, though it's proving a bit of a challenge to account for all the complexities it can entail (code proliferation, data-correctness, increased complexity of refactoring to accomodate changing business rules, data-explosion, etc.). The data-tsunami is upon us w00t



Actually, this:

WHERE P_MS.DateReceived > getdate() - 365



can use an index on DateReceived. The function call is on the right of the conditional and will only be calculated once.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Lynn Pettis
Lynn Pettis
SSC Guru
SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)

Group: General Forum Members
Points: 90742 Visits: 38945
Les Cardwell (3/15/2010)
In spite of the criticism, it was still a simple example of minimal denormalization to achieve an end result rather than a full-on explosion of wide rows to reduce the NF to 0 Pinch

Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...

WHERE P_MS.DateReceived > getdate() - 365



...would have been better expressed declaring a scalar variable:

DECLARE @selectDate = getdate()-365
...
WHERE P_MS.DateReceived > @selectDate
...


...which would allow the optimizer to use an index on DateReceived.


Unfortunately, denormalization for immutable datasets as we've used it in the past just doesn't scale, especially on large datasets...not to mention the escalating complexity (and headaches) it entails. Ironically, not even for data-warehouses that go the same route (MOLAP) vs. a Multi-dimensional ROLAP Snowflake Schema. The stastical implications are the subject of current research, though it's proving a bit of a challenge to account for all the complexities it can entail (code proliferation, data-correctness, increased complexity of refactoring to accomodate changing business rules, data-explosion, etc.). The data-tsunami is upon us w00t


Also, this:

DECLARE @selectDate = getdate()-365


won't work. In SQL Server 2008 it needs to be like this:

DECLARE @selectDate datetime = getdate()-365



Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
SQLRNNR
SQLRNNR
SSC Guru
SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)

Group: General Forum Members
Points: 64097 Visits: 18570
Paul White (3/15/2010)
Normalize 'til it hurts...de-normalize* 'til it works!



Agreed.



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw

SQLRNNR
SQLRNNR
SSC Guru
SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)

Group: General Forum Members
Points: 64097 Visits: 18570
Alvin Ramard (3/15/2010)
Paul White (3/15/2010)
Jim,

Yes. Data warehouses are a totally different kettle.


It's normal for denormalization to be present in a data warehouse.

(Seriously, there was no pun intended.)



Absolutely. There should not be a lot of transactions occurring there and flatter structures can be much more beneficial.



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw

SQLRNNR
SQLRNNR
SSC Guru
SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)SSC Guru (64K reputation)

Group: General Forum Members
Points: 64097 Visits: 18570
Lynn Pettis (3/15/2010)
Les Cardwell (3/15/2010)
In spite of the criticism, it was still a simple example of minimal denormalization to achieve an end result rather than a full-on explosion of wide rows to reduce the NF to 0 Pinch

Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...

WHERE P_MS.DateReceived > getdate() - 365



...would have been better expressed declaring a scalar variable:

DECLARE @selectDate = getdate()-365
...
WHERE P_MS.DateReceived > @selectDate
...


...which would allow the optimizer to use an index on DateReceived.


Unfortunately, denormalization for immutable datasets as we've used it in the past just doesn't scale, especially on large datasets...not to mention the escalating complexity (and headaches) it entails. Ironically, not even for data-warehouses that go the same route (MOLAP) vs. a Multi-dimensional ROLAP Snowflake Schema. The stastical implications are the subject of current research, though it's proving a bit of a challenge to account for all the complexities it can entail (code proliferation, data-correctness, increased complexity of refactoring to accomodate changing business rules, data-explosion, etc.). The data-tsunami is upon us w00t


Also, this:

DECLARE @selectDate = getdate()-365


won't work. In SQL Server 2008 it needs to be like this:

DECLARE @selectDate datetime = getdate()-365



For what it's worth, it doesn't work in 2005 either.

Cannot assign a default value to a local variable.





Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw

Lynn Pettis
Lynn Pettis
SSC Guru
SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)

Group: General Forum Members
Points: 90742 Visits: 38945
CirquedeSQLeil (3/15/2010)
Lynn Pettis (3/15/2010)
Les Cardwell (3/15/2010)
In spite of the criticism, it was still a simple example of minimal denormalization to achieve an end result rather than a full-on explosion of wide rows to reduce the NF to 0 Pinch

Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...

WHERE P_MS.DateReceived > getdate() - 365



...would have been better expressed declaring a scalar variable:

DECLARE @selectDate = getdate()-365
...
WHERE P_MS.DateReceived > @selectDate
...


...which would allow the optimizer to use an index on DateReceived.


Unfortunately, denormalization for immutable datasets as we've used it in the past just doesn't scale, especially on large datasets...not to mention the escalating complexity (and headaches) it entails. Ironically, not even for data-warehouses that go the same route (MOLAP) vs. a Multi-dimensional ROLAP Snowflake Schema. The stastical implications are the subject of current research, though it's proving a bit of a challenge to account for all the complexities it can entail (code proliferation, data-correctness, increased complexity of refactoring to accomodate changing business rules, data-explosion, etc.). The data-tsunami is upon us w00t


Also, this:

DECLARE @selectDate = getdate()-365


won't work. In SQL Server 2008 it needs to be like this:

DECLARE @selectDate datetime = getdate()-365



For what it's worth, it doesn't work in 2005 either.

Cannot assign a default value to a local variable.




Nope, it doesn't. Being able to assign a value to a variable when it is declared is new to SQL Server 2008. Guess what, we upgraded our PeopleSoft systems to SQL Server 2008 EE. Now, we just need to start upgrading our other systems.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Les Cardwell
Les Cardwell
SSC Veteran
SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)

Group: General Forum Members
Points: 233 Visits: 258

Also, this:

DECLARE @selectDate = getdate()-365


won't work. In SQL Server 2008 it needs to be like this:

DECLARE @selectDate datetime = getdate()-365




For what it's worth, it doesn't work in 2005 either.

Cannot assign a default value to a local variable.





Nope, it doesn't. Being able to assign a value to a variable when it is declared is new to SQL Server 2008. Guess what, we upgraded our PeopleSoft systems to SQL Server 2008 EE. Now, we just need to start upgrading our other systems.


Good catch on the 'type' Smile

Actually, in 2005 it needs to be...

DECLARE @selectDate DATETIME
SET @selectDate = getdate() - 365
;

I'm jumping around between SQL2000, SQL2005, SQL2008, Oracle10g, and DB2... nutz.

Dr. Les Cardwell, DCS-DSS
Enterprise Data Architect
Central Lincoln PUD
Les Cardwell
Les Cardwell
SSC Veteran
SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)

Group: General Forum Members
Points: 233 Visits: 258

Actually, this:

WHERE P_MS.DateReceived > getdate() - 365



can use an index on DateReceived. The function call is on the right of the conditional and will only be calculated once.


Hmmm... positive? Since 'getdate()' is a non-deterministic function, like all non-deterministic functions, we've always assigned them to a scalar variable to ensure the dbms won't perform a table-scan...although admittedly, these days they seem to be more implementation dependent.

From SQL Server Help...

For example, the function GETDATE() is nondeterministic. SQL Server puts restrictions on various classes of nondeterminism. Therefore, nondeterministic functions should be used carefully. The lack of strict determinism of a function can block valuable performance optimizations. Certain plan reordering steps are skipped to conservatively preserve correctness. Additionally, the number, order, and timing of calls to user-defined functions is implementation-dependent. Do not rely on these invocation semantics.


JFWIW...

Dr. Les Cardwell, DCS-DSS
Enterprise Data Architect
Central Lincoln PUD
timclaason
timclaason
SSC-Enthusiastic
SSC-Enthusiastic (126 reputation)SSC-Enthusiastic (126 reputation)SSC-Enthusiastic (126 reputation)SSC-Enthusiastic (126 reputation)SSC-Enthusiastic (126 reputation)SSC-Enthusiastic (126 reputation)SSC-Enthusiastic (126 reputation)SSC-Enthusiastic (126 reputation)

Group: General Forum Members
Points: 126 Visits: 143
Good points made. I have never found using getdate() inside a SQL query to be problematic in my execution plans. However, if it's "best practice" to not do it, then I'll probably stop. I had never thought about it, before now.
Lynn Pettis
Lynn Pettis
SSC Guru
SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)SSC Guru (90K reputation)

Group: General Forum Members
Points: 90742 Visits: 38945
Les Cardwell (3/15/2010)

Also, this:

DECLARE @selectDate = getdate()-365


won't work. In SQL Server 2008 it needs to be like this:

DECLARE @selectDate datetime = getdate()-365




For what it's worth, it doesn't work in 2005 either.

Cannot assign a default value to a local variable.





Nope, it doesn't. Being able to assign a value to a variable when it is declared is new to SQL Server 2008. Guess what, we upgraded our PeopleSoft systems to SQL Server 2008 EE. Now, we just need to start upgrading our other systems.


Good catch on the 'type' Smile

Actually, in 2005 it needs to be...

DECLARE @selectDate DATETIME
SET @selectDate = getdate() - 365
;

I'm jumping around between SQL2000, SQL2005, SQL2008, Oracle10g, and DB2... nutz.



Pretty sure.

Table/Index defs


USE [SandBox]
GO
/****** Object: Table [dbo].[JBMTest] Script Date: 03/15/2010 12:49:16 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[JBMTest](
[RowNum] [int] IDENTITY(1,1) NOT NULL,
[AccountID] [int] NOT NULL,
[Amount] [money] NOT NULL,
[Date] [datetime] NOT NULL
) ON [PRIMARY]

GO

/****** Object: Index [IX_JBMTest] Script Date: 03/15/2010 12:49:16 ******/
CREATE CLUSTERED INDEX [IX_JBMTest] ON [dbo].[JBMTest]
(
[Date] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

/****** Object: Index [IX_JBMTest_AccountID_Date] Script Date: 03/15/2010 12:49:16 ******/
CREATE NONCLUSTERED INDEX [IX_JBMTest_AccountID_Date] ON [dbo].[JBMTest]
(
[AccountID] ASC,
[Date] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]



Simple query:


select * from dbo.JBMTest where Date > getdate() - 365



Actual execution plan attached.

There are 1,000,000 records in the test table.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Attachments
IndexTest.sqlplan (28 views, 4.00 KB)
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search