Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««12345»»»

Denormalization Strategies Expand / Collapse
Author
Message
Posted Monday, March 15, 2010 11:44 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 1:42 AM
Points: 20,799, Visits: 32,717
Les Cardwell (3/15/2010)
In spite of the criticism, it was still a simple example of minimal denormalization to achieve an end result rather than a full-on explosion of wide rows to reduce the NF to 0

Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...

WHERE P_MS.DateReceived > getdate() - 365

...would have been better expressed declaring a scalar variable:

DECLARE @selectDate = getdate()-365
...
WHERE P_MS.DateReceived > @selectDate
...


...which would allow the optimizer to use an index on DateReceived.


Unfortunately, denormalization for immutable datasets as we've used it in the past just doesn't scale, especially on large datasets...not to mention the escalating complexity (and headaches) it entails. Ironically, not even for data-warehouses that go the same route (MOLAP) vs. a Multi-dimensional ROLAP Snowflake Schema. The stastical implications are the subject of current research, though it's proving a bit of a challenge to account for all the complexities it can entail (code proliferation, data-correctness, increased complexity of refactoring to accomodate changing business rules, data-explosion, etc.). The data-tsunami is upon us


Also, this:
DECLARE @selectDate = getdate()-365

won't work. In SQL Server 2008 it needs to be like this:
DECLARE @selectDate datetime = getdate()-365




Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #883195
Posted Monday, March 15, 2010 11:45 AM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Today @ 5:27 PM
Points: 17,948, Visits: 15,946
Paul White (3/15/2010)
Normalize 'til it hurts...de-normalize* 'til it works!



Agreed.




Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw
Post #883198
Posted Monday, March 15, 2010 11:47 AM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Today @ 5:27 PM
Points: 17,948, Visits: 15,946
Alvin Ramard (3/15/2010)
Paul White (3/15/2010)
Jim,

Yes. Data warehouses are a totally different kettle.


It's normal for denormalization to be present in a data warehouse.

(Seriously, there was no pun intended.)



Absolutely. There should not be a lot of transactions occurring there and flatter structures can be much more beneficial.




Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw
Post #883201
Posted Monday, March 15, 2010 11:51 AM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Today @ 5:27 PM
Points: 17,948, Visits: 15,946
Lynn Pettis (3/15/2010)
Les Cardwell (3/15/2010)
In spite of the criticism, it was still a simple example of minimal denormalization to achieve an end result rather than a full-on explosion of wide rows to reduce the NF to 0

Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...

WHERE P_MS.DateReceived > getdate() - 365

...would have been better expressed declaring a scalar variable:

DECLARE @selectDate = getdate()-365
...
WHERE P_MS.DateReceived > @selectDate
...


...which would allow the optimizer to use an index on DateReceived.


Unfortunately, denormalization for immutable datasets as we've used it in the past just doesn't scale, especially on large datasets...not to mention the escalating complexity (and headaches) it entails. Ironically, not even for data-warehouses that go the same route (MOLAP) vs. a Multi-dimensional ROLAP Snowflake Schema. The stastical implications are the subject of current research, though it's proving a bit of a challenge to account for all the complexities it can entail (code proliferation, data-correctness, increased complexity of refactoring to accomodate changing business rules, data-explosion, etc.). The data-tsunami is upon us


Also, this:
DECLARE @selectDate = getdate()-365

won't work. In SQL Server 2008 it needs to be like this:
DECLARE @selectDate datetime = getdate()-365



For what it's worth, it doesn't work in 2005 either.
Cannot assign a default value to a local variable.





Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw
Post #883206
Posted Monday, March 15, 2010 11:58 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 1:42 AM
Points: 20,799, Visits: 32,717
CirquedeSQLeil (3/15/2010)
Lynn Pettis (3/15/2010)
Les Cardwell (3/15/2010)
In spite of the criticism, it was still a simple example of minimal denormalization to achieve an end result rather than a full-on explosion of wide rows to reduce the NF to 0

Interestingly enough, the biggest cost to the initial query, which probably exceeded benefits of denormalization, was using a 'function' in a WHERE predicate...

WHERE P_MS.DateReceived > getdate() - 365

...would have been better expressed declaring a scalar variable:

DECLARE @selectDate = getdate()-365
...
WHERE P_MS.DateReceived > @selectDate
...


...which would allow the optimizer to use an index on DateReceived.


Unfortunately, denormalization for immutable datasets as we've used it in the past just doesn't scale, especially on large datasets...not to mention the escalating complexity (and headaches) it entails. Ironically, not even for data-warehouses that go the same route (MOLAP) vs. a Multi-dimensional ROLAP Snowflake Schema. The stastical implications are the subject of current research, though it's proving a bit of a challenge to account for all the complexities it can entail (code proliferation, data-correctness, increased complexity of refactoring to accomodate changing business rules, data-explosion, etc.). The data-tsunami is upon us


Also, this:
DECLARE @selectDate = getdate()-365

won't work. In SQL Server 2008 it needs to be like this:
DECLARE @selectDate datetime = getdate()-365



For what it's worth, it doesn't work in 2005 either.
Cannot assign a default value to a local variable.



Nope, it doesn't. Being able to assign a value to a variable when it is declared is new to SQL Server 2008. Guess what, we upgraded our PeopleSoft systems to SQL Server 2008 EE. Now, we just need to start upgrading our other systems.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #883211
Posted Monday, March 15, 2010 12:10 PM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, January 30, 2014 4:28 PM
Points: 81, Visits: 258

Also, this:
DECLARE @selectDate = getdate()-365

won't work. In SQL Server 2008 it needs to be like this:
DECLARE @selectDate datetime = getdate()-365




For what it's worth, it doesn't work in 2005 either.
Cannot assign a default value to a local variable.




Nope, it doesn't. Being able to assign a value to a variable when it is declared is new to SQL Server 2008. Guess what, we upgraded our PeopleSoft systems to SQL Server 2008 EE. Now, we just need to start upgrading our other systems.


Good catch on the 'type' :)

Actually, in 2005 it needs to be...

DECLARE @selectDate DATETIME
SET @selectDate = getdate() - 365
;

I'm jumping around between SQL2000, SQL2005, SQL2008, Oracle10g, and DB2... nutz.


Dr. Les Cardwell, DCS-DSS
Enterprise Data Architect
Central Lincoln PUD
Post #883231
Posted Monday, March 15, 2010 12:17 PM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, January 30, 2014 4:28 PM
Points: 81, Visits: 258

Actually, this:
  WHERE P_MS.DateReceived > getdate() - 365

can use an index on DateReceived. The function call is on the right of the conditional and will only be calculated once.


Hmmm... positive? Since 'getdate()' is a non-deterministic function, like all non-deterministic functions, we've always assigned them to a scalar variable to ensure the dbms won't perform a table-scan...although admittedly, these days they seem to be more implementation dependent.

From SQL Server Help...

For example, the function GETDATE() is nondeterministic. SQL Server puts restrictions on various classes of nondeterminism. Therefore, nondeterministic functions should be used carefully. The lack of strict determinism of a function can block valuable performance optimizations. Certain plan reordering steps are skipped to conservatively preserve correctness. Additionally, the number, order, and timing of calls to user-defined functions is implementation-dependent. Do not rely on these invocation semantics.


JFWIW...


Dr. Les Cardwell, DCS-DSS
Enterprise Data Architect
Central Lincoln PUD
Post #883236
Posted Monday, March 15, 2010 12:39 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, October 17, 2014 11:18 AM
Points: 22, Visits: 141
Good points made. I have never found using getdate() inside a SQL query to be problematic in my execution plans. However, if it's "best practice" to not do it, then I'll probably stop. I had never thought about it, before now.
Post #883257
Posted Monday, March 15, 2010 12:52 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 1:42 AM
Points: 20,799, Visits: 32,717
Les Cardwell (3/15/2010)

Also, this:
DECLARE @selectDate = getdate()-365

won't work. In SQL Server 2008 it needs to be like this:
DECLARE @selectDate datetime = getdate()-365




For what it's worth, it doesn't work in 2005 either.
Cannot assign a default value to a local variable.




Nope, it doesn't. Being able to assign a value to a variable when it is declared is new to SQL Server 2008. Guess what, we upgraded our PeopleSoft systems to SQL Server 2008 EE. Now, we just need to start upgrading our other systems.


Good catch on the 'type' :)

Actually, in 2005 it needs to be...

DECLARE @selectDate DATETIME
SET @selectDate = getdate() - 365
;

I'm jumping around between SQL2000, SQL2005, SQL2008, Oracle10g, and DB2... nutz.



Pretty sure.

Table/Index defs

USE [SandBox]
GO
/****** Object: Table [dbo].[JBMTest] Script Date: 03/15/2010 12:49:16 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[JBMTest](
[RowNum] [int] IDENTITY(1,1) NOT NULL,
[AccountID] [int] NOT NULL,
[Amount] [money] NOT NULL,
[Date] [datetime] NOT NULL
) ON [PRIMARY]

GO

/****** Object: Index [IX_JBMTest] Script Date: 03/15/2010 12:49:16 ******/
CREATE CLUSTERED INDEX [IX_JBMTest] ON [dbo].[JBMTest]
(
[Date] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

/****** Object: Index [IX_JBMTest_AccountID_Date] Script Date: 03/15/2010 12:49:16 ******/
CREATE NONCLUSTERED INDEX [IX_JBMTest_AccountID_Date] ON [dbo].[JBMTest]
(
[AccountID] ASC,
[Date] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

Simple query:

select * from dbo.JBMTest where Date > getdate() - 365

Actual execution plan attached.

There are 1,000,000 records in the test table.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)


  Post Attachments 
IndexTest.sqlplan (19 views, 4.81 KB)
Post #883274
Posted Monday, March 15, 2010 1:10 PM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Today @ 5:27 PM
Points: 17,948, Visits: 15,946
Lynn Pettis (3/15/2010)

[code="sql"]
USE [SandBox]
GO
/****** Object: Table [dbo].[JBMTest] Script Date: 03/15/2010 12:49:16 ******/



Looks like a familiar setup




Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw
Post #883289
« Prev Topic | Next Topic »

Add to briefcase «««12345»»»

Permissions Expand / Collapse