Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««7891011

The Numbers Table Expand / Collapse
Author
Message
Posted Friday, November 28, 2008 8:53 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 5:56 PM
Points: 36,772, Visits: 31,227
Wow... now that's interesting. We need to find out what the differences are...

Let's start out with one of my previous requests:

1. What is the CPU, Ram, Harddrive, etc, for your box.
2. What version, sp, and cummulative update for SQL Server are you using?

As a side bar, the one thing that this proves, so far, is that the previous claims of a lot of rows in the RefDate table wasn't the problem and neither was the set based solution. I'd say, we've shifted gears... on every system I've ever seen, the RBAR solution looses very badly. Yet, on your system, it's as good as and, apparently, sometimes better than the set based solution. WHAT is the difference that makes this possible?


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #610424
Posted Friday, November 28, 2008 8:57 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, July 25, 2014 3:18 AM
Points: 323, Visits: 2,200
Lynn Pettis (11/28/2008)
Time to compare functions.

In the code below I create a 1,000,000 row Tally table starting at 1. The table has one column, n. This is also clustered primary key.

I then create two functions, the first is the RBAR TVF, the second is a NoRBAR TVF using the Tally table.

I then use these functions to load a Table variable with dates.


Can you please look at the test code I posted earlyer and add RBAR and non-RBAR functions based on the tally FUNCTION I provided? I do not intend to cloud the issue, but a lot of the time a tally function is better then a tally table.

And most non-inline functions need an indexed table result to perform optimal, your test code had none atm. In my tests that alone halved the time of the queries, be it an RBAR or a non-RBAR function being used.
Post #610430
Posted Friday, November 28, 2008 9:01 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 5:56 PM
Points: 36,772, Visits: 31,227
Jeff Moden (11/28/2008)
Wow... now that's interesting. We need to find out what the differences are...

Let's start out with one of my previous requests:

1. What is the CPU, Ram, Harddrive, etc, for your box.
2. What version, sp, and cummulative update for SQL Server are you using?

As a side bar, the one thing that this proves, so far, is that the previous claims of a lot of rows in the RefDate table wasn't the problem and neither was the set based solution. I'd say, we've shifted gears... on every system I've ever seen, the RBAR solution looses very badly. Yet, on your system, it's as good as and, apparently, sometimes better than the set based solution. WHAT is the difference that makes this possible?


Oh, yeah... I almost forgot. It's silly, but we need to know for sure... please also verify that the row counts and the data being returned for both are identical. Thanks.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #610432
Posted Friday, November 28, 2008 9:03 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 5:56 PM
Points: 36,772, Visits: 31,227
Hey folks... I've got outside "obligations"... I probably won't be able to return to this for a while. Keep focused... we need to find out why the system jacRoberts is running on handles the RBAR solution so very well. Heh... no feather fluffing, name calling, or stating the "obvious", eh? Stick to the subject... it's a big sandbox with room enough for us all. :P

--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #610437
Posted Friday, November 28, 2008 9:15 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Wednesday, July 23, 2014 3:19 AM
Points: 338, Visits: 1,421
Jeff Moden (11/28/2008)
jacRoberts,

I have SET NOCOUNT ON in the test code so I can't do the verification. Pardon me for asking stupid questions, but I have to check... Have you verified that both result sets are returning both identical rows and an identical number of rows?

There's something else going on here and the devil may be in the data...

Using the methods found in the link in my signature, would you post the table creation statements for the Event and RefDate tables that YOU have along with 10 rows for each. Be sure to include all indexes and keys on each, please. This is very interesting and I'm trying to get to the bottom of it because it could be a "fix" for other RBAR problems.

Also, it would be good to know more about the machine and the exact version of SQL Server that you're using to do this. Thanks.


The machine is
Windows 2003 SP1
3584 MB
3.60 GHz Intel Xeon (Hyper-threaded)

Microsoft SQL Server 2005 - 9.00.3042.00 (Intel X86) Feb 9 2007 22:47:07 Copyright (c) 1988-2005 Microsoft Corporation Standard Edition on Windows NT 5.2 (Build 3790: Service Pack 1)

The table is:
/****** Object:  Table [dbo].[Event]    Script Date: 11/28/2008 16:08:09 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[Event](
[EventID] [int] IDENTITY(1,1) NOT NULL,
[SomeInt] [int] NULL,
[SomeLetters2] [char](2) NULL,
[SomeCSV] [varchar](80) NULL,
[SomeMoney] [money] NULL,
[SomeDate] [datetime] NULL,
[SomeHex12] [varchar](12) NULL,
PRIMARY KEY CLUSTERED
(
[EventID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

G

O


Event Rows

1 34148 AL Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10 59.51 2002-09-07 00:00:00.000 C2E0CE2DB52A
2 48444 AH Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10 43.92 2009-12-19 00:00:00.000 43C6CBC359A5
3 43988 FG Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10 38.34 2001-02-24 00:00:00.000 20ED33B58595
4 7548 IG Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10 46.25 2004-01-02 00:00:00.000 E4AA25010C59
5 40033 UE Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10 89.17 2005-07-31 00:00:00.000 1A6C3EDF8349
6 24973 XL Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10 2.86 2008-04-26 00:00:00.000 67611BAE6164
7 43800 HX Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10 41.12 2005-01-02 00:00:00.000 AFA9625FE532
8 10951 FF Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10 50.07 2005-06-07 00:00:00.000 13847D7D89A7
9 22175 TK Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10 45.05 2003-08-04 00:00:00.000 83B9227B2A84
10 47765 GZ Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10 8.17 2000-10-26 00:00:00.000 819F6E858E39

RefDate:

CREATE TABLE [dbo].[RefDate](
[Date] [datetime] NOT NULL,
CONSTRAINT [PK_Date_Date] PRIMARY KEY CLUSTERED
(
[Date] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 100) ON [PRIMARY]
) ON [PRIMARY]


2000-01-02 00:00:00.000
2000-01-03 00:00:00.000
2000-01-04 00:00:00.000
2000-01-05 00:00:00.000
2000-01-06 00:00:00.000
2000-01-07 00:00:00.000
2000-01-08 00:00:00.000
2000-01-09 00:00:00.000
2000-01-10 00:00:00.000
2000-01-11 00:00:00.000

Both queries return the same rows.
Post #610442
Posted Friday, November 28, 2008 9:22 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, July 25, 2014 3:18 AM
Points: 323, Visits: 2,200
We might also need to make sure all tests are performed on a database with the same recovery modal (tempdb uses 'simple') and a comparable load if we are going into tiny differences.
Post #610448
Posted Friday, November 28, 2008 10:47 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Wednesday, July 23, 2014 3:19 AM
Points: 338, Visits: 1,421
My original actual query was a lot different and maybe I've over simplified it for want of a simple example. I used to have two tables, RefDates which was just a table of dates and RefTimePeriods which had a row for each 15 minute period in a day (96 rows in total). I then had a view which was a cross product of the RefDates and RefTimePeriods tables.


    CREATE TABLE [dbo].[RefDates]
(
[Date] datetime NOT NULL,
CONSTRAINT [PK_RefDates] PRIMARY KEY CLUSTERED
(
[Date] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]


CREATE TABLE [dbo].[RefTimePeriods]
(
[StartTime] datetime NOT NULL,
[EndTime] datetime NOT NULL,
[Hour] tinyint NOT NULL,
[Quarter] tinyint NOT NULL,
CONSTRAINT [PK_RefTimePeriods] PRIMARY KEY CLUSTERED
(
[Hour] ASC,
[Quarter] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]


CREATE VIEW [dbo].[RefDateTimePeriods] AS
(
SELECT RD.Date Date,
RT.Hour,
RT.Quarter
FROM RefDates RD,
RefTimePeriods RT
)

I changed the RefDateTimePeriods view to be a TVF that used the GenRefDates TVF.

CREATE FUNCTION dbo.RefDateTimePeriods
(
@StartDateTime smalldatetime,
@EndDateTime smalldatetime
)
RETURNS @Table TABLE
(
Date datetime,
Hour tinyint,
Quarter tinyint
)
AS
BEGIN
INSERT INTO @Table(Date, Hour, Quarter)
SELECT RD.Date Date,
RT.Hour,
RT.Quarter
FROM dbo.GenRefDates(@StartDateTime, @EndDateTime) RD
CROSS JOIN dbo.RefTimePeriods RT
RETURN
END
GO

The RefDateTimePeriods TVF was then used in another view

CREATE VIEW Downtime AS 
(
SELECT *
FROM RefDateTimePeriods R
LEFT JOIN myTable1 S
ON S.Date = R.Date
AND S.Hour = R.Hour
AND S.Quarter = R.Quarter
LEFT JOIN myTable2 A
ON A.Date = S.Date
AND A.Hour = S.Hour
AND A.Quarter = S.Quarter
AND A.myColumn1 IN (2, 3, 4, 5, 6, 7, 8, 9, 10)
)

This Downtime view is used in reports with outer joins and these ran slowly but sped up when a RBAR TVF was added that only returned the required rows. The difference in the run time of the two methods was orders of magnitude. If I had used a set based query in the TVF instead of an RBAR then I would potentially save a couple of milliseconds but this would be totally insignificant to the total run time of the report.
Post #610490
Posted Friday, November 28, 2008 4:15 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 5:56 PM
Points: 36,772, Visits: 31,227
jacroberts (11/28/2008)
My original actual query was a lot different and maybe I've over simplified it for want of a simple example.



I agree and the change you made seems to have also changed the basic logic of what they had done. It's entirely possible that they had formed an inadvertant triangular or cross join that produced millions of "internal" rows and the change you made, somehow resolved that.

Going back to the code and data you posted in response to my questions... it's obvious to me that you used the same code I posted and that's what I wanted to know so we can compare apples to apples. I actually expected the RBAR and the set based code to work within milliseconds of each other because it was my understanding that the TVF would calculate just a single time because it's in the FROM clause as a derived table. That, notwithstanding, let's review what we know...

1. We just proved that the performance problem had nothing to do with the size of the date table they were using.
2. We just proved that a set based solution was never the problem and caused no performance problem. Improperly written code that looked set based may have been the problem.
3. We are (or, at least I am) speculating that your change to the RBAR function was more a symptomatic fix of their problem because you may have resolved an accidental non-linear join and that a similar fix using proper set based methods probably would have also fixed the problem.
4. I have to go back through Peter's tests and see what's up with those, but I suspect we'll find what has happened on all other machines... either the RBAR performs horribly slower, or it approximately ties within some miniscule fraction of a second owing only to the extremely low row count of 31. The "tie" is more what I would have expected because I was under the impression that TVF's used as derived tables would only be resolved once.
5. We still don't know why the "tie" occurs on jacRoberts machine, but doesn't on some other machines. This is particularly confusing for me because jacRoberts is using the "Standard Edition" and I'm using the supposedly more robust "Developer's Edition" which is actually the "Enterprise Edition" with nothing more different than "licensing".

While it would still be interesting why jacRobert's machine seems to have correctly resolved the TVF as a single result set and why mine did not (remember... IDENTICAL CODE was used on both), it's actually become further proof that RBAR should never be used. On some machines, it'll work just fine for this very limited requirement of only 31 rows... on others, the RBAR code takes a relatively huge amount more. Further, if some poor developer, caught in the squeeze of producing something on time, finds the RBAR code and uses it in an inappropriate manner rather than for generating only 31 rows on a very repeatable basis like a split for a large batch file or multiple long winded splits on parameters from a GUI, (s)he'll be in deep Kimchie when it comes to performance and scalability.

So, the bottom line is, jacRoberts was right about his fix... despite it being RBAR, it worked for him just fine because it correctly resolved only 1 time on his machine and for only 31 days. We proved in a small million row example that proper set based code has the same effect on those small number of rows. I'd use the method that worked with some good performance on both machines... that means I'd recommend not using the RBAR solution... ever.

jacRoberts and Peter... thanks for hanging in there on this very long thread.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #610596
Posted Monday, December 1, 2008 12:17 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Friday, December 6, 2013 8:32 AM
Points: 39, Visits: 298
Reading through this thread after the long w/e, I think people might be taking certain absolute conclusions about the way SS05 operates, specifically regards the optimizer (or lack thereof in some cases).

Some of you might be interested to revisit this old threadlet, discussing TVFs and RBAR, starting with a post I made: http://www.sqlservercentral.com/Forums/FindPost456536.aspx



The End.
Post #611585
« Prev Topic | Next Topic »

Add to briefcase «««7891011

Permissions Expand / Collapse