Function works when passing in parameters, does not work with CROSS APPLY

Question

Post reply

Function works when passing in parameters, does not work with CROSS APPLY

kyagi.jo

SSCrazy

Points: 2559
More actions
March 1, 2017 at 2:54 am

#319156

Hi,
I have a function that looks to work fine when I pass in parameters, but when I pass in a single rowed table with those same parameters, it does not complete. Can anyone advise why this might be the case? Looking into it, it is something to do with how I pass in my begin date, as the function works fine if I pass in all parameters, other than hardcoding this single parameter.
I can also get it to work by rearranging the joins slightly, but I'm trying to find an efficient execution plan for this, and the one with my rearranged joins is grabbing a lot of rows initially, which I am trying to avoid through the rewrite.
This is the function:

CREATE FUNCTION [dbo].[itvf_GetMissingReads_nulls] (@BeginDate DATETIME, @EndDate DATETIME, @meter_point NVARCHAR(50) )
RETURNS TABLE
AS
RETURN
/*
DECLARE @begindate datetime
DECLARE @enddate datetime
DECLARE @meter_point nvarchar(50)
SET @begindate = '20151125'
SET @enddate = '20151127'
SET @meter_point = 'xyz';*/
WITH
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 1*10^1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^2 or 10000 rows
E(N) AS(SELECT ROW_NUMBER() OVER (ORDER BY n) FROM e4)
SELECT DATEADD(DAY, N-1, @BeginDate) [DateTime], tmp.settlement_key, dt.date_key, @meter_point AS meter_point--, fc.read_value,
FROM E
-- matrix of dates
CROSS JOIN ( SELECT ds.settlement_key, ds.settlement_type_key, ds.settlement_period, ds.settlement_name , dm.meter_point
FROM dimmeter dm
INNER JOIN dbo.DimSettlement ds ON ds.settlement_type_key = dm.settlement_type_key
WHERE meter_point = @meter_point
AND dm.end_date IS NULL AND dm.fact_loading_enabled = 1 ) AS tmp
INNER JOIN dimtime dt ON dt.date_value = DATEADD(DAY, N-1, @BeginDate)
-- I changed the join order by placing join here
INNER JOIN dimmeter dm ON tmp.meter_point = dm.meter_point
LEFT outer JOIN dbo.FactConsumption fc
-- works with this line instead
--INNER JOIN dimmeter dm ON dm.meter_key = fc.meter_key AND dm.meter_point = @meter_point
ON fc.meter_key = dm.meter_key
AND fc.date_key = dt.date_key
AND fc.settlement_key = tmp.settlement_key
AND fc.read_type_key = 1
WHERE /*DATEADD(DAY, N-1, @BeginDate) >= @begindate
AND DATEADD(DAY, N-1, @BeginDate) <= @enddate */
DATEDIFF(DAY, @BeginDate, @EndDate) >= N-1
GROUP BY E.N, tmp.settlement_key, dt.date_key
HAVING SUM(ISNULL(fc.date_key,0))=0
;
GO
This is how I am running the function, and the results I am seeing:
--works
SELECT * FROM [dbo].[itvf_GetMissingReads_nulls]('20151125','20151127', '00000000000035050475')
ORDER BY 1,2
--works
DECLARE @meter_table TABLE (
meter_point nvarchar(50)
, min_date DATETIME
, max_date DATETIME
) ;
INSERT INTO @meter_table
( meter_point, min_date, max_date )
VALUES ( N'00000000000035050475', -- meter_point - nvarchar(50)
'20151125', -- min_date - datetime
'20151127' -- max_date - datetime
)
SELECT mt.min_date, mt.max_date, tbl.*
FROM @meter_table mt
CROSS APPLY [dbo].[itvf_GetMissingReads_nulls]('20151125', mt.max_date, mt.meter_point) tbl
-- doesn't work!
DECLARE @meter_table TABLE (
meter_point nvarchar(50)
, min_date DATETIME
, max_date DATETIME
) ;
INSERT INTO @meter_table
( meter_point, min_date, max_date )
VALUES ( N'00000000000035050475', -- meter_point - nvarchar(50)
'20151125', -- min_date - datetime
'20151127' -- max_date - datetime
)
SELECT mt.min_date, mt.max_date, tbl.*
FROM @meter_table mt
CROSS APPLY [dbo].[itvf_GetMissingReads_nulls](mt.min_date, mt.max_date, mt.meter_point) tbl
Any assistance would be appreciated - I'm a bit stumped at this point, although I will do some more reading up on how CROSS APPLY works.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

When you say 'doesn't work', what do you mean?

No results?
Wrong results?
Error message?
Something else

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

kyagi.jo SSCrazy Points: 2559 More actions · Answer 2

It just keeps running forever, whereas it normally would finish in less than a second. Expected results for me would be returning a dataset of maybe around 300 rows, which denote periods where I am missing data.

kyagi.jo SSCrazy Points: 2559 More actions · Answer 3

Hmm I just realised even though it never completes, I can still look at the estimated plan. It definitely looks significantly different when I pass in the value in the table, rather than hardcode it - it is basically returning every row in my fact table multiple times, as the estimate row count is something like 5 billion!

I'll look into whether I can add some query hints to guide it better, I see that in one query plan it is a hash match inner join, in the other, it is a nested loop inner join.

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 4

Possibly a parameter sniffing problem. Why are you staging the parameter values in @meter_table instead of passing them directly to the function?

John

kyagi.jo SSCrazy Points: 2559 More actions · Answer 5

Ok, I fixed the issue by joining onto my fact table specifically using a loop join, rather than the generic join. It now runs like greased lightning

I wanted to run the function for a table. I was making a one row table as the simple case (it wasn't even working for this, so it definitely wasn't working for the dataset that I wanted to run this function through)

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 6

kyagi.jo - Wednesday, March 1, 2017 3:44 AM
It just keeps running forever, whereas it normally would finish in less than a second. Expected results for me would be returning a dataset of maybe around 300 rows, which denote periods where I am missing data.

Of course it does. Even with a range of three days inclusive, your code generates the whole set of 10,000 dates:

DECLARE @begindate datetime

DECLARE @enddate datetime

SET @begindate = '20151125'

SET @enddate = '20151127'

WITH

E1(N) AS (

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

), -- 1*10^1 or 10 rows

E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows

E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^2 or 10000 rows

E(N) AS(SELECT ROW_NUMBER() OVER (ORDER BY n) FROM e4)

SELECT

DATEADD(DAY, N-1, @BeginDate) [DateTime] --, tmp.settlement_key, dt.date_key, @meter_point AS meter_point--, fc.read_value,

FROM E -- matrix of dates

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

kyagi.jo SSCrazy Points: 2559 More actions · Answer 7

I think that's how tally tables work though? I haven't been able to track any slowness down to the use of the tally table, at least as far as I can see in my query plan. If anything, using the tally tables has sped up the query from its previous incarnation.

Jeff Moden SSC Guru Points: 1004591 More actions · Answer 8

ChrisM@Work - Wednesday, March 1, 2017 5:31 AM
kyagi.jo - Wednesday, March 1, 2017 3:44 AM
It just keeps running forever, whereas it normally would finish in less than a second. Expected results for me would be returning a dataset of maybe around 300 rows, which denote periods where I am missing data.
Of course it does. Even with a range of three days inclusive, your code generates the whole set of 10,000 dates:
DECLARE @begindate datetime
DECLARE @enddate datetime
SET @begindate = '20151125'
SET @enddate = '20151127'
WITH
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 1*10^1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^2 or 10000 rows
E(N) AS(SELECT ROW_NUMBER() OVER (ORDER BY n) FROM e4)
SELECT
DATEADD(DAY, N-1, @BeginDate) [DateTime] --, tmp.settlement_key, dt.date_key, @meter_point AS meter_point--, fc.read_value,
FROM E -- matrix of dates

I was also going to point that out. There's a huge accidental cross join formed within the function. For a separate run, it may look fine but people tend to forget that iTVFs don't actually run separately and should never be treated as if they were a separate, materialized result set. The code of an iTVF is incorporated into the execution plan of the calling code just like it would for a view.

What needs to be done inside the function is that the value of e.N needs to be limited very early and in such a fashion that the optimizer doesn't bypass the early limit. That usually means that a TOP (someformula) should be included to limit the value of e.N.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

kyagi.jo SSCrazy Points: 2559 More actions · Answer 9

This was the intention of the line:
DATEDIFF(DAY, @BeginDate, @EndDate) >= N-1

Was this not done soon enough, as it is in the WHERE clause?

Luis Cazares SSC Guru Points: 183706 More actions · Answer 10

kyagi.jo - Wednesday, March 1, 2017 7:23 AM
I think that's how tally tables work though? I haven't been able to track any slowness down to the use of the tally table, at least as far as I can see in my query plan. If anything, using the tally tables has sped up the query from its previous incarnation.

No, you should limit the rows to avoid extra work. Here's an example on how to do it.
CREATE FUNCTION [dbo].[itvf_GetMissingReads_nulls] (@BeginDate DATETIME, @EndDate DATETIME, @meter_point NVARCHAR(50) ) RETURNS TABLE AS RETURN /* DECLARE @begindate datetime DECLARE @enddate datetime DECLARE @meter_point nvarchar(50) SET @begindate = '20151125' SET @enddate = '20151127' SET @meter_point = 'xyz';*/ WITH E1(N) AS ( SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 ), -- 1*10^1 or 10 rows E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^2 or 10000 rows E (N) AS (SELECT 0 UNION ALL SELECT TOP(DATEDIFF(DAY, @BeginDate, @EndDate)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM e4) SELECT DATEADD(DAY, N, @BeginDate) [DateTime], tmp.settlement_key, dt.date_key, @meter_point AS meter_point --, fc.read_value, FROM E -- matrix of dates CROSS JOIN ( SELECT ds.settlement_key, ds.settlement_type_key, ds.settlement_period, ds.settlement_name , dm.meter_point FROM dimmeter dm INNER JOIN dbo.DimSettlement ds ON ds.settlement_type_key = dm.settlement_type_key WHERE meter_point = @meter_point AND dm.end_date IS NULL AND dm.fact_loading_enabled = 1 ) AS tmp INNER JOIN dimtime dt ON dt.date_value = DATEADD(DAY, N, @BeginDate) -- I changed the join order by placing join here INNER JOIN dimmeter dm ON tmp.meter_point = dm.meter_point LEFT outer JOIN dbo.FactConsumption fc -- works with this line instead --INNER JOIN dimmeter dm ON dm.meter_key = fc.meter_key AND dm.meter_point = @meter_point ON fc.meter_key = dm.meter_key AND fc.date_key = dt.date_key AND fc.settlement_key = tmp.settlement_key AND fc.read_type_key = 1 GROUP BY E.N, tmp.settlement_key, dt.date_key HAVING SUM(ISNULL(fc.date_key,0))=0;

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 11

Luis Cazares - Wednesday, March 1, 2017 7:36 AM
kyagi.jo - Wednesday, March 1, 2017 7:23 AM
I think that's how tally tables work though? I haven't been able to track any slowness down to the use of the tally table, at least as far as I can see in my query plan. If anything, using the tally tables has sped up the query from its previous incarnation.
No, you should limit the rows to avoid extra work. Here's an example on how to do it.
CREATE FUNCTION [dbo].[itvf_GetMissingReads_nulls] (@BeginDate DATETIME, @EndDate DATETIME, @meter_point NVARCHAR(50) ) RETURNS TABLE AS RETURN /* DECLARE @begindate datetime DECLARE @enddate datetime DECLARE @meter_point nvarchar(50) SET @begindate = '20151125' SET @enddate = '20151127' SET @meter_point = 'xyz';*/ WITH E1(N) AS ( SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 ), -- 1*10^1 or 10 rows E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^2 or 10000 rows E (N) AS (SELECT 0 UNION ALL SELECT TOP(DATEDIFF(DAY, @BeginDate, @EndDate)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM e4) SELECT DATEADD(DAY, N, @BeginDate) [DateTime], tmp.settlement_key, dt.date_key, @meter_point AS meter_point --, fc.read_value, FROM E -- matrix of dates CROSS JOIN ( SELECT ds.settlement_key, ds.settlement_type_key, ds.settlement_period, ds.settlement_name , dm.meter_point FROM dimmeter dm INNER JOIN dbo.DimSettlement ds ON ds.settlement_type_key = dm.settlement_type_key WHERE meter_point = @meter_point AND dm.end_date IS NULL AND dm.fact_loading_enabled = 1 ) AS tmp INNER JOIN dimtime dt ON dt.date_value = DATEADD(DAY, N, @BeginDate) -- I changed the join order by placing join here INNER JOIN dimmeter dm ON tmp.meter_point = dm.meter_point LEFT outer JOIN dbo.FactConsumption fc -- works with this line instead --INNER JOIN dimmeter dm ON dm.meter_key = fc.meter_key AND dm.meter_point = @meter_point ON fc.meter_key = dm.meter_key AND fc.date_key = dt.date_key AND fc.settlement_key = tmp.settlement_key AND fc.read_type_key = 1 GROUP BY E.N, tmp.settlement_key, dt.date_key HAVING SUM(ISNULL(fc.date_key,0))=0;

I think you need to add 1 to the difference in days between startdate and enddate:

DECLARE @begindate datetime

DECLARE @enddate datetime

DECLARE @meter_point nvarchar(50)

SET @begindate = '20151125'

SET @enddate = '20151127'

SET @meter_point = 'xyz';

WITH

E1 (N) AS (

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

), -- 1*10^1 or 10 rows

E2 AS (SELECT n = 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows

E4 AS (SELECT n = 1 FROM E2 a, E2 b), -- 1*10^2 or 10000 rows

_Tally AS (

SELECT TOP(DATEDIFF(DAY,@begindate,@enddate) + 1)

n = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1

FROM e4

)

SELECT GeneratedDate = DATEADD(DAY,n, @BeginDate)

FROM _Tally E

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 12

Luis, how are you getting your code to format nicely? Until a couple of days ago, I was copying from SSMS to a text file then into here - but that doesn't appear to work now.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 13

Combine the tally table with the dimtime table to clean up the main body of the query

DROP TABLE #dimtime -- this is just sample data to mimic the actual dimtime table

SELECT * INTO #dimtime FROM (VALUES

(1,CAST('20151124' AS DATETIME)),

(2,CAST('20151125' AS DATETIME)),

(3,CAST('20151126' AS DATETIME)),

(4,CAST('20151127' AS DATETIME)),

(5,CAST('20151128' AS DATETIME))) d (date_key, date_value)

DECLARE @begindate datetime

DECLARE @enddate datetime

DECLARE @meter_point nvarchar(50)

SET @begindate = '20151125'

SET @enddate = '20151127'

SET @meter_point = 'xyz';

WITH

E1(N) AS (

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

), -- 1*10^1 or 10 rows

E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows

E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^2 or 10000 rows

_T (N) AS (SELECT 0 UNION ALL

SELECT TOP(DATEDIFF(DAY, @BeginDate, @EndDate))

ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

FROM e4),

DateRange AS (

SELECT

dt.date_key,

ActualDate = DATEADD(DAY, _T.N, @BeginDate) -- just for show - not required by query

FROM _T

INNER JOIN #dimtime dt

ON dt.date_value = DATEADD(DAY, _T.N, @BeginDate)

)

SELECT *

FROM DateRange

-- wtf? The datatypes differ in the UNION but it still works. Sweeeet, Luis!

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Luis Cazares SSC Guru Points: 183706 More actions · Answer 14

ChrisM@Work - Wednesday, March 1, 2017 7:48 AM
Luis, how are you getting your code to format nicely? Until a couple of days ago, I was copying from SSMS to a text file then into here - but that doesn't appear to work now.

It's not even formatted the same way I have it in SSMS. Apparently some spaces are removed. I just copy and paste directly (when I'm in a good mood, I'll format in here using the preview).
I'm using Firefox 45.6.0

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Function works when passing in parameters, does not work with CROSS APPLY

Cookies on SQLServerCentral