March 1, 2017 at 2:54 am
Hi,
I have a function that looks to work fine when I pass in parameters, but when I pass in a single rowed table with those same parameters, it does not complete. Can anyone advise why this might be the case? Looking into it, it is something to do with how I pass in my begin date, as the function works fine if I pass in all parameters, other than hardcoding this single parameter.
I can also get it to work by rearranging the joins slightly, but I'm trying to find an efficient execution plan for this, and the one with my rearranged joins is grabbing a lot of rows initially, which I am trying to avoid through the rewrite.
This is the function:
CREATE FUNCTION [dbo].[itvf_GetMissingReads_nulls] (@BeginDate DATETIME, @EndDate DATETIME, @meter_point NVARCHAR(50) )
RETURNS TABLE
AS
RETURN
/*
DECLARE @begindate datetime
DECLARE @enddate datetime
DECLARE @meter_point nvarchar(50)
SET @begindate = '20151125'
SET @enddate = '20151127'
SET @meter_point = 'xyz';*/
WITH
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 1*10^1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^2 or 10000 rows
E(N) AS(SELECT ROW_NUMBER() OVER (ORDER BY n) FROM e4)
SELECT DATEADD(DAY, N-1, @BeginDate) [DateTime], tmp.settlement_key, dt.date_key, @meter_point AS meter_point--, fc.read_value,
FROM E
-- matrix of dates
CROSS JOIN ( SELECT ds.settlement_key, ds.settlement_type_key, ds.settlement_period, ds.settlement_name , dm.meter_point
FROM dimmeter dm
INNER JOIN dbo.DimSettlement ds ON ds.settlement_type_key = dm.settlement_type_key
WHERE meter_point = @meter_point
AND dm.end_date IS NULL AND dm.fact_loading_enabled = 1 ) AS tmp
INNER JOIN dimtime dt ON dt.date_value = DATEADD(DAY, N-1, @BeginDate)
-- I changed the join order by placing join here
INNER JOIN dimmeter dm ON tmp.meter_point = dm.meter_point
LEFT outer JOIN dbo.FactConsumption fc
-- works with this line instead
--INNER JOIN dimmeter dm ON dm.meter_key = fc.meter_key AND dm.meter_point = @meter_point
ON fc.meter_key = dm.meter_key
AND fc.date_key = dt.date_key
AND fc.settlement_key = tmp.settlement_key
AND fc.read_type_key = 1
WHERE /*DATEADD(DAY, N-1, @BeginDate) >= @begindate
AND DATEADD(DAY, N-1, @BeginDate) <= @enddate */
DATEDIFF(DAY, @BeginDate, @EndDate) >= N-1
GROUP BY E.N, tmp.settlement_key, dt.date_key
HAVING SUM(ISNULL(fc.date_key,0))=0
;
GO
This is how I am running the function, and the results I am seeing:
--works
SELECT * FROM [dbo].[itvf_GetMissingReads_nulls]('20151125','20151127', '00000000000035050475')
ORDER BY 1,2
--works
DECLARE @meter_table TABLE (
meter_point nvarchar(50)
, min_date DATETIME
, max_date DATETIME
) ;
INSERT INTO @meter_table
( meter_point, min_date, max_date )
VALUES ( N'00000000000035050475', -- meter_point - nvarchar(50)
'20151125', -- min_date - datetime
'20151127' -- max_date - datetime
)
SELECT mt.min_date, mt.max_date, tbl.*
FROM @meter_table mt
CROSS APPLY [dbo].[itvf_GetMissingReads_nulls]('20151125', mt.max_date, mt.meter_point) tbl
-- doesn't work!
DECLARE @meter_table TABLE (
meter_point nvarchar(50)
, min_date DATETIME
, max_date DATETIME
) ;
INSERT INTO @meter_table
( meter_point, min_date, max_date )
VALUES ( N'00000000000035050475', -- meter_point - nvarchar(50)
'20151125', -- min_date - datetime
'20151127' -- max_date - datetime
)
SELECT mt.min_date, mt.max_date, tbl.*
FROM @meter_table mt
CROSS APPLY [dbo].[itvf_GetMissingReads_nulls](mt.min_date, mt.max_date, mt.meter_point) tbl
Any assistance would be appreciated - I'm a bit stumped at this point, although I will do some more reading up on how CROSS APPLY works.
March 1, 2017 at 3:38 am
When you say 'doesn't work', what do you mean?
No results?
Wrong results?
Error message?
Something else
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 1, 2017 at 3:44 am
It just keeps running forever, whereas it normally would finish in less than a second. Expected results for me would be returning a dataset of maybe around 300 rows, which denote periods where I am missing data.
March 1, 2017 at 3:59 am
Hmm I just realised even though it never completes, I can still look at the estimated plan. It definitely looks significantly different when I pass in the value in the table, rather than hardcode it - it is basically returning every row in my fact table multiple times, as the estimate row count is something like 5 billion!
I'll look into whether I can add some query hints to guide it better, I see that in one query plan it is a hash match inner join, in the other, it is a nested loop inner join.
March 1, 2017 at 4:04 am
Possibly a parameter sniffing problem. Why are you staging the parameter values in @meter_table instead of passing them directly to the function?
John
March 1, 2017 at 4:08 am
Ok, I fixed the issue by joining onto my fact table specifically using a loop join, rather than the generic join. It now runs like greased lightning
I wanted to run the function for a table. I was making a one row table as the simple case (it wasn't even working for this, so it definitely wasn't working for the dataset that I wanted to run this function through)
March 1, 2017 at 5:31 am
kyagi.jo - Wednesday, March 1, 2017 3:44 AMIt just keeps running forever, whereas it normally would finish in less than a second. Expected results for me would be returning a dataset of maybe around 300 rows, which denote periods where I am missing data.
Of course it does. Even with a range of three days inclusive, your code generates the whole set of 10,000 dates:
DECLARE @begindate datetime
DECLARE @enddate datetime
SET @begindate = '20151125'
SET @enddate = '20151127'
WITH
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 1*10^1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^2 or 10000 rows
E(N) AS(SELECT ROW_NUMBER() OVER (ORDER BY n) FROM e4)
SELECT
DATEADD(DAY, N-1, @BeginDate) [DateTime] --, tmp.settlement_key, dt.date_key, @meter_point AS meter_point--, fc.read_value,
FROM E -- matrix of dates
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 1, 2017 at 7:23 am
I think that's how tally tables work though? I haven't been able to track any slowness down to the use of the tally table, at least as far as I can see in my query plan. If anything, using the tally tables has sped up the query from its previous incarnation.
March 1, 2017 at 7:29 am
ChrisM@Work - Wednesday, March 1, 2017 5:31 AMkyagi.jo - Wednesday, March 1, 2017 3:44 AMIt just keeps running forever, whereas it normally would finish in less than a second. Expected results for me would be returning a dataset of maybe around 300 rows, which denote periods where I am missing data.Of course it does. Even with a range of three days inclusive, your code generates the whole set of 10,000 dates:
DECLARE @begindate datetime
DECLARE @enddate datetime
SET @begindate = '20151125'
SET @enddate = '20151127'
WITH
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 1*10^1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^2 or 10000 rows
E(N) AS(SELECT ROW_NUMBER() OVER (ORDER BY n) FROM e4)
SELECT
DATEADD(DAY, N-1, @BeginDate) [DateTime] --, tmp.settlement_key, dt.date_key, @meter_point AS meter_point--, fc.read_value,
FROM E -- matrix of dates
I was also going to point that out. There's a huge accidental cross join formed within the function. For a separate run, it may look fine but people tend to forget that iTVFs don't actually run separately and should never be treated as if they were a separate, materialized result set. The code of an iTVF is incorporated into the execution plan of the calling code just like it would for a view.
What needs to be done inside the function is that the value of e.N needs to be limited very early and in such a fashion that the optimizer doesn't bypass the early limit. That usually means that a TOP (someformula) should be included to limit the value of e.N.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 1, 2017 at 7:32 am
This was the intention of the line:DATEDIFF(DAY, @BeginDate, @EndDate) >= N-1
Was this not done soon enough, as it is in the WHERE clause?
March 1, 2017 at 7:36 am
kyagi.jo - Wednesday, March 1, 2017 7:23 AMI think that's how tally tables work though? I haven't been able to track any slowness down to the use of the tally table, at least as far as I can see in my query plan. If anything, using the tally tables has sped up the query from its previous incarnation.
No, you should limit the rows to avoid extra work. Here's an example on how to do it.
CREATE FUNCTION [dbo].[itvf_GetMissingReads_nulls] (@BeginDate DATETIME, @EndDate DATETIME, @meter_point NVARCHAR(50) )
RETURNS TABLE
AS
RETURN
/*
DECLARE @begindate datetime
DECLARE @enddate datetime
DECLARE @meter_point nvarchar(50)
SET @begindate = '20151125'
SET @enddate = '20151127'
SET @meter_point = 'xyz';*/
WITH
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 1*10^1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^2 or 10000 rows
E (N) AS (SELECT 0 UNION ALL
SELECT TOP(DATEDIFF(DAY, @BeginDate, @EndDate))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM e4)
SELECT DATEADD(DAY, N, @BeginDate) [DateTime],
tmp.settlement_key,
dt.date_key,
@meter_point AS meter_point
--, fc.read_value,
FROM E
-- matrix of dates
CROSS JOIN ( SELECT ds.settlement_key,
ds.settlement_type_key,
ds.settlement_period,
ds.settlement_name ,
dm.meter_point
FROM dimmeter dm
INNER JOIN dbo.DimSettlement ds ON ds.settlement_type_key = dm.settlement_type_key
WHERE meter_point = @meter_point
AND dm.end_date IS NULL
AND dm.fact_loading_enabled = 1 ) AS tmp
INNER JOIN dimtime dt ON dt.date_value = DATEADD(DAY, N, @BeginDate)
-- I changed the join order by placing join here
INNER JOIN dimmeter dm ON tmp.meter_point = dm.meter_point
LEFT outer JOIN dbo.FactConsumption fc
-- works with this line instead
--INNER JOIN dimmeter dm ON dm.meter_key = fc.meter_key AND dm.meter_point = @meter_point
ON fc.meter_key = dm.meter_key
AND fc.date_key = dt.date_key
AND fc.settlement_key = tmp.settlement_key
AND fc.read_type_key = 1
GROUP BY E.N, tmp.settlement_key, dt.date_key
HAVING SUM(ISNULL(fc.date_key,0))=0;
March 1, 2017 at 7:47 am
Luis Cazares - Wednesday, March 1, 2017 7:36 AMkyagi.jo - Wednesday, March 1, 2017 7:23 AMI think that's how tally tables work though? I haven't been able to track any slowness down to the use of the tally table, at least as far as I can see in my query plan. If anything, using the tally tables has sped up the query from its previous incarnation.No, you should limit the rows to avoid extra work. Here's an example on how to do it.
CREATE FUNCTION [dbo].[itvf_GetMissingReads_nulls] (@BeginDate DATETIME, @EndDate DATETIME, @meter_point NVARCHAR(50) )
RETURNS TABLE
AS
RETURN
/*
DECLARE @begindate datetime
DECLARE @enddate datetime
DECLARE @meter_point nvarchar(50)
SET @begindate = '20151125'
SET @enddate = '20151127'
SET @meter_point = 'xyz';*/
WITH
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 1*10^1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^2 or 10000 rows
E (N) AS (SELECT 0 UNION ALL
SELECT TOP(DATEDIFF(DAY, @BeginDate, @EndDate))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM e4)
SELECT DATEADD(DAY, N, @BeginDate) [DateTime],
tmp.settlement_key,
dt.date_key,
@meter_point AS meter_point
--, fc.read_value,
FROM E
-- matrix of dates
CROSS JOIN ( SELECT ds.settlement_key,
ds.settlement_type_key,
ds.settlement_period,
ds.settlement_name ,
dm.meter_point
FROM dimmeter dm
INNER JOIN dbo.DimSettlement ds ON ds.settlement_type_key = dm.settlement_type_key
WHERE meter_point = @meter_point
AND dm.end_date IS NULL
AND dm.fact_loading_enabled = 1 ) AS tmp
INNER JOIN dimtime dt ON dt.date_value = DATEADD(DAY, N, @BeginDate)
-- I changed the join order by placing join here
INNER JOIN dimmeter dm ON tmp.meter_point = dm.meter_point
LEFT outer JOIN dbo.FactConsumption fc
-- works with this line instead
--INNER JOIN dimmeter dm ON dm.meter_key = fc.meter_key AND dm.meter_point = @meter_point
ON fc.meter_key = dm.meter_key
AND fc.date_key = dt.date_key
AND fc.settlement_key = tmp.settlement_key
AND fc.read_type_key = 1
GROUP BY E.N, tmp.settlement_key, dt.date_key
HAVING SUM(ISNULL(fc.date_key,0))=0;
I think you need to add 1 to the difference in days between startdate and enddate:
DECLARE @begindate datetime
DECLARE @enddate datetime
DECLARE @meter_point nvarchar(50)
SET @begindate = '20151125'
SET @enddate = '20151127'
SET @meter_point = 'xyz';
WITH
E1 (N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 1*10^1 or 10 rows
E2 AS (SELECT n = 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows
E4 AS (SELECT n = 1 FROM E2 a, E2 b), -- 1*10^2 or 10000 rows
_Tally AS (
SELECT TOP(DATEDIFF(DAY,@begindate,@enddate) + 1)
n = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1
FROM e4
)
SELECT GeneratedDate = DATEADD(DAY,n, @BeginDate)
FROM _Tally E
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 1, 2017 at 7:48 am
Luis, how are you getting your code to format nicely? Until a couple of days ago, I was copying from SSMS to a text file then into here - but that doesn't appear to work now.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 1, 2017 at 8:10 am
Combine the tally table with the dimtime table to clean up the main body of the query
DROP TABLE #dimtime -- this is just sample data to mimic the actual dimtime table
SELECT * INTO #dimtime FROM (VALUES
(1,CAST('20151124' AS DATETIME)),
(2,CAST('20151125' AS DATETIME)),
(3,CAST('20151126' AS DATETIME)),
(4,CAST('20151127' AS DATETIME)),
(5,CAST('20151128' AS DATETIME))) d (date_key, date_value)
DECLARE @begindate datetime
DECLARE @enddate datetime
DECLARE @meter_point nvarchar(50)
SET @begindate = '20151125'
SET @enddate = '20151127'
SET @meter_point = 'xyz';
WITH
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 1*10^1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^2 or 10000 rows
_T (N) AS (SELECT 0 UNION ALL
SELECT TOP(DATEDIFF(DAY, @BeginDate, @EndDate))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM e4),
DateRange AS (
SELECT
dt.date_key,
ActualDate = DATEADD(DAY, _T.N, @BeginDate) -- just for show - not required by query
FROM _T
INNER JOIN #dimtime dt
ON dt.date_value = DATEADD(DAY, _T.N, @BeginDate)
)
SELECT *
FROM DateRange
-- wtf? The datatypes differ in the UNION but it still works. Sweeeet, Luis!
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 1, 2017 at 8:12 am
ChrisM@Work - Wednesday, March 1, 2017 7:48 AMLuis, how are you getting your code to format nicely? Until a couple of days ago, I was copying from SSMS to a text file then into here - but that doesn't appear to work now.
It's not even formatted the same way I have it in SSMS. Apparently some spaces are removed. I just copy and paste directly (when I'm in a good mood, I'll format in here using the preview).
I'm using Firefox 45.6.0
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy