June 9, 2016 at 3:45 pm
Hi all,
Need some expert help!. Have a pivot that I use to transpose rows to columns (dynamic pivot). Works fine except performance with large datasets sucks.
The table (tbl_forms_data below) has over 1.3M rows that I need to pivot on which is so slow it's timing out all the time. I thought using row_number would only fetch a limited dataset but it doesn't make any difference to the performance.
I think the inner select/pivot is trawling through the entire 1M rows and hence the poor performance. The query is below (obviously trimmed for clarity).
thanks
-----------------
select * from(select *
, ROW_NUMBER() OVER (ORDER BY formdate desc) rowid
from
(select
tf.formid,
formdate,
answer,
'field-' + cast(tff.filtertype as varchar(10)) + '-' + cast(tff.id as varchar(10)) as columnname
from tbl_forms tf
inner join tbl_forms_data tfd on tf.formid = tfd.formid
inner join tbl_forms_fields tff on ( tff.id = tfd.qid and tff.templateid = tf.templateid )
where tf.templateid = 1234
) as x
pivot ( max(answer) for columnname in ([field-0-1],[APPTIMI-0-2],
[field-0-3],[field-0-4],[field-0-5],[field-0-6],
[field-0-18])) p
) as forms
WHERE rowid BETWEEN 1 and 50
June 10, 2016 at 1:54 am
Yes, you will be seeing that as the WHERE clause relates to the output of the whole PIVOT, rather than the data which is being pivoted. Put that WHERE clause into the source data for the PIVOT instead.
Thomas Rushton
blog: https://thelonedba.wordpress.com
June 10, 2016 at 2:14 am
As Thomas said. Something like this:
select *
from (
SELECT
tf.formid,
??.formdate,
??.answer,
'field-' + cast(tff.filtertype as varchar(10)) + '-' + cast(tff.id as varchar(10)) as columnname
FROM (
SELECT *,
rowid = ROW_NUMBER() OVER (ORDER BY formdate desc)
FROM tbl_forms
) tf
inner join tbl_forms_data tfd on tf.formid = tfd.formid
inner join tbl_forms_fields tff on ( tff.id = tfd.qid and tff.templateid = tf.templateid )
where tf.templateid = 1234
AND tf.rowid BETWEEN 1 and 50
) as x
pivot (
max(answer) for columnname in ([field-0-1],[APPTIMI-0-2],[field-0-3],[field-0-4],[field-0-5],[field-0-6],[field-0-18])
) p
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 10, 2016 at 2:28 am
Chris/Thomas
Thanks for the replies.
Chris - Understand your solution however the row id in it comes up in the 30K range (even if I select from 0 to 50. It's obviously taking the row_number from the source rows before the pivot and I can be pivoting 3 columns or 30 into each row.
results from the query:
Row Id Formid formdate etc....
32433 4544 xxx
32478 4665 xxx
33104 5210 xxx
Thanks
June 10, 2016 at 2:40 am
kevin 26531 (6/10/2016)
Chris/ThomasThanks for the replies.
Chris - Understand your solution however the row id in it comes up in the 30K range (even if I select from 0 to 50. It's obviously taking the row_number from the source rows before the pivot and I can be pivoting 3 columns or 30 into each row.
results from the query:
Row Id Formid formdate etc....
32433 4544 xxx
32478 4665 xxx
33104 5210 xxx
Thanks
Could be the placement of the filter. Using TOP() for convenience:
FROM (
SELECT TOP(50) *
FROM tbl_forms
WHERE templateid = 1234
ORDER BY formdate desc
) tf
If it isn't clear at this point, then I'd recommend posting up some readily-consumable sample data.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 10, 2016 at 5:28 am
Ok been playing with this a lot this morning and I think I've fixed the issue with the row_number. Added a where templateid = xxx for the tbl_forms. Now at least it's producing the correct row number.
Performance is way way better than before in that's not trying to pull in the complete dataset. Only issue I have now that I can no longer order by one of the dynamic fields as the ROW_NUMBER() OVER (ORDER BY [field-0-1] desc) spits out an error with Field-0-1 as an invalid column. I'm guessing it's because the pivot is not done at that stage and therefore the pivot is not available at this point.
PS. Pivots really do hurt my brain.
select *
from (
SELECT
tf.formid,
tf.formdate,
tfd.answer,
'field-' + cast(tff.filtertype as varchar(10)) + '-' + cast(tff.id as varchar(10)) as columnname
FROM (
SELECT *,
rowid = ROW_NUMBER() OVER (ORDER BY formdate desc)
FROM tbl_forms
-----> where templateid = 1234 <------- This fixes the row number issue
) tf
inner join tbl_forms_data tfd on tf.formid = tfd.formid
inner join tbl_forms_fields tff on ( tff.id = tfd.qid and tff.templateid = tf.templateid )
where tf.templateid = 1234
AND tf.rowid BETWEEN 1 and 50
) as x
pivot (
max(answer) for columnname in ([field-0-1],[field-0-2],[field-0-3],[field-0-4],[field-0-5],[field-0-6],[field-0-18])
) p
June 10, 2016 at 5:58 am
If you don't like the PIVOT operator, you could use a cross tab query instead:
SELECT TOP(50)
tf.formid,
tf.formdate,
x.*
FROM tbl_forms tf
CROSS APPLY (
SELECT
[field-0-1] = MAX(CASE WHEN tff.filtertype = 0 AND tff.id = 1 THEN tfd.answer ELSE 0 END),
[field-0-2] = MAX(CASE WHEN tff.filtertype = 0 AND tff.id = 2 THEN tfd.answer ELSE 0 END),
[field-0-3] = MAX(CASE WHEN tff.filtertype = 0 AND tff.id = 3 THEN tfd.answer ELSE 0 END),
[field-0-4] = MAX(CASE WHEN tff.filtertype = 0 AND tff.id = 4 THEN tfd.answer ELSE 0 END),
[field-0-5] = MAX(CASE WHEN tff.filtertype = 0 AND tff.id = 5 THEN tfd.answer ELSE 0 END),
[field-0-6] = MAX(CASE WHEN tff.filtertype = 0 AND tff.id = 6 THEN tfd.answer ELSE 0 END),
[field-0-18] = MAX(CASE WHEN tff.filtertype = 0 AND tff.id = 18 THEN tfd.answer ELSE 0 END)
FROM tbl_forms_data tfd
INNER JOIN tbl_forms_fields tff
ON tff.id = tfd.qid
AND tff.templateid = tf.templateid -- outer reference
WHERE tfd.formid = tf.formid -- outer reference
) x
WHERE tf.templateid = 1234
ORDER BY tf.formdate DESC
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 10, 2016 at 6:18 am
Thanks Chris but it's a dynamic query. There could be 10 columns (after the pivot) or 100 so I can't hardcode fields into the script.
Had thought about just creating a table with the columns, tell the customer the max is 100 fields/columns and if they only use 10 then waste the other 90.
thanks
June 10, 2016 at 8:43 am
kevin 26531 (6/10/2016)
Thanks Chris but it's a dynamic query. There could be 10 columns (after the pivot) or 100 so I can't hardcode fields into the script.Had thought about just creating a table with the columns, tell the customer the max is 100 fields/columns and if they only use 10 then waste the other 90.
thanks
You can easily convert Chris' query into dynamic code.
DECLARE @SQL nvarchar(MAX)
SELECT @SQL =
N'SELECT TOP(50)
tf.formid,
tf.formdate,
x.*
FROM tbl_forms tf
CROSS APPLY (
SELECT' + NCHAR(10) +
STUFF(( SELECT CHAR(9) + ',[field-' + CAST( filtertype AS varchar(10))+ '-' + CAST( id AS varchar(10))
+ '] = MAX(CASE WHEN tff.filtertype = ' + CAST( filtertype AS varchar(10)) + ' AND tff.id = ' + CAST( id AS varchar(10))
+ ' THEN tfd.answer ELSE 0 END) ' + CHAR(10)
FROM tbl_forms_fields
FOR XML PATH(''), TYPE).value('./text()[1]', 'nvarchar(max)'), 2, 1, ' ') +
N'FROM tbl_forms_data tfd
INNER JOIN tbl_forms_fields tff
ON tff.id = tfd.qid
AND tff.templateid = tf.templateid -- outer reference
WHERE tfd.formid = tf.formid -- outer reference
) x
WHERE tf.templateid = 1234
ORDER BY tf.formdate DESC; '
June 10, 2016 at 9:28 am
Nice one, Luis
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 11, 2016 at 8:39 am
For some reason I can't get this to pivot the results - It runs but just lists the data in a column. I'll play with it and see where it takes me.
Thanks everyone for the help. Learnt a lot.
Viewing 11 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy