SQL Subquery Help

Question

SQL Subquery Help

caseyo

SSC Rookie

Points: 47
More actions
July 14, 2010 at 3:48 pm

#237685

I'm new to SQL 2008. I got thrust in here because someone left. So I don't know a whole lot so bear with me please. I didn't build any of this just trying to figure out some work arounds for issues we are experiencing.
I have a table. tblResults. That contains several hundred thousands rows of results. It has about 40 columns. These columns have a numeric value. Usually 1 to 5. The columns are named based on the labels our users need to see. An example is Recommend.
What I'm trying to do is pre calculate some averages. Will probably run once per night, to make the data run faster on demand.
I have a system id, another id, label (this is the same as the data column name) in my overall avgs table. What I'm having issues on is selecting the column value from tblResults so I can do some math on it to get the averages. See below.
SelectOA.system_id,
OA.id,
OA.label,
(select Convert(Decimal(9,4),Sum(OA.label)/COUNT(*)) from tblResults where date_range between '2010-01-01' and '2011-01-01') as ytd_avg
from overall_avgs OA
I get this error Operand data type varchar is invalid for sum operator. I know its trying to sum 'Recommend' instead of the values in that column. I'm trying to do it this way to save some time. It's got to go through this process quite a bit.
Any help would be appreciated. I'm not sure on the terminology enough to know exactly what this is called.
Thank You

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

LutzM SSC Guru Points: 107049 More actions · Answer 1

Hi caseyo,

please provide the table definition (CREATE TABLE ...) for the table in question (overall_avgs and tblResults). One way to do it is by Right-clicking at the table in Management Studio -> Script table as -> Create To -> New Query window.

It would also help if you could provide some sample date so we can see what the actual data look like. Based on those sample data we'd need to see your expected result so we can test our solution.

Please note that I'm offering a "I got thrust in here because someone left." discount 😉

Usually we'd expect to get ready to use sample data as described in the first link in my signature. But I think this situation calls for an exception....

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

caseyo SSC Rookie Points: 47 More actions · Answer 2

Thanks for the advice. Here are the Creates. I took columns out of the tblResults but it's the same idea, for 30 or more data elements.

CREATE TABLE [dbo].[overall_avgs](

[system_id] [tinyint] NULL,

[id] [tinyint] NULL,

[label] [varchar](50) NULL,

[ytd_avg] [decimal](4, 2) NULL,

[qtd_avg] [decimal](4, 2) NULL,

[mtd_avg] [decimal](4, 2) NULL,

[all_avg] [decimal](4, 2) NULL

) ON [PRIMARY]

CREATE TABLE [dbo].[tblResults](

[id] [int] NULL,

[Visit Date] [datetime] NULL,

[Parking] [int] NULL,

[Invoices] [int] NULL,

[Recommend] [int] NULL

) ON [PRIMARY]

tblResults data would look like this. Int values are 1 to 5 scale and can be null sometimes.

10,'2010-01-15 12:30', 3,1,5

20,'2010-07-15 12:35', 2,4,4

30,'2010-06-15 12:40', 1,2,3

overall_avgs desired data would look like this. I just made up the numbers below but you get the idea. We are going to be collecting more calculations than this, but this should give you an idea.

1,1,'Parking',4.52,4.64,3.50,3.78

1,2,'Invoices',4.75,3.64,2.50,4.78

1,3,'Recommend',4.52,4.64,3.50,3.78

Thanks so much for your help. If that wasn't what you were needing please let me know.

Caseyo

LutzM SSC Guru Points: 107049 More actions · Answer 3

Ok, here's what I came up with:

DECLARE @overall_avgs TABLE

(

[system_id] [TINYINT] NULL,

[id] [TINYINT] NULL,

[label] [VARCHAR](50) NULL,

[ytd_avg] [DECIMAL](4, 2) NULL,

[qtd_avg] [DECIMAL](4, 2) NULL,

[mtd_avg] [DECIMAL](4, 2) NULL,

[all_avg] [DECIMAL](4, 2) NULL

)

DECLARE @tblResults TABLE

(

[id] [INT] NULL,

[Visit DATE] [DATETIME] NULL,

[Parking] [INT] NULL,

[Invoices] [INT] NULL,

[Recommend] [INT] NULL

)

INSERT INTO @tblResults

SELECT 10,'2010-01-15 12:30', 3,1,5 UNION ALL

SELECT 20,'2010-07-15 12:35', 2,4,4 UNION ALL

SELECT 30,'2010-06-15 12:40', 1,2,3

;

-- step 1: unpivot the table to get more normalized data

WITH cte AS

(

SELECT [id], [Visit DATE],Amount , Item

FROM

(SELECT *

FROM @tblResults) p

UNPIVOT

(Amount FOR Item IN ([Parking], [Invoices], [Recommend])

)AS unpvt

)

SELECT

1 AS system_id,

-- assign hard coded id values to each column since there is no other business logic provided so far

CASE Item

WHEN 'Parking' THEN 1

WHEN 'Invoices' THEN 2

WHEN 'Recommend' THEN 3

ELSE 4 END AS [id],

item,

/* basic concept:

dividend: If the row has a visit date within the period to be covered

then use the value of each column unpivoted previously, e.g. [Parking], otherwise assign Zero. Sum those values.

divisor: The COUNT(*) is replaced by another conditional sum function: if the row matches the condition, assign 1,

otherwise assign Zero and sum those values. It might happen that there are no rows to sum. This would lead to a division by Zero.

To avoid that, the NULLIF function is used to get a division by NULL which will retun NULL.

To display Zero instead of NULL, the ISNULL function is used.

In order to get the result of the division in decimal format, I simply added 0.00 inside the SUM() function.

This will force an implicit conversion and is a lot easier to write than another CAST() statement inside the SUM() function.

It's considered to be faster, too.

Finally, cast the result to dec(4,2) as required.

Apply this concept to each relevant period: ytd, qtd, mtd.

*/CAST(

ISNULL(

SUM( CASE WHEN [Visit DATE] >= DATEADD(YEAR,DATEDIFF(YEAR,0,GETDATE()),0) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN [Visit DATE] >= DATEADD(YEAR,DATEDIFF(YEAR,0,GETDATE()),0) THEN 1 ELSE 0 END),0)

,0)AS DECIMAL(4,2)) AS [ytd_avg],

CAST(

ISNULL(

SUM( CASE WHEN [Visit DATE] >= DATEADD(Quarter,DATEDIFF(Quarter,0,GETDATE()),0) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN [Visit DATE] >= DATEADD(Quarter,DATEDIFF(Quarter,0,GETDATE()),0) THEN 1 ELSE 0 END),0)

,0)AS DECIMAL(4,2)) AS [qtd_avg],

CAST(

ISNULL(

SUM( CASE WHEN [Visit DATE] >= DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE()),0) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN [Visit DATE] >= DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE()),0) THEN 1 ELSE 0 END),0)

,0)AS DECIMAL(4,2)) AS [mtd_avg],

CAST(AVG(Amount+0.00 )AS DECIMAL(4,2)) AS [all_avg]

FROM cte

GROUP BY item

ORDER BY system_id,id

/* result set

system_ididitemytd_avgqtd_avgmtd_avgall_avg

11Parking2.002.002.002.00

12Invoices2.334.004.002.33

13Recommend4.004.004.004.00

*/

It would also be possible to change that statement into a dynamic SQL to make it a lot shorter (especially if we talk about the 40 cols for the tblResults table).

But I think this code already is beyond the basic SQL 101 stuff...

There's one issue I worry about:

Due to the required result of [all_avg] this query will end up in a table scan or clustered index scan. It would be interesting to know how many rows from previos years are stored compared to the current year. It might be faster to exclude the [all_avg] calculation from this query, use a WHERE clause to limit the rows and add the [all_avg] values using a separate query. But that depends on the data distribution and should be tested for performance.

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

LutzM SSC Guru Points: 107049 More actions · Answer 4

@ Joe Celko:

You've demonstrated to design the view for YTD_Averages.

What is your recommendation to get MTD, QTD and ALL_avg?

As far as I can see, one option would be to UNION (ALL) 4 separate views using different WHERE clauses (resulting in 4 separate scan/seek operation) or to use the approach I posted that will use one table scan and several aggregated CASE statements.

It's nothing I would start arguing about but learning from.

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

caseyo SSC Rookie Points: 47 More actions · Answer 5

@LutzM Thank you so much. I think this will get me where i need to go.

This was exactly what I was needing. I modified it to fit the full table and put some more validation on it, not your fault I forgot to include it. The query ran in 7 secs on 30,000 rows worth of data. I did take out the year portion. Your advice makes sense. We weren't sold on needing that anyway. And we can always add it in later.

This is amazing. Exactly what I was trying to do but unsure how to get there. I've got a lot more to do, but know this will help greatly. Thanks Again.

caseyo SSC Rookie Points: 47 More actions · Answer 6

@ Joe Celko:

Thank you for your post as well. It's not a huge table, but can take awhile to process especially when calls from the web are involved. I'm more of a .Net Programmer and got forced into SQL Duty. We are actually going to balance showing real time data vs showing pre compiled data. I'll definitely use something very similar to what you suggest for our real time data.

Thanks for your reply as well. This community is top notch.

LutzM SSC Guru Points: 107049 More actions · Answer 7

caseyo (7/15/2010)
@LutzM Thank you so much. I think this will get me where i need to go.
This was exactly what I was needing. I modified it to fit the full table and put some more validation on it, not your fault I forgot to include it. The query ran in 7 secs on 30,000 rows worth of data. I did take out the year portion. Your advice makes sense. We weren't sold on needing that anyway. And we can always add it in later.
This is amazing. Exactly what I was trying to do but unsure how to get there. I've got a lot more to do, but know this will help greatly. Thanks Again.

7 secs for 30K rows doesn't sound like an acceptable performance. Would you mind sharing your current query and the related execution plan (actual , not estimated)? Seems like there is some more room for improvement.

Did you consider to normalize and properly index your tblResults table? This would definitely help to improve perfromance.

As a side note: You shouldn't be overly amazed that you've got a query you could use almost instantly. You took the major part by providing table structure, sample data and expected result. (Hint: have a look at how I posted the sample data so others might benefit from it and maybe come up with a better solution - [hide away shuddering knowing Joe Celko moght be around]).

For someone not frequently using that forum: Great job!!:-)

For someone who's kinda new to SQL Server: OUTSTANDING JOB!!!!! VERY WELL DONE!!!

If I've been able to show you you're not alone and you left your office with less headaches than expected, I'm excited! Because that's what it's all about: to volunteer on a great forum like this one helping folks like you!

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

caseyo SSC Rookie Points: 47 More actions · Answer 8

@LutzM

I did not consider to normalize and properly index your tblResults table. I'll have to look up tomorrow on how to do that. I'll also submit the full query tomorrow as well.

Thanks so much for your help. Until recently my experience with SQL server has been writing simple insert or update procedures to work with forms from .Net. Nothing to this magnitude.

I'm definitely excited. While 7 seconds sounds like a long time for this. We were looking at hours for this stuff to run before.

LutzM SSC Guru Points: 107049 More actions · Answer 9

caseyo (7/15/2010)
@LutzM
I did not consider to normalize and properly index your tblResults table. I'll have to look up tomorrow on how to do that. I'll also submit the full query tomorrow as well.
Thanks so much for your help. Until recently my experience with SQL server has been writing simple insert or update procedures to work with forms from .Net. Nothing to this magnitude.
I'm definitely excited. While 7 seconds sounds like a long time for this. We were looking at hours for this stuff to run before.

When talking about normalizing the table I've been thinking about using the subquery

SELECT [id], [Visit DATE],Amount , Item

FROM

(SELECT *

FROM @tblResults) p

UNPIVOT

(Amount FOR Item IN ([Parking], [Invoices], [Recommend])

)AS unpvt

to populate a separate table, properly indexed and use it in the query instead of the cte.

[sarcasm, humor ON]If you were looking at hours you could always wrap it into a stored procedure and a add for instance WAITFOR DELAY '02:45' to meet your expectation (Side note: this will add a useless idle time but it will "prove" the code to be complicated because "it takes so long to execute") [sarcasm, humor OFF]

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

caseyo SSC Rookie Points: 47 More actions · Answer 10

Here is what i have so far. I changed a few of the field names. Other than that it's exactly what i have. I had to add a condition to not count 6's because there is a stray one every now and then. I added in the code to get visit date for the last 8 months. Because that's all the data we have for this particular one.

We are just trying to salvage what we have now. The plan is to learn more then rebuild this thing. Splitting that large results table up definitely makes a lot of sense.

Since it is 30k rows and growing. Would a temp table be more efficient than the table variable? Also

You know you have something in that wait for WAITFOR DELAY command.

DECLARE @tblResults TABLE

(

[id] [INT] NULL,

[Visit DATE] [DATETIME] NULL,

[Parking] [INT] NULL,

[Invoices] [INT] NULL,

[Recommend] [INT] NULL,

[Q4] [INT] NULL,

[Q5] [INT] NULL,

[Q6] [INT] NULL,

[Q7] [INT] NULL,

[Q8] [INT] NULL,

[Q9] [INT] NULL,

[Q10] [INT] NULL,

[Q11] [INT] NULL,

[Q12] [INT] NULL,

[Q13] [INT] NULL,

[Q14] [INT] NULL,

[Q15] [INT] NULL,

[Q16] [INT] NULL,

[Q17] [INT] NULL,

[Q18] [INT] NULL,

[Q19] [INT] NULL

)

--[Appointment Schedule],[Total Wait],

INSERT INTO @tblResults

select id,[Visit Date],Parking,Invoices,Recommend,Q4,Q5,Q6,Q7,Q8,Q9,Q10,Q11,Q12,Q13,Q14,Q15,Q16,Q17,Q18,Q19

from tblResults where id > 40

;

-- step 1: unpivot the table to get more normalized data

WITH cte AS

(

SELECT [id], [Visit DATE],Amount , Item

FROM

(SELECT *

FROM @tblResults) p

UNPIVOT

(Amount FOR Item IN ([Parking], [Invoices], [Recommend],Q4,Q5,Q6,Q7,Q8,Q9,Q10,Q11,Q12,Q13,Q14,Q15,Q16,Q17,Q18,Q19)

)AS unpvt

)

SELECT

1 AS system_id,

-- assign hard coded id values to each column since there is no other business logic provided so far

CASE Item

WHEN 'Parking' THEN 11

WHEN 'Invoices' THEN 24

WHEN 'Recommend' THEN 25

WHEN 'Q4' THEN 9

WHEN 'Q5' THEN 15

WHEN 'Q6' THEN 5

WHEN 'Q7' THEN 6

WHEN 'Q8' THEN 7

WHEN 'Q9' THEN 8

WHEN 'Q10' THEN 10

WHEN 'Q11' THEN 12

WHEN 'Q12' THEN 13

WHEN 'Q13' THEN 14

WHEN 'Q14' THEN 16

WHEN 'Q15' THEN 17

WHEN 'Q16' THEN 18

WHEN 'Q17' THEN 19

WHEN 'Q18' THEN 20

WHEN 'Q19' THEN 21

ELSE 0 END AS [id],

item,

CAST(

ISNULL(

SUM( CASE WHEN ([Visit DATE] >= DATEADD(YEAR,DATEDIFF(YEAR,0,GETDATE()),0) and Amount <6) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN ([Visit DATE] >= DATEADD(YEAR,DATEDIFF(YEAR,0,GETDATE()) ,0) and Amount <6) THEN 1 ELSE 0 END),0)

,0) AS DECIMAL(4,2)) AS [ytd_avg]

,CAST(

ISNULL(

SUM( CASE WHEN ([Visit DATE] >= DATEADD(Quarter,DATEDIFF(Quarter,0,GETDATE()),0) and Amount <6) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN ([Visit DATE] >= DATEADD(Quarter,DATEDIFF(Quarter,0,GETDATE()),0) and Amount <6) THEN 1 ELSE 0 END),0)

,0) AS DECIMAL(4,2)) AS [qtd_avg],

CAST(

ISNULL(

SUM( CASE WHEN ([Visit DATE] >= DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE()),0) and Amount <6) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN ([Visit DATE] >= DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE()),0) and Amount <6) THEN 1 ELSE 0 END),0)

,0) AS DECIMAL(4,2)) AS [mtd_avg]

/*

,CAST( AVG(Amount+0.00 ) AS DECIMAL(4,2)) AS [all_avg]

*/

, CAST(

ISNULL(

SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE()),0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())+1,0) and Amount <6) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE()),0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())+1,0) and Amount <6) THEN 1 ELSE 0 END),0)

,0) AS DECIMAL(4,2)) AS [m1_avg]

-- Added in Last month

, CAST(

ISNULL(

SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-1,0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE()),0) and Amount <6) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-1,0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE()),0) and Amount <6) THEN 1 ELSE 0 END),0)

,0) AS DECIMAL(4,2)) AS [m2_avg]

, CAST(

ISNULL(

SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-2,0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-1,0) and Amount <6) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-2,0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-1,0) and Amount <6) THEN 1 ELSE 0 END),0)

,0) AS DECIMAL(4,2)) AS [m3_avg]

, CAST(

ISNULL(

SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-3,0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-2,0) and Amount <6) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-3,0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-2,0) and Amount <6) THEN 1 ELSE 0 END),0)

,0) AS DECIMAL(4,2)) AS [m4_avg]

, CAST(

ISNULL(

SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-4,0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-3,0) and Amount <6) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-4,0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-3,0) and Amount <6) THEN 1 ELSE 0 END),0)

,0) AS DECIMAL(4,2)) AS [m5_avg]

, CAST(

ISNULL(

SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-5,0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-4,0) and Amount <6) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-5,0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-4,0) and Amount <6) THEN 1 ELSE 0 END),0)

,0) AS DECIMAL(4,2)) AS [m6_avg]

, CAST(

ISNULL(

SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-6,0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-5,0) and Amount <6) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-6,0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-5,0) and Amount <6) THEN 1 ELSE 0 END),0)

,0) AS DECIMAL(4,2)) AS [m7_avg]

, CAST(

ISNULL(

SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-7,0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-6,0) and Amount <6) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN ([Visit DATE] between DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-7,0) and DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-6,0) and Amount <6) THEN 1 ELSE 0 END),0)

,0) AS DECIMAL(4,2)) AS [m8_avg]

FROM cte

GROUP BY item

ORDER BY system_id,id

Thanks so much for all the help.

caseyo SSC Rookie Points: 47 More actions · Answer 11

That code i posted above ran in 13 seconds. i switched it to a temp table instead of a Table variable and it now runs in 3 seconds.

LutzM SSC Guru Points: 107049 More actions · Answer 12

Hmmm...

It doesn't really make sense to populate a separate table just to exclude 40 rows...

I usually use table variables to populate sample data when answering a forum question since one of the advantages is that I don't have to do any "cleanup" (meaning use a drop table statement...).

What I was talking about is to compare the following two solutions and check which one will perform better:

-- Version 1

;

-- step 1: unpivot the table to get more normalized data

WITH cte AS

(

SELECT [id], [Visit DATE],Amount , Item

FROM

(SELECT *

FROM tblResults WHERE id > 40) p

UNPIVOT

(Amount FOR Item IN ([Parking], [Invoices], [Recommend],Q4,Q5,Q6,Q7,Q8,Q9,Q10,Q11,Q12,Q13,Q14,Q15,Q16,Q17,Q18,Q19)

)AS unpvt

)

SELECT

1 AS system_id,

-- assign hard coded id values to each column since there is no other business logic provided so far

CASE Item

WHEN 'Parking' THEN 11

WHEN 'Invoices' THEN 24

WHEN 'Recommend' THEN 25

-- ... rest of the statement removed for simplicity

ELSE 0 END AS [id],

item,

CAST(

ISNULL(

SUM( CASE WHEN ([Visit DATE] >= DATEADD(YEAR,DATEDIFF(YEAR,0,GETDATE()),0) AND Amount <6) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN ([Visit DATE] >= DATEADD(YEAR,DATEDIFF(YEAR,0,GETDATE()) ,0) AND Amount <6) THEN 1 ELSE 0 END),0)

,0) AS DECIMAL(4,2)) AS [ytd_avg]

-- ... rest of the statement removed for simplicity

FROM cte

GROUP BY item

ORDER BY system_id,id

-- #########################################################################

-- Version 2

SELECT [id], [Visit DATE],Amount , Item

INTO #resultsNormalized

FROM

(SELECT *

FROM tblResults WHERE id > 40) p

UNPIVOT

(Amount FOR Item IN ([Parking], [Invoices], [Recommend],Q4,Q5,Q6,Q7,Q8,Q9,Q10,Q11,Q12,Q13,Q14,Q15,Q16,Q17,Q18,Q19)

)AS unpvt

ORDER BY item

CREATE CLUSTERED INDEX IX_#resultsNormalized_item

ON #resultsNormalized (item);

SELECT

1 AS system_id,

-- assign hard coded id values to each column since there is no other business logic provided so far

CASE Item

WHEN 'Parking' THEN 11

WHEN 'Invoices' THEN 24

WHEN 'Recommend' THEN 25

-- ... rest of the statement removed for simplicity

ELSE 0 END AS [id],

item,

CAST(

ISNULL(

SUM( CASE WHEN ([Visit DATE] >= DATEADD(YEAR,DATEDIFF(YEAR,0,GETDATE()),0) AND Amount <6) THEN Amount+0.00 ELSE 0.00 END)

/

NULLIF(SUM( CASE WHEN ([Visit DATE] >= DATEADD(YEAR,DATEDIFF(YEAR,0,GETDATE()) ,0) AND Amount <6) THEN 1 ELSE 0 END),0)

,0) AS DECIMAL(4,2)) AS [ytd_avg]

-- ... rest of the statement removed for simplicity

FROM #resultsNormalized

GROUP BY item

ORDER BY system_id,id

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

caseyo SSC Rookie Points: 47 More actions · Answer 13

I'm sorry I misunderstood you.

I ran your 2nd example.

First Version runs in 3 seconds.

Second Version runs in 15 seconds.

Both examples provide the same results.

Again I thank you for you help. I'm learning quite a bit from this.

LutzM SSC Guru Points: 107049 More actions · Answer 14

If you would post the execution plans for both queries it would help us a lot to see what changes need to be made to tune the code. 3 seconds for processing 30K rows still sound fairly high. How did you measure it? Did you use SET STATISTICS TIME ON/OFF?

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]