Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

trying to add third grouping set to return average aggregate Expand / Collapse
Author
Message
Posted Tuesday, February 18, 2014 9:58 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 12:08 AM
Points: 35,366, Visits: 31,901
It's not really a stumper the way I did it. According to the groupings, the correct answer displays. The problem is that's not the answer that you're expecting.

Shifting gears a bit here, I'm thinking that this whole thing is just not quite right. We're preaggregating data for an Excel Pivot? Why not just let Excel do the aggregations during the Pivot?


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1542855
Posted Tuesday, February 18, 2014 10:34 PM
Say Hey Kid

Say Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey Kid

Group: General Forum Members
Last Login: Yesterday @ 9:00 PM
Points: 680, Visits: 1,598
the reason for preaggregating is because there's too much data to fit into excel and growing. But, in preaggregating the column names need to remain intact, with no intersection of time dimension with fact columns. No intersection means only grouping, so that there are fewer rows to import. It would work, if only I didn't need to divide sum(lineItems) by DISTINCT invoices for AVG metric...

I really wish I could just import all the data and let Excel do the pivoting for me.
I shared Excel spreadsheet.


  Post Attachments 
ForJM.xlsx (2 views, 35.91 KB)
Post #1542862
Posted Sunday, February 23, 2014 1:07 PM
Say Hey Kid

Say Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey Kid

Group: General Forum Members
Last Login: Yesterday @ 9:00 PM
Points: 680, Visits: 1,598
Jeff, the query you gave me gives incorrect average, anyway. So the following query shows how I get just the correct Average#ofLineItemsPerDistinctInvoice for any given month. It's what I need in the final query as well.
WITH
cte AS
(
SELECT Invoice
,D_Type
,Status
,Yr = YEAR(RemediationDate)
,Mo = LEFT(DATENAME(mm,RemediationDate),3)
,RemediationDate
FROM dbo.Remediation
)
SELECT
Yr,
Mo,
--D_Type,
--Status,
--Invoice,
Distinct#Invoice = count(distinct Invoice),
#LineItems = COUNT(*),
#Avg#LineItemsPerInvoice = COUNT(*)/count(distinct Invoice)
FROM cte
group by
Mo,
Yr
--D_Type,
--Status,
--Invoice



which gives following correct averages
select 'Yr','Mo','Distinct#Invoice','#LineItems','#AvgLineItemsPerDistinctInvoice' union all
select '2014','Feb','2','6','3' union all
select '2014','Jan','2','12','6'


When I uncomment D_Type, Status, Invoice from above query I get the following result set:

select 'Mo', 'Yr', 'D_Type', 'Status', 'Invoice', '#LineItems', 'Distinct#Invoice' UNION ALL
select 'Feb', '2014', 'Recycle', 'Pass', '503', '1', '1' UNION ALL
select 'Feb', '2014', 'Recycle', 'Pass', '503', '3', '1' UNION ALL
select 'Feb', '2014', 'Remarket', 'Fail', '503', '1', '1' UNION ALL
select 'Feb', '2014', 'Reuse', 'Fail', '503', '1', '1' UNION ALL
select 'Jan', '2014', 'Recycle', 'Fail', '501', '1', '1' UNION ALL
select 'Jan', '2014', 'Recycle', 'Pass', '501', '1', '1' UNION ALL
select 'Jan', '2014', 'Recycle', 'Pass', '502', '1', '1' UNION ALL
select 'Jan', '2014', 'Remarket', 'Fail', '501', '1', '1' UNION ALL
select 'Jan', '2014', 'Remarket', 'Pass', '502', '3', '1' UNION ALL
select 'Jan', '2014', 'Reuse', 'Fail', '502', '3', '1' UNION ALL
select 'Jan', '2014', 'Reuse', 'Pass', '501', '1', '1' UNION ALL
select 'Jan', '2014', 'Reuse', 'Pass', '502', '1', '1'


Problem with it is that it now gives the count of distinct invoices per D_Type, Status, Invoice which will always be one. I need count distinct for each Yr,Mo grouping only, for eg. In January there were 6 line items and two distinct invoices, so average#ofLineItems is 6/2 = 3
The distinct is definitely the challenge here. Is there any solution in SQL, (maybe a ROLLUP by count of Distinct Invoices per month?) to build this query so I can do the aggregations for D_Type, Status, and Invoice, and also show Avg#ofLineItems/CountofDistinctInvoices per Month, in Excel?
Post #1544306
Posted Sunday, February 23, 2014 4:48 PM
Say Hey Kid

Say Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey Kid

Group: General Forum Members
Last Login: Yesterday @ 9:00 PM
Points: 680, Visits: 1,598
Problem is solved by using Dense_Rank function and the blog that helped me solve problem of not being able to use COUNT DISTINCT in the OVER clause with PARTITION BY clause is http://beyondrelational.com/modules/2/blogs/51/posts/15524/distinct-counts-using-aggregate-functions-with-over-clause.aspx.


SELECT  Yr     
,Mo
, Invoice
,D_Type
,Status
,MAX(RankByInvoice) over (partition by Yr, Mo) as NumDistinctInvoices
,RemediationDate
FROM
(
SELECT
Invoice,
RemediationDate,
D_Type, Status,
YEAR(RemediationDate) as Yr,
LEFT(DATENAME(mm,RemediationDate),3) as Mo,
RankByInvoice = DENSE_RANK() OVER (PARTITION BY YEAR(RemediationDate), LEFT(DATENAME(mm,RemediationDate),3) ORDER BY Invoice)
FROM dbo.Remediation
)p




Another link for solving COUNT DISTINCT with OVER clause is here
http://stackoverflow.com/questions/13480880/window-functions-to-count-distinct-records.

This was a tricky problem for me to solve and I didn't give it a good title or explain the end use of my query which had become a Grouping Set problem after Partition by wasn't working. It would have helped, I think, if I'd explained background and also destination (Excel) or result set!

I do not need to use GROUPING SETS and can go back to using Windowing functions for getting Count Distinct. Thanks to Jeff and Matak for feedback.
Post #1544323
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse