Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Group By Help Expand / Collapse
Author
Message
Posted Monday, July 30, 2012 4:52 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, January 23, 2014 4:55 PM
Points: 37, Visits: 144
I have a script that uses the GROUP BY clause and returns the SUM much greater than expected. Here is the script with GROUP BY followed by the script without GROUP BY. The value returned by SUM(RETAIL_SALES.[Sales Units LW]) as [Sales Units LW] is 18426. The correct value should be 6142. I appreciate your help.

STYLE COLOR SEASON YR MO WK Sales Units LW
HK87202 FWHG 2012 2012 7 4 18426.00

STYLE COLOR SEASON YR MO WK Sales Units LW
HK87202 FWHG 2012 2012 7 4 366
HK87202 FWHG 2012 2012 7 4 796
HK87202 FWHG 2012 2012 7 4 1189
HK87202 FWHG 2012 2012 7 4 1814
HK87202 FWHG 2012 2012 7 4 1977
TOTAL 6142

SELECT DISTINCT
ITEMMAST.STYLE as STYLE
,ITEMMAST.COLOR as COLOR
,Max(ITEMMAST.SEASON) as SEASON
,RETAIL_SALES.YR
,RETAIL_SALES.MO
,RETAIL_SALES.WK
,SUM(RETAIL_SALES.[Sales Units LW]) as [Sales Units LW]

FROM Evy_RH_Objects.dbo.RETAIL_SALES RETAIL_SALES
LEFT OUTER JOIN RH2007_EvyLive.dbo.ITEMMAST ITEMMAST on ITEMMAST.CUSTNO='WALM01' and (ITEMMAST.SKU=RETAIL_SALES.SKU or ITEMMAST.ITEMUPC=RETAIL_SALES.SKU)

WHERE
RETAIL_SALES.CUST_NO='WALM01'
and RETAIL_SALES.WK=4
and ITEMMAST.STYLE='HK87202'

GROUP BY ITEMMAST.STYLE, ITEMMAST.COLOR, RETAIL_SALES.YR, RETAIL_SALES.MO, RETAIL_SALES.WK

=================================================================================
SELECT DISTINCT
ITEMMAST.STYLE as STYLE
,ITEMMAST.COLOR as COLOR
,ITEMMAST.SEASON as SEASON
,RETAIL_SALES.YR
,RETAIL_SALES.MO
,RETAIL_SALES.WK
,RETAIL_SALES.[Sales Units LW] as [Sales Units LW]

FROM Evy_RH_Objects.dbo.RETAIL_SALES RETAIL_SALES
LEFT OUTER JOIN RH2007_EvyLive.dbo.ITEMMAST ITEMMAST on ITEMMAST.CUSTNO='WALM01' and (ITEMMAST.SKU=RETAIL_SALES.SKU or ITEMMAST.ITEMUPC=RETAIL_SALES.SKU)

WHERE
RETAIL_SALES.CUST_NO='WALM01'
and RETAIL_SALES.WK=4
and ITEMMAST.STYLE='HK87202'
Post #1337610
Posted Monday, July 30, 2012 10:53 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 8:06 PM
Points: 36,786, Visits: 31,243
You may have a many-to-many join going on. You should probably also have things like "ITEMMAST.CUSTNO='WALM01'" in a WHERE clause instead of an ON especially when outer joins are involved.

--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1337666
Posted Monday, July 30, 2012 11:28 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 7:55 AM
Points: 1,945, Visits: 2,863
Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. If you know how, follow ISO-11179 data element naming conventions and formatting rules. Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible and not local dialect.

This is minimal polite behavior on SQL forums.

You put spaces in data element names for display (1950's COBOL), “Master” in table names (magnetic tape file terms!), leading commas (punch card programming), why do you alias things to the original names? COBOL had no IN() predicate, so you did not use it. In COBOL, date is made of separate fields; in SQL it is is a single column, so you split up the sales(?) date. Do you really sell things not in inventory, like the OUTER JOINs say? Why do you have both a SKU and a UPC? Do you know that a DISTINCT and GROUP By can screw a result?

My guess, made without any DDL is that it might look more like this if done in SQL and not COBOL.

SELECT I.item_style, I.item_color, S.sale_date,
MAX(I.season_name),
SUM(S.sales_units) AS sales_units_tot
FROM Retail_Sales AS S,
Inventory AS I
WHERE I.cust_nbr = 'walm01'
AND S.cust_nbr = 'walm01'
AND S.sku IN (I.sku, I.item_upc)
AND I.item_style = 'hk87202'
GROUP BY I.item_style, I.item_color, S.sale_date;

Want to try again with more information?


Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Post #1337673
Posted Tuesday, July 31, 2012 1:29 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 9:30 AM
Points: 4,985, Visits: 11,679
Following on from Jeff's comment - if you remove the GROUP BY, you should be able to check whether more rows are being returned than you expect/want.


Help us to help you. For better, quicker and more-focused answers to your questions, consider following the advice in this link.

When you ask a question (and please do ask a question: "My T-SQL does not work" just doesn't cut it), please provide enough information for us to understand its context.
Post #1337709
Posted Tuesday, July 31, 2012 10:17 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, January 23, 2014 4:55 PM
Points: 37, Visits: 144
I did remove the GROUP BY and and achieved the correct result. My original post shows a 2nd script without GROUP BY.
Post #1338029
Posted Tuesday, July 31, 2012 12:58 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Tuesday, September 3, 2013 8:24 AM
Points: 1,240, Visits: 5,421
JayWinter (7/31/2012)
I did remove the GROUP BY and and achieved the correct result. My original post shows a 2nd script without GROUP BY.


Your DISTINCT clause is hiding the problem. DISTINCT is processed after the GROUP BY, so any duplicates will be included in your totals for the GROUP BY, but will be excluded in your QA query.

DISTINCT is also superfluous in conjunction with a GROUP BY anyhow. The results of a simple GROUP BY statement are necessarily distinct. (That may not be the case if you have multiple grouping sets.)

Drew


J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
Post #1338110
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse