Log in  ::  Register  ::  Not logged in

 Recent PostsRecent Posts Popular TopicsPopular Topics
 Home Search Members Calendar Who's On

 Select Top x% From Group Based on Total Rate Topic Display Mode Topic Options
Author
 Message
 Posted Friday, March 21, 2014 3:44 PM
 Valued Member Group: General Forum Members Last Login: Thursday, May 21, 2015 3:10 PM Points: 55, Visits: 341
 Lynn Pettis (3/21/2014)Added two more orders for the same tire and Dealer 3 shows up but only reports the tire as 66% of the total when in actuality it is 100%.Hmm...Interesting. I think you stumbled on a test case that we did not think of. Namely when all the orders are the same tire size. It is not likely going to ever happen but it is good to cover all bases. The query is supposed to go through Table 1 grouping by dealerID and counting up all a dealers orders. This number will be the denominator in the calculation. It then looks at table 2 grouping by dealerID and by tire and dividing the count by the total orders in table 1 to get the % of the total orders. Once I have a % calculated I then need to only select the tires that make up 80% of the dealers total count, everything else is left out.
Post #1553689
 Posted Friday, March 21, 2014 3:51 PM
 SSC-Insane Group: General Forum Members Last Login: Today @ 10:07 AM Points: 21,518, Visits: 34,067
 Added another dealer with 10 orders, 9 for one timre and 1 for a different tire. The new dealer shows up, but again the count of the Repairs for the tire shows as 8 and 80% when actually it is 9 and 90%.`CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);INSERT INTO dbo.Table1 VALUES (1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7), (1,8),(1,9),(1,10),(2,11),(2,12),(2,13), (3,14),(3,15),(3,16), (4,17),(4,18),(4,19),(4,20),(4,21),(4,22),(4,23),(4,24),(4,25),(4,26);CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));INSERT INTO dbo.Table2 VALUES (1,'225/65R17 102T'), (2,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'), (5,'235/60R18 102V'),(6,'235/60R18 102V'),(7,'235/60R18 102V'), (8,'205/55R16 89H'),(9,'205/70R15 89H'),(13,'225/65R17 102T'), (17,'235/60R18 102V'),(18,'235/60R18 102V'),(19,'235/60R18 102V'), (20,'235/60R18 102V'),(21,'235/60R18 102V'),(22,'235/60R18 102V'), (23,'235/60R18 102V'),(24,'235/60R18 102V'),(25,'235/60R18 102V'), (26,'225/65R17 102T');select * from dbo.Table1;select * from dbo.Table2;select DealerId, tire, count(*) as RepairsOfThisTire, max(cnt) as TotalRepairs, count(*) * 1.0 / max(cnt) as PercentOfTotalfrom (select t1.*, t2.tire, row_number() over (partition by t1.DealerId order by (case when t2.tire is null then 1 else 0 end), t1.RepairId ) as seqnum, count(*) over (partition by t1.DealerId) as cnt from dbo.Table1 t1 left join dbo.Table2 t2 on t1.RepairId = t2.RepairId ) twhere seqnum <= 0.8*cnt and tire is not nullgroup by DealerId, tireorder by DealerId, tire, RepairsOfThisTire desc;goDROP TABLE dbo.Table1;DROP TABLE dbo.Table2;`
Post #1553690
 Posted Friday, March 21, 2014 4:03 PM
 SSC-Insane Group: General Forum Members Last Login: Today @ 10:07 AM Points: 21,518, Visits: 34,067
 Modified the order by so that it listed output in descending order by RepairsOfThisTire by DealerId, which what you original indicated what you also wanted.`CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);INSERT INTO dbo.Table1 VALUES (1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7), (1,8),(1,9),(1,10),(2,11),(2,12),(2,13), (3,14),(3,15),(3,16), (4,17),(4,18),(4,19),(4,20),(4,21),(4,22),(4,23),(4,24),(4,25),(4,26);CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));INSERT INTO dbo.Table2 VALUES (1,'225/65R17 102T'), (2,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'), (5,'235/60R18 102V'),(6,'235/60R18 102V'),(7,'235/60R18 102V'), (8,'205/55R16 89H'),(9,'205/70R15 89H'),(13,'225/65R17 102T'), (14,'235/60R18 102V'),(15,'235/60R18 102V'),(16,'235/60R18 102V'), (17,'235/60R18 102V'),(18,'235/60R18 102V'),(19,'235/60R18 102V'), (20,'235/60R18 102V'),(21,'235/60R18 102V'),(22,'235/60R18 102V'), (23,'235/60R18 102V'),(24,'235/60R18 102V'),(25,'235/60R18 102V'), (26,'225/65R17 102T');select * from dbo.Table1;select * from dbo.Table2;select DealerId, tire, count(*) as RepairsOfThisTire, max(cnt) as TotalRepairs, count(*) * 1.0 / max(cnt) as PercentOfTotalfrom (select t1.*, t2.tire, row_number() over (partition by t1.DealerId order by (case when t2.tire is null then 1 else 0 end), t1.RepairId ) as seqnum, count(*) over (partition by t1.DealerId) as cnt from dbo.Table1 t1 left join dbo.Table2 t2 on t1.RepairId = t2.RepairId ) twhere seqnum <= 0.8*cnt and tire is not nullgroup by DealerId, tireorder by DealerId, RepairsOfThisTire desc, tire;goDROP TABLE dbo.Table1;DROP TABLE dbo.Table2;`
Post #1553694
 Posted Saturday, March 22, 2014 3:33 AM
 Ten Centuries Group: General Forum Members Last Login: Today @ 8:22 AM Points: 1,073, Visits: 7,760
 tfendt (3/21/2014)The Preagg query is just counting the total number or records per dealer in table 2....It also counts the total number of repairs per dealer.`;WITH Preagg AS ( SELECT DealerID, Tire, RepairsOfThisTyre, RepairsForThisDealer = SUM(RepairsOfThisTyre) OVER(PARTITION BY DealerID) FROM ( SELECT DealerID, Tire, RepairsOfThisTyre = COUNT(*) FROM Table1 d INNER JOIN Table2 r ON r.RepairID = d.RepairID GROUP BY DealerID, Tire ) d) SELECT * FROM Preagg` Low-hanging fruit picker and defender of the moggies For better assistance in answering your questions, please read this. Understanding and using APPLY, (I) and (II) Paul White Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Post #1553744
 Posted Sunday, March 23, 2014 6:50 AM This worked for the OP
 Ten Centuries Group: General Forum Members Last Login: Today @ 8:22 AM Points: 1,073, Visits: 7,760
 tfendt (3/21/2014)I also posted this question in StackOverflow. Here is the solution:`select DealerId, Tire, count(*) as RepairsOfThisTire, max(cnt) as TotalRepairs, count(*) * 1.0 / max(cnt) as PercentOfTotalfrom (select t1.*, t2.Tire, row_number() over (partition by t1.DealerId order by (case when t2.tire is null then 1 else 0 end), t1.RepairId ) as seqnum, count(*) over (partition by t1.DealerId) as cnt from table1 t1 left join table2 t2 on t1.RepairId = t2.RepairId ) twhere seqnum <= 0.8*cnt and tire is not nullgroup by DealerId, Tireorder by 1, 2;`Thank you for you time.This query works only because of the sequence of RepairId in table2, which already orders most-used tyres ahead of least-used tyres, probably by accident. If table2 really is artificially ordered in this way, you're ok, otherwise you will get incorrect results. Here's a copy of Lyn's dataset slightly adjusted to show this:`DROP TABLE dbo.Table1CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);INSERT INTO dbo.Table1 VALUES (1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7), (1,8),(1,9),(1,10),(2,11),(2,12),(2,13), (3,14),(3,15),(3,16);DROP TABLE dbo.Table2CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));INSERT INTO dbo.Table2 VALUES (9,'225/65R17 102T'), (7,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'), (5,'235/60R18 102V'),(6,'235/60R18 102V'),(8,'235/60R18 102V'), (2,'205/55R16 89H'),(1,'205/70R15 89H'),(13,'225/65R17 102T'), (14,'235/60R18 102V'),(15,'235/60R18 102V'),(16,'235/60R18 102V');`and a query which works:`SELECT DealerID, tire, RepairsOfThisTire, TotalRepairs, PercentOfTotal = (RepairsOfThisTire*100.00)/TotalRepairsFROM ( SELECT DealerID, TotalRepairs, tire, RepairsOfThisTire, Sector = NTILE(5) OVER(PARTITION BY DealerID ORDER BY RepairsOfThisTire DESC, tire DESC) FROM ( SELECT t1.DealerID, t2.tire, TotalRepairs = COUNT(*) OVER (PARTITION BY t1.DealerID), RepairsOfThisTire = COUNT(*) OVER(PARTITION BY t1.DealerID, t2.tire) FROM dbo.Table1 t1 LEFT JOIN dbo.Table2 t2 ON t2.RepairId = t1.RepairId ) d WHERE tire IS NOT NULL) eWHERE Sector < 5 GROUP BY DealerID, TotalRepairs, tire, RepairsOfThisTire--, PercentOfTotalORDER BY DealerID, RepairsOfThisTire DESC, tire;` Low-hanging fruit picker and defender of the moggies For better assistance in answering your questions, please read this. Understanding and using APPLY, (I) and (II) Paul White Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Post #1553811
 Posted Monday, March 24, 2014 12:50 AM
 SSC Veteran Group: General Forum Members Last Login: Tuesday, March 31, 2015 3:07 AM Points: 256, Visits: 549
 might be this will be handy and more readable`with cte as ( SELECT DISTINCT t1.DealerID, t2.tire, TotalRepairs = SUM(1) OVER (PARTITION BY t1.DealerID), RepairsOfThisTire = SUM(1) OVER(PARTITION BY t1.DealerID, t2.tire), PercentOfTotalRepairOrders = ROUND((SUM(1.0) OVER(PARTITION BY t1.DealerID, t2.tire)/SUM(1.0) OVER (PARTITION BY t1.DealerID)) *100,0)FROM dbo.Table1 t1 LEFT JOIN dbo.Table2 t2 ON t2.RepairId = t1.RepairId)SELECT * FROM cteWHERE tire IS NOT NULLORDER BY PercentOfTotalRepairOrders DESC `since you are using Sql 2012 you can use Lead or Lag function to calculate Total percent by Dealer Group and you can get the desire result Every rule in a world of bits and bytes, can be bend or eventually be brokenMyBlog About Common dialog controlA Visualizer for viewing SqlCommand object script
Post #1553876
 Posted Monday, March 24, 2014 8:09 AM
 Valued Member Group: General Forum Members Last Login: Thursday, May 21, 2015 3:10 PM Points: 55, Visits: 341
 ChrisM@home (3/23/2014)This query works only because of the sequence of RepairId in table2, which already orders most-used tyres ahead of least-used tyres, probably by accident. If table2 really is artificially ordered in this way, you're ok, otherwise you will get incorrect results. Here's a copy of Lyn's dataset slightly adjusted to show this:`DROP TABLE dbo.Table1CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);INSERT INTO dbo.Table1 VALUES (1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7), (1,8),(1,9),(1,10),(2,11),(2,12),(2,13), (3,14),(3,15),(3,16);DROP TABLE dbo.Table2CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));INSERT INTO dbo.Table2 VALUES (9,'225/65R17 102T'), (7,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'), (5,'235/60R18 102V'),(6,'235/60R18 102V'),(8,'235/60R18 102V'), (2,'205/55R16 89H'),(1,'205/70R15 89H'),(13,'225/65R17 102T'), (14,'235/60R18 102V'),(15,'235/60R18 102V'),(16,'235/60R18 102V');`and a query which works:`SELECT DealerID, tire, RepairsOfThisTire, TotalRepairs, PercentOfTotal = (RepairsOfThisTire*100.00)/TotalRepairsFROM ( SELECT DealerID, TotalRepairs, tire, RepairsOfThisTire, Sector = NTILE(5) OVER(PARTITION BY DealerID ORDER BY RepairsOfThisTire DESC, tire DESC) FROM ( SELECT t1.DealerID, t2.tire, TotalRepairs = COUNT(*) OVER (PARTITION BY t1.DealerID), RepairsOfThisTire = COUNT(*) OVER(PARTITION BY t1.DealerID, t2.tire) FROM dbo.Table1 t1 LEFT JOIN dbo.Table2 t2 ON t2.RepairId = t1.RepairId ) d WHERE tire IS NOT NULL) eWHERE Sector < 5 GROUP BY DealerID, TotalRepairs, tire, RepairsOfThisTire--, PercentOfTotalORDER BY DealerID, RepairsOfThisTire DESC, tire;`The problem with this query is it returns data for dealer id 3. The business rule I was given was that they only want to see the top 80%. In the case of dealer id 3 the query is showing the top 100%. In this example the top 80% cannot be calculated since all the orders are the same tire. The main issue I see and need to get clarification from the business on is what happens if the top tire is > 80%?
Post #1554038
 Posted Monday, March 24, 2014 8:17 AM
 SSCertifiable Group: General Forum Members Last Login: Today @ 9:57 AM Points: 7,162, Visits: 15,020
 tfendt (3/24/2014)...The problem with this query is it returns data for dealer id 3. The business rule I was given was that they only want to see the top 80%. In the case of dealer id 3 the query is showing the top 100%. In this example the top 80% cannot be calculated since all the orders are the same tire. The main issue I see and need to get clarification from the business on is what happens if the top tire is > 80%?Exactly. What do they want if the number of rows is not divisible by 5. Modification of the query to accommodate the decision should be straightforward. “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail ShawFor fast, accurate and documented assistance in answering your questions, please read this article.Understanding and using APPLY, (I) and (II) Paul White Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff ModenExploring Recursive CTEs by Example Dwain Camps
Post #1554047
 Posted Monday, March 24, 2014 8:18 AM
 Valued Member Group: General Forum Members Last Login: Thursday, May 21, 2015 3:10 PM Points: 55, Visits: 341
 tfendt (3/24/2014)ChrisM@home (3/23/2014)This query works only because of the sequence of RepairId in table2, which already orders most-used tyres ahead of least-used tyres, probably by accident. If table2 really is artificially ordered in this way, you're ok, otherwise you will get incorrect results. Here's a copy of Lyn's dataset slightly adjusted to show this:`DROP TABLE dbo.Table1CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);INSERT INTO dbo.Table1 VALUES (1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7), (1,8),(1,9),(1,10),(2,11),(2,12),(2,13), (3,14),(3,15),(3,16);DROP TABLE dbo.Table2CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));INSERT INTO dbo.Table2 VALUES (9,'225/65R17 102T'), (7,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'), (5,'235/60R18 102V'),(6,'235/60R18 102V'),(8,'235/60R18 102V'), (2,'205/55R16 89H'),(1,'205/70R15 89H'),(13,'225/65R17 102T'), (14,'235/60R18 102V'),(15,'235/60R18 102V'),(16,'235/60R18 102V');`and a query which works:`SELECT DealerID, tire, RepairsOfThisTire, TotalRepairs, PercentOfTotal = (RepairsOfThisTire*100.00)/TotalRepairsFROM ( SELECT DealerID, TotalRepairs, tire, RepairsOfThisTire, Sector = NTILE(5) OVER(PARTITION BY DealerID ORDER BY RepairsOfThisTire DESC, tire DESC) FROM ( SELECT t1.DealerID, t2.tire, TotalRepairs = COUNT(*) OVER (PARTITION BY t1.DealerID), RepairsOfThisTire = COUNT(*) OVER(PARTITION BY t1.DealerID, t2.tire) FROM dbo.Table1 t1 LEFT JOIN dbo.Table2 t2 ON t2.RepairId = t1.RepairId ) d WHERE tire IS NOT NULL) eWHERE Sector < 5 GROUP BY DealerID, TotalRepairs, tire, RepairsOfThisTire--, PercentOfTotalORDER BY DealerID, RepairsOfThisTire DESC, tire;`The problem with this query is it returns data for dealer id 3. The business rule I was given was that they only want to see the top 80%. In the case of dealer id 3 the query is showing the top 100%. In this example the top 80% cannot be calculated since all the orders are the same tire. The main issue I see and need to get clarification from the business on is what happens if the top tire is > 80%?I do like this query however. I am not 100% sure on how it works though. How does it know to select the top 80%? Why NTILE(5)?
Post #1554048
 Posted Monday, March 24, 2014 8:23 AM
 SSCertifiable Group: General Forum Members Last Login: Today @ 9:57 AM Points: 7,162, Visits: 15,020
 Have a look at the NTILE function in Books Online (SQL Server help). NTILE(n) splits the data into n equal-sized partitions numbered 1 to n in a user-defined order and with defined rules for when the data set is not divisible by n. “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail ShawFor fast, accurate and documented assistance in answering your questions, please read this article.Understanding and using APPLY, (I) and (II) Paul White Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff ModenExploring Recursive CTEs by Example Dwain Camps
Post #1554052

 Permissions