Select Top x% From Group Based on Total

 Author Message tfendt Mr or Mrs. 500 Group: General Forum Members Points: 591 Visits: 342 I tried searching for this but I must be using the wrong search terms. What I need to do it select the top 80 percent of records per group based on the group total. To be clear I am not trying to just grab the top x percent of rows. My example below might be better than me trying to explain Table 1:`DealerID RepairID1 11 21 31 41 51 61 71 81 91 102 112 122 13`Table 2:`RepairID Tire1 225/65R17 102T2 225/65R17 102T3 225/65R17 102T4 235/60R18 102V5 235/60R18 102V6 235/60R18 102V7 235/60R18 102V8 205/55R16 89H9 205/70R15 89H13 225/65R17 102T`Table 1 has the total number of repair orders per dealer. This can be obtained by simply grouping on DealerID and counting the number of RepairIDs.Table 2 has information on some of the repair orders and it is needed to select the top 80% of tire sizes. This table will be joined to Table 1 and grouped by DealerID and Tire.Desired Output:`DealerID Tire RepairsOfThisTire RepairCount PercentOfTotalRepairOrders1 235/60R18 102V 4 10 401 225/65R17 102T 3 10 301 205/55R16 89H 1 10 102 225/65R17 102T 1 3 33`The equation I am given to calculate the top tires per dealer is as follows: Total # of tires for the size / Total # of repair orders per dealer.Here is what I have so far though I think I might have to rewrite all of it.`SELECT DealerID , COUNT(RepairID)INTO #TotalDealerRepairOrdersFROM dbo.Table1GROUP BY DealerIDSELECT DealerID, Tire, COUNT(RepairID) AS 'Tire Count'INTO #tempCountsFROM dbo.Table1 INNER JOIN dbo.Table2 ON Table1.RepairID = Table2.RepairID GROUP BY Table1.DealerID, TireSELECT tc.DealerId , tc.Tire, CONVERT(DECIMAL(18,4), tc.[Tire Count]) / tro.NumberROs * 100 AS 'Percent'FROM #tempCounts tc INNER JOIN #TotalDealerRepairOrders tro ON tro.DealerID = tc.DealerIdGROUP BY tc.DealerID, tc.Tire, tc.[Tire Count], tro.NumberROsHAVING SUM(tc.[Tire Count]) <= .8 * tro.NumberROsORDER BY tc.[Tire Count] DESCDROP TABLE #TotalDealerRepairOrdersDROP TABLE #tempCounts`I feel like I should be using the Partition function. Thanks for your help. ChrisM@Work SSC Guru Group: General Forum Members Points: 96281 Visits: 20662 What do you mean by "Total # of tires for the size"? Here's your data set up in a consumable format:`DROP TABLE #Table1CREATE TABLE #Table1 (DealerID INT, RepairID INT)INSERT INTO #Table1 VALUES (1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7), (1,8),(1,9),(1,10),(2,11),(2,12),(2,13)DROP TABLE #Table2CREATE TABLE #Table2 (RepairID INT, Tire VARCHAR(15))INSERT INTO #Table2 VALUES (1,'225/65R17 102T'), (2,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'), (5,'235/60R18 102V'),(6,'235/60R18 102V'),(7,'235/60R18 102V'), (8,'205/55R16 89H'),(9,'205/70R15 89H'),(13,'225/65R17 102T')`Here's a query built upon your data which may go some way towards what you're trying to do:`SELECT DealerID, Tire, RepairsOfThisTyre, RepairCount = SUM(RepairsOfThisTyre) OVER(PARTITION BY DealerID)FROM ( SELECT DealerID, Tire, RepairsOfThisTyre = COUNT(*) FROM #Table1 d INNER JOIN #Table2 r ON r.RepairID = d.RepairID GROUP BY DealerID, Tire) dORDER BY DealerID, RepairsOfThisTyre DESC`How close is this to your requested result? “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail ShawFor fast, accurate and documented assistance in answering your questions, please read this article.Understanding and using APPLY, (I) and (II) Paul White Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff ModenExploring Recursive CTEs by Example Dwain Camps tfendt Mr or Mrs. 500 Group: General Forum Members Points: 591 Visits: 342 It is close to what I wrote in my example.Your query returns this:`DealerID Tire RepairsOfThisTyre RepairCount1 235/60R18 102V 4 91 225/65R17 102T 3 91 205/55R16 89H 1 91 205/70R15 89H 1 92 225/65R17 102T 1 1`I need it to return this:`DealerID Tire RepairsOfThisTyre RepairCount PercentOfTotalRepairOrders1 235/60R18 102V 4 10 401 225/65R17 102T 3 10 301 205/55R16 89H 1 10 102 225/65R17 102T 1 3 33`The RepairCount is the total number of repair orders in Table 1. The record with DealerID 1 and Tire 205/70R15 89H is not included since the 3 records above it make up the top 80%. ChrisM@Work SSC Guru Group: General Forum Members Points: 96281 Visits: 20662 Gotcha - think you're looking for this:`;WITH Preagg AS ( SELECT DealerID, Tire, RepairsOfThisTyre, RepairsForThisDealer = SUM(RepairsOfThisTyre) OVER(PARTITION BY DealerID) FROM ( SELECT DealerID, Tire, RepairsOfThisTyre = COUNT(*) FROM #Table1 d INNER JOIN #Table2 r ON r.RepairID = d.RepairID GROUP BY DealerID, Tire ) d), FinalAgg AS ( SELECT DealerID, Tire, RepairsOfThisTyre, RepairsForThisDealer, x.PCT, q = SUM(x.PCT) OVER(PARTITION BY DealerID ORDER BY x.PCT DESC, Tire), rn = ROW_NUMBER() OVER(PARTITION BY DealerID ORDER BY RepairsOfThisTyre DESC) FROM Preagg CROSS APPLY (SELECT PCT = (RepairsOfThisTyre*100.00)/RepairsForThisDealer) x)SELECT f.* FROM FinalAgg fCROSS APPLY ( SELECT TOP 1 * FROM FinalAgg fi WHERE fi.DealerID = f.DealerID AND q >= 80 ORDER BY rn) xWHERE f.rn <= x.rnORDER BY f.DealerID, f.RepairsOfThisTyre DESC`Give it a shot. It's highly likely that it can be improved. “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail ShawFor fast, accurate and documented assistance in answering your questions, please read this article.Understanding and using APPLY, (I) and (II) Paul White Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff ModenExploring Recursive CTEs by Example Dwain Camps tfendt Mr or Mrs. 500 Group: General Forum Members Points: 591 Visits: 342 The Preagg query is just counting the total number or records per dealer in table 2.I switched the inner join to a left outer join and got some crazy results. The only thing I can think of to correct it is to do this:`SELECT DealerId, COUNT(RepairID) AS 'RepairsForThisDealer'INTO #tempFROM #Table1GROUP BY DealerId;WITH Preagg AS ( SELECT d.DealerID, Tire, COUNT(r.RepairID) AS RepairsOfThisTyre, RepairsForThisDealerFROM #Table1 d INNER JOIN #Table2 r ON r.RepairID = d.RepairID INNER JOIN #temp t ON t.DealerId = d.DealerIdGROUP BY d.DealerID, Tire, RepairsForThisDealer), FinalAgg AS ( SELECT DealerID, Tire, RepairsOfThisTyre, RepairsForThisDealer, x.PCT, q = SUM(x.PCT) OVER(PARTITION BY DealerID ORDER BY x.PCT DESC, Tire), rn = ROW_NUMBER() OVER(PARTITION BY DealerID ORDER BY RepairsOfThisTyre DESC) FROM Preagg CROSS APPLY (SELECT PCT = (RepairsOfThisTyre*100.00)/RepairsForThisDealer) x)SELECT f.* FROM FinalAgg fCROSS APPLY ( SELECT TOP 1 * FROM FinalAgg fi WHERE fi.DealerID = f.DealerID AND q >= 80 ORDER BY rn) xWHERE f.rn <= x.rnORDER BY f.DealerID, f.RepairsOfThisTyre DESC`I get the correct percents, but now it doesn't show dealer 2. tfendt Mr or Mrs. 500 Group: General Forum Members Points: 591 Visits: 342 I also posted this question in StackOverflow. Here is the solution they came up with:`select DealerId, Tire, count(*) as RepairsOfThisTire, max(cnt) as TotalRepairs, count(*) * 1.0 / max(cnt) as PercentOfTotalfrom (select t1.*, t2.Tire, row_number() over (partition by t1.DealerId order by (case when t2.tire is null then 1 else 0 end), t1.RepairId ) as seqnum, count(*) over (partition by t1.DealerId) as cnt from table1 t1 left join table2 t2 on t1.RepairId = t2.RepairId ) twhere seqnum <= 0.8*cnt and tire is not nullgroup by DealerId, Tireorder by 1, 2;` Lynn Pettis SSC Guru Group: General Forum Members Points: 222767 Visits: 40382 Just an FYI, you shouldn't order by ordinal values. You really should specify the column names you are ordering by. If I remember correctly ordering by ordinal values is being deprecated. Lynn PettisFor better assistance in answering your questions, click hereFor tips to get better help with Performance Problems, click hereFor Running Totals and its variations, click here or when working with partitioned tablesFor more about Tally Tables, click hereFor more about Cross Tabs and Pivots, click here and hereManaging Transaction LogsSQL Musings from the Desert Fountain Valley SQL (My Mirror Blog) tfendt Mr or Mrs. 500 Group: General Forum Members Points: 591 Visits: 342 Thanks for the tip. I will make sure to change it in my query. Lynn Pettis SSC Guru Group: General Forum Members Points: 222767 Visits: 40382 Curious, is this what you actually want? I added a third dealer with only one order for one tire. This dealer does not show up in the results set.`CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);INSERT INTO dbo.Table1 VALUES (1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7), (1,8),(1,9),(1,10),(2,11),(2,12),(2,13), (3,14);CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));INSERT INTO dbo.Table2 VALUES (1,'225/65R17 102T'), (2,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'), (5,'235/60R18 102V'),(6,'235/60R18 102V'),(7,'235/60R18 102V'), (8,'205/55R16 89H'),(9,'205/70R15 89H'),(13,'225/65R17 102T'), (14,'235/60R18 102V');select * from dbo.Table1;select * from dbo.Table2;select DealerId, tire, count(*) as RepairsOfThisTire, max(cnt) as TotalRepairs, count(*) * 1.0 / max(cnt) as PercentOfTotalfrom (select t1.*, t2.tire, row_number() over (partition by t1.DealerId order by (case when t2.tire is null then 1 else 0 end), t1.RepairId ) as seqnum, count(*) over (partition by t1.DealerId) as cnt from dbo.Table1 t1 left join dbo.Table2 t2 on t1.RepairId = t2.RepairId ) twhere seqnum <= 0.8*cnt and tire is not nullgroup by DealerId, tireorder by DealerId, tire, RepairsOfThisTire desc;goDROP TABLE dbo.Table1;DROP TABLE dbo.Table2;` Lynn PettisFor better assistance in answering your questions, click hereFor tips to get better help with Performance Problems, click hereFor Running Totals and its variations, click here or when working with partitioned tablesFor more about Tally Tables, click hereFor more about Cross Tabs and Pivots, click here and hereManaging Transaction LogsSQL Musings from the Desert Fountain Valley SQL (My Mirror Blog) Lynn Pettis SSC Guru Group: General Forum Members Points: 222767 Visits: 40382 Added two more orders for the same tire and Dealer 3 shows up but only reports the tire as 66% of the total when in actuality it is 100%.`CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);INSERT INTO dbo.Table1 VALUES (1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7), (1,8),(1,9),(1,10),(2,11),(2,12),(2,13), (3,14),(3,15),(3,16);CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));INSERT INTO dbo.Table2 VALUES (1,'225/65R17 102T'), (2,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'), (5,'235/60R18 102V'),(6,'235/60R18 102V'),(7,'235/60R18 102V'), (8,'205/55R16 89H'),(9,'205/70R15 89H'),(13,'225/65R17 102T'), (14,'235/60R18 102V'),(15,'235/60R18 102V'),(16,'235/60R18 102V');select * from dbo.Table1;select * from dbo.Table2;select DealerId, tire, count(*) as RepairsOfThisTire, max(cnt) as TotalRepairs, count(*) * 1.0 / max(cnt) as PercentOfTotalfrom (select t1.*, t2.tire, row_number() over (partition by t1.DealerId order by (case when t2.tire is null then 1 else 0 end), t1.RepairId ) as seqnum, count(*) over (partition by t1.DealerId) as cnt from dbo.Table1 t1 left join dbo.Table2 t2 on t1.RepairId = t2.RepairId ) twhere seqnum <= 0.8*cnt and tire is not nullgroup by DealerId, tireorder by DealerId, tire, RepairsOfThisTire desc;goDROP TABLE dbo.Table1;DROP TABLE dbo.Table2;` Lynn PettisFor better assistance in answering your questions, click hereFor tips to get better help with Performance Problems, click hereFor Running Totals and its variations, click here or when working with partitioned tablesFor more about Tally Tables, click hereFor more about Cross Tabs and Pivots, click here and hereManaging Transaction LogsSQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)