Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

Select Top x% From Group Based on Total Expand / Collapse
Author
Message
Posted Friday, March 21, 2014 3:44 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: 2 days ago @ 3:26 PM
Points: 41, Visits: 266
Lynn Pettis (3/21/2014)
Added two more orders for the same tire and Dealer 3 shows up but only reports the tire as 66% of the total when in actuality it is 100%.


Hmm...Interesting. I think you stumbled on a test case that we did not think of. Namely when all the orders are the same tire size. It is not likely going to ever happen but it is good to cover all bases.

The query is supposed to go through Table 1 grouping by dealerID and counting up all a dealers orders. This number will be the denominator in the calculation. It then looks at table 2 grouping by dealerID and by tire and dividing the count by the total orders in table 1 to get the % of the total orders. Once I have a % calculated I then need to only select the tires that make up 80% of the dealers total count, everything else is left out.
Post #1553689
Posted Friday, March 21, 2014 3:51 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 9:36 PM
Points: 20,738, Visits: 32,522
Added another dealer with 10 orders, 9 for one timre and 1 for a different tire. The new dealer shows up, but again the count of the Repairs for the tire shows as 8 and 80% when actually it is 9 and 90%.


CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);

INSERT INTO dbo.Table1 VALUES
(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),
(1,8),(1,9),(1,10),(2,11),(2,12),(2,13),
(3,14),(3,15),(3,16),
(4,17),(4,18),(4,19),(4,20),(4,21),(4,22),(4,23),(4,24),(4,25),(4,26);

CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));

INSERT INTO dbo.Table2 VALUES (1,'225/65R17 102T'),
(2,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'),
(5,'235/60R18 102V'),(6,'235/60R18 102V'),(7,'235/60R18 102V'),
(8,'205/55R16 89H'),(9,'205/70R15 89H'),(13,'225/65R17 102T'),
(17,'235/60R18 102V'),(18,'235/60R18 102V'),(19,'235/60R18 102V'),
(20,'235/60R18 102V'),(21,'235/60R18 102V'),(22,'235/60R18 102V'),
(23,'235/60R18 102V'),(24,'235/60R18 102V'),(25,'235/60R18 102V'),
(26,'225/65R17 102T');


select * from dbo.Table1;
select * from dbo.Table2;

select DealerId, tire, count(*) as RepairsOfThisTire,
max(cnt) as TotalRepairs,
count(*) * 1.0 / max(cnt) as PercentOfTotal
from (select t1.*, t2.tire,
row_number() over (partition by t1.DealerId
order by (case when t2.tire is null then 1 else 0 end),
t1.RepairId
) as seqnum,
count(*) over (partition by t1.DealerId) as cnt
from dbo.Table1 t1 left join
dbo.Table2 t2
on t1.RepairId = t2.RepairId
) t
where seqnum <= 0.8*cnt and tire is not null
group by DealerId, tire
order by DealerId, tire, RepairsOfThisTire desc;

go
DROP TABLE dbo.Table1;
DROP TABLE dbo.Table2;





Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1553690
Posted Friday, March 21, 2014 4:03 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 9:36 PM
Points: 20,738, Visits: 32,522
Modified the order by so that it listed output in descending order by RepairsOfThisTire by DealerId, which what you original indicated what you also wanted.


CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);

INSERT INTO dbo.Table1 VALUES
(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),
(1,8),(1,9),(1,10),(2,11),(2,12),(2,13),
(3,14),(3,15),(3,16),
(4,17),(4,18),(4,19),(4,20),(4,21),(4,22),(4,23),(4,24),(4,25),(4,26);

CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));

INSERT INTO dbo.Table2 VALUES (1,'225/65R17 102T'),
(2,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'),
(5,'235/60R18 102V'),(6,'235/60R18 102V'),(7,'235/60R18 102V'),
(8,'205/55R16 89H'),(9,'205/70R15 89H'),(13,'225/65R17 102T'),
(14,'235/60R18 102V'),(15,'235/60R18 102V'),(16,'235/60R18 102V'),
(17,'235/60R18 102V'),(18,'235/60R18 102V'),(19,'235/60R18 102V'),
(20,'235/60R18 102V'),(21,'235/60R18 102V'),(22,'235/60R18 102V'),
(23,'235/60R18 102V'),(24,'235/60R18 102V'),(25,'235/60R18 102V'),
(26,'225/65R17 102T');


select * from dbo.Table1;
select * from dbo.Table2;

select DealerId, tire, count(*) as RepairsOfThisTire,
max(cnt) as TotalRepairs,
count(*) * 1.0 / max(cnt) as PercentOfTotal
from (select t1.*, t2.tire,
row_number() over (partition by t1.DealerId
order by (case when t2.tire is null then 1 else 0 end),
t1.RepairId
) as seqnum,
count(*) over (partition by t1.DealerId) as cnt
from dbo.Table1 t1 left join
dbo.Table2 t2
on t1.RepairId = t2.RepairId
) t
where seqnum <= 0.8*cnt and tire is not null
group by DealerId, tire
order by DealerId, RepairsOfThisTire desc, tire;

go
DROP TABLE dbo.Table1;
DROP TABLE dbo.Table2;





Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1553694
Posted Saturday, March 22, 2014 3:33 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 2:19 AM
Points: 1,034, Visits: 6,811
tfendt (3/21/2014)
The Preagg query is just counting the total number or records per dealer in table 2.

...


It also counts the total number of repairs per dealer.

;WITH 
Preagg AS (
SELECT DealerID, Tire, RepairsOfThisTyre,
RepairsForThisDealer = SUM(RepairsOfThisTyre) OVER(PARTITION BY DealerID)
FROM (
SELECT DealerID, Tire,
RepairsOfThisTyre = COUNT(*)
FROM Table1 d
INNER JOIN Table2 r ON r.RepairID = d.RepairID
GROUP BY DealerID, Tire
) d
) SELECT * FROM Preagg




Low-hanging fruit picker and defender of the moggies





For better assistance in answering your questions, please read this.




Understanding and using APPLY, (I) and (II) Paul White

Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Post #1553744
Posted Sunday, March 23, 2014 6:50 AM This worked for the OP Answer marked as solution
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 2:19 AM
Points: 1,034, Visits: 6,811
tfendt (3/21/2014)
I also posted this question in StackOverflow. Here is the solution:
select DealerId, Tire, count(*) as RepairsOfThisTire,
max(cnt) as TotalRepairs,
count(*) * 1.0 / max(cnt) as PercentOfTotal
from (select t1.*, t2.Tire,
row_number() over (partition by t1.DealerId
order by (case when t2.tire is null then 1 else 0 end),
t1.RepairId
) as seqnum,
count(*) over (partition by t1.DealerId) as cnt
from table1 t1 left join
table2 t2
on t1.RepairId = t2.RepairId
) t
where seqnum <= 0.8*cnt and tire is not null
group by DealerId, Tire
order by 1, 2;

Thank you for you time.


This query works only because of the sequence of RepairId in table2, which already orders most-used tyres ahead of least-used tyres, probably by accident. If table2 really is artificially ordered in this way, you're ok, otherwise you will get incorrect results. Here's a copy of Lyn's dataset slightly adjusted to show this:
DROP TABLE dbo.Table1
CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);

INSERT INTO dbo.Table1 VALUES
(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),
(1,8),(1,9),(1,10),(2,11),(2,12),(2,13),
(3,14),(3,15),(3,16);

DROP TABLE dbo.Table2
CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));

INSERT INTO dbo.Table2 VALUES (9,'225/65R17 102T'),
(7,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'),
(5,'235/60R18 102V'),(6,'235/60R18 102V'),(8,'235/60R18 102V'),
(2,'205/55R16 89H'),(1,'205/70R15 89H'),(13,'225/65R17 102T'),
(14,'235/60R18 102V'),(15,'235/60R18 102V'),(16,'235/60R18 102V');

and a query which works:
SELECT DealerID, tire, RepairsOfThisTire, TotalRepairs, 
PercentOfTotal = (RepairsOfThisTire*100.00)/TotalRepairs
FROM (
SELECT DealerID, TotalRepairs, tire, RepairsOfThisTire,
Sector = NTILE(5) OVER(PARTITION BY DealerID ORDER BY RepairsOfThisTire DESC, tire DESC)
FROM (
SELECT
t1.DealerID, t2.tire,
TotalRepairs = COUNT(*) OVER (PARTITION BY t1.DealerID),
RepairsOfThisTire = COUNT(*) OVER(PARTITION BY t1.DealerID, t2.tire)
FROM dbo.Table1 t1
LEFT JOIN dbo.Table2 t2 ON t2.RepairId = t1.RepairId
) d
WHERE tire IS NOT NULL
) e
WHERE Sector < 5
GROUP BY DealerID, TotalRepairs, tire, RepairsOfThisTire--, PercentOfTotal
ORDER BY DealerID, RepairsOfThisTire DESC, tire;




Low-hanging fruit picker and defender of the moggies





For better assistance in answering your questions, please read this.




Understanding and using APPLY, (I) and (II) Paul White

Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Post #1553811
Posted Monday, March 24, 2014 12:50 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Sunday, September 14, 2014 12:30 AM
Points: 253, Visits: 540
might be this will be handy and more readable
with cte as (    
SELECT DISTINCT t1.DealerID, t2.tire,
TotalRepairs = SUM(1) OVER (PARTITION BY t1.DealerID),
RepairsOfThisTire = SUM(1) OVER(PARTITION BY t1.DealerID, t2.tire),
PercentOfTotalRepairOrders = ROUND((SUM(1.0) OVER(PARTITION BY t1.DealerID, t2.tire)/SUM(1.0) OVER (PARTITION BY t1.DealerID)) *100,0)
FROM dbo.Table1 t1
LEFT JOIN dbo.Table2 t2
ON t2.RepairId = t1.RepairId
)
SELECT * FROM cte
WHERE tire IS NOT NULL
ORDER BY PercentOfTotalRepairOrders DESC

since you are using Sql 2012 you can use Lead or Lag function to calculate Total percent by Dealer Group and you can get the desire result




Every rule in a world of bits and bytes, can be bend or eventually be broken
MyBlog About Common dialog control
A Visualizer for viewing SqlCommand object script
Post #1553876
Posted Monday, March 24, 2014 8:09 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: 2 days ago @ 3:26 PM
Points: 41, Visits: 266
ChrisM@home (3/23/2014)

This query works only because of the sequence of RepairId in table2, which already orders most-used tyres ahead of least-used tyres, probably by accident. If table2 really is artificially ordered in this way, you're ok, otherwise you will get incorrect results. Here's a copy of Lyn's dataset slightly adjusted to show this:
DROP TABLE dbo.Table1
CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);

INSERT INTO dbo.Table1 VALUES
(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),
(1,8),(1,9),(1,10),(2,11),(2,12),(2,13),
(3,14),(3,15),(3,16);

DROP TABLE dbo.Table2
CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));

INSERT INTO dbo.Table2 VALUES (9,'225/65R17 102T'),
(7,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'),
(5,'235/60R18 102V'),(6,'235/60R18 102V'),(8,'235/60R18 102V'),
(2,'205/55R16 89H'),(1,'205/70R15 89H'),(13,'225/65R17 102T'),
(14,'235/60R18 102V'),(15,'235/60R18 102V'),(16,'235/60R18 102V');

and a query which works:
SELECT DealerID, tire, RepairsOfThisTire, TotalRepairs, 
PercentOfTotal = (RepairsOfThisTire*100.00)/TotalRepairs
FROM (
SELECT DealerID, TotalRepairs, tire, RepairsOfThisTire,
Sector = NTILE(5) OVER(PARTITION BY DealerID ORDER BY RepairsOfThisTire DESC, tire DESC)
FROM (
SELECT
t1.DealerID, t2.tire,
TotalRepairs = COUNT(*) OVER (PARTITION BY t1.DealerID),
RepairsOfThisTire = COUNT(*) OVER(PARTITION BY t1.DealerID, t2.tire)
FROM dbo.Table1 t1
LEFT JOIN dbo.Table2 t2 ON t2.RepairId = t1.RepairId
) d
WHERE tire IS NOT NULL
) e
WHERE Sector < 5
GROUP BY DealerID, TotalRepairs, tire, RepairsOfThisTire--, PercentOfTotal
ORDER BY DealerID, RepairsOfThisTire DESC, tire;



The problem with this query is it returns data for dealer id 3. The business rule I was given was that they only want to see the top 80%. In the case of dealer id 3 the query is showing the top 100%. In this example the top 80% cannot be calculated since all the orders are the same tire. The main issue I see and need to get clarification from the business on is what happens if the top tire is > 80%?
Post #1554038
Posted Monday, March 24, 2014 8:17 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 9:54 AM
Points: 6,813, Visits: 14,028
tfendt (3/24/2014)
...
The problem with this query is it returns data for dealer id 3. The business rule I was given was that they only want to see the top 80%. In the case of dealer id 3 the query is showing the top 100%. In this example the top 80% cannot be calculated since all the orders are the same tire. The main issue I see and need to get clarification from the business on is what happens if the top tire is > 80%?


Exactly. What do they want if the number of rows is not divisible by 5. Modification of the query to accommodate the decision should be straightforward.


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1554047
Posted Monday, March 24, 2014 8:18 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: 2 days ago @ 3:26 PM
Points: 41, Visits: 266
tfendt (3/24/2014)
ChrisM@home (3/23/2014)

This query works only because of the sequence of RepairId in table2, which already orders most-used tyres ahead of least-used tyres, probably by accident. If table2 really is artificially ordered in this way, you're ok, otherwise you will get incorrect results. Here's a copy of Lyn's dataset slightly adjusted to show this:
DROP TABLE dbo.Table1
CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);

INSERT INTO dbo.Table1 VALUES
(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),
(1,8),(1,9),(1,10),(2,11),(2,12),(2,13),
(3,14),(3,15),(3,16);

DROP TABLE dbo.Table2
CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));

INSERT INTO dbo.Table2 VALUES (9,'225/65R17 102T'),
(7,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'),
(5,'235/60R18 102V'),(6,'235/60R18 102V'),(8,'235/60R18 102V'),
(2,'205/55R16 89H'),(1,'205/70R15 89H'),(13,'225/65R17 102T'),
(14,'235/60R18 102V'),(15,'235/60R18 102V'),(16,'235/60R18 102V');

and a query which works:
SELECT DealerID, tire, RepairsOfThisTire, TotalRepairs, 
PercentOfTotal = (RepairsOfThisTire*100.00)/TotalRepairs
FROM (
SELECT DealerID, TotalRepairs, tire, RepairsOfThisTire,
Sector = NTILE(5) OVER(PARTITION BY DealerID ORDER BY RepairsOfThisTire DESC, tire DESC)
FROM (
SELECT
t1.DealerID, t2.tire,
TotalRepairs = COUNT(*) OVER (PARTITION BY t1.DealerID),
RepairsOfThisTire = COUNT(*) OVER(PARTITION BY t1.DealerID, t2.tire)
FROM dbo.Table1 t1
LEFT JOIN dbo.Table2 t2 ON t2.RepairId = t1.RepairId
) d
WHERE tire IS NOT NULL
) e
WHERE Sector < 5
GROUP BY DealerID, TotalRepairs, tire, RepairsOfThisTire--, PercentOfTotal
ORDER BY DealerID, RepairsOfThisTire DESC, tire;



The problem with this query is it returns data for dealer id 3. The business rule I was given was that they only want to see the top 80%. In the case of dealer id 3 the query is showing the top 100%. In this example the top 80% cannot be calculated since all the orders are the same tire. The main issue I see and need to get clarification from the business on is what happens if the top tire is > 80%?


I do like this query however. I am not 100% sure on how it works though. How does it know to select the top 80%? Why NTILE(5)?
Post #1554048
Posted Monday, March 24, 2014 8:23 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 9:54 AM
Points: 6,813, Visits: 14,028
Have a look at the NTILE function in Books Online (SQL Server help).
NTILE(n) splits the data into n equal-sized partitions numbered 1 to n in a user-defined order and with defined rules for when the data set is not divisible by n.


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1554052
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse