Problem with writing a query. Kindly guide...

Question

Problem with writing a query. Kindly guide...

T.Ashish

SSCarpal Tunnel

Points: 4248
More actions
September 26, 2013 at 7:17 am

#283773

Hi All,
I have RATE and SHIFT columns in a table as mentioned below. When I pass RATE=1 then I get SHIFT = 1,2,3. Now in second step, I need to write a query that will find all rates where atleast these three shifts compulsorily exist (e.g. RATE 3,5 as per below data).
All above work is to be done as a part of a single query and first input of RATE will be dynamic. Below mentioned data is only for example, and real table will be having huge data.
RateShift
11
1 2
1 3
25
31
32
33
84
41
42
51
52
53
54
61
72
82
Regards,
Ashish

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Sean Lange SSC Guru Points: 286573 More actions · Answer 1

You have been around here long enough to know that posting some consumable ddl and data is far more likely to get responses.

That being said why is only 3 and 5 returned? What about 1?

Seems to me like all you need is

GROUP BY Col1

HAVING COUNT(*) >= 3

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Hunchback SSCrazy Points: 2649 More actions · Answer 2

This problem is known as Relational Division[/url].

SET NOCOUNT ON;

USE tempdb;

GO

DECLARE @T TABLE (

Rate int NOT NULL,

Shift int NOT NULL,

PRIMARY KEY (Rate, Shift)

);

INSERT INTO @T (

Rate,

Shift

)

VALUES

(1, 1),

(1, 2),

(1, 3),

(2, 5),

(3, 1),

(3, 2),

(3, 3),

(8, 4),

(4, 1),

(4, 2),

(5, 1),

(5, 2),

(5, 3),

(5, 4),

(6, 1),

(7, 2),

(8, 2);

DECLARE @Rate int = 1;

WITH C1 AS (

SELECT

Shift

FROM

@T

WHERE

Rate = @Rate

)

SELECT

A.Rate

FROM

@T AS A

INNER JOIN

C1 AS B

ON A.Shift = B.Shift

GROUP BY

A.Rate

HAVING

COUNT(*) = (SELECT COUNT(*) FROM C1);

GO

/*

Rate

1

3

5

*/

Sean Lange SSC Guru Points: 286573 More actions · Answer 3

Thanks to hunchback for the ddl and data. Here is another way to do the same thing.

select Rate

from @T

cross apply (select count(*) as RowsCount from @t where rate = @Rate) r

group by Rate

having COUNT(*) >= max(r.RowsCount)

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Dwain Camps SSC Guru Points: 86908 More actions · Answer 4

Sean Lange (9/26/2013)
Thanks to hunchback for the ddl and data. Here is another way to do the same thing.
select Rate
from @T
cross apply (select count(*) as RowsCount from @t where rate = @Rate) r
group by Rate
having COUNT(*) >= max(r.RowsCount)

Sean - Should this return Rate = 9? I think your approach simply returns any Rate that has 3 or more rows.

DECLARE @T TABLE (

Rate int NOT NULL,

Shift int NOT NULL,

PRIMARY KEY (Rate, Shift)

);

INSERT INTO @T (

Rate,

Shift

)

VALUES

(1, 1),

(1, 2),

(1, 3),

(2, 5),

(3, 1),

(3, 2),

(3, 3),

(8, 4),

(4, 1),

(4, 2),

(5, 1),

(5, 2),

(5, 3),

(5, 4),

(6, 1),

(7, 2),

(8, 2),

(9, 2),

(9, 3),

(9, 4);

DECLARE @Rate int = 1;

select Rate

from @T

cross apply (select count(*) as RowsCount from @t where rate = @Rate) r

group by Rate

having COUNT(*) >= max(r.RowsCount)

I think this works as the OP requested also:

SELECT Rate

FROM @T

WHERE Shift IN (SELECT Shift FROM @T WHERE Rate = @Rate)

GROUP BY Rate

HAVING COUNT(*) = (SELECT COUNT(*) FROM @T WHERE Rate = @Rate);

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86908 More actions · Answer 5

I am a bit curious how hunchback and my solutions would scale against a divide and conquer approach.

SELECT Shift

INTO #T

FROM @T

WHERE Rate = @Rate;

DECLARE @Rows INT = @@ROWCOUNT;

SELECT Rate

FROM @T

WHERE Shift IN (SELECT Shift FROM #T)

GROUP BY Rate

HAVING COUNT(*) = @Rows;

GO

DROP TABLE #T;

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Mr. Kapsicum SSCertifiable Points: 6128 More actions · Answer 6

well, almost all the sql masters have answered the questions.

a simple group by and having count will serve the purpose.

T.Ashish SSCarpal Tunnel Points: 4248 More actions · Answer 7

Thanks hunchback, Sean Lange and dwain.c for help.

I'm going with CROSS APPLY as it has least logical reads and lesser cost then other two.

Thanks Again.

Dwain Camps SSC Guru Points: 86908 More actions · Answer 8

If I am reading this article (http://weblogs.sqlteam.com/peterl/archive/2010/07/02/Proper-Relational-Division-With-Sets.aspx) correctly, this solution should also be pretty fast:

WITH Target AS

(

SELECT Rate, Shift

FROM @T

WHERE Rate = @Rate

)

SELECT c.Rate

FROM

(

SELECT Rate, rc=COUNT(*)

FROM @T

GROUP BY Rate

) a

JOIN

(

SELECT rc=COUNT(*)

FROM Target

) b ON b.rc <= a.rc

JOIN @T AS c ON c.Rate = a.Rate

JOIN Target AS d ON d.Shift = c.Shift

GROUP BY c.Rate

HAVING COUNT(*) = MIN(b.rc);

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86908 More actions · Answer 9

T.Ashish (9/26/2013)
Thanks hunchback, Sean Lange and dwain.c for help.
I'm going with CROSS APPLY as it has least logical reads and lesser cost then other two.
Thanks Again.

I do not recommend that you judge a query's performance by its plan cost.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86908 More actions · Answer 10

Continuing curiosity regarding this problem drove me to see how the proposed solutions scale, so I built a test harness.

CREATE TABLE #T (

Rate int NOT NULL,

Shift int NOT NULL,

PRIMARY KEY (Rate, Shift)

);

INSERT INTO #T (

Rate,

Shift

)

VALUES

(-1, 1),

(-1, 2),

(-1, 3);

WITH Tally (n) AS

(

SELECT TOP 500000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

FROM sys.all_columns a CROSS JOIN sys.all_columns b

),

STally (n) AS

(

SELECT TOP 10 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

FROM sys.all_columns a

)

INSERT INTO #T

SELECT a.n, b.n

FROM Tally a

CROSS APPLY STally b;

DELETE

FROM #T

WHERE (Rate < 20000 AND Shift = 1) OR (Rate > 400000 AND Shift = 3)

OR (RATE BETWEEN 50000 AND 150000 AND Shift = 2);

DECLARE @Rate INT = -1

,@Holder INT

,@StartDT DATETIME;

SELECT @StartDT=GETDATE();

PRINT 'Solution by Hunchback';

SET STATISTICS TIME ON;

WITH C1 AS (

SELECT

Shift

FROM

#T

WHERE

Rate = @Rate

)

SELECT

@Holder=A.Rate

FROM

#T AS A

INNER JOIN

C1 AS B

ON A.Shift = B.Shift

GROUP BY

A.Rate

HAVING

COUNT(*) = (SELECT COUNT(*) FROM C1);

SET STATISTICS TIME OFF;

SELECT ElapsedMS=DATEDIFF(millisecond, @StartDT, GETDATE());

SELECT @StartDT=GETDATE();

PRINT 'Solution by Dwain.C';

SET STATISTICS TIME ON;

SELECT @Holder=Rate

FROM #T

WHERE Shift IN (SELECT Shift FROM #T WHERE Rate = @Rate)

GROUP BY Rate

HAVING COUNT(*) = (SELECT COUNT(*) FROM #T WHERE Rate = @Rate);

SET STATISTICS TIME OFF;

SELECT ElapsedMS=DATEDIFF(millisecond, @StartDT, GETDATE());

SELECT @StartDT=GETDATE();

PRINT 'Divide and Conquer Solution by Dwain.C';

SET STATISTICS TIME ON;

SELECT Shift

INTO #T1

FROM #T

WHERE Rate = @Rate;

DECLARE @Rows INT = @@ROWCOUNT;

SELECT @Holder=Rate

FROM #T

WHERE Shift IN (SELECT Shift FROM #T1)

GROUP BY Rate

HAVING COUNT(*) = @Rows;

SET STATISTICS TIME OFF;

SELECT ElapsedMS=DATEDIFF(millisecond, @StartDT, GETDATE());

SELECT @StartDT=GETDATE();

PRINT 'Solution by Peso/Adam Machanic';

SET STATISTICS TIME ON;

WITH Target AS

(

SELECT Rate, Shift

FROM #T

WHERE Rate = @Rate

)

SELECT @Holder=c.Rate

FROM

(

SELECT Rate, rc=COUNT(*)

FROM #T

GROUP BY Rate

) a

JOIN

(

SELECT rc=COUNT(*)

FROM Target

) b ON b.rc <= a.rc

JOIN #T AS c ON c.Rate = a.Rate

JOIN Target AS d ON d.Shift = c.Shift

GROUP BY c.Rate

HAVING COUNT(*) = MIN(b.rc);

SET STATISTICS TIME OFF;

SELECT ElapsedMS=DATEDIFF(millisecond, @StartDT, GETDATE());

GO

DROP TABLE #T;

DROP TABLE #T1;

From the statistics, I get these CPU/Elapsed time results:

Solution by Hunchback

SQL Server Execution Times:

CPU time = 1637 ms, elapsed time = 452 ms.

Solution by Dwain.C

SQL Server Execution Times:

CPU time = 1467 ms, elapsed time = 473 ms.

Divide and Conquer Solution by Dwain.C

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:

CPU time = 1169 ms, elapsed time = 301 ms.

Solution by Peso/Adam Machanic

SQL Server Execution Times:

CPU time = 1996 ms, elapsed time = 2235 ms.

And these results are quite similar to the results SELECTED into the 4 results panes for elapsed MS:

460

480

306

2263

Comments:

- I eliminated Sean's solution because I believe he's solving a different problem. But of course since the OP has confirmed that he prefers Sean's solution, I have to assume that both Hunchback and I read the requirements wrong.

- The result for my last suggestion (Peso/Adam Machanic solution) was probably impacted by the <= I had to use in the first JOIN. I believe their solution was based on the no remainder problem posed by Mr. Celko.

I was surprised that Divide and Conquer came in at about 30% faster.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

T.Ashish SSCarpal Tunnel Points: 4248 More actions · Answer 11

Thanks Dwain.C

Kindly ignore my earlier post, I missed it completely. Actually, Other four solutions are giving accurate results

My results goes as below:

1.hunchback

WITH C1 AS (SELECT Shift FROM test12 WHERE Rate = 4)

SELECT A.Rate FROM test12 AS A INNER JOIN C1 AS B ON A.Shift = B.Shift

GROUP BY A.Rate HAVING COUNT(*) = (SELECT COUNT(*) FROM C1);

Table 'test12'. Scan count 2, logical reads 48

Query Cost relative to batch = 21%

2.dwain.c

SELECT Rate FROM test12

WHERE Shift IN (SELECT Shift FROM test12 WHERE Rate = 4)

GROUP BY Rate HAVING COUNT(*) = (SELECT COUNT(*) FROM test12 WHERE Rate = 4);

Table 'test12'. Scan count 2, logical reads 48

Query Cost relative to batch = 21%

3.dwain.c (divide and concur)

SELECT Shift INTO #T FROM @T WHERE Rate = @Rate;

DECLARE @Rows INT = @@ROWCOUNT;

SELECT Rate FROM @T

WHERE Shift IN (SELECT Shift FROM #T)

GROUP BY Rate HAVING COUNT(*) = @Rows;

GO

DROP TABLE #T;

Table 'test12'. Scan count 1, logical reads 2

'#T_____00000000000D'. Scan count 1, logical reads 22

Table 'test12'. Scan count 1, logical reads 2

Query Cost relative to batch = 23%

4.dwain.c

WITH Target AS

(SELECT Rate, Shift FROM test12 WHERE Rate = 4)

SELECT c.Rate

FROM

(SELECT Rate, rc=COUNT(*) FROM test12 GROUP BY Rate) a

JOIN

(SELECT rc=COUNT(*) FROM Target) b ON b.rc <= a.rc

JOIN test12 AS c ON c.Rate = a.Rate

JOIN Target AS d ON d.Shift = c.Shift

GROUP BY c.Rate

HAVING COUNT(*) = MIN(b.rc);

Table 'test12'. Scan count 8, logical reads 54,

Query Cost relative to batch = 35%

So, Cost wise, 1 and 2 are performing better then 3.

ElapsedMS is as below:

990

956

960 (Divide and Conquer)

2723

I could not get "Divide and Conquer" at first place.

Thanks again everyone for putting your efforts.

Dwain Camps SSC Guru Points: 86908 More actions · Answer 12

T.Ashish (9/27/2013)
Thanks Dwain.C
Kindly ignore my earlier post, I missed it completely. Actually, Other four solutions are giving accurate results
My results goes as below:
1.hunchback
WITH C1 AS (SELECT Shift FROM test12 WHERE Rate = 4)
SELECT A.Rate FROM test12 AS A INNER JOIN C1 AS B ON A.Shift = B.Shift
GROUP BY A.Rate HAVING COUNT(*) = (SELECT COUNT(*) FROM C1);
Table 'test12'. Scan count 2, logical reads 48
Query Cost relative to batch = 21%
2.dwain.c
SELECT Rate FROM test12
WHERE Shift IN (SELECT Shift FROM test12 WHERE Rate = 4)
GROUP BY Rate HAVING COUNT(*) = (SELECT COUNT(*) FROM test12 WHERE Rate = 4);
Table 'test12'. Scan count 2, logical reads 48
Query Cost relative to batch = 21%
3.dwain.c (divide and concur)
SELECT Shift INTO #T FROM @T WHERE Rate = @Rate;
DECLARE @Rows INT = @@ROWCOUNT;
SELECT Rate FROM @T
WHERE Shift IN (SELECT Shift FROM #T)
GROUP BY Rate HAVING COUNT(*) = @Rows;
GO
DROP TABLE #T;
Table 'test12'. Scan count 1, logical reads 2
'#T_____00000000000D'. Scan count 1, logical reads 22
Table 'test12'. Scan count 1, logical reads 2
Query Cost relative to batch = 23%
4.dwain.c
WITH Target AS
(SELECT Rate, Shift FROM test12 WHERE Rate = 4)
SELECT c.Rate
FROM
(SELECT Rate, rc=COUNT(*) FROM test12 GROUP BY Rate) a
JOIN
(SELECT rc=COUNT(*) FROM Target) b ON b.rc <= a.rc
JOIN test12 AS c ON c.Rate = a.Rate
JOIN Target AS d ON d.Shift = c.Shift
GROUP BY c.Rate
HAVING COUNT(*) = MIN(b.rc);
Table 'test12'. Scan count 8, logical reads 54,
Query Cost relative to batch = 35%
So, Cost wise, 1 and 2 are performing better then 3.
ElapsedMS is as below:
990
956
960 (Divide and Conquer)
2723
I could not get "Divide and Conquer" at first place.
Thanks again everyone for putting your efforts.

#1 and 2 have effectively identical execution plans so any timing variances you're seeing with those is probably random and multiple testing runs probably would swing one way or the other.

#3 (Divide and Conquer) has an advantage that will likely grow depending on the number of rows in your table (the more rows, the more the initial INSERT capturing the row count will help). I think anyway.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Sean Lange SSC Guru Points: 286573 More actions · Answer 13

Well I guess I didn't quite get the requirements. Chock that up to a lack of details. This thread could serve as a good example of why it is so important to provide those details. 😉

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

T.Ashish SSCarpal Tunnel Points: 4248 More actions · Answer 14

#3 (Divide and Conquer) has an advantage that will likely grow depending on the number of rows in your table (the more rows, the more the initial INSERT capturing the row count will help). I think anyway.

I will keep that in mind.