T-SQl help...

Question

Post reply

T-SQl help...

etirem

SSCommitted

Points: 1655
More actions
September 21, 2013 at 11:41 am

#277465

Hi, I have the sql sample code and the below image has the ouput needed.. thanks in advance...
I`m working with RANKING functions but need some expertise...
IF OBJECT_ID('Tempdb..#tTable') IS NOT NULL
DROP TABLE #tTable
CREATE TABLE #tTable(CID INT, CDate DATETIME, Dept VARCHAR(10))
INSERT INTO #tTable(CID, CDate, Dept)
VALUES
(111,'2012-10-05 00:00:00.000','A'),
(111,'2012-10-18 00:00:00.000','C'),
(111,'2012-11-01 00:00:00.000','B'),
(111,'2012-11-01 00:00:00.000','C'),
(111,'2012-11-20 00:00:00.000','C'),
(111,'2012-12-09 00:00:00.000','C'),
(111,'2012-12-11 00:00:00.000','A'),
(111,'2013-02-21 00:00:00.000','B'),
(111,'2013-03-22 00:00:00.000','B'),
(111, '2013-03-22 00:00:00.000','C'),
(111,'2013-04-12 00:00:00.000','C'),
(111,'2013-04-26 00:00:00.000','B'),
(111,'2013-04-26 00:00:00.000','C'),
(222,'2012-02-13 00:00:00.000','C'),
(222,'2012-03-02 00:00:00.000','B'),
(222, '2012-06-16 00:00:00.000','C'),
(222,'2012-07-12 00:00:00.000','C'),
(222,'2013-04-26 00:00:00.000','B'),
(222, '2013-05-23 00:00:00.000','C'),
(222,'2013-07-11 00:00:00.000','C'),
(222,'2013-09-19 00:00:00.000','C'),
(222,'2013-09-20 00:00:00.000','A')
SELECT * FROM #tTable
DROP TABLE #tTable
The Groups are created based on CDate and Dept.
For the CID, the CDate is in ASC order and if Dept=C and the next row is Dept=C...and next row is also Dept=C is a Group. This means if Dept=C value comes one after the other it is a Group.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

kapil_kk SSC-Insane Points: 21316 More actions · Answer 1

the information you have provided is not complete..

can you please tell us on what basis you are creating the groups?

_______________________________________________________________
To get quick answer follow this link:
http://www.sqlservercentral.com/articles/Best+Practices/61537/

etirem SSCommitted Points: 1655 More actions · Answer 2

The Groups are created based on CDate and Dept.

For the CID, the CDate is in ASC order and if Dept=C and the next row is Dept=C...and next row is also Dept=C is a Group. This means if Dept=C value comes one after the other it is a Group.

etirem SSCommitted Points: 1655 More actions · Answer 3

etirem (9/21/2013)
The Groups are created based on CDate and Dept.
For the CID, the CDate is in ASC order and if Dept=C and the next row is Dept=C...and next row is also Dept=C is a Group. This means if Dept=C value comes one after the other it is a Group.

kapil_kk SSC-Insane Points: 21316 More actions · Answer 4

etirem (9/21/2013)
The Groups are created based on CDate and Dept.
For the CID, the CDate is in ASC order and if Dept=C and the next row is Dept=C...and next row is also Dept=C is a Group. This means if Dept=C value comes one after the other it is a Group.

still not clear.....

can you please provide some more detailed information.

_______________________________________________________________
To get quick answer follow this link:
http://www.sqlservercentral.com/articles/Best+Practices/61537/

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 5

maybe something along these lines.....???

WITH cte AS

(

SELECT CID,

CDate,

Dept,

row_number() OVER (PARTITION BY cid,dept ORDER BY cid,cdate,dept) rn

FROM #tTable

)

SELECT CID,Cdate,Dept

FROM cte

WHERE rn = 3 AND dept = 'C'

ORDER by cid,cdate

edit...simplified

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

etirem SSCommitted Points: 1655 More actions · Answer 6

Sorry! May be I`m not explaining it properly.

One more CID=333 is added below SQL.

IF OBJECT_ID('Tempdb..#tTable') IS NOT NULL

DROP TABLE #tTable

CREATE TABLE #tTable(CID INT, CDate DATETIME, Dept VARCHAR(10))

INSERT INTO #tTable(CID, CDate, Dept)

VALUES

(333, '2012-01-14 00:00:00.000','A'),

(333,'2012-02-14 00:00:00.000','C'),

(333,'2012-03-14 00:00:00.000','B'),

(333,'2012-04-14 00:00:00.000','C'),

(333,'2012-08-15 00:00:00.000','C'),

(333,'2012-09-20 00:00:00.000','C'),

(333,'2012-10-26 00:00:00.000','B'),

(333, '2013-01-14 00:00:00.000','A'),

(333,'2013-02-14 00:00:00.000','C'),

(333,'2013-03-14 00:00:00.000','B'),

(333,'2013-04-14 00:00:00.000','C'),

(333,'2013-05-14 00:00:00.000','C')

SELECT * FROM #tTable

DROP TABLE #tTable

Note that the ORDER of the CDate is in ASC order and it should not be changed for that CID.

GROUP(s) is ONLY for Dept=C:

If Dept=C is followed by another Dept=C or more Dept=C values it is considered as GROUP 1 for that CID...

AND for GROUP 2 same logic..if Dept=C is followed by another Dept=C or more Dept=C values it is considered as GROUP 2 for that CID...

Output needed is the 2nd row of the GROUP 1.

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 7

etirem (9/22/2013)
Sorry! May be I`m not explaining it properly.
One more CID=333 is added below SQL.
IF OBJECT_ID('Tempdb..#tTable') IS NOT NULL
DROP TABLE #tTable
CREATE TABLE #tTable(CID INT, CDate DATETIME, Dept VARCHAR(10))
INSERT INTO #tTable(CID, CDate, Dept)
VALUES
(333, '2012-01-14 00:00:00.000','A'),
(333,'2012-02-14 00:00:00.000','C'),
(333,'2012-03-14 00:00:00.000','B'),
(333,'2012-04-14 00:00:00.000','C'),
(333,'2012-08-15 00:00:00.000','C'),
(333,'2012-09-20 00:00:00.000','C'),
(333,'2012-10-26 00:00:00.000','B'),
(333, '2013-01-14 00:00:00.000','A'),
(333,'2013-02-14 00:00:00.000','C'),
(333,'2013-03-14 00:00:00.000','B'),
(333,'2013-04-14 00:00:00.000','C'),
(333,'2013-05-14 00:00:00.000','C')
SELECT * FROM #tTable
DROP TABLE #tTable
Note that the ORDER of the CDate is in ASC order and it should not be changed for that CID.
GROUP(s) is ONLY for Dept=C:
If Dept=C is followed by another Dept=C or more Dept=C values it is considered as GROUP 1 for that CID...
AND for GROUP 2 same logic..if Dept=C is followed by another Dept=C or more Dept=C values it is considered as GROUP 2 for that CID...
Output needed is the 2nd row of the GROUP 1.

ok...please post the results you get when running my code....

thanks

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

etirem SSCommitted Points: 1655 More actions · Answer 8

J Livingston SQL (9/22/2013)
etirem (9/22/2013)
Sorry! May be I`m not explaining it properly.
One more CID=333 is added below SQL.
IF OBJECT_ID('Tempdb..#tTable') IS NOT NULL
DROP TABLE #tTable
CREATE TABLE #tTable(CID INT, CDate DATETIME, Dept VARCHAR(10))
INSERT INTO #tTable(CID, CDate, Dept)
VALUES
(333, '2012-01-14 00:00:00.000','A'),
(333,'2012-02-14 00:00:00.000','C'),
(333,'2012-03-14 00:00:00.000','B'),
(333,'2012-04-14 00:00:00.000','C'),
(333,'2012-08-15 00:00:00.000','C'),
(333,'2012-09-20 00:00:00.000','C'),
(333,'2012-10-26 00:00:00.000','B'),
(333, '2013-01-14 00:00:00.000','A'),
(333,'2013-02-14 00:00:00.000','C'),
(333,'2013-03-14 00:00:00.000','B'),
(333,'2013-04-14 00:00:00.000','C'),
(333,'2013-05-14 00:00:00.000','C')
SELECT * FROM #tTable
DROP TABLE #tTable
Note that the ORDER of the CDate is in ASC order and it should not be changed for that CID.
GROUP(s) is ONLY for Dept=C:
If Dept=C is followed by another Dept=C or more Dept=C values it is considered as GROUP 1 for that CID...
AND for GROUP 2 same logic..if Dept=C is followed by another Dept=C or more Dept=C values it is considered as GROUP 2 for that CID...
Output needed is the 2nd row of the GROUP 1.
ok...please post the results you get when running my code....
thanks

Thank you for your reply but if I have another CID=444 your code doesn`t work...

IF OBJECT_ID('Tempdb..#tTable') IS NOT NULL

DROP TABLE #tTable

CREATE TABLE #tTable(CID INT, CDate DATETIME, Dept VARCHAR(10))

INSERT INTO #tTable(CID, CDate, Dept)

VALUES

(444, '2013-01-14 00:00:00.000','C'),

(444,'2013-02-14 00:00:00.000','C'),

(444,'2013-03-14 00:00:00.000','B'),

(444,'2013-04-14 00:00:00.000','C'),

(444,'2013-05-14 00:00:00.000','C')

SELECT * from #tTable

DROP TABLE #tTable

mickyT SSChampion Points: 10360 More actions · Answer 9

Hi

I think this covers what you want to do. I suspect the performance will be poor.

WITH groupit AS (

-- Group up the Depts on data

SELECT CID, CDate, Dept,

ROW_NUMBER() OVER (PARTITION BY CID ORDER BY CDate ASC) - ROW_NUMBER() OVER (PARTITION BY CID, Dept ORDER BY CDate ASC) GroupNum

FROM #tTable t

),

getDeptGC AS (

-- Do counts in the groups for dept C

SELECT CID, CDate, Dept, GroupNum,

COUNT(*) OVER (PARTITION BY CID, Dept, GroupNum) C

FROM groupit a

WHERE Dept = 'C'

),

getDeptR AS (

-- Number the rows in groups of more than 1

SELECT CID, CDate, Dept, GroupNum,

ROW_NUMBER() OVER (PARTITION BY CID, Dept ORDER BY CDate ASC) N

FROM getDeptGC

WHERE C > 1

)

SELECT CID, CDate, Dept FROM getDeptR WHERE N = 2

ORDER BY CID;

Hunchback SSCrazy Points: 2649 More actions · Answer 10

This problem is known as "Finding Islands", but this case is a special one because there is no other column that we could use to break ties based on (CID, cDate) other than column [Dept].

I added one more column, an identity one, to break this tie, and here is the code.

SET NOCOUNT ON;

USE tempdb;

GO

CREATE TABLE #tTable(

sk int NOT NULL IDENTITY UNIQUE CLUSTERED,

CID INT,

CDate DATETIME,

Dept VARCHAR(10)

);

INSERT INTO #tTable(CID, CDate, Dept)

VALUES

(111,'2012-10-05 00:00:00.000','A'),

(111,'2012-10-18 00:00:00.000','C'),

(111,'2012-11-01 00:00:00.000','B'),

(111,'2012-11-01 00:00:00.000','C'),

(111,'2012-11-20 00:00:00.000','C'),

(111,'2012-12-09 00:00:00.000','C'),

(111,'2012-12-11 00:00:00.000','A'),

(111,'2013-02-21 00:00:00.000','B'),

(111,'2013-03-22 00:00:00.000','B'),

(111, '2013-03-22 00:00:00.000','C'),

(111,'2013-04-12 00:00:00.000','C'),

(111,'2013-04-26 00:00:00.000','B'),

(111,'2013-04-26 00:00:00.000','C'),

(222,'2012-02-13 00:00:00.000','C'),

(222,'2012-03-02 00:00:00.000','B'),

(222, '2012-06-16 00:00:00.000','C'),

(222,'2012-07-12 00:00:00.000','C'),

(222,'2013-04-26 00:00:00.000','B'),

(222, '2013-05-23 00:00:00.000','C'),

(222,'2013-07-11 00:00:00.000','C'),

(222,'2013-09-19 00:00:00.000','C'),

(222,'2013-09-20 00:00:00.000','A'),

(444, '2013-01-14 00:00:00.000','C'),

(444,'2013-02-14 00:00:00.000','C'),

(444,'2013-03-14 00:00:00.000','B'),

(444,'2013-04-14 00:00:00.000','C'),

(444,'2013-05-14 00:00:00.000','C');

WITH C1 AS (

SELECT

CID,

CDate,

Dept,

DENSE_RANK() OVER(PARTITION BY CID ORDER BY CDate, sk) -

DENSE_RANK() OVER(PARTITION BY CID ORDER BY Dept, CDate, sk) AS grp

FROM

#tTable

)

, C2 AS (

SELECT

CID,

grp,

MIN(cDate) dt

FROM

C1

GROUP BY

CID, grp

HAVING

SUM(CASE WHEN Dept <> 'C' THEN 1 ELSE 0 END) = 0

AND COUNT(*) > 1

)

SELECT

A.CID,

A.cDate,

A.Dept,

DENSE_RANK() OVER(PARTITION BY B.CID ORDER BY B.dt) AS grpnum

FROM

C1 AS A

INNER JOIN

C2 AS B

ON B.CID = A.CID

AND B.grp = A.grp

ORDER BY

A.CID, A.CDate;

GO

DROP TABLE #tTable

GO

/*

Result

CIDcDateDeptgrpnum

1112012-11-01 00:00:00.000C1

1112012-11-20 00:00:00.000C1

1112012-12-09 00:00:00.000C1

1112013-03-22 00:00:00.000C2

1112013-04-12 00:00:00.000C2

2222012-06-16 00:00:00.000C1

2222012-07-12 00:00:00.000C1

2222013-05-23 00:00:00.000C2

2222013-07-11 00:00:00.000C2

2222013-09-19 00:00:00.000C2

4442013-01-14 00:00:00.000C1

4442013-02-14 00:00:00.000C1

4442013-04-14 00:00:00.000C2

4442013-05-14 00:00:00.000C2

*/

The idea is to enumerate rows based on two criteria that together will yield a group number. Then you need to solve a second problem known as Relational Division[/url].

You can learn more about solving this problem, in the last book from Itzik Ben-Gan about T-SQL Querying.

Inside Microsoft® SQL Server® 2008: T-SQL Querying

Dwain Camps SSC Guru Points: 86908 More actions · Answer 11

Not sure but perhaps this is a simpler approach?

CREATE TABLE #tTable(

sk int NOT NULL IDENTITY UNIQUE CLUSTERED,

CID INT,

CDate DATETIME,

Dept VARCHAR(10)

);

INSERT INTO #tTable(CID, CDate, Dept)

VALUES

(111,'2012-10-05 00:00:00.000','A'),

(111,'2012-10-18 00:00:00.000','C'),

(111,'2012-11-01 00:00:00.000','B'),

(111,'2012-11-01 00:00:00.000','C'),

(111,'2012-11-20 00:00:00.000','C'),

(111,'2012-12-09 00:00:00.000','C'),

(111,'2012-12-11 00:00:00.000','A'),

(111,'2013-02-21 00:00:00.000','B'),

(111,'2013-03-22 00:00:00.000','B'),

(111, '2013-03-22 00:00:00.000','C'),

(111,'2013-04-12 00:00:00.000','C'),

(111,'2013-04-26 00:00:00.000','B'),

(111,'2013-04-26 00:00:00.000','C'),

(222,'2012-02-13 00:00:00.000','C'),

(222,'2012-03-02 00:00:00.000','B'),

(222, '2012-06-16 00:00:00.000','C'),

(222,'2012-07-12 00:00:00.000','C'),

(222,'2013-04-26 00:00:00.000','B'),

(222, '2013-05-23 00:00:00.000','C'),

(222,'2013-07-11 00:00:00.000','C'),

(222,'2013-09-19 00:00:00.000','C'),

(222,'2013-09-20 00:00:00.000','A'),

(444, '2013-01-14 00:00:00.000','C'),

(444,'2013-02-14 00:00:00.000','C'),

(444,'2013-03-14 00:00:00.000','B'),

(444,'2013-04-14 00:00:00.000','C'),

(444,'2013-05-14 00:00:00.000','C');

WITH GroupedRows AS (

SELECT CID, Dept, sk=1+MIN(sk)

FROM (

SELECT sk, CID, CDate, Dept

,rn=sk-ROW_NUMBER() OVER (PARTITION BY CID, Dept ORDER BY cDATE)

FROM #tTable

WHERE Dept = 'C'

) a

GROUP BY CID, Dept, rn

HAVING MIN(sk) <> MAX(sk))

SELECT a.sk, a.CID, a.cDate, a.Dept

FROM #tTable a

JOIN GroupedRows b ON a.sk=b.sk;

GO

DROP TABLE #tTable;

Edit: Adding my results:

sk CID cDate Dept

5 111 2012-11-20 00:00:00.000 C

11 111 2013-04-12 00:00:00.000 C

17 222 2012-07-12 00:00:00.000 C

20 222 2013-07-11 00:00:00.000 C

24 444 2013-02-14 00:00:00.000 C

27 444 2013-05-14 00:00:00.000 C

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

etirem SSCommitted Points: 1655 More actions · Answer 12

mickyT (9/22/2013)
Hi
I think this covers what you want to do. I suspect the performance will be poor.
WITH groupit AS (
-- Group up the Depts on data
SELECT CID, CDate, Dept,
ROW_NUMBER() OVER (PARTITION BY CID ORDER BY CDate ASC) - ROW_NUMBER() OVER (PARTITION BY CID, Dept ORDER BY CDate ASC) GroupNum
FROM #tTable t
),
getDeptGC AS (
-- Do counts in the groups for dept C
SELECT CID, CDate, Dept, GroupNum,
COUNT(*) OVER (PARTITION BY CID, Dept, GroupNum) C
FROM groupit a
WHERE Dept = 'C'
),
getDeptR AS (
-- Number the rows in groups of more than 1
SELECT CID, CDate, Dept, GroupNum,
ROW_NUMBER() OVER (PARTITION BY CID, Dept ORDER BY CDate ASC) N
FROM getDeptGC
WHERE C > 1
)
SELECT CID, CDate, Dept FROM getDeptR WHERE N = 2
ORDER BY CID;

Thank You!!