SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Grouping Related Rows in Same Table


Grouping Related Rows in Same Table

Author
Message
tssopa
tssopa
Grasshopper
Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)

Group: General Forum Members
Points: 14 Visits: 71
I am hoping that someone on here can help me out with my problem. I have a single table full of shapes that may or may not intersect each other. My goal is to identify the shapes that intersect and group them together by giving them a suedo "cluster id". Here is a simplified version of the data that I am working with.
IF EXISTS (
SELECT *
FROM sys.tables
WHERE NAME = 'OverlappingShapes'
AND Schema_Name(schema_id) = 'dbo'
AND [type] = 'U'
)
DROP TABLE dbo.OverlappingShapes
GO

CREATE TABLE dbo.OverlappingShapes (ShapeID CHAR(5), Shape GEOMETRY)
GO

INSERT INTO dbo.OverlappingShapes (ShapeID, Shape)
VALUES ('A1B20', geometry::STGeomFromText('POLYGON((01 15, 02 13, 03 11, 05 13, 04 17, 02 17, 01 15))', 0))
, ('D1B19', geometry::STGeomFromText('POLYGON((03 14, 06 11, 06 16, 03 14))', 0))
, ('C5B23', geometry::STGeomFromText('POLYGON((05 04, 09 04, 09 08, 05 08, 05 04))', 0))
, ('A1B11', geometry::STGeomFromText('POLYGON((06 02, 10 06, 06 09, 06 02))', 0))
, ('D1B25', geometry::STGeomFromText('POLYGON((01 04, 02 03, 02 02, 03 02, 04 03, 03 04, 04 06, 02 07, 01 06, 02 05, 01 04))', 0))
, ('E3A22', geometry::STGeomFromText('LINESTRING(00 12, 07 12)', 0))
GO



If you run the following SQL text you should get the results listed below.
SELECT A.ShapeID AS ShapeID_A, B.ShapeID AS ShapeID_B
FROM dbo.OverlappingShapes AS A
INNER JOIN dbo.OverlappingShapes AS B
ON A.Shape.STIntersects(B.Shape) = 1
GO


ShapeID_A ShapeID_B
--------- ---------
A1B20 A1B20
D1B19 A1B20
E3A22 A1B20
A1B20 D1B19
D1B19 D1B19
E3A22 D1B19
C5B23 C5B23
A1B11 C5B23
C5B23 A1B11
A1B11 A1B11
D1B25 D1B25
A1B20 E3A22
D1B19 E3A22
E3A22 E3A22

(14 row(s) affected)



What I am trying to accomplish is something like this:
ShapeID_A ShapeID_B ClusterID
--------- --------- ---------
A1B20 A1B20 1
D1B19 A1B20 1
E3A22 A1B20 1
A1B20 D1B19 1
D1B19 D1B19 1
E3A22 D1B19 1
A1B20 E3A22 1
D1B19 E3A22 1
E3A22 E3A22 1
C5B23 C5B23 2
A1B11 C5B23 2
C5B23 A1B11 2
A1B11 A1B11 2
D1B25 D1B25 3



Or to go one step further, something like this:
ShapeID ClusterID
------- ---------
A1B20 1
D1B19 1
E3A22 1
C5B23 2
A1B11 2
D1B25 3



Any help would be greatly appreciated.

Thanks,

Mike
TheSQLGuru
TheSQLGuru
SSChampion
SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)

Group: General Forum Members
Points: 12337 Visits: 8546
This solution was pretty quick to slap together and seems to do the trick, although I must admit to feeling slightly dirty posting it! Hehe

SET NOCOUNT ON
CREATE TABLE dbo.#out (ShapeID char(5) NOT NULL, GroupID tinyint NOT NULL) --size group as needed

DECLARE @GroupID tinyint = 1, @ShapeID char(5), @Shape geometry

DECLARE Csr cursor FAST_FORWARD for
SELECT ShapeID, Shape
FROM dbo.OverlappingShapes

OPEN Csr
FETCH NEXT FROM Csr INTO @ShapeID, @Shape
WHILE (@@fetch_status = 0)
BEGIN
INSERT dbo.#out
SELECT ShapeID, @GroupID
FROM dbo.OverlappingShapes os
WHERE @Shape.STIntersects(os.Shape) = 1
AND NOT EXISTS (SELECT * FROM dbo.#out t WHERE t.ShapeID = os.ShapeID)

IF @@ROWCOUNT <> 0 --hit a new group, so increment
BEGIN
SET @GroupID += 1
END

FETCH NEXT FROM Csr INTO @ShapeID, @Shape
END
CLOSE Csr
DEALLOCATE Csr

SELECT * FROM dbo.#out



Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Mark Cowne
Mark Cowne
Hall of Fame
Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)

Group: General Forum Members
Points: 3097 Visits: 24076
Here's another way using recursive CTEs



WITH Base AS (
SELECT A.ShapeID AS ShapeID_A, B.ShapeID AS ShapeID_B
FROM dbo.OverlappingShapes AS A
INNER JOIN dbo.OverlappingShapes AS B
ON A.Shape.STIntersects(B.Shape) = 1
AND A.ShapeID < B.ShapeID),

Recur AS (
SELECT b.ShapeID_A,
b.ShapeID_B
FROM Base b
WHERE NOT EXISTS(SELECT * FROM Base b2 WHERE b2.ShapeID_B=b.ShapeID_A)

UNION ALL

SELECT r.ShapeID_A,
b.ShapeID_B
FROM Base b
INNER JOIN Recur r ON r.ShapeID_B = b.ShapeID_A),

Results AS (
SELECT ShapeID_A,
ShapeID_B
FROM Recur
UNION
SELECT a.ShapeID,
a.ShapeID
FROM dbo.OverlappingShapes a
WHERE NOT EXISTS (SELECT * FROM Recur r WHERE r.ShapeID_B=a.ShapeID))

SELECT ShapeID_B AS ShapeID,
DENSE_RANK() OVER(ORDER BY ShapeID_A) AS ClusterID
FROM Results;



____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537




tssopa
tssopa
Grasshopper
Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)

Group: General Forum Members
Points: 14 Visits: 71
TheSQLGuru (6/18/2013)
This solution was pretty quick to slap together and seems to do the trick, although I must admit to feeling slightly dirty posting it! Hehe


Thanks Kevin. I had something similar to this but just not as elegant. I really appreciate the quick response.

Mike
tssopa
tssopa
Grasshopper
Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)

Group: General Forum Members
Points: 14 Visits: 71
Mark-101232 (6/19/2013)

Here's another way using recursive CTEs


Now this is what I was hoping for. Thanks Mark! I am still trying to wrap my head around recursive CTE's and I knew there was a solution but I just couldn't see it.

Thank you very much and I appreciate the quick response.

Mike
tssopa
tssopa
Grasshopper
Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)Grasshopper (14 reputation)

Group: General Forum Members
Points: 14 Visits: 71
Okay, I am a little dissapointed. I had read how efficient and fast recursive CTE's are compared to CURSOR's, however when I run the CTE against my data set of 1000+ records it takes far too long. When I run the CURSOR it only takes 4 seconds. The shape table has the ShapeID as the primary key and the shape field does have a spatial index. Any thoughts on why the recursive CTE is taking so long?
Mark Cowne
Mark Cowne
Hall of Fame
Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)

Group: General Forum Members
Points: 3097 Visits: 24076
tssopa (6/19/2013)
... I had read how efficient and fast recursive CTE's are compared to CURSOR's...


Typically they're not. You could try splitting out a temporary table from the CTE.



IF OBJECT_ID('tempdb..#Base') IS NOT NULL DROP TABLE #Base;

SELECT A.ShapeID AS ShapeID_A, B.ShapeID AS ShapeID_B
INTO #Base
FROM dbo.OverlappingShapes AS A
INNER JOIN dbo.OverlappingShapes AS B
ON A.Shape.STIntersects(B.Shape) = 1
AND A.ShapeID < B.ShapeID;

-- May want to add indexes to #Base here .. CREATE INDEX IX ON #Base(ShapeID_A);

WITH
Recur AS (
SELECT b.ShapeID_A,
b.ShapeID_B
FROM #Base b
WHERE NOT EXISTS(SELECT * FROM #Base b2 WHERE b2.ShapeID_B=b.ShapeID_A)

UNION ALL

SELECT r.ShapeID_A,
b.ShapeID_B
FROM #Base b
INNER JOIN Recur r ON r.ShapeID_B = b.ShapeID_A),

Results AS (
SELECT ShapeID_A,
ShapeID_B
FROM Recur
UNION
SELECT a.ShapeID,
a.ShapeID
FROM dbo.OverlappingShapes a
WHERE NOT EXISTS (SELECT * FROM Recur r WHERE r.ShapeID_B=a.ShapeID))

SELECT ShapeID_B AS ShapeID,
DENSE_RANK() OVER(ORDER BY ShapeID_A) AS ClusterID
FROM Results;



____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537




TheSQLGuru
TheSQLGuru
SSChampion
SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)

Group: General Forum Members
Points: 12337 Visits: 8546
tssopa (6/19/2013)
Okay, I am a little dissapointed. I had read how efficient and fast recursive CTE's are compared to CURSOR's, however when I run the CTE against my data set of 1000+ records it takes far too long. When I run the CURSOR it only takes 4 seconds. The shape table has the ShapeID as the primary key and the shape field does have a spatial index. Any thoughts on why the recursive CTE is taking so long?


Recursive CTEs SUCK @SS and should be avoided at almost all costs, IMNSHO. I only use them when there is NO other recourse or testing shows they are optimal for a specific data processing need.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
TheSQLGuru
TheSQLGuru
SSChampion
SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)

Group: General Forum Members
Points: 12337 Visits: 8546
tssopa (6/19/2013)
TheSQLGuru (6/18/2013)
This solution was pretty quick to slap together and seems to do the trick, although I must admit to feeling slightly dirty posting it! Hehe


Thanks Kevin. I had something similar to this but just not as elegant. I really appreciate the quick response.

Mike


You are welcome! It was a fun little problem to tackle. Something in the back of my brain is telling me there is a set-based efficient mechanism to do this, and I (quickly) tried a number of approaches but was unsuccessful. Sometimes a cursor really is the best tool for the job (despite how horribly inefficient they are in SQL Server), and I am ALL about using the best tool for the job!! :-)

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Mark Cowne
Mark Cowne
Hall of Fame
Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)

Group: General Forum Members
Points: 3097 Visits: 24076
If you change the data, replacing

, ('E3A22', geometry::STGeomFromText('LINESTRING(00 12, 07 12)', 0))

with

, ('E3A22', geometry::STGeomFromText('LINESTRING(06 12, 07 12)', 0))

so that A1B20 and D1B19 intersect,
D1B19 and E3A22 intersect
but A1B20 and E3A22 now do not intersect, the cursor and CTE solutions give different results.

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537




Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search