Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Grouping Related Rows in Same Table Expand / Collapse
Author
Message
Posted Tuesday, June 18, 2013 3:17 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, January 15, 2014 4:45 PM
Points: 14, Visits: 71
I am hoping that someone on here can help me out with my problem. I have a single table full of shapes that may or may not intersect each other. My goal is to identify the shapes that intersect and group them together by giving them a suedo "cluster id". Here is a simplified version of the data that I am working with.
IF EXISTS (
SELECT *
FROM sys.tables
WHERE NAME = 'OverlappingShapes'
AND Schema_Name(schema_id) = 'dbo'
AND [type] = 'U'
)
DROP TABLE dbo.OverlappingShapes
GO

CREATE TABLE dbo.OverlappingShapes (ShapeID CHAR(5), Shape GEOMETRY)
GO

INSERT INTO dbo.OverlappingShapes (ShapeID, Shape)
VALUES ('A1B20', geometry::STGeomFromText('POLYGON((01 15, 02 13, 03 11, 05 13, 04 17, 02 17, 01 15))', 0))
, ('D1B19', geometry::STGeomFromText('POLYGON((03 14, 06 11, 06 16, 03 14))', 0))
, ('C5B23', geometry::STGeomFromText('POLYGON((05 04, 09 04, 09 08, 05 08, 05 04))', 0))
, ('A1B11', geometry::STGeomFromText('POLYGON((06 02, 10 06, 06 09, 06 02))', 0))
, ('D1B25', geometry::STGeomFromText('POLYGON((01 04, 02 03, 02 02, 03 02, 04 03, 03 04, 04 06, 02 07, 01 06, 02 05, 01 04))', 0))
, ('E3A22', geometry::STGeomFromText('LINESTRING(00 12, 07 12)', 0))
GO

If you run the following SQL text you should get the results listed below.
SELECT A.ShapeID AS ShapeID_A, B.ShapeID AS ShapeID_B
FROM dbo.OverlappingShapes AS A
INNER JOIN dbo.OverlappingShapes AS B
ON A.Shape.STIntersects(B.Shape) = 1
GO

ShapeID_A ShapeID_B
--------- ---------
A1B20 A1B20
D1B19 A1B20
E3A22 A1B20
A1B20 D1B19
D1B19 D1B19
E3A22 D1B19
C5B23 C5B23
A1B11 C5B23
C5B23 A1B11
A1B11 A1B11
D1B25 D1B25
A1B20 E3A22
D1B19 E3A22
E3A22 E3A22

(14 row(s) affected)

What I am trying to accomplish is something like this:
ShapeID_A ShapeID_B ClusterID
--------- --------- ---------
A1B20 A1B20 1
D1B19 A1B20 1
E3A22 A1B20 1
A1B20 D1B19 1
D1B19 D1B19 1
E3A22 D1B19 1
A1B20 E3A22 1
D1B19 E3A22 1
E3A22 E3A22 1
C5B23 C5B23 2
A1B11 C5B23 2
C5B23 A1B11 2
A1B11 A1B11 2
D1B25 D1B25 3

Or to go one step further, something like this:
ShapeID ClusterID
------- ---------
A1B20 1
D1B19 1
E3A22 1
C5B23 2
A1B11 2
D1B25 3

Any help would be greatly appreciated.

Thanks,

Mike
Post #1464887
Posted Tuesday, June 18, 2013 9:16 PM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Yesterday @ 2:18 PM
Points: 4,406, Visits: 6,268
This solution was pretty quick to slap together and seems to do the trick, although I must admit to feeling slightly dirty posting it!

SET NOCOUNT ON
CREATE TABLE dbo.#out (ShapeID char(5) NOT NULL, GroupID tinyint NOT NULL) --size group as needed

DECLARE @GroupID tinyint = 1, @ShapeID char(5), @Shape geometry

DECLARE Csr cursor FAST_FORWARD for
SELECT ShapeID, Shape
FROM dbo.OverlappingShapes

OPEN Csr
FETCH NEXT FROM Csr INTO @ShapeID, @Shape
WHILE (@@fetch_status = 0)
BEGIN
INSERT dbo.#out
SELECT ShapeID, @GroupID
FROM dbo.OverlappingShapes os
WHERE @Shape.STIntersects(os.Shape) = 1
AND NOT EXISTS (SELECT * FROM dbo.#out t WHERE t.ShapeID = os.ShapeID)

IF @@ROWCOUNT <> 0 --hit a new group, so increment
BEGIN
SET @GroupID += 1
END

FETCH NEXT FROM Csr INTO @ShapeID, @Shape
END
CLOSE Csr
DEALLOCATE Csr

SELECT * FROM dbo.#out



Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1464941
Posted Wednesday, June 19, 2013 2:54 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: 2 days ago @ 3:29 AM
Points: 1,678, Visits: 19,554

Here's another way using recursive CTEs


WITH Base AS (
SELECT A.ShapeID AS ShapeID_A, B.ShapeID AS ShapeID_B
FROM dbo.OverlappingShapes AS A
INNER JOIN dbo.OverlappingShapes AS B
ON A.Shape.STIntersects(B.Shape) = 1
AND A.ShapeID < B.ShapeID),

Recur AS (
SELECT b.ShapeID_A,
b.ShapeID_B
FROM Base b
WHERE NOT EXISTS(SELECT * FROM Base b2 WHERE b2.ShapeID_B=b.ShapeID_A)

UNION ALL

SELECT r.ShapeID_A,
b.ShapeID_B
FROM Base b
INNER JOIN Recur r ON r.ShapeID_B = b.ShapeID_A),

Results AS (
SELECT ShapeID_A,
ShapeID_B
FROM Recur
UNION
SELECT a.ShapeID,
a.ShapeID
FROM dbo.OverlappingShapes a
WHERE NOT EXISTS (SELECT * FROM Recur r WHERE r.ShapeID_B=a.ShapeID))

SELECT ShapeID_B AS ShapeID,
DENSE_RANK() OVER(ORDER BY ShapeID_A) AS ClusterID
FROM Results;



____________________________________________________

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

Never approach a goat from the front, a horse from the rear, or a fool from any direction.
Post #1465001
Posted Wednesday, June 19, 2013 4:43 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, January 15, 2014 4:45 PM
Points: 14, Visits: 71
TheSQLGuru (6/18/2013)
This solution was pretty quick to slap together and seems to do the trick, although I must admit to feeling slightly dirty posting it!


Thanks Kevin. I had something similar to this but just not as elegant. I really appreciate the quick response.

Mike
Post #1465055
Posted Wednesday, June 19, 2013 4:53 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, January 15, 2014 4:45 PM
Points: 14, Visits: 71
Mark-101232 (6/19/2013)

Here's another way using recursive CTEs


Now this is what I was hoping for. Thanks Mark! I am still trying to wrap my head around recursive CTE's and I knew there was a solution but I just couldn't see it.

Thank you very much and I appreciate the quick response.

Mike
Post #1465063
Posted Wednesday, June 19, 2013 6:09 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, January 15, 2014 4:45 PM
Points: 14, Visits: 71
Okay, I am a little dissapointed. I had read how efficient and fast recursive CTE's are compared to CURSOR's, however when I run the CTE against my data set of 1000+ records it takes far too long. When I run the CURSOR it only takes 4 seconds. The shape table has the ShapeID as the primary key and the shape field does have a spatial index. Any thoughts on why the recursive CTE is taking so long?
Post #1465116
Posted Wednesday, June 19, 2013 6:38 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: 2 days ago @ 3:29 AM
Points: 1,678, Visits: 19,554
tssopa (6/19/2013)
... I had read how efficient and fast recursive CTE's are compared to CURSOR's...


Typically they're not. You could try splitting out a temporary table from the CTE.


IF OBJECT_ID('tempdb..#Base') IS NOT NULL DROP TABLE #Base;

SELECT A.ShapeID AS ShapeID_A, B.ShapeID AS ShapeID_B
INTO #Base
FROM dbo.OverlappingShapes AS A
INNER JOIN dbo.OverlappingShapes AS B
ON A.Shape.STIntersects(B.Shape) = 1
AND A.ShapeID < B.ShapeID;

-- May want to add indexes to #Base here .. CREATE INDEX IX ON #Base(ShapeID_A);

WITH
Recur AS (
SELECT b.ShapeID_A,
b.ShapeID_B
FROM #Base b
WHERE NOT EXISTS(SELECT * FROM #Base b2 WHERE b2.ShapeID_B=b.ShapeID_A)

UNION ALL

SELECT r.ShapeID_A,
b.ShapeID_B
FROM #Base b
INNER JOIN Recur r ON r.ShapeID_B = b.ShapeID_A),

Results AS (
SELECT ShapeID_A,
ShapeID_B
FROM Recur
UNION
SELECT a.ShapeID,
a.ShapeID
FROM dbo.OverlappingShapes a
WHERE NOT EXISTS (SELECT * FROM Recur r WHERE r.ShapeID_B=a.ShapeID))

SELECT ShapeID_B AS ShapeID,
DENSE_RANK() OVER(ORDER BY ShapeID_A) AS ClusterID
FROM Results;




____________________________________________________

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

Never approach a goat from the front, a horse from the rear, or a fool from any direction.
Post #1465135
Posted Wednesday, June 19, 2013 7:40 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Yesterday @ 2:18 PM
Points: 4,406, Visits: 6,268
tssopa (6/19/2013)
Okay, I am a little dissapointed. I had read how efficient and fast recursive CTE's are compared to CURSOR's, however when I run the CTE against my data set of 1000+ records it takes far too long. When I run the CURSOR it only takes 4 seconds. The shape table has the ShapeID as the primary key and the shape field does have a spatial index. Any thoughts on why the recursive CTE is taking so long?


Recursive CTEs SUCK @SS and should be avoided at almost all costs, IMNSHO. I only use them when there is NO other recourse or testing shows they are optimal for a specific data processing need.



Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1465170
Posted Wednesday, June 19, 2013 7:43 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Yesterday @ 2:18 PM
Points: 4,406, Visits: 6,268
tssopa (6/19/2013)
TheSQLGuru (6/18/2013)
This solution was pretty quick to slap together and seems to do the trick, although I must admit to feeling slightly dirty posting it!


Thanks Kevin. I had something similar to this but just not as elegant. I really appreciate the quick response.

Mike


You are welcome! It was a fun little problem to tackle. Something in the back of my brain is telling me there is a set-based efficient mechanism to do this, and I (quickly) tried a number of approaches but was unsuccessful. Sometimes a cursor really is the best tool for the job (despite how horribly inefficient they are in SQL Server), and I am ALL about using the best tool for the job!!


Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1465171
Posted Wednesday, June 19, 2013 8:06 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: 2 days ago @ 3:29 AM
Points: 1,678, Visits: 19,554

If you change the data, replacing

, ('E3A22', geometry::STGeomFromText('LINESTRING(00 12, 07 12)', 0))

with

, ('E3A22', geometry::STGeomFromText('LINESTRING(06 12, 07 12)', 0))

so that A1B20 and D1B19 intersect,
D1B19 and E3A22 intersect
but A1B20 and E3A22 now do not intersect, the cursor and CTE solutions give different results.


____________________________________________________

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

Never approach a goat from the front, a horse from the rear, or a fool from any direction.
Post #1465187
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse