Combining many crossing intervals

,

Problem

Combining overlapping or continuous intervals could be a discouraging task even for advanced programmers. This article presents a new solution based on usage of the geometry data type and an unknown function (if not, maybe less used function) specifically designed for geometry objects.

Computing the overall period when every user (of some website) was connected (logged in) is just one of those examples where this SQL problem must have a performant solution. This problem is also recognized under the following names: Gaps and Islands or Packing Date and Time Intervals (source 1, source 2, source 3, source 4).

While this solution is using INT columns, we could adapt simple to DATE, DATETIME, TIME intervals by converting these values into INT(BIGINT) thus: {ts '2020-01-30 14:30:00'} becomes 20200130143000

Given the following intervals (1, 5), (4, 30), (67, 80), (63, 70), the question is how could we combine them easily into nonoverlapping intervals, thus: (1,30), (63,80)?

Proposed Solution

In order to combine these intervals, first, we are going to transform every interval (Start, End) into a polygon object thus:

POLYGON((Start 0, End 0, End 1, Start 1, Start 0)):

(Start 1)-(End 1)

|                           |

(Start 0)-(End 0)

Thus, first interval (1, 5) become POLYGON((1 0,5 0,5 1,1 1,1 0)):

1_Range_1_5

Note 1:

The general syntax for a simple polygon is

POLYGON((Point1X Point1Y,Point2X Point2Y,Point3X Point13Y,Point4X Point4Y,Point1X Point1Y)):
(Point4X Point4Y)-(Point3X Point3Y)
|                                 |
(Point1X Point1Y)-(Point2X Point2Y)

Note 2: Polygons are closed objects, this means that first point and last point are the same.

Then, we are going to use the UnionAggregate() spatial function to combine these polygons (intervals). Next polygons

POLYGON((1 0,5 0,5 1,1 1,1 0)) and POLYGON((4 0,30 0,30 1,4 1,4 0))
(interval (1, 5) and (4, 30))
2_Range_1_5_and_4_30

are going to be combined into a single polygon

POLYGON ((1 0, 4 0, 5 0, 30 0, 30 1, 5 1, 4 1, 1 1, 1 0))

SELECT GEOMETRY::UnionAggregate(s.Pol).ToString()
FROM (
SELECT Pol = GEOMETRY::STPolyFromText('POLYGON((1 0,5 0,5 1,1 1,1 0))', 0)
UNION ALL
SELECT Pol = GEOMETRY::STPolyFromText('POLYGON((4 0,30 0,30 1,4 1,4 0))', 0)
) s

Then, all those 4 polygons are going to be combined into non-overlapping polygons:

MULTIPOLYGON (
((1 0, 4 0, 5 0, 30 0, 30 1, 5 1, 4 1, 1 1, 1 0)),
((63 0, 67 0, 70 0, 80 0, 80 1, 70 1, 67 1, 63 1, 63 0))
)

The problem is that the resulting polygon (the result of calling the UnionAggregate() function) contains all points and not only the extremities points. Next step of this process is to get the bounding box of this polygon using the STEnvelope() spatial geometry method, thus POLYGON(1 0, 4 0, 5 0, 30 0, 30 1, 5 1, 4 1, 1 1, 1 0) becomes POLYGON((1 0,30 0,30 1,1 1,1 0)):

Final step of this process suppose to decompress these polygons back into ranges using the STPointN(index) property used to return a single point from polygon and also using two properties specific to every point STX and STY. These properties are used to get the first point (Start) and the second point of every interval polygon (End) thus:

POLYGON((Point1X, Point1Y), (Point2X, Point2Y), ..
POLYGON((Start 0, End 0, ..
Start = 1
End = 30

Source Code

The final source code is:

SELECT *
INTO Tab
FROM (
SELECT POL = GEOMETRY::STPolyFromText('POLYGON((1 0,5 0,5 1,1 1,1 0))', 0)
UNION ALL
SELECT POL = GEOMETRY::STPolyFromText('POLYGON((4 0,30 0,30 1,4 1,4 0))', 0)
UNION ALL
SELECT POL = GEOMETRY::STPolyFromText('POLYGON((67 0,80 0,80 1,67 1,67 0))', 0)
UNION ALL
SELECT POL = GEOMETRY::STPolyFromText('POLYGON((63 0,70 0,70 1,63 1,63 0))', 0)
) s
DECLARE @pol GEOMETRY;
SELECT @pol = GEOMETRY::UnionAggregate(t.POL)
FROM Tab t;
SELECT
PK = ptl.number,
[Start] = @pol.STGeometryN(ptl.number).STEnvelope().STPointN(1).STX,
[End] = @pol.STGeometryN(ptl.number).STEnvelope().STPointN(2).STX,
Polygon = @pol.STGeometryN(ptl.number).ToString()
FROM master.dbo.spt_values ptl
WHERE ptl.type = N'P'
AND ptl.number >= 1
AND ptl.number <= @pol.STNumGeometries()

Note:

  • spt_values.type = ‘P’ returns a list with sequential increasing numbers starting with 1.
  • STNumGeometries() returns the number of polygons and STGeometryN(index) returns a single polygon object from a collection (@pol) of polygons.

Conclusion

This article describes a new solution based on the usage of polygon objects and UnionAggregate() function used to combine those objects that have common zones.

 

Rate

5 (1)

Share

Share

Rate

5 (1)