SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Finding Gaps in a Sequential Number Sequence


Finding Gaps in a Sequential Number Sequence

Author
Message
Tao Klerks
Tao Klerks
SSCommitted
SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)

Group: General Forum Members
Points: 1543 Visits: 1249

Hi,

I'm a latecomer to the party and didn't realize there had already been healthy debate, but here's what I came up with.

Looking through the previous submissions above, it's the same concept as Brendan's, but extended to support the full functionality of the original post.

For the original example (working on an unoptimized temp table) the time taken seems to be approx 0.25 secs/1000 records, so with 10000 records I'm getting something like 3 seconds run time. With the code below there seems to be no increase in time taken at all (around 0.25 secs for anything from 0 to 10000 records), but this is probably because for small datasets the main delay below is not due to the work itself, but rather the creation of the temp table.

I'd be interested to see if anyone can make something faster! (ignoring the fact that I could probably have made it a table variable...)

Thanks,

Tao

CREATE TABLE #StartsAndEnds (EventID Int, EventType NVarChar(20), IdentityTracker INT IDENTITY (1,1))
INSERT INTO #StartsAndEnds (EventID, EventType)
SELECT SeqNumber, Type
FROM (
SELECT Seq1.SeqNumber, 'Start' AS Type
FROM #SequenceTable Seq1
LEFT JOIN #SequenceTable Seq2 ON Seq2.SeqNumber = Seq1.SeqNumber + 1
WHERE Seq2.SeqNumber Is Null

UNION ALL

SELECT Seq1.SeqNumber, 'End' AS Type
FROM #SequenceTable Seq1
LEFT JOIN #SequenceTable Seq2 ON Seq2.SeqNumber = Seq1.SeqNumber - 1
WHERE Seq2.SeqNumber Is Null
) AS PingPong
ORDER BY SeqNumber

SELECT IsNull(FL1.EventID, 0) AS StartGap,
FL2.EventID As EndGap,
IsNull(FL1.EventID, 0) + 1 AS FirstAvailable,
FL2.EventID - 1 AS LastAvailable,
FL2.EventID - IsNull(FL1.EventID, 0) - 1 As AvailableCount
FROM #StartsAndEnds FL2
LEFT JOIN #StartsAndEnds FL1 ON FL2.IdentityTracker = FL1.IdentityTracker + 1
WHERE FL2.IdentityTracker % 2 = 1

DROP TABLE #StartsAndEnds



http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.
Lashams
Lashams
SSC-Enthusiastic
SSC-Enthusiastic (118 reputation)SSC-Enthusiastic (118 reputation)SSC-Enthusiastic (118 reputation)SSC-Enthusiastic (118 reputation)SSC-Enthusiastic (118 reputation)SSC-Enthusiastic (118 reputation)SSC-Enthusiastic (118 reputation)SSC-Enthusiastic (118 reputation)

Group: General Forum Members
Points: 118 Visits: 48

Hi Tao, did you post the correct version of your code?

I copied your code above and did a find/replace on #SequenceTable and SeqNumber swapping these for my table name and sequence field respectively, then ran it. It was certainly quick, very impressive, 16 seconds to produce 142874 identified breaks in the sequence from a table containing 3,827,625 rows.

First few rows of output are as below

StartGapEndGapFirstAvailableLastAvailableAvailableCount
0NULL1NULLNULL
0NULL1NULLNULL
0NULL1NULLNULL
0NULL1NULLNULL
0NULL1NULLNULL
0NULL1NULLNULL
582246210758225621063882
780258008678026800852060

My original query over the same table takes 44 seconds and produces the following first few rows.

StartGapEndGapFirstAvailableLastAvailableAvailableCount
02111
206242062620625206251
520915209352092520921
555185552055519555191
582225822458223582231
6210778025621087802415917
800868008880087800871
801608016280161801611

I've checked the results and my query is showing the correct gaps. Your result line showing 58224 to 62107 represents my end gap result line 5 above to my start gap result line 6 above.

Love to see your revision as certainly looks like it will be very fast.

Stephen





Little Mouse
Little Mouse
Grasshopper
Grasshopper (23 reputation)Grasshopper (23 reputation)Grasshopper (23 reputation)Grasshopper (23 reputation)Grasshopper (23 reputation)Grasshopper (23 reputation)Grasshopper (23 reputation)Grasshopper (23 reputation)

Group: General Forum Members
Points: 23 Visits: 33
Thanks for your all excellent solutions. Much appreciated!
Martin Grape
Martin Grape
Forum Newbie
Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)

Group: General Forum Members
Points: 9 Visits: 23
Mike Arney (4/3/2006)


If you just need to identify IF a list of numbers is in sequence, but not where the gaps are, I believe you can do it in a single pass of the table. (Warning -- to understand this algorithm, some high school math is required!)<FONT color=#008000 size=2>



Or some grade school one:

SELECT CASE WHEN COUNT(Id) = MAX(Id) - MIN(Id) + 1 THEN 'In sequence' ELSE 'Not sequence' END FROM yourtable

:-D
Martin Grape
Martin Grape
Forum Newbie
Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)

Group: General Forum Members
Points: 9 Visits: 23
Here is another thing i found and modified a little.
Takes under a second for 1 millon rows and returns gaps size sorted ;-)

Change tables and fields surrounded by []
----------------------------------------

SELECT
LastSysId + 1 AS GapFrom,
NextSysId - 1 AS GapTo,
NextSysId - (LastSysId + 1) AS GapSize
FROM
(
SELECT
(
SELECT TOP 1
[SysId]
FROM
[yourtable]
WHERE
[SysId] < a.[SysId]
ORDER BY
[SysId] DESC
) AS LastSysId,
a.[SysId] AS NextSysId
FROM
[yourtable] AS a
LEFT JOIN
[yourtable] AS b ON a.[SysId] = b.[SysId] + 1
WHERE
b.[SysId] IS NULL
) AS a
WHERE
LastSysId IS NOT NULL
ORDER BY
3 DESC
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)

Group: General Forum Members
Points: 88414 Visits: 41128
Martin Grape (7/2/2011)
Here is another thing i found and modified a little.
Takes under a second for 1 millon rows and returns gaps size sorted ;-)

Change tables and fields surrounded by []
----------------------------------------

SELECT
LastSysId + 1 AS GapFrom,
NextSysId - 1 AS GapTo,
NextSysId - (LastSysId + 1) AS GapSize
FROM
(
SELECT
(
SELECT TOP 1
[SysId]
FROM
[yourtable]
WHERE
[SysId] < a.[SysId]
ORDER BY
[SysId] DESC
) AS LastSysId,
a.[SysId] AS NextSysId
FROM
[yourtable] AS a
LEFT JOIN
[kicks_medlem_20110630] AS b ON a.[SysId] = b.[SysId] + 1
WHERE
b.[SysId] IS NULL
) AS a
WHERE
LastSysId IS NOT NULL
ORDER BY
3 DESC



Nicely done and I can verify the speed. The only problem is that the code doesn't recognize missing "SysID" of 1. In fact, it doesn't recognize any gap from 1 to x-1 if all the rows between 1 to x-1 are missing. That could be OK for some folks. For me, it's usually not. It's neither wrong nor right. "It Depends" on what it's being used for.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Martin Grape
Martin Grape
Forum Newbie
Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)

Group: General Forum Members
Points: 9 Visits: 23
@Jeff

Here is a little stored procedure i made out of this that will return a list of gaps including missing 1-x (or 0-x or whatever-x) or
return the first available number i a sequence. Njoy Hehe



-- Drop procedure if it exists
IF EXISTS (SELECT * FROM [sys].[objects] WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[FindGap]') AND [type] IN (N'P', N'PC')) DROP PROCEDURE [dbo].[FindGap]
GO

-- Create procedure
CREATE PROCEDURE [dbo].[FindGap]
@strFindGapTable VARCHAR(254),
@strFindGapField VARCHAR(254),
@binFindGapLower BIGINT = 1,
@intFindGapMode INT = 0
AS
-- +---------------------------------------------------------------------------+
-- | |
-- | Type : SQL Procedure |
-- | Name : FindGap |
-- | Version : 1.1 |
-- | Creator : Martin Grape, ActionBase, +46 73 391 88 89 |
-- | Info : Returns first available numer in a field of sequence or |
-- | : displays a list of available gaps and number of gaps. |
-- | Input : @strFindGapTable -> Name of table to run on |
-- | : @strFindGapField -> Name of field |
-- | : @strFindGapLower -> Lower value to find gap from |
-- | : @strFindGapMode -> 0=All gaps, 1=Next available number |
-- | Example : EXEC FindGap 'yourtable', 'sysid', 1, 1 |
-- | History : 2011-07-03 Created v1.0 |
-- | : 2011-07-03 Added check for all numbers in sequence v1.1 |
-- | : |
-- | |
-- +---------------------------------------------------------------------------+

-- Declare variables
DECLARE @strFindGapSQL AS NVARCHAR(2000);
DECLARE @strFindGapPar AS NVARCHAR(2000);
DECLARE @binFindGapMin AS BIGINT;
DECLARE @binFindGapMax AS BIGINT;
DECLARE @binFindGapCount AS BIGINT;
DECLARE @binFindGapFirst AS BIGINT;

-- Make SQL to find min and max values
SET @strFindGapSQL = N'SELECT @binMinVal = MIN(' + @strFindGapField + '), @binMaxVal = MAX(' + @strFindGapField + '), @binCount = COUNT(*) FROM ' + @strFindGapTable;
SET @strFindGapPar = N'@binMinVal BIGINT OUTPUT, @binMaxVal BIGINT OUTPUT, @binCount BIGINT OUTPUT';

-- Get values into variables
EXEC SP_EXECUTESQL @strFindGapSQL, @strFindGapPar, @binFindGapMin OUTPUT, @binFindGapMax OUTPUT, @binFindGapCount OUTPUT;

-- Check mode
IF @intFindGapMode = 1
-- Next available sequence number
BEGIN
-- Check if min is bigger than supplied lower, then take that value
IF @binFindGapMin > @binFindGapLower
BEGIN
RETURN @binFindGapLower;
END
-- Check if max is smaller than supplied lower, then take that value
ELSE IF @binFindGapMax < @binFindGapLower
BEGIN
RETURN @binFindGapLower;
END
-- Check if max is the same as supplied lower, then return + 1
ELSE IF @binFindGapMax = @binFindGapLower
BEGIN
RETURN @binFindGapLower + 1;
END
-- Check if no existing gaps, return higest + 1
ELSE IF @binFindGapCount = @binFindGapMax - @binFindGapMin + 1
BEGIN
RETURN @binFindGapMax + 1
END
-- Get first available number
ELSE
BEGIN
-- Create SQL to get next value
SET @strFindGapSQL = N'SELECT TOP 1 @binMinVal = LastSysId + 1 FROM (SELECT (SELECT TOP 1 ' + @strFindGapField + ' FROM ' + @strFindGapTable + ' WHERE ' + @strFindGapField + ' < a.' + @strFindGapField + ' ORDER BY ' + @strFindGapField + ' DESC) AS LastSysId, a.' + @strFindGapField + ' AS NextSysId FROM ' + @strFindGapTable + ' AS a LEFT JOIN ' + @strFindGapTable + ' AS b ON a.' + @strFindGapField + ' = b.' + @strFindGapField + ' + 1 WHERE b.' + @strFindGapField + ' IS NULL) AS a WHERE LastSysId IS NOT NULL ORDER BY 1';
SET @strFindGapPar = N'@binMinVal BIGINT OUTPUT';
-- Get value into variable
EXEC SP_EXECUTESQL @strFindGapSQL, @strFindGapPar, @binFindGapFirst OUTPUT;
-- Return value
RETURN @binFindGapFirst;
END
END
ELSE
-- All available sequence numbers as a list
BEGIN
-- Check if first value is bigger than supplied lower, then add a row
IF @binFindGapLower < @binFindGapMin
BEGIN
-- Add a union from supplied lower to the first actual value
SET @strFindGapSQL = N'SELECT ' + CAST(@binFindGapLower AS NVARCHAR) + ' AS GapFrom, ' + CAST(@binFindGapMin - 1 AS NVARCHAR) + ' AS GapTo, ' + CAST(@binFindGapMin - @binFindGapLower AS NVARCHAR) + ' AS GapSize UNION ';
END
ELSE
BEGIN
-- Reset variable
SET @strFindGapSQL = '';
END
-- Check if to add
SET @strFindGapSQL = @strFindGapSQL + 'SELECT LastSysId + 1 AS GapFrom, NextSysId - 1 AS GapTo, NextSysId - (LastSysId + 1) AS GapSize FROM (SELECT (SELECT TOP 1 ' + @strFindGapField + ' FROM ' + @strFindGapTable + ' WHERE ' + @strFindGapField + ' < a.' + @strFindGapField + ' ORDER BY ' + @strFindGapField + ' DESC) AS LastSysId, a.' + @strFindGapField + ' AS NextSysId FROM ' + @strFindGapTable + ' AS a LEFT JOIN ' + @strFindGapTable + ' AS b ON a.' + @strFindGapField + ' = b.SysId + 1 WHERE b.' + @strFindGapField + ' IS NULL) AS a WHERE LastSysId IS NOT NULL ORDER BY 1';
-- Run the select
EXEC (@strFindGapSQL);
-- Return also the rowcount
RETURN @@ROWCOUNT;
END
-- +------------------+
-- | END OF PROCEDURE |
-- +------------------+
GO

-- +---------+
-- | Testing |
-- +---------+

-- Drop test table if it existst
IF EXISTS (SELECT * FROM [sys].[objects] WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[yourtable]') AND [type] IN (N'U')) DROP TABLE [dbo].[yourtable]
GO


-- Add dummy data
SELECT 2 AS SysId INTO yourtable UNION
SELECT 6 AS SysId UNION
SELECT 8 AS SysId UNION
SELECT 9 AS SysId UNION
SELECT 11 AS SysId;


-- Example 1: Adds a row with next available number starting on 0 , displays the result table and selects next available number
DECLARE @binFindGapReturn BIGINT;
EXEC @binFindGapReturn = FindGap 'yourtable', 'sysid', 0, 1;
INSERT INTO yourtable (SysId) SELECT @binFindGapReturn
SELECT * FROM yourtable ORDER BY 1
EXEC @binFindGapReturn = FindGap 'yourtable', 'sysid', 0, 1;
SELECT @binFindGapReturn AS NextValue;
GO


-- Example 2: Returns all the gaps starting on 1 as a resultset and returns number of gaps
DECLARE @binFindGapReturn BIGINT;
EXEC @binFindGapReturn = FindGap 'yourtable', 'sysid', 1, 2;
SELECT @binFindGapReturn AS NoOfGaps;
GO

-- Example 3: Fill missing sequences to get max + 1
INSERT INTO yourtable SELECT 1
INSERT INTO yourtable SELECT 3
INSERT INTO yourtable SELECT 4
INSERT INTO yourtable SELECT 5
INSERT INTO yourtable SELECT 7
INSERT INTO yourtable SELECT 10
DECLARE @binFindGapReturn BIGINT;
EXEC @binFindGapReturn = FindGap 'yourtable', 'sysid', 0, 1;
SELECT @binFindGapReturn AS NextValue;
GO


-- Drop test table if it existst
IF EXISTS (SELECT * FROM [sys].[objects] WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[yourtable]') AND [type] IN (N'U')) DROP TABLE [dbo].[yourtable]
GO



Attachments
FindGap.sql.txt (30 views, 7.00 KB)
9701
9701
Forum Newbie
Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)

Group: General Forum Members
Points: 1 Visits: 2
Dear Sir,
The query what u have posted helped me exactly the way i want the result. But i am completeley unaware of how it get executed. Kindly help.
stephenqlam
stephenqlam
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 77
n
chinkit
chinkit
Forum Newbie
Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)

Group: General Forum Members
Points: 8 Visits: 34
Hey!Mike Arney this is a good try. I have tried various algorithms in my experience and have never come across anything as efficient as this. However, the Math behind this was not adding up for me.

Therefore i went to http://mathforum.org/dr.math/ & Thanks To Dr Peterson who confirmed that this formula is invalid.

Try to use your algorithm for any of the below sequence.

1, 14

Or, if you prefer,

1, 3, 5, 6

Cheers!
Chinkit
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search