September 8, 2011 at 10:12 am
wiltc-836148 (9/8/2011)
Jeff Moden (9/8/2011)
It seems to me, that the only thing I could take out is the lineGapEnd = hi.MyID - 1
which would have little effect on performance...
Am I missing something?
Thanks again!
Charles
I meant "much quicker than the Tally Table join. The only other thing to do other than removing the code you pointed out would be to add TOP 1 to the SELECT to return just the first occurance of a GAP START.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 8, 2011 at 10:26 am
Jeff Moden (9/8/2011)
I meant "much quicker than the Tally Table join. The only other thing to do other than removing the code you pointed out would be to add TOP 1 to the SELECT to return just the first occurance of a GAP START.
Ok, TOP 1 I understood 🙂
Actually in my testing the tally table has been faster...
But
1) I'm using DB2 for i
2) My number range is only 1-99999..
Thanks for your help!
Charles
May 24, 2026 at 3:14 pm
hi hope this helps
drop table if exists #abc
create table #abc ( col1 varchar(100) )
insert into #abc values
(' 1,2,4,5,7,8,11,12,13,15,17,19,20')
select * from #abc
t-sql solution
;WITH x AS
(
SELECT v = CAST(value AS INT),
nxt = LEAD(CAST(value AS INT))
OVER (ORDER BY CAST(value AS INT))
FROM STRING_SPLIT('1,2,4,5,7,8,11,12,13,15,17,19,20', ',')
)
SELECT v + 1 AS MissingStart
FROM x
WHERE nxt - v > 1;

May 24, 2026 at 7:17 pm
hi hope this helps
Fwiw, this is the 2K8 forum. LEAD and STRING_SPLIT were not available. Also, where is 10? It's skipped in the list of missing. The column labeled 'MissingStart' doesn't really answer the question imo
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
May 26, 2026 at 2:24 am
The original post: "I have the data like 1,2,4,5,7,8,11,12,13,15,17,19,20. How to find the missing numbers in the series?"
If the data is like, or similar, to the series provided as an example it doesn't seem unlikely to actually include a number over 100 imo. With the following series is the latest query still the appropriate answer?
DECLARE @s varchar(max)='1,2,4,5,7,8,11,12,13,15,17,19,20,102'
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
May 26, 2026 at 11:34 am

May 26, 2026 at 1:13 pm
DECLARE @MissingNumbers TABLE (N INT primary key not null);
INSERT @MissingNumbers(N)
VALUES (1),(2),(4),(5),(7),(8),(11),(12),(13),(15),(17),(19),(20),(102);
--Steve's query (based on Paul White's query, which is based on Jeff Moden's query)
SELECT GapStart = Lo.LoN,
GapEnd = Hi.N - 1,
Fn.N,
(Lo.LoN+Fn.N-1) missing_N
FROM @MissingNumbers Hi
CROSS APPLY (SELECT LoN = ISNULL(MAX(Lo.N), 0) + 1
FROM @MissingNumbers Lo
WHERE Lo.N < Hi.N) Lo
cross apply dbo.fnTally(Lo.LoN, (Hi.N-1)-Lo.LoN+1) Fn
WHERE NOT EXISTS (SELECT 1
FROM @MissingNumbers NotIn
WHERE NotIn.N + 1 = Hi.N)
AND (Hi.N-1)>0
order by (Lo.LoN+Fn.N-1);
--Paul White's query
SELECT GapStart = Lo.MyID + 1,
GapEnd = Hi.MyID - 1
FROM dbo.MyTest Hi
CROSS APPLY (SELECT MyID = ISNULL(MAX(Lo.MyID), 0) + 1
FROM dbo.MyTest Lo
WHERE Lo.MyID < Hi.MyID) Lo
WHERE NOT EXISTS (SELECT NotIn.MyID + 1
FROM dbo.MyTest NotIn
WHERE NotIn.MyID + 1 = Hi.MyID);
--Jeff's query
SELECT GapStart = (SELECT ISNULL(MAX(lo.MyID),0)+1
FROM #MyTest lo
WHERE lo.MyID < hi.MyID),
GapEnd = hi.MyID - 1
FROM #MyTest hi
WHERE hi.MyID NOT IN (SELECT MyID + 1
FROM #MyTest);
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
May 26, 2026 at 1:23 pm
hi
my query = worst
Paul White / Steve Query = Best

May 26, 2026 at 2:59 pm

May 26, 2026 at 4:29 pm
Ha, ok that's a nice inversion of the problem. Nicely done. Posting code in a forum is an invitation for nitpicking of course. One assumption baked into the query is the max(N) in MissingNumbers is less than then the max(N) in dbo.Numbers. Typically a safe assumption but it's worth mentioning because if it's not true the query silently (worst way) fails, meaning it returns an incorrect answer without any error. Also, the inequality doesn't need the equals because max(N) from MissingNumbers is always within the non-missing set. Also, LEFT JOIN is typically used if you need to access columns from the righthand side of the join. In this case that's not necessary so imo it's better specified as an EXIST'ance test with a correlated subquery
SELECT n.N
FROM dbo.Numbers n
WHERE n.N < (SELECT MAX(N)
FROM MissingNumbers)
AND NOT EXISTS (SELECT 1
FROM MissingNumbers m
WHERE m.N = n.N);
To nitpick my own code I don't think it handles 0-N gaps well. I added the 2nd condition to the WHERE clause: "AND (Hi.N-1)>0" which is a bit of a kludge. I'd have to think about it more...
Regarding GENERATE_SERIES being in SQL Server 2022. That's true. If we were using 2022 to answer the OP's request I would use LEAD and GENERATE_SERIES and the query would be much simpler to look at.
It's nice to see old topics being revived. There's a treasure trove of great questions and there's no reason imo they have to be pinned to obsolete versions of SQL Server
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
May 26, 2026 at 7:01 pm
not exists "preferred" "better in all sorts " over left join

Viewing 12 posts - 61 through 72 (of 72 total)
You must be logged in to reply to this topic. Login to reply