|
|
|
SSCommitted
      
Group: General Forum Members
Last Login: Tuesday, January 15, 2013 11:11 AM
Points: 1,945,
Visits: 2,782
|
|
Comments posted to this topic are about the item Celko's Summer SQL Stumpers: Prime numbers
Books in Celko Series for Morgan-Kaufmann Publishing Analytics and OLAP in SQL Data and Databases: Concepts in Practice Data, Measurements and Standards in SQL SQL for Smarties SQL Programming Style SQL Puzzles and Answers Thinking in Sets Trees and Hierarchies in SQL
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Monday, November 09, 2009 4:06 PM
Points: 18,
Visits: 50
|
|
Here's how I do it. On my server it takes 24 seconds to generate the first 1,229 primes. But at this point add an index to Num, and it then takes me 59 seconds to do up to the 100,000 number range (9,595 primes).
drop table p
go CREATE TABLE [dbo].[P] ( [Num] [int] NULL ) ON [PRIMARY] go set nocount on
declare @Number as int, @Factor as int
set @Number = 1 While @Number < 10000 Begin Declare Factors cursor for select num from p where num <= @number/cast(num as float) and num > 1 order by num asc open Factors fetch Factors into @Factor while @@fetch_Status = 0 begin if (@Number % @Factor) = 0 begin break end fetch Factors into @Factor end if (@Number % @Factor) <> 0 or @Factor is null insert into p values (@Number) close Factors deallocate Factors set @Number = @Number + 2 end go select * from p
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Monday, November 09, 2009 4:06 PM
Points: 18,
Visits: 50
|
|
Actually.... just put a seconds more thought into it and came up with this... it is faster... I don't know why I went down the cursor path...
CREATE TABLE [dbo].[P] ( [Num] [int] NULL ) ON [PRIMARY] CREATE CLUSTERED INDEX [MainIdex] ON [dbo].[P]([Num]) ON [PRIMARY] go set nocount on
declare @Number as int, @Factor as int select @Number = isnull(max(num),1) from p While @Number < 100000 Begin if not exists ( select 'x' from p where num > 1 and (@Number % num) = 0 ) insert into p values (@Number) set @Number = @Number + 2 end go select * from p
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Monday, November 09, 2009 4:06 PM
Points: 18,
Visits: 50
|
|
| When I say faster - it runs the full 100,000 range (9592 primes) in 30 seconds.
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Monday, November 09, 2009 4:06 PM
Points: 18,
Visits: 50
|
|
Oh - I just re-read and it said to explain.... and I realised I left "2" out of the prime list. I'm also considering 1 a prime because it is divisible by 1 and itself... you can change the value in the original insert to 3 if you don't want to consider 1 to be prime.
Well - I'm jumping by increments of 2 because that removes all even numbers, which by definition are divisible by 2. I then do an "exists" check, because if at least one occurance is found the exist stops - that is, SQL Server doesn't have to go through all factors in the table, BUT because the clustered index is ascending it will start with the most likely hits (3, 5, etc).
Of all nubmers, 1/2 (50%) are divisible by 2, 1/3 (33.33%) are divisible by 3, 1/5 (20%) are divisible by 5 and so on. So if I had a random number that is non-even, there is a 2/3 probability it is divisible by 3 (the fact I removed all even nubmers in the first step doubles the later probabilities). So 66% of numbers are eliminated after hitting the very first check. 2/5 (40%) of nubmers are eliminated after hitting the next row and so on (actually, it's more like 13.3% of all numbers checked are removed by being devisible by 5, because 2/3 of the numbers are eliminated by being divisible by 3, meaining that we have just eliminated 79% of numbers in the first two rows of our table). So this itteration is efficient in that most non-primes will be eliminated very quickly.
Also - I'm only storing primes because to the best of my understaning there is no such thing as a non-prime that is only divisible by other non-primes - at least one of the factors must be prime because the prime must be a factor of the smaller non-prime and hense factor of every number which has the non-prime as a factor.
So that saves a lot on storage, because we only need to store and check against primes. Finally, I am only using mod because it will quickly show up if the factor is divisble (a mod of zero indicates a non-prime). I tried with and without other limiting factors (i.e. to make sure that the factor value wasn't greater than half the Prime cantidate number) but I couldn't get a speed improvement, and therefore left it as is. This is probably because almost all non-primes are eliminated within the first couple of rows (most numbers are divisible by the first few primes).
So here is the completed Entry:
CREATE TABLE [dbo].[P] ( [Num] [int] NULL ) ON [PRIMARY] CREATE CLUSTERED INDEX [MainIdex] ON [dbo].[P]([Num]) ON [PRIMARY] go set nocount on
declare @Number as int, @Factor as int select @Number = isnull(max(num),1) from p While @Number < 10 Begin if not exists ( select 'x' from p where num > 1 and (@Number % num) = 0 ) insert into p values (@Number) set @Number = @Number + 2 end go insert into p values (2) go select * from p
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: 2 days ago @ 2:54 AM
Points: 119,
Visits: 214
|
|
I guess that I avoided part of the problem, because when I did this same problem a little while ago I wrote it in JavaScript, not SQL. For those who are interested, it is here: http://www.calcResult.com/maths/Sequences/prime-number-sequence.html Probably the most important point is that since it was running 'open ended' the traditional sieve methods weren't very useful, so I wrote things so that two lists are kept: a list of non-prime numbers, and a paired list of primes. The latter contains every prime number found so far, paired with a multiple of that prime.
This is not very efficient for small numbers, but works well once the list gets big, as everything is fairly linear; no multiplications and only one division (from what I recall) just a lot of additions and comparisons. To summarise: 1. Step through each integer, one by one 2. In the paired list, add the first number onto the second (if necessary) so that each paired number is greater than the integer from step 1 3. Check if the Integer in step 1 exists in the list of non-prime numbers. Add to either the prime or non-prime list as appropriate.
Throw away your pocket calculators; visit www.calcResult.com
|
|
|
|
|
Valued Member
      
Group: General Forum Members
Last Login: Sunday, June 19, 2011 5:02 AM
Points: 63,
Visits: 120
|
|
Working only with odd numbers, generate a table of not-primes via simple, fast multiplication. Delete the not-primes from the original odd-numbers table and voilà, we have our table of prime numbers (except for the special case of "2" which is an odd duck amongst the primes anyway).
Note that this code was tested on SQL Server 2000 and should be fairly portable:
DECLARE @UpperLimit integer, @SQR_ULimit integer
SET @UpperLimit = 1000 SET @SQR_ULimit = CEILING (SQRT (@UpperLimit))
CREATE TABLE Digits (K INTEGER NOT NULL PRIMARY KEY CHECK (K >= 0)); CREATE TABLE OddNumbers (K INTEGER NOT NULL PRIMARY KEY CHECK (K > 2)); CREATE TABLE Primes (K INTEGER NOT NULL PRIMARY KEY CHECK (K > 1)); CREATE TABLE NotPrimes (K INTEGER NOT NULL PRIMARY KEY CHECK (K > 1));
INSERT INTO Digits (K) SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9;
INSERT INTO OddNumbers (K) SELECT --1000*thousands.K + 100*hundreds.K + 10*tens.K + units.K + 3 AS K FROM --Digits thousands, Digits hundreds, Digits tens, Digits units WHERE (units.K + 3) % 2 = 1 AND (--1000*thousands.K + 100*hundreds.K + 10*tens.K + units.K + 3) <= @UpperLimit;
INSERT INTO NotPrimes (K) SELECT DISTINCT A.K * B.K AS K FROM OddNumbers A , OddNumbers B WHERE A.K <= @SQR_ULimit -- We can only limit one half of the equation using this method. AND (A.K * B.K) <= @UpperLimit;
INSERT INTO Primes (K) VALUES (2); -- Special case. INSERT INTO Primes SELECT K FROM OddNumbers;
DELETE FROM Primes WHERE K IN (SELECT K FROM NotPrimes);
SELECT K FROM Primes; -- ORDER BY K not really needed?
|
|
|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: Friday, April 26, 2013 3:18 AM
Points: 276,
Visits: 524
|
|
The "Sieve of Atkins" and "Wheel Sieve" entries in Wikipedia look too complicated for me, to implement in SQL while I'm supposed to be working, but I wrote a quick script that seems to me the "obvious" way, and it's quicker than any of the methods in the original article, and quicker than the previous post, so here is my entry (SQL2000).
Basically, generate a list (@seq) of all the numbers from 1 to 100,000 (excluding even numbers and numbers ending in 5 - except for 2 and 5 itself). Then, generate a list (@primes) of all the numbers in that list that don't appear in a list of non-primes.
Oddly, although this runs in 11 secs on my hardware, the subquery that generates the list of non-primes takes well over a minute to run on its own. Yet more evidence that the SQL optimizer is smarter than me.
declare @primes table (p int)
declare @allDigits table (i int) insert into @allDigits (i) ( select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9 union all select 0 )
declare @digits1379 table (i int) insert into @digits1379 (i) ( select (1-1) union all select (3-1) union all select (7-1) union all select (9-1) )
declare @seq table (s int, primary key (s)) insert into @seq (s) select 2 union all select 5 union all select a.i from ( select 1 + d0.i + d1.i * 10 + d2.i * 100 + d3.i * 1000 + d4.i * 10000 as i from @digits1379 d0 cross join @allDigits d1 cross join @allDigits d2 cross join @allDigits d3 cross join @allDigits d4 ) a
insert into @primes (p) select s from @seq where s not in ( select p1.s from @seq p1 cross join @seq p2 where (p1.s % p2.s = 0) -- check for divisibility and (p2.s > 1) -- no point in checking for divisibility by 1 and (p2.s <= p1.s / 2) -- only need to check for divisibility up to half the number )
select * from @primes order by p
|
|
|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: Today @ 3:57 AM
Points: 288,
Visits: 1,903
|
|
Wasn't the there a mathematical equation discovered several years ago that could say with certainty that a given number is a prime or not? It was constructed around a fixed number (around 12 or so iirc) of terms calculated seperatly and combined they give the answer yes or no prime.
I do not have a match background and I don't have a link at hand, just read it in a science magazine back then. Certainly doing some math per number is the quickest way, you can exclude even numbers and one from the list, and the obvious multiples of 3. it depends on the efficiency of the final math that determines if quick exclusion tricks are needed.
If I am right older sieves would be obsolete unless the match is to slow. At least its perfect to infinity :)
|
|
|
|
|
Valued Member
      
Group: General Forum Members
Last Login: Sunday, June 19, 2011 5:02 AM
Points: 63,
Visits: 120
|
|
Actually, the only reason the previous code was "faster" was because it used table variables -- which also makes it much less portable for us multi-platform types.
Also note that 1 is not a prime number, by definition.
Refactoring my previous code to use table variables, makes it the fastest by a factor of 4:
DECLARE @UpperLimit integer, @SQR_ULimit integer
SET @UpperLimit = 100000 SET @SQR_ULimit = CEILING (SQRT (@UpperLimit))
DECLARE @Digits TABLE (K INTEGER NOT NULL PRIMARY KEY CHECK (K >= 0)); DECLARE @OddNumbers TABLE (K INTEGER NOT NULL PRIMARY KEY CHECK (K > 2)); DECLARE @Primes TABLE (K INTEGER NOT NULL PRIMARY KEY CHECK (K > 1)); DECLARE @NotPrimes TABLE (K INTEGER NOT NULL PRIMARY KEY CHECK (K > 1));
INSERT INTO @Digits (K) SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9;
INSERT INTO @OddNumbers (K) SELECT 10000*TenThous.K + 1000*thousands.K + 100*hundreds.K + 10*tens.K + units.K + 3 AS K FROM @Digits TenThous, @Digits thousands, @Digits hundreds, @Digits tens, @Digits units WHERE (units.K + 3) % 2 = 1 AND (10000*TenThous.K + 1000*thousands.K + 100*hundreds.K + 10*tens.K + units.K + 3) <= @UpperLimit;
INSERT INTO @NotPrimes (K) SELECT DISTINCT A.K * B.K AS K FROM @OddNumbers A , @OddNumbers B WHERE A.K <= @SQR_ULimit -- We can only limit one half of the equation using this method. AND (A.K * B.K) <= @UpperLimit;
INSERT INTO @Primes (K) VALUES (2); -- Special case. INSERT INTO @Primes SELECT K FROM @OddNumbers;
DELETE FROM @Primes WHERE K IN (SELECT K FROM @NotPrimes);
SELECT K FROM @Primes ORDER BY K;
|
|
|
|