Log in  ::  Register  ::  Not logged in

 Recent PostsRecent Posts Popular TopicsPopular Topics
 Home Search Members Calendar Who's On

 Celko's Summer SQL Stumpers: Prime numbers Rate Topic Display Mode Topic Options
Author
 Message
 Posted Saturday, July 25, 2009 5:09 PM
 SSCommitted Group: General Forum Members Last Login: 2 days ago @ 7:41 PM Points: 1,945, Visits: 3,470
 Comments posted to this topic are about the item Celko's Summer SQL Stumpers: Prime numbers Books in Celko Series for Morgan-Kaufmann PublishingAnalytics and OLAP in SQL Data and Databases: Concepts in Practice Data, Measurements and Standards in SQLSQL for SmartiesSQL Programming Style SQL Puzzles and Answers Thinking in SetsTrees and Hierarchies in SQL
Post #759697
 Posted Monday, July 27, 2009 1:02 AM
 Grasshopper Group: General Forum Members Last Login: Friday, June 13, 2014 8:09 PM Points: 18, Visits: 51
 Here's how I do it. On my server it takes 24 seconds to generate the first 1,229 primes. But at this point add an index to Num, and it then takes me 59 seconds to do up to the 100,000 number range (9,595 primes).`drop table pgoCREATE TABLE [dbo].[P] ( [Num] [int] NULL ) ON [PRIMARY]goset nocount ondeclare @Number as int, @Factor as intset @Number = 1While @Number < 10000 Begin Declare Factors cursor for select num from p where num <= @number/cast(num as float) and num > 1 order by num asc open Factors fetch Factors into @Factor while @@fetch_Status = 0 begin if (@Number % @Factor) = 0 begin break end fetch Factors into @Factor end if (@Number % @Factor) <> 0 or @Factor is null insert into p values (@Number) close Factors deallocate Factors set @Number = @Number + 2 endgoselect * from p`
Post #759900
 Posted Monday, July 27, 2009 1:12 AM
 Grasshopper Group: General Forum Members Last Login: Friday, June 13, 2014 8:09 PM Points: 18, Visits: 51
 Actually.... just put a seconds more thought into it and came up with this... it is faster... I don't know why I went down the cursor path...`CREATE TABLE [dbo].[P] ( [Num] [int] NULL ) ON [PRIMARY] CREATE CLUSTERED INDEX [MainIdex] ON [dbo].[P]([Num]) ON [PRIMARY]goset nocount ondeclare @Number as int, @Factor as intselect @Number = isnull(max(num),1) from pWhile @Number < 100000 Begin if not exists ( select 'x' from p where num > 1 and (@Number % num) = 0 ) insert into p values (@Number) set @Number = @Number + 2 endgoselect * from p`
Post #759904
 Posted Monday, July 27, 2009 1:13 AM
 Grasshopper Group: General Forum Members Last Login: Friday, June 13, 2014 8:09 PM Points: 18, Visits: 51
 When I say faster - it runs the full 100,000 range (9592 primes) in 30 seconds.
Post #759905
 Posted Monday, July 27, 2009 1:33 AM
 Grasshopper Group: General Forum Members Last Login: Friday, June 13, 2014 8:09 PM Points: 18, Visits: 51
 Oh - I just re-read and it said to explain.... and I realised I left "2" out of the prime list. I'm also considering 1 a prime because it is divisible by 1 and itself... you can change the value in the original insert to 3 if you don't want to consider 1 to be prime.Well - I'm jumping by increments of 2 because that removes all even numbers, which by definition are divisible by 2. I then do an "exists" check, because if at least one occurance is found the exist stops - that is, SQL Server doesn't have to go through all factors in the table, BUT because the clustered index is ascending it will start with the most likely hits (3, 5, etc). Of all nubmers, 1/2 (50%) are divisible by 2, 1/3 (33.33%) are divisible by 3, 1/5 (20%) are divisible by 5 and so on. So if I had a random number that is non-even, there is a 2/3 probability it is divisible by 3 (the fact I removed all even nubmers in the first step doubles the later probabilities). So 66% of numbers are eliminated after hitting the very first check. 2/5 (40%) of nubmers are eliminated after hitting the next row and so on (actually, it's more like 13.3% of all numbers checked are removed by being devisible by 5, because 2/3 of the numbers are eliminated by being divisible by 3, meaining that we have just eliminated 79% of numbers in the first two rows of our table). So this itteration is efficient in that most non-primes will be eliminated very quickly.Also - I'm only storing primes because to the best of my understaning there is no such thing as a non-prime that is only divisible by other non-primes - at least one of the factors must be prime because the prime must be a factor of the smaller non-prime and hense factor of every number which has the non-prime as a factor. So that saves a lot on storage, because we only need to store and check against primes. Finally, I am only using mod because it will quickly show up if the factor is divisble (a mod of zero indicates a non-prime). I tried with and without other limiting factors (i.e. to make sure that the factor value wasn't greater than half the Prime cantidate number) but I couldn't get a speed improvement, and therefore left it as is. This is probably because almost all non-primes are eliminated within the first couple of rows (most numbers are divisible by the first few primes).So here is the completed Entry:`CREATE TABLE [dbo].[P] ( [Num] [int] NULL ) ON [PRIMARY] CREATE CLUSTERED INDEX [MainIdex] ON [dbo].[P]([Num]) ON [PRIMARY]goset nocount ondeclare @Number as int, @Factor as intselect @Number = isnull(max(num),1) from pWhile @Number < 10 Begin if not exists ( select 'x' from p where num > 1 and (@Number % num) = 0 ) insert into p values (@Number) set @Number = @Number + 2 endgoinsert into p values (2)goselect * from p`
Post #759910
 Posted Monday, July 27, 2009 6:35 AM
 SSC-Enthusiastic Group: General Forum Members Last Login: Monday, November 11, 2013 2:42 AM Points: 150, Visits: 245
 I guess that I avoided part of the problem, because when I did this same problem a little while ago I wrote it in JavaScript, not SQL. For those who are interested, it is here: http://www.calcResult.com/maths/Sequences/prime-number-sequence.htmlProbably the most important point is that since it was running 'open ended' the traditional sieve methods weren't very useful, so I wrote things so that two lists are kept: a list of non-prime numbers, and a paired list of primes. The latter contains every prime number found so far, paired with a multiple of that prime. This is not very efficient for small numbers, but works well once the list gets big, as everything is fairly linear; no multiplications and only one division (from what I recall) just a lot of additions and comparisons.To summarise:1. Step through each integer, one by one2. In the paired list, add the first number onto the second (if necessary) so that each paired number is greater than the integer from step 13. Check if the Integer in step 1 exists in the list of non-prime numbers. Add to either the prime or non-prime list as appropriate. Throw away your pocket calculators; visit www.calcResult.com
Post #759999
 Posted Monday, July 27, 2009 7:10 AM
 Valued Member Group: General Forum Members Last Login: Sunday, June 19, 2011 5:02 AM Points: 63, Visits: 120
 Working only with odd numbers, generate a table of not-primes via simple, fast multiplication.Delete the not-primes from the original odd-numbers table and voilà, we have our table of prime numbers (except for the special case of "2" which is an odd duck amongst the primes anyway).Note that this code was tested on SQL Server 2000 and should be fairly portable:`DECLARE @UpperLimit integer, @SQR_ULimit integerSET @UpperLimit = 1000SET @SQR_ULimit = CEILING (SQRT (@UpperLimit))CREATE TABLE Digits (K INTEGER NOT NULL PRIMARY KEY CHECK (K >= 0));CREATE TABLE OddNumbers (K INTEGER NOT NULL PRIMARY KEY CHECK (K > 2));CREATE TABLE Primes (K INTEGER NOT NULL PRIMARY KEY CHECK (K > 1));CREATE TABLE NotPrimes (K INTEGER NOT NULL PRIMARY KEY CHECK (K > 1));INSERT INTO Digits (K)SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9;INSERT INTO OddNumbers (K)SELECT --1000*thousands.K + 100*hundreds.K + 10*tens.K + units.K + 3 AS KFROM --Digits thousands, Digits hundreds, Digits tens, Digits unitsWHERE (units.K + 3) % 2 = 1AND (--1000*thousands.K + 100*hundreds.K + 10*tens.K + units.K + 3) <= @UpperLimit;INSERT INTO NotPrimes (K)SELECT DISTINCT A.K * B.K AS KFROM OddNumbers A , OddNumbers BWHERE A.K <= @SQR_ULimit -- We can only limit one half of the equation using this method.AND (A.K * B.K) <= @UpperLimit;INSERT INTO Primes (K) VALUES (2); -- Special case.INSERT INTO Primes SELECT K FROM OddNumbers;DELETE FROM Primes WHERE K IN (SELECT K FROM NotPrimes);SELECT K FROM Primes; -- ORDER BY K not really needed?`
Post #760025
 Posted Monday, July 27, 2009 7:12 AM
 SSC Veteran Group: General Forum Members Last Login: Friday, January 17, 2014 6:38 AM Points: 278, Visits: 534
 The "Sieve of Atkins" and "Wheel Sieve" entries in Wikipedia look too complicated for me, to implement in SQL while I'm supposed to be working, but I wrote a quick script that seems to me the "obvious" way, and it's quicker than any of the methods in the original article, and quicker than the previous post, so here is my entry (SQL2000).Basically, generate a list (@seq) of all the numbers from 1 to 100,000 (excluding even numbers and numbers ending in 5 - except for 2 and 5 itself). Then, generate a list (@primes) of all the numbers in that list that don't appear in a list of non-primes.Oddly, although this runs in 11 secs on my hardware, the subquery that generates the list of non-primes takes well over a minute to run on its own. Yet more evidence that the SQL optimizer is smarter than me.`declare @primes table (p int)declare @allDigits table (i int)insert into @allDigits (i)(select 1union allselect 2union allselect 3union allselect 4union allselect 5union allselect 6union allselect 7union allselect 8union allselect 9union allselect 0)declare @digits1379 table (i int)insert into @digits1379 (i)(select (1-1)union allselect (3-1)union allselect (7-1)union allselect (9-1))declare @seq table (s int, primary key (s))insert into @seq (s)select 2union allselect 5union allselect a.ifrom ( select 1 + d0.i + d1.i * 10 + d2.i * 100 + d3.i * 1000 + d4.i * 10000 as i from @digits1379 d0 cross join @allDigits d1 cross join @allDigits d2 cross join @allDigits d3 cross join @allDigits d4 ) ainsert into @primes (p)select sfrom @seqwhere s not in ( select p1.s from @seq p1 cross join @seq p2 where (p1.s % p2.s = 0) -- check for divisibility and (p2.s > 1) -- no point in checking for divisibility by 1 and (p2.s <= p1.s / 2) -- only need to check for divisibility up to half the number )select * from @primesorder by p`
Post #760026
 Posted Monday, July 27, 2009 7:53 AM
 Old Hand Group: General Forum Members Last Login: 2 days ago @ 6:43 AM Points: 344, Visits: 2,339
 Wasn't the there a mathematical equation discovered several years ago that could say with certainty that a given number is a prime or not? It was constructed around a fixed number (around 12 or so iirc) of terms calculated seperatly and combined they give the answer yes or no prime.I do not have a match background and I don't have a link at hand, just read it in a science magazine back then. Certainly doing some math per number is the quickest way, you can exclude even numbers and one from the list, and the obvious multiples of 3. it depends on the efficiency of the final math that determines if quick exclusion tricks are needed.If I am right older sieves would be obsolete unless the match is to slow. At least its perfect to infinity :)
Post #760064
 Posted Monday, July 27, 2009 7:56 AM
 Valued Member Group: General Forum Members Last Login: Sunday, June 19, 2011 5:02 AM Points: 63, Visits: 120
 Actually, the only reason the previous code was "faster" was because it used table variables -- which also makes it much less portable for us multi-platform types.Also note that 1 is not a prime number, by definition.Refactoring my previous code to use table variables, makes it the fastest by a factor of 4:`DECLARE @UpperLimit integer, @SQR_ULimit integerSET @UpperLimit = 100000SET @SQR_ULimit = CEILING (SQRT (@UpperLimit))DECLARE @Digits TABLE (K INTEGER NOT NULL PRIMARY KEY CHECK (K >= 0));DECLARE @OddNumbers TABLE (K INTEGER NOT NULL PRIMARY KEY CHECK (K > 2));DECLARE @Primes TABLE (K INTEGER NOT NULL PRIMARY KEY CHECK (K > 1));DECLARE @NotPrimes TABLE (K INTEGER NOT NULL PRIMARY KEY CHECK (K > 1));INSERT INTO @Digits (K)SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9;INSERT INTO @OddNumbers (K)SELECT 10000*TenThous.K + 1000*thousands.K + 100*hundreds.K + 10*tens.K + units.K + 3 AS KFROM @Digits TenThous, @Digits thousands, @Digits hundreds, @Digits tens, @Digits unitsWHERE (units.K + 3) % 2 = 1AND (10000*TenThous.K + 1000*thousands.K + 100*hundreds.K + 10*tens.K + units.K + 3) <= @UpperLimit;INSERT INTO @NotPrimes (K)SELECT DISTINCT A.K * B.K AS KFROM @OddNumbers A , @OddNumbers BWHERE A.K <= @SQR_ULimit -- We can only limit one half of the equation using this method.AND (A.K * B.K) <= @UpperLimit;INSERT INTO @Primes (K) VALUES (2); -- Special case.INSERT INTO @Primes SELECT K FROM @OddNumbers;DELETE FROM @Primes WHERE K IN (SELECT K FROM @NotPrimes);SELECT K FROM @Primes ORDER BY K; `
Post #760072

 Permissions

 Copyright © 2002-2015 Simple Talk Publishing. All Rights Reserved. Privacy Policy. Terms of Use. Report Abuse.