# Finding Primes

Aunt Kathi Data Platform MVP
Author of Expert T-SQL Window Functions
Simple-Talk Editor

• Finding primes is something I worked on about 6 months ago (for fun, of course ). Here's a link to the work...

It uses a sieve of Atkin approach, and in my tests back then, the function could find all primes below 100,000 in 0.5 seconds.

Ryan Randall

Solutions are easy. Understanding the problem, now, that's the hard part.

• Cool.  Looks like there are a lot of ways to solve this puzzle.

Aunt Kathi Data Platform MVP
Author of Expert T-SQL Window Functions
Simple-Talk Editor

• By checking the square root outside of the select statement, you will save the processor a number of extra calculations.  I modified it to be this:

SELECT @NextIntSqrt = SQRT(@NextInt)

IF NOT EXISTS (SELECT Prime FROM Primes WHERE @NextIntSqrt >= Prime AND @NextInt % Prime = 0)

Cuts down the calculation time significantly.  I also adjusted the order of the filters in the WHERE clause.

• Good idea.

Aunt Kathi Data Platform MVP
Author of Expert T-SQL Window Functions
Simple-Talk Editor

• Fun stuff.

I tried to do something similar in VB several years ago...the trick was that I was trying to solve for primes bigger than any of the data types. I ran out of time before I solved it...but it sure was a fun exercise!

• Kathi

In the book you mentioned, Chris has Aspbergers Syndrome (rather than being autistic). There are many similarities, but people with Aspbergers for the most part can lead relatively 'normal' lives. There are varying degrees of both of course, and it's sometimes hard to make a clear distinction where one ends and the other begins.

Regards,

MItch

• Hi Ryan,

What did you come up with for the largest prime number found and how many prime numbers?   I wanna make sure I'm doing it right...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

• My suggestion, on our SQL Server 2000 installation it finds all primes less than 5000000 in 54 seconds. But of course it depends on the hardware...

set nocount on

go

select top 5000000 identity(int, 1, 1) as Number into #Numbers

go

alter table #Numbers add constraint PK_Numbers primary key clustered (Number)

go

create table #Primes(prime int primary key)

go

declare @time datetime

select @time = getdate()

declare @i int

set @i = 1

while @i*@i < 5000000

begin

insert into #Primes

select n.Number

from #Numbers n

where

n.Number < (@i+1)*(@i+1) and n.Number > @i*@i

and not exists

(

select * from #Primes p

where p.prime < @i + 1 and n.Number % p.prime = 0

&nbsp

set @i = @i + 1

end

--select * from #Primes

select datediff(ms, @time, getdate())

drop table #Primes

go

drop table #Numbers

go

• Hi Jeff,

Here's some code so you can run it yourself. I'll add some run results in the next post.

--Inputs

DECLARE @MaxNumber INT

SET @MaxNumber = 5000000

--Preparation

SET NOCOUNT ON

DECLARE @Time DATETIME

SELECT @Time = GETDATE()

--Preparation - Numbers table

DECLARE @SqrtMaxNumber INT

SET @SqrtMaxNumber = SQRT(@MaxNumber)

SET ROWCOUNT @SqrtMaxNumber

CREATE TABLE #Numbers (i BIGINT IDENTITY(1, 1) PRIMARY KEY CLUSTERED, j BIGINT, x bit)

SET ROWCOUNT 0

UPDATE #Numbers SET j = i*i

PRINT 'Checkpoint A: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VARCHAR(10)) + ' ms'

--Preparation - Put candidate primes into a Primes table

-- (integers which have an odd number of representations by certain quadratic forms)

CREATE TABLE #Primes (i BIGINT PRIMARY KEY CLUSTERED)

INSERT #Primes

SELECT 2 UNION ALL SELECT 3 UNION ALL

SELECT k FROM (

SELECT k FROM (SELECT 4 * a.j + b.j AS k FROM #Numbers a, #Numbers b) c WHERE k <= @MaxNumber AND k % 12 IN (1, 5)

UNION ALL

SELECT k from (select 3 * a.j + b.j AS k FROM #Numbers a, #Numbers b) c WHERE k <= @MaxNumber AND k % 12 = 7

UNION ALL

SELECT k from (select 3 * a.j - b.j AS k FROM #Numbers a INNER JOIN #Numbers b ON a.i > b.i) c WHERE k <= @MaxNumber AND k % 12 = 11

) d GROUP BY k HAVING COUNT(*) IN (1, 3)

PRINT 'Checkpoint B: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VARCHAR(10)) + ' ms'

--Calculation - Eliminate composites by sieving

DECLARE @i BIGINT

SET @i = 5

WHILE @i * @i < @MaxNumber

BEGIN

DELETE #Primes WHERE i > @i and i % @i = 0

SELECT @i = min(i) FROM #Primes WHERE i > @i

PRINT @i

END

--Show results

PRINT 'Finish: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VARCHAR(10)) + ' ms'

SELECT COUNT(*) AS 'Number of primes', MAX(i) AS 'Max prime', AVG(i) AS 'Average of primes' FROM #Primes

SELECT * FROM #Primes ORDER BY i

--Tidy up

DROP TABLE #Primes

DROP TABLE #Numbers

Ryan Randall

Solutions are easy. Understanding the problem, now, that's the hard part.

• Here's a selection of results when I run the above code on a server here.

39 seconds to find all primes below 5 million, and 18 minutes to find all primes below 100 million.

DECLARE @Results TABLE (MaxNumber INT, CheckpointA INT, CheckpointB INT, Finish INT, NumberOfPrimes INT,

MaxPrime INT, AverageOfPrimes INT,

Seconds AS CAST(Finish AS DECIMAL) / 1000, Minutes AS CAST(Finish AS DECIMAL) / 60000,

Time AS CAST(Finish / 60000 AS VARCHAR(5)) + ':' + RIGHT('0' + CAST(Finish / 1000 % 60 AS VARCHAR(2)), 2))

INSERT @Results

SELECT 1000, 60, 93, 106, 168, 997, 453

UNION ALL SELECT 5000, 76, 123, 140, 669, 4999, 2314

UNION ALL SELECT 10000, 76, 156, 186, 1229, 9973, 4667

UNION ALL SELECT 50000, 76, 280, 360, 5133, 49999, 23575

UNION ALL SELECT 100000, 76, 466, 656, 9592, 99991, 47372

UNION ALL SELECT 500000, 76, 1250, 2576, 41538, 499979, 238678

UNION ALL SELECT 1000000, 93, 2203, 5516, 78498, 999983, 478361

UNION ALL SELECT 5000000, 110, 10156, 39280, 348513, 4999999, 2406213

UNION ALL SELECT 10000000, 93, 20030, 93403, 664579, 9999991, 4820081

UNION ALL SELECT 50000000, 123, 208930, 843080, 3001134, 49999991, 24197369

UNION ALL SELECT 100000000, 140, 466970, 1089933, 5761455, 99999989, 48461680

SELECT * FROM @Results

/* Results

MaxNumber CheckpointA CheckpointB Finish  NumberOfPrimes MaxPrime AverageOfPrimes Seconds     Minutes    Time

--------- ----------- ----------- ------- -------------- -------- --------------- ----------- ---------- -----

1000      60          93          106     168            997      453             0.106000    0.001766   0:00

5000      76          123         140     669            4999     2314            0.140000    0.002333   0:00

10000     76          156         186     1229           9973     4667            0.186000    0.003100   0:00

50000     76          280         360     5133           49999    23575           0.360000    0.006000   0:00

100000    76          466         656     9592           99991    47372           0.656000    0.010933   0:00

500000    76          1250        2576    41538          499979   238678          2.576000    0.042933   0:02

1000000   93          2203        5516    78498          999983   478361          5.516000    0.091933   0:05

5000000   110         10156       39280   348513         4999999  2406213         39.280000   0.654666   0:39

10000000  93          20030       93403   664579         9999991  4820081         93.403000   1.556716   1:33

50000000  123         208930      843080  3001134        49999991 24197369        843.080000  14.051333  14:03

100000000 140         466970      1089933 5761455        99999989 48461680        1089.933000 18.165550  18:09

*/

Ryan Randall

Solutions are easy. Understanding the problem, now, that's the hard part.

• Good stuff Ryan!

Did you compare the runtime against any of the other code posted on this thread?

• I have tested my code against Ryan's code on SQL Server 2000 and it seems to be equally fast (with the same output).

If the Numbers table is preconstructed, I guess my code is faster (because I need a larger Numbers table). On the other hand, if I am forced to use bigints (as Ryan does), my code will

be a bit slower.

I have used the following in my test:

/*************** MY CODE ******************/

set nocount on

go

declare @time datetime

select @time = getdate()

DECLARE @MaxNumber INT

SET @MaxNumber = 5000000

SET ROWCOUNT @MaxNumber

select identity(int, 1, 1) as Number into #Numbers

SET ROWCOUNT 0

alter table #Numbers add constraint PK_Numbers primary key clustered (Number)

create table #Primes(prime int primary key)

declare @i int, @limit int

set @i = 1

while @i*@i < @MaxNumber

begin

set @limit = (@i+1)*(@i+1)

if (@i+1)*(@i+1) > @MaxNumber

set @limit = @MaxNumber + 1

insert into #Primes

select n.Number

from #Numbers n

where

n.Number < @limit and n.Number > @i*@i

and not exists

(select * from #Primes p where p.prime < @i + 1 and n.Number % p.prime = 0)

set @i = @i + 1

end

--select * from #Primes

PRINT 'Finish: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VARCHAR(10)) + ' ms'

SELECT COUNT(*) AS 'Number of primes', MAX(prime) AS 'Max prime' FROM #Primes

drop table #Primes

go

drop table #Numbers

go

/******** RYAN'S CODE************************/

--Inputs

DECLARE @MaxNumber INT

SET @MaxNumber = 5000000

--Preparation

SET NOCOUNT ON

DECLARE @Time DATETIME

SELECT @Time = GETDATE()

--Preparation - Numbers table

DECLARE @SqrtMaxNumber INT

SET @SqrtMaxNumber = SQRT(@MaxNumber)

SET ROWCOUNT @SqrtMaxNumber

CREATE TABLE #Numbers (i BIGINT IDENTITY(1, 1) PRIMARY KEY CLUSTERED, j BIGINT, x bit)

INSERT INTO #Numbers (x) SELECT NULL FROM syscomments c1 CROSS JOIN syscomments c2

SET ROWCOUNT 0

UPDATE #Numbers SET j = i*i

PRINT 'Checkpoint A: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VARCHAR(10)) + ' ms'

--Preparation - Put candidate primes into a Primes table

-- (integers which have an odd number of representations by certain quadratic forms)

CREATE TABLE #Primes (i BIGINT PRIMARY KEY CLUSTERED)

INSERT #Primes

SELECT 2 UNION ALL SELECT 3 UNION ALL

SELECT k FROM (

SELECT k FROM (SELECT 4 * a.j + b.j AS k FROM #Numbers a, #Numbers b) c WHERE k <= @MaxNumber AND k % 12 IN (1, 5)

UNION ALL

SELECT k from (select 3 * a.j + b.j AS k FROM #Numbers a, #Numbers b) c WHERE k <= @MaxNumber AND k % 12 = 7

UNION ALL

SELECT k from (select 3 * a.j - b.j AS k FROM #Numbers a INNER JOIN #Numbers b ON a.i > b.i) c WHERE k <= @MaxNumber AND k % 12 = 11

) d GROUP BY k HAVING COUNT(*) IN (1, 3)

PRINT 'Checkpoint B: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VARCHAR(10)) + ' ms'

--Calculation - Eliminate composites by sieving

DECLARE @i BIGINT

SET @i = 5

WHILE @i * @i < @MaxNumber

BEGIN

DELETE #Primes WHERE i > @i and i % @i = 0

SELECT @i = min(i) FROM #Primes WHERE i > @i

--    PRINT @i

END

--Show results

PRINT 'Finish: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VARCHAR(10)) + ' ms'

SELECT COUNT(*) AS 'Number of primes', MAX(i) AS 'Max prime' FROM #Primes

--SELECT * FROM #Primes ORDER BY i

--Tidy up

DROP TABLE #Primes

DROP TABLE #Numbers

• Ryan, I modified your script to use INT temp tables, instead of BIGINT, and got almost an 18% reduction in runtime.  The following is the result of finding all primes below 100 million.

Using BIGINT for the data types doesn't seem to be needed unless you are going over 2 billion in your prime number search.

```Test with INT temp tables
Checkpoint A: 250 ms
Checkpoint B: 230573 ms
Finish: 652583 ms
Number of primes Max prime   Average of primes
---------------- ----------- --------------------
5761455          99999989    48461680```
```Test with BIGINT temp tables
Checkpoint A: 250 ms
Checkpoint B: 280590 ms
Finish: 791850 ms
Number of primes Max prime            Average of primes
---------------- -------------------- --------------------
5761455          99999989             48461680```
` `
` `

Viewing 15 posts - 1 through 15 (of 15 total)