Finding Primes

  • Comments posted here are about the content posted at http://www.sqlservercentral.com/columnists/kKellenberger/2782.asp

    Aunt Kathi Data Platform MVP
    Author of Expert T-SQL Window Functions
    Simple-Talk Editor

  • Finding primes is something I worked on about 6 months ago (for fun, of course ). Here's a link to the work...

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=69646

    It uses a sieve of Atkin approach, and in my tests back then, the function could find all primes below 100,000 in 0.5 seconds.

    Ryan Randall

    Solutions are easy. Understanding the problem, now, that's the hard part.

  • Cool.  Looks like there are a lot of ways to solve this puzzle. 

    Aunt Kathi Data Platform MVP
    Author of Expert T-SQL Window Functions
    Simple-Talk Editor

  • By checking the square root outside of the select statement, you will save the processor a number of extra calculations.  I modified it to be this:

     SELECT @NextIntSqrt = SQRT(@NextInt)

     IF NOT EXISTS (SELECT Prime FROM Primes WHERE @NextIntSqrt >= Prime AND @NextInt % Prime = 0)

    Cuts down the calculation time significantly.  I also adjusted the order of the filters in the WHERE clause.

  • Good idea.

    Aunt Kathi Data Platform MVP
    Author of Expert T-SQL Window Functions
    Simple-Talk Editor

  • Fun stuff.

    I tried to do something similar in VB several years ago...the trick was that I was trying to solve for primes bigger than any of the data types. I ran out of time before I solved it...but it sure was a fun exercise!

  • Kathi

    In the book you mentioned, Chris has Aspbergers Syndrome (rather than being autistic). There are many similarities, but people with Aspbergers for the most part can lead relatively 'normal' lives. There are varying degrees of both of course, and it's sometimes hard to make a clear distinction where one ends and the other begins.

    Regards,

    MItch

  • Hi Ryan,

    What did you come up with for the largest prime number found and how many prime numbers?   I wanna make sure I'm doing it right...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • My suggestion, on our SQL Server 2000 installation it finds all primes less than 5000000 in 54 seconds. But of course it depends on the hardware...

     

     

    set nocount on

    go

    select top 5000000 identity(int, 1, 1) as Number into #Numbers

    from syscomments c1 cross join syscomments c2

    go

    alter table #Numbers add constraint PK_Numbers primary key clustered (Number)

    go

    create table #Primes(prime int primary key)

    go

    declare @time datetime

    select @time = getdate()

    declare @i int

    set @i = 1

    while @i*@i < 5000000

    begin

     insert into #Primes

     select n.Number

     from #Numbers n

     where

     n.Number < (@i+1)*(@i+1) and n.Number > @i*@i

     and not exists

     (

      select * from #Primes p

      where p.prime < @i + 1 and n.Number % p.prime = 0

    &nbsp

     set @i = @i + 1

    end

    --select * from #Primes

    select datediff(ms, @time, getdate())

    drop table #Primes

    go

    drop table #Numbers

    go

     

  • Hi Jeff,

    Here's some code so you can run it yourself. I'll add some run results in the next post.

    --Inputs

    DECLARE @MaxNumber INT

    SET @MaxNumber = 5000000

    --Preparation

    SET NOCOUNT ON

    DECLARE @Time DATETIME

    SELECT @Time = GETDATE()

    --Preparation - Numbers table

    DECLARE @SqrtMaxNumber INT

    SET @SqrtMaxNumber = SQRT(@MaxNumber)

    SET ROWCOUNT @SqrtMaxNumber

    CREATE TABLE #Numbers (i BIGINT IDENTITY(1, 1) PRIMARY KEY CLUSTERED, j BIGINT, x bit)

    INSERT INTO #Numbers (x) SELECT NULL FROM syscomments c1, syscomments c2, syscomments c3, syscomments c4

    SET ROWCOUNT 0

    UPDATE #Numbers SET j = i*i

    PRINT 'Checkpoint A: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VARCHAR(10)) + ' ms'

    --Preparation - Put candidate primes into a Primes table

    -- (integers which have an odd number of representations by certain quadratic forms)

    CREATE TABLE #Primes (i BIGINT PRIMARY KEY CLUSTERED)

    INSERT #Primes

    SELECT 2 UNION ALL SELECT 3 UNION ALL

    SELECT k FROM (

        SELECT k FROM (SELECT 4 * a.j + b.j AS k FROM #Numbers a, #Numbers b) c WHERE k <= @MaxNumber AND k % 12 IN (1, 5)

        UNION ALL

        SELECT k from (select 3 * a.j + b.j AS k FROM #Numbers a, #Numbers b) c WHERE k <= @MaxNumber AND k % 12 = 7

        UNION ALL

        SELECT k from (select 3 * a.j - b.j AS k FROM #Numbers a INNER JOIN #Numbers b ON a.i > b.i) c WHERE k <= @MaxNumber AND k % 12 = 11

    ) d GROUP BY k HAVING COUNT(*) IN (1, 3)

    PRINT 'Checkpoint B: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VARCHAR(10)) + ' ms'

    --Calculation - Eliminate composites by sieving

    DECLARE @i BIGINT

    SET @i = 5

    WHILE @i * @i < @MaxNumber

    BEGIN

        DELETE #Primes WHERE i > @i and i % @i = 0

        SELECT @i = min(i) FROM #Primes WHERE i > @i

        PRINT @i

    END

    --Show results

    PRINT 'Finish: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VARCHAR(10)) + ' ms'

    SELECT COUNT(*) AS 'Number of primes', MAX(i) AS 'Max prime', AVG(i) AS 'Average of primes' FROM #Primes

    SELECT * FROM #Primes ORDER BY i

    --Tidy up

    DROP TABLE #Primes

    DROP TABLE #Numbers

    Ryan Randall

    Solutions are easy. Understanding the problem, now, that's the hard part.

  • Here's a selection of results when I run the above code on a server here.

     

    39 seconds to find all primes below 5 million, and 18 minutes to find all primes below 100 million.

     

     

    DECLARE @Results TABLE (MaxNumber INT, CheckpointA INT, CheckpointB INT, Finish INT, NumberOfPrimes INT,

        MaxPrime INT, AverageOfPrimes INT,

        Seconds AS CAST(Finish AS DECIMAL) / 1000, Minutes AS CAST(Finish AS DECIMAL) / 60000,

        Time AS CAST(Finish / 60000 AS VARCHAR(5)) + ':' + RIGHT('0' + CAST(Finish / 1000 % 60 AS VARCHAR(2)), 2))

     

    INSERT @Results

              SELECT 1000, 60, 93, 106, 168, 997, 453

    UNION ALL SELECT 5000, 76, 123, 140, 669, 4999, 2314

    UNION ALL SELECT 10000, 76, 156, 186, 1229, 9973, 4667

    UNION ALL SELECT 50000, 76, 280, 360, 5133, 49999, 23575

    UNION ALL SELECT 100000, 76, 466, 656, 9592, 99991, 47372

    UNION ALL SELECT 500000, 76, 1250, 2576, 41538, 499979, 238678

    UNION ALL SELECT 1000000, 93, 2203, 5516, 78498, 999983, 478361

    UNION ALL SELECT 5000000, 110, 10156, 39280, 348513, 4999999, 2406213

    UNION ALL SELECT 10000000, 93, 20030, 93403, 664579, 9999991, 4820081

    UNION ALL SELECT 50000000, 123, 208930, 843080, 3001134, 49999991, 24197369

    UNION ALL SELECT 100000000, 140, 466970, 1089933, 5761455, 99999989, 48461680

     

    SELECT * FROM @Results

     

    /* Results

    MaxNumber CheckpointA CheckpointB Finish  NumberOfPrimes MaxPrime AverageOfPrimes Seconds     Minutes    Time

    --------- ----------- ----------- ------- -------------- -------- --------------- ----------- ---------- -----

    1000      60          93          106     168            997      453             0.106000    0.001766   0:00

    5000      76          123         140     669            4999     2314            0.140000    0.002333   0:00

    10000     76          156         186     1229           9973     4667            0.186000    0.003100   0:00

    50000     76          280         360     5133           49999    23575           0.360000    0.006000   0:00

    100000    76          466         656     9592           99991    47372           0.656000    0.010933   0:00

    500000    76          1250        2576    41538          499979   238678          2.576000    0.042933   0:02

    1000000   93          2203        5516    78498          999983   478361          5.516000    0.091933   0:05

    5000000   110         10156       39280   348513         4999999  2406213         39.280000   0.654666   0:39

    10000000  93          20030       93403   664579         9999991  4820081         93.403000   1.556716   1:33

    50000000  123         208930      843080  3001134        49999991 24197369        843.080000  14.051333  14:03

    100000000 140         466970      1089933 5761455        99999989 48461680        1089.933000 18.165550  18:09

    */

    Ryan Randall

    Solutions are easy. Understanding the problem, now, that's the hard part.

  • Good stuff Ryan!

    Did you compare the runtime against any of the other code posted on this thread?

     

  • I have tested my code against Ryan's code on SQL Server 2000 and it seems to be equally fast (with the same output).

    If the Numbers table is preconstructed, I guess my code is faster (because I need a larger Numbers table). On the other hand, if I am forced to use bigints (as Ryan does), my code will

    be a bit slower.

    I have used the following in my test:

     

    /*************** MY CODE ******************/

    set nocount on

    go

    declare @time datetime

    select @time = getdate()

    DECLARE @MaxNumber INT

    SET @MaxNumber = 5000000

    SET ROWCOUNT @MaxNumber

    select identity(int, 1, 1) as Number into #Numbers

    from syscomments c1 cross join syscomments c2 cross join syscomments c3

    SET ROWCOUNT 0

    alter table #Numbers add constraint PK_Numbers primary key clustered (Number)

    create table #Primes(prime int primary key)

    declare @i int, @limit int

    set @i = 1

    while @i*@i < @MaxNumber

    begin

      set @limit = (@i+1)*(@i+1)

      if (@i+1)*(@i+1) > @MaxNumber

        set @limit = @MaxNumber + 1

      insert into #Primes

      select n.Number

      from #Numbers n

      where

      n.Number < @limit and n.Number > @i*@i

      and not exists

      (select * from #Primes p where p.prime < @i + 1 and n.Number % p.prime = 0)

      set @i = @i + 1

    end

    --select * from #Primes

    PRINT 'Finish: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VARCHAR(10)) + ' ms'

    SELECT COUNT(*) AS 'Number of primes', MAX(prime) AS 'Max prime' FROM #Primes

    drop table #Primes

    go

    drop table #Numbers

    go

     

     

    /******** RYAN'S CODE************************/

     

    --Inputs

    DECLARE @MaxNumber INT

    SET @MaxNumber = 5000000

    --Preparation

    SET NOCOUNT ON

    DECLARE @Time DATETIME

    SELECT @Time = GETDATE()

    --Preparation - Numbers table

    DECLARE @SqrtMaxNumber INT

    SET @SqrtMaxNumber = SQRT(@MaxNumber)

    SET ROWCOUNT @SqrtMaxNumber

    CREATE TABLE #Numbers (i BIGINT IDENTITY(1, 1) PRIMARY KEY CLUSTERED, j BIGINT, x bit)

    INSERT INTO #Numbers (x) SELECT NULL FROM syscomments c1 CROSS JOIN syscomments c2

    SET ROWCOUNT 0

    UPDATE #Numbers SET j = i*i

    PRINT 'Checkpoint A: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VARCHAR(10)) + ' ms'

    --Preparation - Put candidate primes into a Primes table

    -- (integers which have an odd number of representations by certain quadratic forms)

    CREATE TABLE #Primes (i BIGINT PRIMARY KEY CLUSTERED)

    INSERT #Primes

    SELECT 2 UNION ALL SELECT 3 UNION ALL

    SELECT k FROM (

        SELECT k FROM (SELECT 4 * a.j + b.j AS k FROM #Numbers a, #Numbers b) c WHERE k <= @MaxNumber AND k % 12 IN (1, 5)

        UNION ALL

        SELECT k from (select 3 * a.j + b.j AS k FROM #Numbers a, #Numbers b) c WHERE k <= @MaxNumber AND k % 12 = 7

        UNION ALL

        SELECT k from (select 3 * a.j - b.j AS k FROM #Numbers a INNER JOIN #Numbers b ON a.i > b.i) c WHERE k <= @MaxNumber AND k % 12 = 11

    ) d GROUP BY k HAVING COUNT(*) IN (1, 3)

    PRINT 'Checkpoint B: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VARCHAR(10)) + ' ms'

    --Calculation - Eliminate composites by sieving

    DECLARE @i BIGINT

    SET @i = 5

    WHILE @i * @i < @MaxNumber

    BEGIN

        DELETE #Primes WHERE i > @i and i % @i = 0

        SELECT @i = min(i) FROM #Primes WHERE i > @i

    --    PRINT @i

    END

    --Show results

    PRINT 'Finish: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VARCHAR(10)) + ' ms'

    SELECT COUNT(*) AS 'Number of primes', MAX(i) AS 'Max prime' FROM #Primes

    --SELECT * FROM #Primes ORDER BY i

    --Tidy up

    DROP TABLE #Primes

    DROP TABLE #Numbers

     

  • Ryan, I modified your script to use INT temp tables, instead of BIGINT, and got almost an 18% reduction in runtime.  The following is the result of finding all primes below 100 million.

    Using BIGINT for the data types doesn't seem to be needed unless you are going over 2 billion in your prime number search.

    Test with INT temp tables
    Checkpoint A: 250 ms
    Checkpoint B: 230573 ms
    Finish: 652583 ms
    Number of primes Max prime   Average of primes    
    ---------------- ----------- -------------------- 
    5761455          99999989    48461680
    Test with BIGINT temp tables
    Checkpoint A: 250 ms
    Checkpoint B: 280590 ms
    Finish: 791850 ms
    Number of primes Max prime            Average of primes    
    ---------------- -------------------- -------------------- 
    5761455          99999989             48461680
     
     

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply