Calculating Prime Numbers With One Query

  • Comments posted to this topic are about the item Calculating Prime Numbers With One Query

  • My question would be... why do you use a resource intensive, slow as a While loop recursive CTE as your sequence generator?  There are many better methods.  See the following article.

    https://www.sqlservercentral.com/articles/hidden-rbar-counting-with-recursive-ctes

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I wrote this SQL Server table valued function a few years ago. It returns all the prime numbers less than 1 million in about 1 second.

    Fast Table Valued Function to return all prime numbers within a range

     

  • Calculate these once and store them in a table - look them up when needed :)_

  • For exhaustive methods it's quicker to only look at odd numbers.  There was a Youtube video a little while ago where they compared all different programming languages to determine which is the "fastest" at generating primes.  SQL was pretty much in last place

    https://www.youtube.com/watch?v=pSvSXBorw4A&t=281s

    The overall leaders (java, c++, rust, etc) are like 5k primes/sec.  I came up with this code to mess around and find out.  Iirc it was around 300/sec versus the contest entry (which someone did in SQLlite) was around 30/sec.  10x better but still nearly last.  Interpreted languages are inherently slower.  The generated counts are off by 1 (vs known counts) because the query count omits 2, the only even prime

    /*
    insert into known_prime_counts values
    (10,4),
    (100,25),
    (1000,168),
    (10000,1229),
    (100000,9592),
    (1000000,78498),
    (10000000,664579),
    (100000000,5761455)
    ;
    */

    declare @n int=10000;

    select count(*)
    --select fn.N*2+1 /*use this to return the prime numbers except for 2*/
    from dbo.fnTally(1, @n/2-1) fn
    where not exists(select 1
    from dbo.fnTally(1, fn.N/2) ffn
    where (fn.N*2+1)%(ffn.N*2+1)=0
    and (fn.N*2+1)<>(ffn.N*2+1));

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Steve Collins wrote:

    For exhaustive methods it's quicker to only look at odd numbers.  There was a Youtube video a little while ago where they compared all different programming languages to determine which is the "fastest" at generating primes.  SQL was pretty much in last place

    https://www.youtube.com/watch?v=pSvSXBorw4A&t=281s

    The overall leaders (java, c++, rust, etc) are like 5k primes/sec.  I came up with this code to mess around and find out.  Iirc it was around 300/sec versus the contest entry (which someone did in SQLlite) was around 30/sec.  10x better but still nearly last.  Interpreted languages are inherently slower.  The generated counts are off by 1 (vs known counts) because the query count omits 2, the only even prime

    /*
    insert into known_prime_counts values
    (10,4),
    (100,25),
    (1000,168),
    (10000,1229),
    (100000,9592),
    (1000000,78498),
    (10000000,664579),
    (100000000,5761455)
    ;
    */

    declare @n int=10000;

    select count(*)
    --select fn.N*2+1 /*use this to return the prime numbers except for 2*/
    from dbo.fnTally(1, @n/2-1) fn
    where not exists(select 1
    from dbo.fnTally(1, fn.N/2) ffn
    where (fn.N*2+1)%(ffn.N*2+1)=0
    and (fn.N*2+1)<>(ffn.N*2+1));

    You only need to test up to the square root of a number to test for primality, which is quite a saving:

    declare @n      int=100000;

    select fn.N*2+1 /*use this to return the prime numbers except for 2*/
    from dbo.fnTally(1, @n/2-1) fn
    where not exists(select 1
    from dbo.fnTally(1, sqrt(fn.N/2)) ffn
    where (fn.N*2+1)%(ffn.N*2+1)=0
    and (fn.N*2+1)<>(ffn.N*2+1));
  • Oh you're right.  This is not the final version but I'm not finding my notes on this.   Iirc if you follow the links it leads to a SQLlite script with a WHILE loop

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Jonathan AC Roberts wrote:

    You only need to test up to the square root of a number to test for primality, which is quite a saving:

    declare @n      int=100000;

    select fn.N*2+1 /*use this to return the prime numbers except for 2*/
    from dbo.fnTally(1, @n/2-1) fn
    where not exists(select 1
    from dbo.fnTally(1, sqrt(fn.N/2)) ffn
    where (fn.N*2+1)%(ffn.N*2+1)=0
    and (fn.N*2+1)<>(ffn.N*2+1));

    I did some (limited) testing using the script in the article and working up to the square root was marginally slower than working up to half the value - even though it cuts out a large proportion of the divisors. Maybe the SQRT function is slower than a simple division.

  • Chris Wooding wrote:

    Jonathan AC Roberts wrote:

    You only need to test up to the square root of a number to test for primality, which is quite a saving:

    declare @n      int=100000;

    select fn.N*2+1 /*use this to return the prime numbers except for 2*/
    from dbo.fnTally(1, @n/2-1) fn
    where not exists(select 1
    from dbo.fnTally(1, sqrt(fn.N/2)) ffn
    where (fn.N*2+1)%(ffn.N*2+1)=0
    and (fn.N*2+1)<>(ffn.N*2+1));

    I did some (limited) testing using the script in the article and working up to the square root was marginally slower than working up to half the value - even though it cuts out a large proportion of the divisors. Maybe the SQRT function is slower than a simple division.

    Really?! Can you provide your test script?

  • Jonathan AC Roberts wrote:

    Chris Wooding wrote:

    Jonathan AC Roberts wrote:

    You only need to test up to the square root of a number to test for primality, which is quite a saving:

    declare @n      int=100000;

    select fn.N*2+1 /*use this to return the prime numbers except for 2*/
    from dbo.fnTally(1, @n/2-1) fn
    where not exists(select 1
    from dbo.fnTally(1, sqrt(fn.N/2)) ffn
    where (fn.N*2+1)%(ffn.N*2+1)=0
    and (fn.N*2+1)<>(ffn.N*2+1));

    I did some (limited) testing using the script in the article and working up to the square root was marginally slower than working up to half the value - even though it cuts out a large proportion of the divisors. Maybe the SQRT function is slower than a simple division.

    Really?! Can you provide your test script?

    It's the one from the article with SQRT(T1.C0) replacing T1.C0 / 2.

    WITH NUMBER_SEQUENCE(C0) AS
    (
    SELECT 2 AS C0
    UNION ALL
    SELECT C0 + 1 AS C0 FROM NUMBER_SEQUENCE WHERE NUMBER_SEQUENCE.C0 < 100000
    )
    SELECT C0
    FROM (
    SELECT C0, CASE WHEN (REST <> 0 OR C0 = 2 OR C0 = 3) THEN 0 ELSE 1 END AS REST_CONDITION
    FROM (
    SELECT T.C0 AS C0, T.C1, T.C0 % T.C1 AS REST
    FROM (
    SELECT T1.C0 AS C0, T0.C0 AS C1
    FROM NUMBER_SEQUENCE AS T0, NUMBER_SEQUENCE AS T1
    WHERE SQRT(T1.[C0]) >= T0.C0 OR T1.C0 = 2 OR T1.C0 = 3
    ) AS T
    ) AS T
    ) AS T
    GROUP BY C0
    HAVING SUM(REST_CONDITION) = 0
    OPTION (MAXRECURSION 0);
  • Jonathan AC Roberts wrote:

    You only need to test up to the square root of a number to test for primality, which is quite a saving:

    declare @n      int=100000;

    select fn.N*2+1 /*use this to return the prime numbers except for 2*/
    from dbo.fnTally(1, @n/2-1) fn
    where not exists(select 1
    from dbo.fnTally(1, sqrt(fn.N/2)) ffn
    where (fn.N*2+1)%(ffn.N*2+1)=0
    and (fn.N*2+1)<>(ffn.N*2+1));

    That's a nice improvement.  It still seems tho when it scales to bigger search counts it slows down disproportionately

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • I think the script in the article is a bit of a non-starter.

    It is calculating the square root multiple times for each row.

  • Steve Collins wrote:

    Jonathan AC Roberts wrote:

    You only need to test up to the square root of a number to test for primality, which is quite a saving:

    declare @n      int=100000;

    select fn.N*2+1 /*use this to return the prime numbers except for 2*/
    from dbo.fnTally(1, @n/2-1) fn
    where not exists(select 1
    from dbo.fnTally(1, sqrt(fn.N/2)) ffn
    where (fn.N*2+1)%(ffn.N*2+1)=0
    and (fn.N*2+1)<>(ffn.N*2+1));

    That's a nice improvement.  It still seems tho when it scales to bigger search counts it slows down disproportionately

    Each number you test for primality needs O(sqrt(n)*n) calculations, so it will slow down for larger numbers. Also, if you increase @n to 10 times its previous value each run you would expect the elapsed time to go up by a factor of over 30.

  • Peter "Peso" Larsson wrote a nasty fast Prime Numbers function years ago and I tested it to confirm it.  I'll see if I can find it in my archives this week if someone else doesn't find it first.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    Peter "Peso" Larsson wrote a nasty fast Prime Numbers function years ago and I tested it to confirm it.  I'll see if I can find it in my archives this week if someone else doesn't find it first.

    I think this must be it, it is in the comments on this page https://www.red-gate.com/simple-talk/databases/sql-server/t-sql-programming-sql-server/celkos-summer-sql-stumpers-prime-numbers/

    /*
    Sieve Of Atkin implementation
    Peter Larsson
    Ryan Randall
    */

    -- Initialize user supplied parameter
    DECLARE @MaxPrime INT

    SET @MaxPrime = 1000000

    -- Prepare helper variables
    DECLARE @MaxNumber INT

    SET @MaxNumber = SQRT(@MaxPrime)

    -- Create staging tables
    CREATE TABLE #Primes
    (
    Prime INT NOT NULL,
    Prime2 BIGINT NOT NULL
    )

    CREATE TABLE #Sequence
    (
    Seq INT NOT NULL,
    Seq2 BIGINT NOT NULL
    )

    CREATE UNIQUE CLUSTERED INDEX IX_Sequence ON #Sequence (Seq, Seq2)

    INSERT #Sequence
    (
    Seq,
    Seq2
    )
    SELECT n,
    CAST(n AS BIGINT) * CAST(n AS BIGINT)
    FROM (
    SELECT 1
    + 1000 * COALESCE(d.Digit, 0)
    + 100 * COALESCE(c.Digit, 0)
    + 10 * COALESCE(b.Digit, 0)
    + COALESCE(a.Digit, 0) AS n
    FROM (
    SELECT 0 AS Digit UNION ALL
    SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL
    SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL
    SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
    ) AS a
    LEFT JOIN (
    SELECT 0 AS Digit UNION ALL
    SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL
    SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL
    SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
    ) AS b ON @MaxNumber > 10
    AND 10 * b.Digit < @MaxNumber
    LEFT JOIN (
    SELECT 0 AS Digit UNION ALL
    SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL
    SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL
    SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
    ) AS c ON @MaxNumber > 100
    AND 100 * c.Digit < @MaxNumber
    LEFT JOIN (
    SELECT 0 AS Digit UNION ALL
    SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL
    SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL
    SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
    ) AS d ON @MaxNumber > 1000
    AND 1000 * d.Digit < @MaxNumber
    WHERE a.Digit < @MaxNumber
    ) AS q
    WHERE n <= @MaxNumber
    ORDER BY n

    INSERT #Primes
    (
    Prime,
    Prime2
    )
    SELECT 2 AS k, 4 UNION ALL
    SELECT 3, 9 UNION ALL
    SELECT 5, 25 UNION ALL
    SELECT k, CAST(k AS BIGINT) * CAST(k AS BIGINT)
    FROM (
    SELECT k
    FROM (
    SELECT 4 * a.Seq2 + b.Seq2 AS k
    FROM #Sequence AS a
    CROSS JOIN #Sequence AS b
    ) AS c
    WHERE k <= @MaxPrime
    AND k % 12 IN (1, 5)

    UNION ALL

    SELECT k
    FROM (
    SELECT 3 * a.Seq2 + b.Seq2 AS k
    FROM #Sequence AS a
    CROSS JOIN #Sequence AS b
    ) AS c
    WHERE k <= @MaxPrime
    AND k % 12 = 7

    UNION ALL

    SELECT k
    FROM (
    SELECT 3 * a.Seq2 - b.Seq2 AS k
    FROM #Sequence AS a
    INNER JOIN #Sequence AS b ON b.Seq < a.Seq
    ) AS c
    WHERE k <= @MaxPrime
    AND k % 12 = 11
    ) AS d
    WHERE k % 10 IN (1, 3, 7, 9)
    GROUP BY k
    HAVING COUNT(k) IN (1, 3)
    --ORDER BY k

    CREATE UNIQUE CLUSTERED INDEX IX_Primes ON #Primes (Prime2, Prime)

    SELECT p.Prime
    FROM #Primes AS p
    WHERE NOT EXISTS (
    SELECT 1
    FROM #Primes AS x
    WHERE x.Prime2 <= p.Prime
    AND p.Prime % x.Prime = 0
    )
    AND p.Prime <= @MaxPrime

    DROP TABLE #Primes,
    #Sequence

Viewing 15 posts - 1 through 15 (of 77 total)

You must be logged in to reply to this topic. Login to reply