Calculating Prime Numbers With One Query

Question

Calculating Prime Numbers With One Query

wfw311

SSC Veteran

Points: 232
More actions
June 14, 2023 at 12:00 am

#4196886

Comments posted to this topic are about the item Calculating Prime Numbers With One Query

Viewing 15 posts - 1 through 15 (of 77 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1003921 More actions · Answer 1

My question would be... why do you use a resource intensive, slow as a While loop recursive CTE as your sequence generator? There are many better methods. See the following article.

https://www.sqlservercentral.com/articles/hidden-rbar-counting-with-recursive-ctes

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jonathan AC Roberts SSCoach Points: 18272 More actions · Answer 2

I wrote this SQL Server table valued function a few years ago. It returns all the prime numbers less than 1 million in about 1 second.

Fast Table Valued Function to return all prime numbers within a range

Martin Catherall Right there with Babe Points: 746 More actions · Answer 3

Calculate these once and store them in a table - look them up when needed :)_

Steve Collins SSCrazy Points: 2082 More actions · Answer 4

For exhaustive methods it's quicker to only look at odd numbers. There was a Youtube video a little while ago where they compared all different programming languages to determine which is the "fastest" at generating primes. SQL was pretty much in last place

https://www.youtube.com/watch?v=pSvSXBorw4A&t=281s

The overall leaders (java, c++, rust, etc) are like 5k primes/sec. I came up with this code to mess around and find out. Iirc it was around 300/sec versus the contest entry (which someone did in SQLlite) was around 30/sec. 10x better but still nearly last. Interpreted languages are inherently slower. The generated counts are off by 1 (vs known counts) because the query count omits 2, the only even prime

/*
 insert into known_prime_counts values 
    (10,4),
    (100,25),
    (1000,168),
    (10000,1229),
    (100000,9592),
    (1000000,78498),
    (10000000,664579),
    (100000000,5761455)
 ;
*/

declare @n      int=10000;

select count(*)
--select fn.N*2+1 /*use this to return the prime numbers except for 2*/
from dbo.fnTally(1, @n/2-1) fn
where not exists(select 1
                 from dbo.fnTally(1, fn.N/2) ffn
                 where (fn.N*2+1)%(ffn.N*2+1)=0
                        and (fn.N*2+1)<>(ffn.N*2+1));

Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

Jonathan AC Roberts SSCoach Points: 18272 More actions · Answer 5

Steve Collins wrote:

For exhaustive methods it's quicker to only look at odd numbers. There was a Youtube video a little while ago where they compared all different programming languages to determine which is the "fastest" at generating primes. SQL was pretty much in last place
https://www.youtube.com/watch?v=pSvSXBorw4A&t=281s
The overall leaders (java, c++, rust, etc) are like 5k primes/sec. I came up with this code to mess around and find out. Iirc it was around 300/sec versus the contest entry (which someone did in SQLlite) was around 30/sec. 10x better but still nearly last. Interpreted languages are inherently slower. The generated counts are off by 1 (vs known counts) because the query count omits 2, the only even prime
/*
 insert into known_prime_counts values 
    (10,4),
    (100,25),
    (1000,168),
    (10000,1229),
    (100000,9592),
    (1000000,78498),
    (10000000,664579),
    (100000000,5761455)
 ;
*/

declare @n      int=10000;

select count(*)
--select fn.N*2+1 /*use this to return the prime numbers except for 2*/
from dbo.fnTally(1, @n/2-1) fn
where not exists(select 1
                 from dbo.fnTally(1, fn.N/2) ffn
                 where (fn.N*2+1)%(ffn.N*2+1)=0
                        and (fn.N*2+1)<>(ffn.N*2+1));

You only need to test up to the square root of a number to test for primality, which is quite a saving:

declare @n      int=100000;

select fn.N*2+1 /*use this to return the prime numbers except for 2*/
from dbo.fnTally(1, @n/2-1) fn
where not exists(select 1
                 from dbo.fnTally(1, sqrt(fn.N/2)) ffn
                 where (fn.N*2+1)%(ffn.N*2+1)=0
                        and (fn.N*2+1)<>(ffn.N*2+1));

Steve Collins SSCrazy Points: 2082 More actions · Answer 6

Oh you're right. This is not the final version but I'm not finding my notes on this. Iirc if you follow the links it leads to a SQLlite script with a WHILE loop

Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

Chris Wooding SSCarpal Tunnel Points: 4530 More actions · Answer 7

Jonathan AC Roberts wrote:

You only need to test up to the square root of a number to test for primality, which is quite a saving:

declare @n      int=100000;

select fn.N*2+1 /*use this to return the prime numbers except for 2*/
from dbo.fnTally(1, @n/2-1) fn
where not exists(select 1
                 from dbo.fnTally(1, sqrt(fn.N/2)) ffn
                 where (fn.N*2+1)%(ffn.N*2+1)=0
                        and (fn.N*2+1)<>(ffn.N*2+1));

I did some (limited) testing using the script in the article and working up to the square root was marginally slower than working up to half the value - even though it cuts out a large proportion of the divisors. Maybe the SQRT function is slower than a simple division.

Jonathan AC Roberts SSCoach Points: 18272 More actions · Answer 8

Chris Wooding wrote:

Jonathan AC Roberts wrote:
You only need to test up to the square root of a number to test for primality, which is quite a saving:
declare @n      int=100000;

select fn.N*2+1 /*use this to return the prime numbers except for 2*/
from dbo.fnTally(1, @n/2-1) fn
where not exists(select 1
                 from dbo.fnTally(1, sqrt(fn.N/2)) ffn
                 where (fn.N*2+1)%(ffn.N*2+1)=0
                        and (fn.N*2+1)<>(ffn.N*2+1));
I did some (limited) testing using the script in the article and working up to the square root was marginally slower than working up to half the value - even though it cuts out a large proportion of the divisors. Maybe the SQRT function is slower than a simple division.

Really?! Can you provide your test script?

Chris Wooding SSCarpal Tunnel Points: 4530 More actions · Answer 9

Jonathan AC Roberts wrote:

Chris Wooding wrote:
Jonathan AC Roberts wrote:
You only need to test up to the square root of a number to test for primality, which is quite a saving:
declare @n      int=100000;

select fn.N*2+1 /*use this to return the prime numbers except for 2*/
from dbo.fnTally(1, @n/2-1) fn
where not exists(select 1
                 from dbo.fnTally(1, sqrt(fn.N/2)) ffn
                 where (fn.N*2+1)%(ffn.N*2+1)=0
                        and (fn.N*2+1)<>(ffn.N*2+1));
I did some (limited) testing using the script in the article and working up to the square root was marginally slower than working up to half the value - even though it cuts out a large proportion of the divisors. Maybe the SQRT function is slower than a simple division.
Really?! Can you provide your test script?

It's the one from the article with SQRT(T1.C0) replacing T1.C0 / 2.

WITH NUMBER_SEQUENCE(C0) AS
(
SELECT 2 AS C0
UNION ALL
SELECT C0 + 1 AS C0 FROM NUMBER_SEQUENCE WHERE NUMBER_SEQUENCE.C0 < 100000
)
SELECT C0
FROM (
SELECT C0, CASE WHEN (REST <> 0 OR C0 = 2 OR C0 = 3) THEN 0 ELSE 1 END AS REST_CONDITION
FROM (
SELECT T.C0 AS C0, T.C1, T.C0 % T.C1 AS REST
FROM (
SELECT T1.C0 AS C0, T0.C0 AS C1
FROM NUMBER_SEQUENCE AS T0, NUMBER_SEQUENCE AS T1
WHERE SQRT(T1.[C0]) >= T0.C0 OR T1.C0 = 2 OR T1.C0 = 3
) AS T
) AS T
) AS T
GROUP BY C0
HAVING SUM(REST_CONDITION) = 0
OPTION (MAXRECURSION 0);

Steve Collins SSCrazy Points: 2082 More actions · Answer 10

Jonathan AC Roberts wrote:

You only need to test up to the square root of a number to test for primality, which is quite a saving:

declare @n      int=100000;

select fn.N*2+1 /*use this to return the prime numbers except for 2*/
from dbo.fnTally(1, @n/2-1) fn
where not exists(select 1
                 from dbo.fnTally(1, sqrt(fn.N/2)) ffn
                 where (fn.N*2+1)%(ffn.N*2+1)=0
                        and (fn.N*2+1)<>(ffn.N*2+1));

That's a nice improvement. It still seems tho when it scales to bigger search counts it slows down disproportionately

Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

Jonathan AC Roberts SSCoach Points: 18272 More actions · Answer 11

I think the script in the article is a bit of a non-starter.

It is calculating the square root multiple times for each row.

Jonathan AC Roberts SSCoach Points: 18272 More actions · Answer 12

Steve Collins wrote:

Jonathan AC Roberts wrote:
You only need to test up to the square root of a number to test for primality, which is quite a saving:
declare @n      int=100000;

select fn.N*2+1 /*use this to return the prime numbers except for 2*/
from dbo.fnTally(1, @n/2-1) fn
where not exists(select 1
                 from dbo.fnTally(1, sqrt(fn.N/2)) ffn
                 where (fn.N*2+1)%(ffn.N*2+1)=0
                        and (fn.N*2+1)<>(ffn.N*2+1));
That's a nice improvement. It still seems tho when it scales to bigger search counts it slows down disproportionately

Each number you test for primality needs O(sqrt(n)*n) calculations, so it will slow down for larger numbers. Also, if you increase @n to 10 times its previous value each run you would expect the elapsed time to go up by a factor of over 30.

Jeff Moden SSC Guru Points: 1003921 More actions · Answer 13

Peter "Peso" Larsson wrote a nasty fast Prime Numbers function years ago and I tested it to confirm it. I'll see if I can find it in my archives this week if someone else doesn't find it first.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jonathan AC Roberts SSCoach Points: 18272 More actions · Answer 14

Jeff Moden wrote:

Peter "Peso" Larsson wrote a nasty fast Prime Numbers function years ago and I tested it to confirm it. I'll see if I can find it in my archives this week if someone else doesn't find it first.

I think this must be it, it is in the comments on this page https://www.red-gate.com/simple-talk/databases/sql-server/t-sql-programming-sql-server/celkos-summer-sql-stumpers-prime-numbers/

/*
Sieve Of Atkin implementation
Peter Larsson
Ryan Randall
*/

-- Initialize user supplied parameter
DECLARE @MaxPrime INT

SET @MaxPrime = 1000000

-- Prepare helper variables
DECLARE @MaxNumber INT

SET @MaxNumber = SQRT(@MaxPrime)

-- Create staging tables
CREATE TABLE #Primes
(
Prime INT NOT NULL,
Prime2 BIGINT NOT NULL
)

CREATE TABLE #Sequence
(
Seq INT NOT NULL,
Seq2 BIGINT NOT NULL
)

CREATE UNIQUE CLUSTERED INDEX IX_Sequence ON #Sequence (Seq, Seq2)

INSERT #Sequence
(
Seq,
Seq2
)
SELECT n,
CAST(n AS BIGINT) * CAST(n AS BIGINT)
FROM (
SELECT 1
+ 1000 * COALESCE(d.Digit, 0)
+ 100 * COALESCE(c.Digit, 0)
+ 10 * COALESCE(b.Digit, 0)
+ COALESCE(a.Digit, 0) AS n
FROM (
SELECT 0 AS Digit UNION ALL
SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL
SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
) AS a
LEFT JOIN (
SELECT 0 AS Digit UNION ALL
SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL
SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
) AS b ON @MaxNumber > 10
AND 10 * b.Digit < @MaxNumber
LEFT JOIN (
SELECT 0 AS Digit UNION ALL
SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL
SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
) AS c ON @MaxNumber > 100
AND 100 * c.Digit < @MaxNumber
LEFT JOIN (
SELECT 0 AS Digit UNION ALL
SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL
SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
) AS d ON @MaxNumber > 1000
AND 1000 * d.Digit < @MaxNumber
WHERE a.Digit < @MaxNumber
) AS q
WHERE n <= @MaxNumber
ORDER BY n

INSERT #Primes
(
Prime,
Prime2
)
SELECT 2 AS k, 4 UNION ALL
SELECT 3, 9 UNION ALL
SELECT 5, 25 UNION ALL
SELECT k, CAST(k AS BIGINT) * CAST(k AS BIGINT)
FROM (
SELECT k
FROM (
SELECT 4 * a.Seq2 + b.Seq2 AS k
FROM #Sequence AS a
CROSS JOIN #Sequence AS b
) AS c
WHERE k <= @MaxPrime
AND k % 12 IN (1, 5)

UNION ALL

SELECT k
FROM (
SELECT 3 * a.Seq2 + b.Seq2 AS k
FROM #Sequence AS a
CROSS JOIN #Sequence AS b
) AS c
WHERE k <= @MaxPrime
AND k % 12 = 7

UNION ALL

SELECT k
FROM (
SELECT 3 * a.Seq2 - b.Seq2 AS k
FROM #Sequence AS a
INNER JOIN #Sequence AS b ON b.Seq < a.Seq
) AS c
WHERE k <= @MaxPrime
AND k % 12 = 11
) AS d
WHERE k % 10 IN (1, 3, 7, 9)
GROUP BY k
HAVING COUNT(k) IN (1, 3)
--ORDER BY k

CREATE UNIQUE CLUSTERED INDEX IX_Primes ON #Primes (Prime2, Prime)

SELECT p.Prime
FROM #Primes AS p
WHERE NOT EXISTS (
SELECT 1
FROM #Primes AS x
WHERE x.Prime2 <= p.Prime
AND p.Prime % x.Prime = 0
)
AND p.Prime <= @MaxPrime

DROP TABLE #Primes, 
#Sequence