Performance challenge

  • hreisterp (12/28/2010)


    The performance challenge is not to insert the data but to select the prime number. 🙂

    Heh... I know... but if it takes you 6 seconds to create only 60,000 numbers just to setup the Prime number challenge, there's definitely something wrong with your machine... or the code. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • hreisterp (12/28/2010)


    Thanks for your post. But that was not the challenge. Actually loading all prime number from a pre-computed list is MUCH faster (if it is not huge number you need you could simply define the list as code).

    Ummm... ok. If you already know that, then why did you ask the question on how to build it and why the spec on doing it in less than a second?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • hreisterp (12/28/2010)


    OK. Another try... 😉

    Used then outstanding LINQPad [/url]tool to do the same.

    int[] data=data=Enumerable.Range(2,65536).ToArray(); // learned this today 🙂

    var q = from s in data where !(from o in data where (o < s) && (s % o == 0) select 0).Any() select s;

    q.Dump(); // that is a LINQPad extension...

    3 seconds!

    That is almost twice as fast as running it from withing VCS 2010 Express. Hmmmhhhh ... that is a big surprise.

    And it outperformns SQL server by a factor of two... This needs further investigation.

    Heh... 3 seconds. Would you post the code that you're using for SQL Server to generate that "array" of numbers from 2 to 65536 so I can show you what you may be doing incorrectly?

    And considering the direction this thread has taken, I have to ask, what's you're angle? Is this actually SPAM for the LinqPad tool you seem so proud of? 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here's a very quick way to insert 60,000 sequential numbers into a table:

    CREATE FUNCTION [dbo].[Numbers]

    (

    @MaxNumber int

    )

    RETURNS TABLE

    AS

    RETURN

    (

    WITH

    L0 AS (SELECT 1 AS c UNION ALL SELECT 1),

    L1 AS (SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),

    L2 AS (SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),

    L3 AS (SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),

    L4 AS (SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),

    L5 AS (SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),

    Nums AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS n FROM L5)

    SELECT * FROM Nums WHERE n <= @MaxNumber)

    SELECT

    n

    INTO #Numbers

    FROM Numbers(60000)

    The code above will only run on SQL Server 2005 or later.

    The Numbers table-value function is copied from Chapter 6 of "Inside Microsoft SQL Server 2008: T-SQL Querying" by Itzik Ben-Gan et. al. (http://www.amazon.com/Inside-Microsoft-SQL-Server-2008/dp/0735626030)

    The insert took just milliseconds on my SQL Server VM Workstation.

  • Jeff Moden (12/28/2010)


    hreisterp (12/28/2010)


    OK. Another try... 😉

    Used then outstanding LINQPad [/url]tool to do the same.

    int[] data=data=Enumerable.Range(2,65536).ToArray(); // learned this today 🙂

    var q = from s in data where !(from o in data where (o < s) && (s % o == 0) select 0).Any() select s;

    q.Dump(); // that is a LINQPad extension...

    3 seconds!

    That is almost twice as fast as running it from withing VCS 2010 Express. Hmmmhhhh ... that is a big surprise.

    And it outperformns SQL server by a factor of two... This needs further investigation.

    Heh... 3 seconds. Would you post the code that you're using for SQL Server to generate that "array" of numbers from 2 to 65536 so I can show you what you may be doing incorrectly?

    And considering the direction this thread has taken, I have to ask, what's you're angle? Is this actually SPAM for the LinqPad tool you seem so proud of? 😉

    Hello,

    once again. The challenge is not to insert the data. The challenge is to select the prime numbers.

    And no - i'm not involved with LINQPad. Still it is an amazing tool.

  • hreisterp (12/28/2010)


    Jeff Moden (12/28/2010)


    hreisterp (12/28/2010)


    OK. Another try... 😉

    Used then outstanding LINQPad [/url]tool to do the same.

    int[] data=data=Enumerable.Range(2,65536).ToArray(); // learned this today 🙂

    var q = from s in data where !(from o in data where (o < s) && (s % o == 0) select 0).Any() select s;

    q.Dump(); // that is a LINQPad extension...

    3 seconds!

    That is almost twice as fast as running it from withing VCS 2010 Express. Hmmmhhhh ... that is a big surprise.

    And it outperformns SQL server by a factor of two... This needs further investigation.

    Heh... 3 seconds. Would you post the code that you're using for SQL Server to generate that "array" of numbers from 2 to 65536 so I can show you what you may be doing incorrectly?

    And considering the direction this thread has taken, I have to ask, what's you're angle? Is this actually SPAM for the LinqPad tool you seem so proud of? 😉

    Hello,

    once again. The challenge is not to insert the data. The challenge is to select the prime numbers.

    And no - i'm not involved with LINQPad. Still it is an amazing tool.

    You seem to have completely ignored the solution I posted before.

    Prime Number Table Function

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=69646

    I just ran the following code to test the final version of the F_TABLE_PRIME function using the "Sieve of Aitken" from that thread to get all prime number below 60,000, and it completed in 0.217 seconds.

    declare @st datetime

    declare @t table (number int not null primary key clustered)

    -- Get start time

    select @st = getdate()

    insert into @t

    select * from dbo.F_TABLE_PRIME(60000)

    -- Find elapsed time

    select getdate()-@st

    If this is really a "challenge" and not spam, why are you not looking at that solution and other solutions posed on that thread?

  • Seems the OP is focused for some reason on developing a faster primes finder that is NOT one of the known "... Sieves" methods. OP, may I ask WHY it has to be NEW code?

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • hreisterp (12/28/2010)


    Jeff Moden (12/28/2010)


    hreisterp (12/28/2010)


    OK. Another try... 😉

    Used then outstanding LINQPad [/url]tool to do the same.

    int[] data=data=Enumerable.Range(2,65536).ToArray(); // learned this today 🙂

    var q = from s in data where !(from o in data where (o < s) && (s % o == 0) select 0).Any() select s;

    q.Dump(); // that is a LINQPad extension...

    3 seconds!

    That is almost twice as fast as running it from withing VCS 2010 Express. Hmmmhhhh ... that is a big surprise.

    And it outperformns SQL server by a factor of two... This needs further investigation.

    Heh... 3 seconds. Would you post the code that you're using for SQL Server to generate that "array" of numbers from 2 to 65536 so I can show you what you may be doing incorrectly?

    And considering the direction this thread has taken, I have to ask, what's you're angle? Is this actually SPAM for the LinqPad tool you seem so proud of? 😉

    Hello,

    once again. The challenge is not to insert the data. The challenge is to select the prime numbers.

    And no - i'm not involved with LINQPad. Still it is an amazing tool.

    Then why did you spend so much time on the performance of "insert the data"? 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hello,

    sorry. I missed the link. Now i read it. Quite interesting what you guys did there. But this all ends up in implementing a welln known algorithm in T-SQL imperative language.

    Actually if using an imperative language i'd be using C# or C++ and simply load the output into a table. Or if you want implement a SP or function using CLR in SQL Server.

    But your T-SQL implementation is also very beautiful. But i doubt that it can compete with a C# program due the interpreted nature of the language.

    But what I wanted to test is: how can we do it in plain SQL. The reason: this allows to compare DB's directly regarding the smartness of the engines.

    And that's what I achieved. And so far SQL Server was the best DB with LINQ being somewhat faster (but not massively).

    The real challenge is: why does SQL, which is actually (just like LINQ) an algorithm creator not automatically find an algorithm that can get as least close to the performance (O(...)) of a human made algorithm.

    If you take a little deeper look into the example you can find the reason: the SQL execution engines do NOT really understand the overall objective of the query. This is an indicator for two different kind of issues:

    1.) SQL does not contain enough language constructs to allow a developer to optimally express the objective of a query

    2.) SQL does have enough language constructs but the underlying engines are yet not smart enough to really understand the queries.

    Both mean: there is room for improvment! Improvement -> Challenge -> Opportunity -> Solution -> Success -> MAKE money 🙂

    Take care

    HJ

    Michael Valentine Jones (12/28/2010)


    hreisterp (12/28/2010)


    Jeff Moden (12/28/2010)


    hreisterp (12/28/2010)


    OK. Another try... 😉

    Used then outstanding LINQPad [/url]tool to do the same.

    int[] data=data=Enumerable.Range(2,65536).ToArray(); // learned this today 🙂

    var q = from s in data where !(from o in data where (o < s) && (s % o == 0) select 0).Any() select s;

    q.Dump(); // that is a LINQPad extension...

    3 seconds!

    That is almost twice as fast as running it from withing VCS 2010 Express. Hmmmhhhh ... that is a big surprise.

    And it outperformns SQL server by a factor of two... This needs further investigation.

    Heh... 3 seconds. Would you post the code that you're using for SQL Server to generate that "array" of numbers from 2 to 65536 so I can show you what you may be doing incorrectly?

    And considering the direction this thread has taken, I have to ask, what's you're angle? Is this actually SPAM for the LinqPad tool you seem so proud of? 😉

    Hello,

    once again. The challenge is not to insert the data. The challenge is to select the prime numbers.

    And no - i'm not involved with LINQPad. Still it is an amazing tool.

    You seem to have completely ignored the solution I posted before.

    Prime Number Table Function

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=69646

    I just ran the following code to test the final version of the F_TABLE_PRIME function using the "Sieve of Aitken" from that thread to get all prime number below 60,000, and it completed in 0.217 seconds.

    declare @st datetime

    declare @t table (number int not null primary key clustered)

    -- Get start time

    select @st = getdate()

    insert into @t

    select * from dbo.F_TABLE_PRIME(60000)

    -- Find elapsed time

    select getdate()-@st

    If this is really a "challenge" and not spam, why are you not looking at that solution and other solutions posed on that thread?

  • This has been discussed before. Read it here http://weblogs.sqlteam.com/davidm/archive/2003/10/30/412.aspx

  • hreisterp (12/29/2010)


    Hello,

    sorry. I missed the link. Now i read it. Quite interesting what you guys did there. But this all ends up in implementing a welln known algorithm in T-SQL imperative language.

    Actually if using an imperative language i'd be using C# or C++ and simply load the output into a table. Or if you want implement a SP or function using CLR in SQL Server.

    But your T-SQL implementation is also very beautiful. But i doubt that it can compete with a C# program due the interpreted nature of the language.

    But what I wanted to test is: how can we do it in plain SQL. The reason: this allows to compare DB's directly regarding the smartness of the engines.

    And that's what I achieved. And so far SQL Server was the best DB with LINQ being somewhat faster (but not massively).

    The real challenge is: why does SQL, which is actually (just like LINQ) an algorithm creator not automatically find an algorithm that can get as least close to the performance (O(...)) of a human made algorithm.

    If you take a little deeper look into the example you can find the reason: the SQL execution engines do NOT really understand the overall objective of the query. This is an indicator for two different kind of issues:

    1.) SQL does not contain enough language constructs to allow a developer to optimally express the objective of a query

    2.) SQL does have enough language constructs but the underlying engines are yet not smart enough to really understand the queries.

    Both mean: there is room for improvment! Improvement -> Challenge -> Opportunity -> Solution -> Success -> MAKE money 🙂

    Take care

    HJ

    That is so far from what you said the objective was in your first post, that I have to conclude that what you are really doing is some strange form of spam.

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply