Set based iteration

  • OK, I'm trying and failing (to an extent) to get my head around set-based iteration.

    As a sometime dev who is used to loops (go VBScript go!) and more 'traditional' programming, I decided this was a topic that I really needed to learn something about.

    So I'm wondering if anyone can help with this example.

    Here's me creating a table and populating it with data using a loop-based method:

    CREATE TABLE fooTable (data VARCHAR(8000) )

    DECLARE @loopCounter INT

    SET @loopCounter = 10000

    WHILE @loopCounter > 0

    BEGIN

    INSERT INTO fooTable

    SELECT name from sys.databases

    SET @loopCounter = @loopCounter - 1

    END

    So far, so good. I should have the name of every database on my server, duplicated 10000 times. 5 databases, 50000 rows. This query took about 10 seconds to execute (I didn't time it accurately).

    Now I'm struggling to render this as a set-based approach.

    From what I understand, I should be using a function based approach to populate the data rather than a looping one. I.e. if I was using an INT value or similar I could probably specify a range or other mathematical function - I chose a VARCHAR to make it a bit difficult.

    Would be grateful for any help in understanding this topic. Thanks.

    ---

    Note to developers:
    CAST(SUBSTRING(CAST(FLOOR(NULLIF(ISNULL(COALESCE(1,NULL),NULL),NULL)) AS CHAR(1)),1,1) AS INT) == 1
    So why complicate your code AND MAKE MY JOB HARDER??!:crazy:

    Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
    My blog: http://uksqldba.blogspot.com
    Visit http://www.DerekColley.co.uk to find out more about me.

  • I could write pages of explanation here, but I'd really recommend reading Jeff's article on the Tally/Numbers table [/url], which more eloquently explains it than I ever could.

    With your particular example, running 10000 seperate inserts is the same as CROSS JOINING that table to another table with 10000 rows in it. E.g.:

    To give a very simple example, using the persisted tally table in Jeff's example:

    INSERT INTO fooTable

    SELECT name from sys.databases

    CROSS JOIN Tally

    where N<=10000

  • derek.colley (5/23/2012)


    OK, I'm trying and failing (to an extent) to get my head around set-based iteration.

    As a sometime dev who is used to loops (go VBScript go!) and more 'traditional' programming, I decided this was a topic that I really needed to learn something about.

    So I'm wondering if anyone can help with this example.

    Here's me creating a table and populating it with data using a loop-based method:

    CREATE TABLE fooTable (data VARCHAR(8000) )

    DECLARE @loopCounter INT

    SET @loopCounter = 10000

    WHILE @loopCounter > 0

    BEGIN

    INSERT INTO fooTable

    SELECT name from sys.databases

    SET @loopCounter = @loopCounter - 1

    END

    So far, so good. I should have the name of every database on my server, duplicated 10000 times. 5 databases, 50000 rows. This query took about 10 seconds to execute (I didn't time it accurately).

    Now I'm struggling to render this as a set-based approach.

    From what I understand, I should be using a function based approach to populate the data rather than a looping one. I.e. if I was using an INT value or similar I could probably specify a range or other mathematical function - I chose a VARCHAR to make it a bit difficult.

    Would be grateful for any help in understanding this topic. Thanks.

    try this

    with x (id) as

    (select 1 as id

    union all

    select ID+1 from X as id

    where id<10000

    )

    insert into fooTable

    select d.name from sysdatabases d cross join x option (maxrecursion 10000);

    MVDBA

  • +1 howard beat me to it, but slightly different solution

    MVDBA

  • Thanks both, I'll read the tally tables article and give both solutions a try.

    ---

    Note to developers:
    CAST(SUBSTRING(CAST(FLOOR(NULLIF(ISNULL(COALESCE(1,NULL),NULL),NULL)) AS CHAR(1)),1,1) AS INT) == 1
    So why complicate your code AND MAKE MY JOB HARDER??!:crazy:

    Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
    My blog: http://uksqldba.blogspot.com
    Visit http://www.DerekColley.co.uk to find out more about me.

  • Howard, your approach is good but populates the table with only the first database name returned by SELECT name FROM sys.databases. It produces all the rows, but they are all identical - the remainder are discarded - rather than inserting in complete 'sets'.

    Mike - Got Msg 319 ('Incorrect syntax near the keyword 'with'...') I checked my code with what you put, matches fine. Typo maybe?

    Thanks

    ---

    Note to developers:
    CAST(SUBSTRING(CAST(FLOOR(NULLIF(ISNULL(COALESCE(1,NULL),NULL),NULL)) AS CHAR(1)),1,1) AS INT) == 1
    So why complicate your code AND MAKE MY JOB HARDER??!:crazy:

    Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
    My blog: http://uksqldba.blogspot.com
    Visit http://www.DerekColley.co.uk to find out more about me.

  • any code before the "WITH x as" must be terminated with a ;

    MVDBA

  • see my example

    create table footable (name varchar(1000))

    go

    select 'aaaaa';

    with x (id) as

    (select 1 as id

    union all

    select ID+1 from X as id

    where id<10000

    )

    insert into fooTable

    select d.name from sysdatabases d cross join x option (maxrecursion 10000);

    drop table footable

    MVDBA

  • Hi Derek, the statement immediately previous to the WITH keyword must terminated by a semicolon.

  • derek.colley (5/23/2012)


    Howard, your approach is good but populates the table with only the first database name returned by SELECT name FROM sys.databases. It produces all the rows, but they are all identical - the remainder are discarded - rather than inserting in complete 'sets'.

    Hmm, that shouldn't be the case. You'll have to provide a complete example of what you're running from start to finish to work out what's wrong.

    Are you sure one database isn't just at the top - e.g. the arbitrary order is different when you select back the records?

    E.g.:

    select data, count(*) from fooTable

    group by data

    Returns only one database name??

  • The following ran about 1 second on my computer, ymmv.

    CREATE TABLE dbo.fooTable (data VARCHAR(8000) );

    GO

    WITH

    e1(n) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1), -- 10 rows

    e2(n) AS (SELECT a.n FROM e1 a CROSS JOIN e1 b), -- 100 rows

    e4(n) AS (SELECT a.n FROM e2 a CROSS JOIN e2 b), -- 10,000 rows

    cteNumbers(n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM e4)

    INSERT INTO dbo.fooTable (data)

    SELECT NAME FROM sys.databases CROSS JOIN cteNumbers;

    SELECT * FROM dbo.fooTable;

    DROP TABLE dbo.fooTable;

    GO

  • Howard

    Here's the example code I ran:

    CREATE TABLE numbers ( N INT)

    DECLARE @loopCounter INT -- yes, I know :-)

    SET @loopCounter = 1

    WHILE @loopCounter < 10001

    BEGIN

    INSERT INTO numbers VALUES ( @loopCounter )

    SET @loopCounter = @loopCounter + 1

    END

    RAISERROR('Numbers table created.',0,0) WITH NOWAIT

    CREATE TABLE fooTable ( data VARCHAR(8000) )

    INSERT INTO fooTable

    SELECT name FROM sys.databases

    CROSS JOIN numbers

    WHERE N < 10000

    RAISERROR('fooTable populated.',0,0) WITH NOWAIT

    SELECT TOP 50 * FROM fooTable

    Here's the result set:

    Numbers table created.

    fooTable populated.

    master

    master

    master

    ... -- goes on for 46 rows

    master

    ---

    Note to developers:
    CAST(SUBSTRING(CAST(FLOOR(NULLIF(ISNULL(COALESCE(1,NULL),NULL),NULL)) AS CHAR(1)),1,1) AS INT) == 1
    So why complicate your code AND MAKE MY JOB HARDER??!:crazy:

    Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
    My blog: http://uksqldba.blogspot.com
    Visit http://www.DerekColley.co.uk to find out more about me.

  • Mike - I corrected the WITH to ;WITH and it ran beautifully - 28 seconds. Many thanks.

    Lynn - trying yours now 🙂 The more practice I get with this kind of thing, the better I'll understand it.

    ---

    Note to developers:
    CAST(SUBSTRING(CAST(FLOOR(NULLIF(ISNULL(COALESCE(1,NULL),NULL),NULL)) AS CHAR(1)),1,1) AS INT) == 1
    So why complicate your code AND MAKE MY JOB HARDER??!:crazy:

    Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
    My blog: http://uksqldba.blogspot.com
    Visit http://www.DerekColley.co.uk to find out more about me.

  • SELECT TOP 50 * FROM fooTable

    This is the problem. You're populating your table with 50,000 rows of data and then trying to determine whether it's done it correctly by only looking at the top 50 rows. There are 10000 rows that should have that database name, so it's no proof at all that the process hasn't worked just because you've looked at 50 and they're all the same!

    The key thing to remember about SQL is that inserted order is irrelevant. If you want a particular order, you must have a column that defines it and include it in the order by clause.

  • Lynn - 4 seconds on my ageing kit - thanks very much!

    Howard - was not knocking your effort (would be a bit rich considering my approach wasn't set based!) but I was after an ordered insert in the absence of a PK. The difference between the loop-based approach and your set based one was that the loop-based one inserted as follows:

    1 a

    2 b

    3 c

    4 d

    5 e

    6 a

    7 b

    ...

    10000 e

    At the moment your code does:

    1 a

    2 a

    3 a

    ...

    ? b

    ? b

    ? b

    ...

    ? e

    I agree that inserted order is irrelevant, your code achieved exactly what was wanted - but imagine I e.g. wanted to add a PK at a later date i.e.

    SELECT IDENTITY(INT, 1, 1) AS id INTO fooTable --or similar...

    The order would then become relevant.

    I think I'll have to go and read up properly on this since it's clear there's a few ways of achieving set based iteration.

    Thanks to everyone for your help and examples.

    ---

    Note to developers:
    CAST(SUBSTRING(CAST(FLOOR(NULLIF(ISNULL(COALESCE(1,NULL),NULL),NULL)) AS CHAR(1)),1,1) AS INT) == 1
    So why complicate your code AND MAKE MY JOB HARDER??!:crazy:

    Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
    My blog: http://uksqldba.blogspot.com
    Visit http://www.DerekColley.co.uk to find out more about me.

Viewing 15 posts - 1 through 15 (of 46 total)

You must be logged in to reply to this topic. Login to reply