Cartesian product to create table takes FOREVER... any way to speed it up?

  • I think the problem is that I'm trying to create a seriously huge table, and it takes forever...

    CREATE TABLE ipaddress(

    octet1 tinyint,

    octet2 tinyint,

    octet3 tinyint,

    octet4 tinyint,

    primary key(octet1, octet2, octet3, octet4)

    );

    to populate it, I created a single column table

    CREATE TABLE #Temp(

    num tinyint PRIMARY KEY);

    and then populated the table with all possible values (0-255). That part is fine. When I try to create the insane cartesian product, it takes forever...

    Is there any other option than dropping the primary key? It should take a little while, since it's about 16 million records... but how long is reasonable?

    It's okay if I limit the tables to (0-15 or 0-31), but if I allow the full range, it just cranks away...

    Is this a case of "oh, just go smoke a cigarette and come back." or is this just plain crazy?

    thanks,

    pieter

  • I guess you would receive more input if you give information about your hardware, SQL Server version and the time you mention as "forever".

    A CROSS JOIN can be expensive if it produces a large result set, but I am not sure that removing the primary key (read the clustered index it brings) will help you here.

    brgds

    Philipp Post

  • One trick that might work, not sure since you havent' provided much info, is to use the "Go trick".

    Have smaller cross joins, or variable ones, each one that handles a smaller range, and then use a GO 12.

    So maybe each time this executes it cross joins [0..2][0..31], the next time doing [3..4][0..31] using info from the table. Then use a GO 15 to make this build all the rows from 0..30

  • Steve Jones - SSC Editor (11/15/2010)


    One trick that might work, not sure since you havent' provided much info, is to use the "Go trick".

    Have smaller cross joins, or variable ones, each one that handles a smaller range, and then use a GO 12.

    So maybe each time this executes it cross joins [0..2][0..31], the next time doing [3..4][0..31] using info from the table. Then use a GO 15 to make this build all the rows from 0..30

    Steve,

    thanks for validating my suspicions. I tried the join with much smaller datasets and it was very fast. Will try creating a loop to build the table in chunks. Sorry about the lack of info about the server environment... it's on a virtual machine at school... so as far as i know, it could be a piece of junk, and even then I don't know how many VMs are running.

  • The problem is "populated the table with all possible values (0-255)", results in a table with over 4 billion rows.

    select cast(256 as bigint) * 256 * 256 * 256 = 4,294,967,296.

    Assuming 4 bytes for the data, plus index, plus storage management, let's round to 16 bytes per row, then you are attempting to build a 64Gb table:

    select ( (cast(256 as bigint) * 256 * 256 * 256 ) * 16 ) / (1024 * 1024 * 1024 ) as SpaceGb

    If you have not pre-allocated both the database file and the transaction log file to be 64 Gb, then all of the file growths will occur at run time, slowing the process.

    SQL = Scarcely Qualifies as a Language

  • oh, ouch. I figured it would bring the database to its knees... should have done the math FIRST!!

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply