Cartesian product of 2 tables + update a column

  • I want to create a stored procedure or function that would create a temp table from CROSSJOIN or INNER JOIN and a WHERE of two tables. Then, on that resultset I'd like to fill an empty column with sequential numbers and update a few other columns.

    I know how to iterate records using CURSOR although I have not found any sample that implements FOR UPDATE OF column as in T-SQL Extended Syntax:

    DECLARE cursor_name CURSOR [ LOCAL | GLOBAL ]

    [ FORWARD_ONLY | SCROLL ]

    [ STATIC | KEYSET | DYNAMIC | FAST_FORWARD ]

    [ READ_ONLY | SCROLL_LOCKS | OPTIMISTIC ]

    [ TYPE_WARNING ]

    FOR select_statement

    [ FOR UPDATE [ OF column_name [ ,...n ] ] ]

    [;]

    Can someone help providing a simple example? And, is a CURSOR my only choice for this task?

    Thanks

  • Select your data into a temp table that has an Identity column, and you're done.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • To Elaborate...

    I want to create a stored procedure or function that would create a temp table from CROSSJOIN or INNER JOIN and a WHERE of two tables. Then, on that resultset I'd like to fill an empty column with sequential numbers and update a few other columns.

    ...

    I know how to iterate records using CURSOR although I have not found any sample that implements FOR UPDATE OF column as in T-SQL Extended Syntax:

    Can someone help providing a simple example? And, is a CURSOR my only choice for this task?

    First, let's clarify what the FOR UPDATE syntax means;

    FOR UPDATE [OF column_name [,...n]]

    Defines updatable columns within the cursor. If OF column_name [,...n] is supplied, only the columns listed allow modifications. If UPDATE is specified without a column list, all columns can be updated, unless the READ_ONLY concurrency option was specified.

    Next, never use cursors if you can avoid it. Generally a cursor is used to iterate thru a set of data so that one or more row elements can be used in related updates. The problem with cursors is each fetch makes a trip to the server.

    Another approach is to use a local table variable (SQl 2k+) to store the same set of data with the addition of a PK to manage the iteration.

    For example, the following CURSOR;

    DECLARE @ProductName nvarchar(40)

    DECLARE product_cursor CURSOR FOR

    SELECT p.ProductName

    FROM Northwind..Products p

    OPEN product_cursor

    FETCH NEXT FROM product_cursor INTO @ProductName

    WHILE @@FETCH_STATUS = 0

    BEGIN

    --DO SOMETHING

    PRINT @ProductName

    FETCH NEXT FROM product_cursor INTO @ProductName

    END

    CLOSE product_cursor

    DEALLOCATE product_cursor

    Can be replaced with;

    DECLARE @ProductName nvarchar(40)

    DECLARE @Rows int,@Ctr int

    DECLARE @PRODUCTS TABLE(

    TID int identity(1,1),--NATIVE PK

    ProductName nvarchar(40) NOT NULL)

    INSERT INTO @PRODUCTS(ProductName)

    SELECT p.ProductName

    FROM Northwind..Products p

    SELECT @Rows=@@ROWCOUNT

    SET @Ctr=1

    WHILE @Ctr<=@Rows

    BEGIN

    SELECT @ProductName=ProductName

    FROM @PRODUCTS

    WHERE TID=@Ctr

    PRINT @ProductName

    SET @Ctr=@Ctr+1

    END

    --No need to drop or deallocate table, table variables drop with the scope[/font]

  • devteamlead...I think I know what you mean about avoiding cursors. In fact, I found out in BOL and also in experience. After I posted I managed to create a procedure only to see my server resources being consumed 🙂

    but, that's how one learn

    BTW thanks for your tip, the second one, I'll go with that approach.

  • Jeff Moden (10/5/2007)


    Select your data into a temp table that has an Identity column, and you're done.

    I forgot to mention, empty column will be filled sequentially after user submits a starting number.

    The task at hand is I have to filter 100,000 records down to perhaps 1000 and on that subset tag records sequentially with alphanumerics.

    Thanks to devteamlead I am almost there...this looks almost like what I want.

    Suppose I want to tag records where product name begins with 'C' and starting number is 25

    DECLARE @ProductName nvarchar(40)

    DECLARE @ProductID int, @NewPK int, @LastNumber int

    DECLARE @Rows int, @Ctr int

    SET @LastNumber = 25

    DECLARE @PRODUCTS

    TABLE(

    TID int identity(1,1),--NATIVE PK

    ProductID int NOT NULL,

    ProductName nvarchar(40) NOT NULL,

    NewPK int

    )

    INSERT INTO @PRODUCTS(ProductID, ProductName)

    SELECT p.ProductID, p.ProductName

    FROM Northwind..Products p

    WHERE LEFT(p.ProductName, 1)='C'

    ORDER BY p.ProductID

    SELECT @Rows=@@ROWCOUNT

    SET @Ctr=1

    WHILE @Ctr<=@Rows

    BEGIN

    SELECT @ProductID = ProductID FROM @PRODUCTS WHERE TID=@Ctr

    UPDATE PRODUCTS Set NewPK = @LastNumber + @Ctr WHERE ProductID = @ProductID

    SET @Ctr=@Ctr+1

    END

    --Let's see data now

    SELECT ProductID, ProductName, NewPK FROM PRODUCTS WHERE LEFT(ProductName, 1)='C' ORDER BY ProductID

    ***

    Surprisingly, against my target data, tagging took 9 seconds with this pure sql statements approach whereas old three tier system of a ASP page, ADO, a COM+ dll takes about 10 minutes.

    So, the only concern I have now is with this portion

    WHILE @Ctr<=@Rows

    BEGIN

    SELECT @ProductName=ProductName, @ProductID = ProductID FROM @PRODUCTS WHERE TID=@Ctr

    UPDATE PRODUCTS Set NewPK = @LastNumber + @Ctr WHERE ProductID = @ProductID

    SET @Ctr=@Ctr+1

    END

    It will execute UPDATE PRODUCTS once for every matching record. Is there room for improvement?

  • I guess, this can surely will improve the performance as it uses the a single update to accomplish the task....

    DECLARE @LastNumber INT

    SET @LastNumber = 25

    IF ( OBJECT_ID( 'tempdb..#tmpProducts' ) IS NOT NULL )

    DROP TABLE #tmpProducts

    CREATE TABLE #tmpProducts

    (

    ProductID INT NOT NULL PRIMARY KEY CLUSTERED,

    ProductName VARCHAR(100) NOT NULL,

    NewProductID INT NULL

    )

    INSERT #tmpProducts( ProductID, ProductName )

    SELECT ProductID, ProductName

    FROM Northwind..Products

    WHERE ProductName LIKE 'C%'

    ;WITH ProductCTE

    AS

    (

    SELECT ROW_NUMBER() OVER( ORDER BY ProductID ) AS RowNumber, ProductID, NewProductID

    FROM #tmpProducts

    )

    UPDATE ProductCTE

    SET NewProductID = @LastNumber + RowNumber

    SELECT * FROM #tmpProducts

    --Ramesh

    --Ramesh


  • I forgot to mention, empty column will be filled sequentially after user submits a starting number.

    The task at hand is I have to filter 100,000 records down to perhaps 1000 and on that subset tag records sequentially with alphanumerics.

    Thanks to devteamlead I am almost there...this looks almost like what I want.

    Suppose I want to tag records where product name begins with 'C' and starting number is 25

    Surprisingly, against my target data, tagging took 9 seconds with this pure sql statements approach whereas old three tier system of a ASP page, ADO, a COM+ dll takes about 10 minutes.

    So, the only concern I have now is with this portion

    WHILE @Ctr<=@Rows

    BEGIN

    SELECT @ProductName=ProductName, @ProductID = ProductID FROM @PRODUCTS WHERE TID=@Ctr

    UPDATE PRODUCTS Set NewPK = @LastNumber + @Ctr WHERE ProductID = @ProductID

    SET @Ctr=@Ctr+1

    END

    9 seconds, eh? Heh... if run peformance is important to you, check this out...

    Here's what I said before you threw the "Starting Number" hook at us...

    Select your data into a temp table that has an Identity column, and you're done.

    We're still going to do just that with an added hook for the starting number. BUT first, since performance seems to be important, we need some data to demonstrate the method I'm going to use. So, let's build 100,000 rows of test date similar to your Product table... NOTE THAT THIS IS NOT PART OF THE SOLUTION! THIS JUST MAKES TEST DATA!

    --===== Create and populate a 100,000 row test table.

    -- Column ProductID has a range of 1 to 1,000,000 unique numbers

    -- Column ProductName has a range of "AA" to "ZZ" non-unique 2 character strings

    -- Column VendorID has a range of 1 to 50,000 non-unique numbers

    -- Column Cost has a range of 0.0000 to 99.9999 non-unique numbers

    -- Column SomeDate has a range of >=01/01/2000 and <01/01/2010 non-unique date/times

    -- Takes about 5 seconds to execute.

    SELECT TOP 100000

    ProductID = IDENTITY(INT,1,1),

    ProductName = CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65))

    + CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65))

    + ' Type Product Name',

    Cost = CAST(RAND(CAST(NEWID() AS VARBINARY))*100 AS MONEY),

    VendorID = CAST(RAND(CAST(NEWID() AS VARBINARY))*50000+1 AS INT),

    SomeDate = CAST(RAND(CAST(NEWID() AS VARBINARY))*3653.0+36524.0 AS DATETIME)

    INTO dbo.jbmProductTest

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2 --Lack of join criteria makes this a CROSS-JOIN

    --===== A table is not properly formed unless a Primary Key has been assigned

    ALTER TABLE dbo.jbmProductTest

    ADD PRIMARY KEY CLUSTERED (ProductID)

    --===== Count the number of product names that begin with "C"

    SELECT COUNT(*) FROM dbo.jbmProductTest

    WHERE ProductName LIKE 'C%'

    Notice the number that is printed at the end of the run... that's the number of products that begin with "C" in the test data... about 4 times more than what you said you had.

    Now, for the solution. It's as simple as doing an INSERT/SELECT into the temp table with just a little preparation beforehand... notice the DBCC CHECKIDENT takes care of the starting number... don't blink 😛

    --===== Local variables & presets

    DECLARE @StartTime DATETIME -- To measure run duration

    SET @StartTime = GETDATE()

    DECLARE @NewStartID INT -- Starting value for new ProductID's

    SET @NewStartID = 25 --<<< Change this number to the proper number

    -- Can be a parameter in a stored proc

    --===== Create the temp table

    IF OBJECT_ID('TempDB..#Products','U') IS NOT NULL

    DROP TABLE #Products

    CREATE TABLE #Products

    (

    NewProductID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,

    OldProductID INT,

    ProductName VARCHAR(100)

    )

    --===== Change the SEED of the identity column to the desired number

    DBCC CHECKIDENT ('#Products',RESEED,@NewStartID)

    --===== Populate the table, and you're done

    INSERT INTO #Products

    (OldProductID,ProductName)

    SELECT ProductID,ProductName

    FROM dbo.jbmProductTest

    WHERE ProductName LIKE 'C%'

    --===== Display the run duration

    PRINT STR(DATEDIFF(ms,@StartTime,GETDATE())) + ' Milliseconds duration'

    --===== Show the first 100 rows just for grins

    SELECT TOP 100 *

    FROM #Products

    ORDER BY NewProductID

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Man, I do wish they'd fix the "code" windows so they don't triple space... all the pretty formatting I did got slammed to the left here...

    Hey Steve! When are you guys gonna fix stuff like this???

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    I've been bleating about this in a big way to Steve and Tony too. What is the point of having a forum in which one cannot paste TSQL code. At the moment, not even the Prettifier works properly. To be fair, the red-gate team are working on it.

    What I want to see is a SQL tag like the IFCodes on the left which you paste your SQL into and which formats it up in the right colours.

    Best wishes,
    Phil Factor

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply