how to replace this cursor with set based solution?

  • I've got a cursor script that I want to replace with a set based solution because the cursor script takes a long time to run. All I've got is the cursor script, for which I've reverse engineered the DDL to solve this problem (without success). The cursor script is the actual script that is running in a production environment, and the DDL is something I've created (for myself) to work out a set based solution. I've posted it all below and I would so much appreciate if you took a look.

    The cursor creates an array variable to hold a list of names (LabC1, LabC2, etc) and then fetches the name one by one and compares its value to a value found in the 'name' column of the Machines table.

    I hope a copy of the script and DDL will make this explanation clearer.

    Here's the cursor script:

    DECLARE @benches TABLE (BenchName varchar(5))

    INSERT INTO @benches

    VALUES ('LabC1'),

    ('LabC2'),

    ('LabC3'),

    ('LabC4'),

    ('LabD1'),

    ('LabD2'),

    ('LabD3'),

    ('LabD4'),

    ('LabE1'),

    ('LabE2'),

    ('LabE3'),

    ('LabE4'),

    ('LabE5'),

    ('LabE6'),

    ('LabE7'),

    ('LabE8')

    DECLARE @runsByBench TABLE (BenchName varchar(5), NumberOfRunsOnBench int, LastRun datetime)

    -- for each value in the @benches array return the count of runs and the last createdate associated with that run.

    DECLARE toQuery CURSOR LOCAL FAST_FORWARD FOR

    SELECT b.BenchName

    FROM @benches b

    OPEN toQuery

    -- loop through all of the scripts

    DECLARE @bench varchar(5)

    FETCH NEXT FROM toQuery INTO @bench

    WHILE @@FETCH_STATUS = 0

    BEGIN

    INSERT INTO @runsByBench (BenchName , NumberOfRunsOnBench , LastRun )

    SELECT @bench as BenchName, COUNT(*) as NumberOfRunsOnBench, MAX(CreateDate) as LastRun

    FROM

    (select r.runid, r.creatorid, r.createdate, r.name, r.enddate, count(*) as MachinesInRun

    from Runs r

    join runs_machines rm on rm.runid=r.runid

    join Machines m on m.machineid=rm.machineid

    -----HERE'S WHERE IT COMPARES THE VALUE M.NAME TO THE VALUE FETCHED BY THE CURSOR--------------

    where m.name like (@bench +'[0-9][0-9][0-9]')

    ----------------------------------------------------------------------------------------------------------

    and r.createdate > dateadd(M,-3,getdate())

    group by r.runid, r.creatorid, r.createdate, r.name, r.enddate

    ) as foo

    FETCH NEXT FROM toQuery INTO @bench

    END

    CLOSE toQuery

    DEALLOCATE toQuery

    SELECT * FROM @runsByBench

    and here's the DDL I created for the tables used by the SQL script to work on the problem:

    create table Runs

    (RunID int,

    CreatorID int,

    CreateDate datetime,

    EndDate datetime,

    Name varchar(20),

    MachineID int);

    insert into Runs

    values

    (1, 201, DATEADD(mi,-3,getdate()-1), DATEADD(mi,+60,getdate()-1), 'LABC1521', 521),

    (2, 202, DATEADD(mi,-3,getdate()), DATEADD(mi,+75,getdate()), 'LABC2681', 681),

    (3, 203, DATEADD(mi,-3,getdate()), DATEADD(mi,+63,getdate()), 'LABC3123', 123),

    (4, 204, DATEADD(mi,-3,getdate()), DATEADD(mi,+65,getdate()), 'LABC4765', 765),

    (5, 205, DATEADD(mi,-1,getdate()), DATEADD(mi,+78,getdate()), 'LABC1521', 521);

    create table Machines

    (MachineID int,

    Name varchar(20));

    insert into Machines

    values

    (521, 'LABC1521'),

    (681, 'LABC2681'),

    (123, 'LABC3123'),

    (765, 'LABC4765');

    create table Runs_Machines

    (Runid int,

    MachineID varchar(20));

    insert into Runs_Machines

    values

    (1, 521),

    (2, 681),

    (3, 123),

    (4, 765),

    (5, 521);

    create table RunsByBench

    (Benchname varchar(20),

    NumberOfRunsOnBench int,

    LastRun datetime);

    the query inside the script will work against the upper DDL as long as the you hardcode the m.name value:

    select r.runid, r.creatorid, r.createdate, r.name, r.enddate

    from Runs r

    join runs_machines rm on rm.runid=r.runid

    join Machines m on m.machineid=rm.machineid

    where m.name like 'LabC1%'

    I would like to put the array into a #temp table and compare the results of the query to it. But I have failed. I tried a correlated subquery,

    but without a unique key, I discovered it isn't the solution.

    Can someone help me find the T-SQL equivalent for the above cursor script?

    Thanks!

    --Quote me

  • An OUTER APPLY might do the job. There might be another option, but this looks like it should work fine.

    INSERT INTO RunsByBench

    SELECT BenchName AS BenchName,

    COUNT(CreateDate) AS NumberOfRunsOnBench,

    MAX(CreateDate) AS LastRun

    FROM @benches b

    OUTER APPLY (

    SELECT r.createdate,

    count(*) AS MachinesInRun

    FROM Runs r

    INNER JOIN runs_machines rm ON rm.runid = r.runid

    INNER JOIN Machines m ON m.machineid = rm.machineid

    -----HERE'S WHERE IT COMPARES THE VALUE M.NAME TO THE VALUE FETCHED BY THE CURSOR--------------

    WHERE m.NAME LIKE (b.BenchName + '[0-9][0-9][0-9]')

    ----------------------------------------------------------------------------------------------------------

    AND r.createdate > dateadd(M, - 3, getdate())

    GROUP BY r.runid,

    r.creatorid,

    r.createdate,

    r.NAME,

    r.enddate

    ) AS foo

    GROUP BY BenchName

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • This worked, Thank you Luis.

    --Quote me

  • It was easy to find the solution with the DDL and sample data you posted (I had to run the original query to get the expected results).

    The important thing is that you understand what's happening so you can mantain it and replicate it in other cases. If you need help with APPLY, take a look at these articles, search more information on the web or ask any questions here. 😉

    Understanding and Using APPLY (Part 1)[/url]

    Understanding and Using APPLY (Part 2)[/url]

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Yes, I jumped to thank you but I am examining it. I agree, I must understand so I can reuse in other situations.

    I may have some more questions about this before i'm through. For example, did you use a temp table for Benches?

    eg.

    create table benches (Bench varchar(10));

    insert into benches

    values

    ('LabC1'),

    ('LabC2'),

    ('LabC3'),

    ('LabC4');

    ????? what did you do to handle array? For me to run your query I create a basic table called Benches, but in reality no such table exists (and won't exist). What assumption did you make to handle for the array @benches? A script populating a temp table?

    --Quote me

  • for some reason the first time I used #temp table with query it didn't work. Forget the error.

    Now either #temp or table variable works.

    So, I'll post the entire script/solution below:

    DECLARE @benches TABLE (Bench varchar(10));

    INSERT @benches

    values

    ('LabC1'),

    ('LabC2'),

    ('LabC3'),

    ('LabC4');

    INSERT INTO RunsByBench

    SELECT Bench AS BenchName,

    COUNT(CreateDate) AS NumberOfRunsOnBench,

    MAX(CreateDate) AS LastRun

    FROM @benches b

    OUTER APPLY (

    SELECT r.createdate,

    count(*) AS MachinesInRun

    FROM Runs r

    INNER JOIN runs_machines rm ON rm.runid = r.runid

    INNER JOIN Machines m ON m.machineid = rm.machineid

    WHERE m.NAME LIKE (b.Bench + '[0-9][0-9][0-9]')

    AND r.createdate > dateadd(M, - 3, getdate())

    GROUP BY r.runid,

    r.creatorid,

    r.createdate,

    r.NAME,

    r.enddate

    ) AS foo

    GROUP BY Bench

    --Quote me

  • Be careful on using table variables, they might reduce performance on your query depending on the number of rows. You should test what's better for your performance (temp table, table variable, other) and consider pros and cons.

    Here's an article on the subject: http://www.sqlservercentral.com/articles/Temporary+Tables/66720/

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Excellent, thanks again.

    The whole point is to improve performance so I really appreciate added notes about table variables versus temp etc. APPLY articles were great.

    --Quote me

  • I'm trying to apply what I've learned in the APPLY links you left me, Luis, to my solution. Thus, I want a TVF for the right half of the query. By right half, I mean the query to the right of the OUTER APPLY which gets the createdate for each bench in the input field.

    I'm getting error when I try to create the function:

    CREATE FUNCTION GetBench (@bench varchar(20))

    RETURNS TABLE

    WITH SCHEMABINDING

    AS RETURN

    SELECT r.createdate

    FROM Runs r

    INNER JOIN runs_machines rm ON rm.runid = r.runid

    INNER JOIN Machines m ON m.machineid = rm.machineid

    WHERE m.NAME LIKE (@bench + '[0-9][0-9][0-9]')

    AND r.createdate > dateadd(M, - 3, getdate())

    error:

    Msg 4512, Level 16, State 3, Procedure GetBench, Line 5

    Cannot schema bind table valued function 'GetBench' because name 'Runs' is invalid for schema binding. Names must be in two-part format and an object cannot reference itself.

    Can you point out what's wrong in the TVF? Thanks.

    --Quote me

  • polkadot (9/26/2013)


    I'm trying to apply what I've learned in the APPLY links you left me, Luis, to my solution. Thus, I want a TVF for the right half of the query. By right half, I mean the query to the right of the OUTER APPLY which gets the createdate for each bench in the input field.

    I'm getting error when I try to create the function:

    CREATE FUNCTION GetBench (@bench varchar(20))

    RETURNS TABLE

    WITH SCHEMABINDING

    AS RETURN

    SELECT r.createdate

    FROM Runs r

    INNER JOIN runs_machines rm ON rm.runid = r.runid

    INNER JOIN Machines m ON m.machineid = rm.machineid

    WHERE m.NAME LIKE (@bench + '[0-9][0-9][0-9]')

    AND r.createdate > dateadd(M, - 3, getdate())

    error:

    Msg 4512, Level 16, State 3, Procedure GetBench, Line 5

    Cannot schema bind table valued function 'GetBench' because name 'Runs' is invalid for schema binding. Names must be in two-part format and an object cannot reference itself.

    Can you point out what's wrong in the TVF? Thanks.

    The error message is telling you exactly what the problem is, You MUST use 2 part naming when using schemabinding.

    From BOL:

    SCHEMABINDING

    Binds the view to the schema of the underlying table or tables. When SCHEMABINDING is specified, the base table or tables cannot be modified in a way that would affect the view definition. The view definition itself must first be modified or dropped to remove dependencies on the table that is to be modified. When you use SCHEMABINDING, the select_statement must include the two-part names (schema.object) of tables, views, or user-defined functions that are referenced. All referenced objects must be in the same database.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • I vaguely recall running into that error when creating a view. In my case adding dbo. to the table name fixed it. So in your case dbo.Runs. Hope this works for you.

  • well, then I had to comment out WITH SCHEMABINDING, because try as I might, to do what the message wanted, it was still same error.

    If you can show me how to use proper naming convention, as in point out where exactly problem is, I would appreciate.

    See, I have added the schema dbo...

    CREATE FUNCTION GetBench (@bench varchar(20))

    RETURNS TABLE

    --WITH SCHEMABINDING

    AS RETURN

    SELECT r.createdate

    FROM Sandbox.dbo.Runs r

    INNER JOIN Sandbox.dbo.runs_machines rm ON rm.runid = r.runid

    INNER JOIN Sandbox.dbo.Machines m ON m.machineid = rm.machineid

    WHERE m.NAME LIKE (@bench + '[0-9][0-9][0-9]')

    AND r.createdate > dateadd(M, - 3, getdate());

    ps. I recall now that schemabinding ensures that you can't modify underlyling tables without re-creating the dependent functions and views.

    --Quote me

  • polkadot (9/27/2013)


    well, then I had to comment out WITH SCHEMABINDING, because try as I might, to do what the message wanted, it was still same error.

    If you can show me how to use proper naming convention, as in point out where exactly problem is, I would appreciate.

    See, I have added the schema dbo...

    CREATE FUNCTION GetBench (@bench varchar(20))

    RETURNS TABLE

    --WITH SCHEMABINDING

    AS RETURN

    SELECT r.createdate

    FROM Sandbox.dbo.Runs r

    INNER JOIN Sandbox.dbo.runs_machines rm ON rm.runid = r.runid

    INNER JOIN Sandbox.dbo.Machines m ON m.machineid = rm.machineid

    WHERE m.NAME LIKE (@bench + '[0-9][0-9][0-9]')

    AND r.createdate > dateadd(M, - 3, getdate());

    ps. I recall now that schemabinding ensures that you can't modify underlyling tables without re-creating the dependent functions and views.

    It requires two part notation. So one or three won't work. It should just be dbo.Runs, I believe. I'd have to dig through some emails to find where I had the same problem and exactly what fixed it. But give that a shot.

  • to be honest, I did, but I'm not going to battle that anymore. As long as commenting out the With Schemabinding allows me to continue with the APPLY 'lesson' I'm fine with that.

    --Quote me

  • Luis Cazares (9/26/2013)


    Be careful on using table variables, they might reduce performance on your query depending on the number of rows. You should test what's better for your performance (temp table, table variable, other) and consider pros and cons.

    Here's an article on the subject: http://www.sqlservercentral.com/articles/Temporary+Tables/66720/

    I agree with this. When developing a solution and the choice of table variables vs temp tables comes into question then it is best to know how much data you are dealing with to choose the right path.

    Just all part of the development process.

    Kurt

    Kurt W. Zimmerman
    SR DBA
    Lefrak Organization
    New York, NY

    http://www.linkedin.com/in/kurtwzimmerman

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply