Eliminating Cursors

  • I was able to utilize the proc. This is GREAT. The only thing I had to change was to use a coalesce rather than isnull. Below is the syntax:

    COALESCE((SELECT COUNT(*) FROM @item_table),0 )

  • Bobbie Shrivastav (6/13/2008)


    I was able to utilize the proc. This is GREAT. The only thing I had to change was to use a coalesce rather than isnull. Below is the syntax:

    COALESCE((SELECT COUNT(*) FROM @item_table),0 )

    When have you EVER known COUNT(*) to return a NULL??? The COALESCE/ISNULL is a total waste of clock cycles for this... Here's the proof...

    DECLARE @Item_Table TABLE (RowNum INT, SomethingElse VARCHAR(20))

    SELECT COUNT(*) FROM @Item_Table

    -----------

    0

    (1 row(s) affected)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Good point on the SQL 2000 vs SQL 2005. My first attempt was against 2005. I tried again against a sql 2000 database and came out with very similar results. The cursor was much faster than the while loop.

    Just to give you an idea of time (on SQL 2000 Enterprise)

    100 records

    Cursor: 16 ms

    While: 16 ms

    500 Records

    Cursor: 60 ms

    While: 123 ms

    1000 records

    Cursor: 93 ms

    While: 390 ms

    5000 records

    Cursor: 486ms

    While: 9123ms

    10000 records

    Cursor: 986 ms

    While: 39046ms

    65526 records (entire dataset)

    Cursor: 9406

    While: 1,200,000 and still counting

    If you look at the 10 fold increase of 100->1000->10000, the cursor processing time increases are fairly relative to the amount of data (6x for 100->1000 and 11x from 1000-10000). Even from 100 to the 65526 is a 655x increase in records but only about 590x increase in processing from 100 to 65500. It still seems pretty much relative.

    The while loop on the other hand shows more of a geometric pattern where the processing time increases substantially more than the increase in data processed (24x 100->1000, 100x for 1000->10000), 30x+ 10000->65526

    My guess this increase in processing is due to a lack of an index on the primary key field for the in memory table.

    EDIT:

    Just finished the while loop of the entire recordset. 1,494,640ms which is a 38x increase in processing for 6.5 times the data.

  • Chris.Strolia-Davis (6/13/2008)


    while (select count(*) from @table) > 0

    begin

    select top 1 'do something', @id=id from @table

    delete from @table where id=@id

    end

    Never use count(*) for this purpose;

    every time it is run it scans the entire table!

    Use the much, much more efficient EXISTS clause:

    while EXISTS (select * from @table)

    begin

    select top 1 'do something', @id=id from @table

    delete from @table where id=@id

    end

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

  • Ok, so for the fun of it I tried to add a Primary Key Clustered in my in memory table. The difference is substantial

    New times for the while loop:

    100 records

    Cursor: 16 ms

    While: 16 ms

    While with Index: 16 ms

    500 Records

    Cursor: 60 ms

    While: 123 ms

    While with Index: 46 ms

    1000 records

    Cursor: 93 ms

    While: 390 ms

    While with Index: 110 ms

    5000 records

    Cursor: 486ms

    While: 9123ms

    While with Index: 533 ms

    10000 records

    Cursor: 986 ms

    While: 39046ms

    While with Index: 1080 ms

    65526 records (entire dataset)

    Cursor: 9406

    While: 1,494,640 ms

    While with Index: 7106 ms

    By having an index on my in memory table, I now see a substantial improvement in the while loop. It is now edging out the cursor on the entire data set. On the other groupings it was much closer between the two. Actually it is at a point where there is really no difference between one or the other.

  • I was getting worried when I got through 5 or 6 pages of this thread and hadn't seen Jeff Moden jumping in to point out that different types of row-by-row processing are still row-by-row. I knew if I hung in there, he would say what I wanted to say better than I would have. And he did!

    I gained an understanding of set-based processing working in an array oriented language (APL), and am amazed that (us) programmers in this day and age still put up with scalar one-at-a-time processing. It is sad that when set based solutions are available, folks still revert to old fashioned loops!

    (No disrespect for Kamran, who deserves credit for launching this great exchange.)

  • Did anyone run that test code with other than the default cursor settings (which are not optimal)? Can someone try LOCAL STATIC FORWARD_ONLY READ_ONLY and see how the performance is?

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Jim Russell (6/13/2008)


    I was getting worried when I got through 5 or 6 pages of this thread and hadn't seen Jeff Moden jumping in to point out that different types of row-by-row processing are still row-by-row. I knew if I hung in there, he would say what I wanted to say better than I would have. And he did!

    Heh... glad I didn't let you down, Jim! And, thanks for the compliment! 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Kevin Rathgeber (6/13/2008)


    Good point on the SQL 2000 vs SQL 2005. My first attempt was against 2005. I tried again against a sql 2000 database and came out with very similar results. The cursor was much faster than the while loop.

    Just to give you an idea of time (on SQL 2000 Enterprise)

    100 records

    Cursor: 16 ms

    While: 16 ms

    500 Records

    Cursor: 60 ms

    While: 123 ms

    1000 records

    Cursor: 93 ms

    While: 390 ms

    5000 records

    Cursor: 486ms

    While: 9123ms

    10000 records

    Cursor: 986 ms

    While: 39046ms

    65526 records (entire dataset)

    Cursor: 9406

    While: 1,200,000 and still counting

    If you look at the 10 fold increase of 100->1000->10000, the cursor processing time increases are fairly relative to the amount of data (6x for 100->1000 and 11x from 1000-10000). Even from 100 to the 65526 is a 655x increase in records but only about 590x increase in processing from 100 to 65500. It still seems pretty much relative.

    The while loop on the other hand shows more of a geometric pattern where the processing time increases substantially more than the increase in data processed (24x 100->1000, 100x for 1000->10000), 30x+ 10000->65526

    My guess this increase in processing is due to a lack of an index on the primary key field for the in memory table.

    EDIT:

    Just finished the while loop of the entire recordset. 1,494,640ms which is a 38x increase in processing for 6.5 times the data.

    Kevin,

    I lost the bubble a bit here... what is the code you're running and what is the test data? I'd like to do a comparison of my own and want to use apples vs apples, if you know what I mean...

    Thanks...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Kevin Rathgeber (6/13/2008)


    Ok, so for the fun of it I tried to add a Primary Key Clustered in my in memory table. The difference is substantial

    New times for the while loop:

    100 records

    Cursor: 16 ms

    While: 16 ms

    While with Index: 16 ms

    500 Records

    Cursor: 60 ms

    While: 123 ms

    While with Index: 46 ms

    1000 records

    Cursor: 93 ms

    While: 390 ms

    While with Index: 110 ms

    5000 records

    Cursor: 486ms

    While: 9123ms

    While with Index: 533 ms

    10000 records

    Cursor: 986 ms

    While: 39046ms

    While with Index: 1080 ms

    65526 records (entire dataset)

    Cursor: 9406

    While: 1,494,640 ms

    While with Index: 7106 ms

    By having an index on my in memory table, I now see a substantial improvement in the while loop. It is now edging out the cursor on the entire data set. On the other groupings it was much closer between the two. Actually it is at a point where there is really no difference between one or the other.

    Careful - when you start posting results like this - you might get asked for some specifics on what you were doing, just so we can compare them to set-based results. 7 seconds on 65K rows is pretty good on human standards, but seems supremely "beatable" with a good set-based option.

    Not that I'm actually asking, but you might, which then leads to a test-off, etc.....:w00t:

    Edit: I see Jeff has already thrown the gauntlet:)

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (6/13/2008)


    Edit: I see Jeff has already thrown the gauntlet:)

    Yep... getting ready to launch some pork chops! 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • **ducking** I had chicken for lunch....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Is a 'test-off' like a 'dance-off'? should I go get some cardboard?

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • jcrawf02 (6/13/2008)


    Is a 'test-off' like a 'dance-off'? should I go get some cardboard?

    Only if you want to save the big pieces... 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Matt I totally understand what you are saying. Actually what I am going to do next is an average of several runs against the data to get a better idea.

    Jeff, here is what I am running for both (please excuse the variable names etc....I am just testing and not wanting to make good proper names 🙂

    Couple Notes

    Now, for my tests of 100, 1000, etc I was selecting top x. This probably isn't optimal so I may want to look at doing ranges of data through a where clause. Also will remove the timers I had in there and use the client statistics

    And as I said before, I know there are better ways of summing, I just chose this way as a test scenario as the overhead of that computation is minimal (and use it as a check to ensure that my cursor and while versions are coming back with the same data).

    This is also being run on SQL Server 2000 Enterprise.

    Cursor Code

    DECLARE @FrontageSum float

    DECLARE @sid INT, @SASN varchar(10),

    @SASPreD varchar(2), @SASName varchar(30),

    @SASPD varchar(2), @SAPC varchar(2),

    @SF float

    DECLARE @starttime datetime, @endtime datetime

    SET @starttime = getdate()

    DECLARE curSite CURSOR FAST_FORWARD FOR

    SELECT Site_id, Site_addr_st_num, Site_addr_st_pre_dir, Site_addr_st_name,

    Site_addr_st_post_dir, Site_addr_prov_code, site_frontage

    FROM SITE

    OPEN curSite

    FETCH NEXT FROM curSite INTO

    @sid, @SASN, @SASPreD, @SASName, @SASPD, @SAPC, @SF

    SET @FrontageSum = 0

    WHILE @@FETCH_STATUS = 0

    BEGIN

    IF @SF IS NOT NULL

    SET @FrontageSum = @FrontageSum + @SF

    FETCH NEXT FROM curSite INTO

    @sid, @SASN, @SASPreD, @SASName, @SASPD, @SAPC, @SF

    END

    DEALLOCATE curSite

    PRINT @FrontageSum

    SET @endtime = getdate()

    PRINT datediff(ms, @starttime, @endtime)

    While Loop

    DECLARE @FrontageSum float

    DECLARE @sid INT, @SASN varchar(10),

    @SASPreD varchar(2), @SASName varchar(30),

    @SASPD varchar(2), @SAPC varchar(2),

    @SF float

    DECLARE @itemcount int

    DECLARE @counter int

    DECLARE @starttime datetime, @endtime datetime

    SET @starttime = getdate()

    DECLARE @item_table TABLE

    (primary_key INT IDENTITY(1,1) not null PRIMARY KEY CLUSTERED,

    SiteID int,

    SASN varchar(10),

    SASPreD varchar(2),

    SASName varchar(30),

    SASPD varchar(2),

    SAPC varchar(2),

    SF float)

    INSERT INTO @item_table(SiteID, SASN, SASPreD, SASName, SASPD, SAPC, SF)

    SELECT Site_id, Site_addr_st_num, Site_addr_st_pre_dir, Site_addr_st_name,

    Site_addr_st_post_dir, Site_addr_prov_code, site_frontage

    FROM SITE

    set @itemcount = @@rowcount

    SET @counter = 1

    set @FrontageSum = 0

    IF @itemcount > 0

    BEGIN

    WHILE @counter <= @itemcount

    BEGIN

    SELECT @sid = SiteID, @SASN = SASN, @SASPreD = SASPreD, @SASName = SASName,

    @SASPD = SASPD, @SAPC = SAPC, @SF = SF

    FROM @item_table

    WHERE primary_key = @counter

    IF @SF IS NOT NULL

    SET @FrontageSum = @FrontageSum + @SF

    SET @counter = @counter + 1

    END

    END

    PRINT @FrontageSum

    SET @endtime = getdate()

    PRINT datediff(ms, @starttime, @endtime)

    I will send you the table layout (hopefully that is enough to get you going)

    CREATE TABLE [dbo].[SITE](

    [Timestamp] [binary](8) NOT NULL,

    [Site_Id] [int] NOT NULL,

    [Active_Status] [varchar](8) NOT NULL,

    [Site_Addr_St_Num] [varchar](10) NULL,

    [Site_Addr_St_Pre_Dir] [varchar](2) NULL,

    [Site_Addr_St_Name] [varchar](30) NULL,

    [Site_Addr_St_Suff] [varchar](6) NULL,

    [Site_Addr_St_Post_Dir] [varchar](2) NULL,

    [Site_Addr_City] [varchar](30) NULL,

    [Site_Addr_Prov_Code] [varchar](2) NULL,

    [Site_Addr_Cntry_Code] [varchar](8) NULL,

    [Site_Addr_Postal_Code] [varchar](11) NULL,

    [Site_Addr_Subdiv] [varchar](20) NULL,

    [Site_Addr_Qtr_Sec] [varchar](2) NULL,

    [Site_Addr_Section] [varchar](2) NULL,

    [Site_Addr_Township] [varchar](3) NULL,

    [Site_Addr_Range] [varchar](2) NULL,

    [Site_Addr_Meridian] [varchar](10) NULL,

    [Site_Addr_Portion] [varchar](20) NULL,

    [Site_Status] [varchar](8) NULL,

    [Meter_Route_Id] [int] NULL,

    [Site_Area] [float] NOT NULL CONSTRAINT [df_SITE_Site_Area] DEFAULT (0),

    [Hospital_Ward_Id] [int] NULL,

    [Municipal_Ward_Id] [int] NULL,

    [Nghd_Id] [char](3) NULL,

    [Nghd_Study_Dsg] [char](4) NULL,

    [Dev_Plan_Dsg] [varchar](4) NULL,

    [Zoning_Dsg] [varchar](8) NULL,

    [Proposed_Dev_Plan_Dsg] [varchar](4) NULL,

    [Proposed_Zoning_Dsg] [varchar](9) NULL,

    [Plan_Number] [varchar](67) NULL,

    [Block_Number] [varchar](19) NULL,

    [Lot_Number] [varchar](67) NULL,

    [Site_Frontage] [float] NULL,

    [Electric_Franchise_Id] [int] NOT NULL,

    [Water_Franchise_Id] [int] NOT NULL,

    [Electric_Work_Area_Id] [int] NULL,

    [Water_Work_Area_Id] [int] NULL,

    [Site_Group_Type] [varchar](8) NULL,

    [Site_Width_Setback] [float] NULL,

    [CSMA_LineWork] [char](1) NULL,

    [CSMA_DateTime] [datetime] NULL,

    [COA_Reference] [varchar](50) NULL,

    [COA_Reference_Number] [int] NULL,

    [Municipality_Name] [varchar](30) NULL,

    [Site_Addr_Online_Addr] [varchar](255) NULL,

    [Site_Addr_Rgstr_Desc] [varchar](255) NULL,

    [Create_User_Id] [varchar](255) NOT NULL,

    [Create_User_Domain_Name] [varchar](255) NULL,

    [Create_Datetime] [datetime] NOT NULL,

    [Update_User_Id] [varchar](255) NULL,

    [Update_User_Domain_Name] [varchar](255) NULL,

    [Update_Datetime] [datetime] NULL

    )

Viewing 15 posts - 151 through 165 (of 296 total)

You must be logged in to reply to this topic. Login to reply