The ARRAY In SQL Server 2000

  • Here, here! I second the thought!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I've been using this technique to allow multiple values to be passed to an SP for use within an IN search condition and it works really nicely.

    However I'm a bit curious about the performance. I've been tuning a few queries and viewing the trace in query analyser and the step needed to get the values into a temp table seems to be taking up quite a high proportion of the query cost (nearly 20% of the whole procedure). Considering that this only involves doing a bit of string manipulation and putting up to 10 values into a temporary table, this seems a little high.

    This can be recreated using the code below. Is this cost just part of the process of creating the temporary table?

    I'm mainly surprised by the number of reads that are shown in the trace window. Are these disk reads or cache reads?

    1. Create UDF used to convert delimited string to table

    CREATE FUNCTION dbo.SplitInt

    (

     @List varchar(8000),

     @SplitOn nvarchar(5)

    RETURNS @RtnValue table

    (

     

     Id int identity(1,1),

     Value int

    )

    AS 

    -- Summary: Returns table with single "Value" column, containing integer values from a delimited string of integer values

    BEGIN

     WHILE (Charindex(@SplitOn,@List)>0)

     BEGIN

      Insert Into @RtnValue (value)

      Select

       Value = ltrim(rtrim(Substring(@List,1,Charindex(@SplitOn,@List)-1)))

      Set @List = Substring(@List,Charindex(@SplitOn,@List)+len(@SplitOn),len(@List))

     END

     Insert Into @RtnValue (Value)

     Select Value = ltrim(rtrim(@List))

     Return

    END

    2. Run the following script:

    -- Adding delimited values to another temporary table with primary key

    DECLARE @tblValues TABLE

    (

     Value int NOT NULL PRIMARY KEY CLUSTERED

    )

    INSERT INTO @tblValues (Value)

    SELECT DISTINCT Value FROM dbo.SplitInt('1,2,3,4,5,6,7',',')

    -- Selecting delimited valuesdirectly from function

    SELECT Value FROM dbo.SplitInt('1,2,3,4,5,6,7',',')

     

  • Repost :

    Set based is faster that procedural :

    IF Object_id('fnSplit_Set') > 0

    DROP FUNCTION dbo.fnSplit_Set

    GO

    IF Object_id('Numbers') > 0

    DROP TABLE dbo.Numbers

    GO

    CREATE TABLE dbo.Numbers (PkNumber int identity(1,1), dude bit null, CONSTRAINT Pk_Number PRIMARY KEY CLUSTERED (PkNumber))

    GO

    INSERT INTO dbo.Numbers (dude)

    SELECT NULL FROM (SELECT TOP 100 NULL AS A FROM master.dbo.spt_values) dt100 cross join (SELECT TOP 80 null AS A FROM master.dbo.spt_values) dt80

    GO

    ALTER TABLE dbo.Numbers

    DROP COLUMN dude

    GO

    --Select min(PkNumber) as MinA, max(PkNumber) as MaxA from dbo.Numbers

    --1, 8000

    GO

    CREATE FUNCTION [dbo].[fnSplit_Set] (@IDS as varchar(8000), @vcDelimiter varchar(3))

    RETURNS TABLE

    WITH SCHEMABINDING

    AS

    Return

    Select dtSplitted.EachID, dtSplitted.Rank from (

    SELECT SUBSTRING(@vcDelimiter + @IDs + @vcDelimiter, N.PkNumber + len(@vcDelimiter),

    CHARINDEX(@vcDelimiter, @vcDelimiter + @IDs + @vcDelimiter, N.PkNumber + len(@vcDelimiter)) - N.PkNumber - len(@vcDelimiter)) as EachID

    , (LEN(SUBSTRING(@IDs, 1, N.PkNumber)) - LEN (REPLACE (SUBSTRING(@IDs, 1, N.PkNumber), ',', ''))) + 1 AS Rank

    FROM dbo.Numbers N

    WHERE SUBSTRING(@vcDelimiter + @IDs + @vcDelimiter, N.PkNumber, len(@vcDelimiter)) = @vcDelimiter

    AND PkNumber < LEN(@vcDelimiter + @IDs + @vcDelimiter)

    ) dtSplitted where len(dtSplitted.EachID) > 0

    GO

    DECLARE @Ids as varchar(8000)

    SET @IDS = ''

    Select @IDS = @IDS + CAST(id as varchar(10)) + ',' from (Select TOP 10 id from dbo.SysObjects order by NEWID()) dtIds

    SET @IDS = left(@IDS, ABS(len(@IDS) - 1))

    PRINT @IDS

     

    This is what your queries will look like using the set split function:

    Select O.id, dtIDS.EachId, O.name, dtIDS.Rank from dbo.SysObjects O inner join (Select CAST(EachID as INT) AS EachID, Rank from dbo.fnSplit_Set (@IDS, ',')) dtIDS on O.id = dtIDS.EachID order by O.Name

    Select O.id, dtIDS.EachId, O.name, dtIDS.Rank from dbo.SysObjects O inner join (Select CAST(EachID as INT) AS EachID, Rank from dbo.fnSplit_Set (@IDS, ',')) dtIDS on O.id = dtIDS.EachID order by dtIDS.Rank

    --Dont't ever forget to cast the eachid column to the same datatype as in the join to avoid table scans.

  • Hi

    I would like to know how can I read a record from a table with 12 columns (each on for each month in a year) end put the value for each mounth into an another table using this table variable?

    With arrays I would:

    select

    arr_value[1] = vt.jan_value,

    arr_value[2] = vt.feb_value,

    arr_value[3] = vt.mar_value, ...

    arr_value[12] = vt.dec_value

    from value_table vt

    And then I would make a WHILE to put each array position in a record of that another table.

    I can't see a flexible solution whith Table-variables like I would have with arrays.

    Someone?

    Thanks,

    Bonk

  • I would like to know how can I read a record from a table with 12 columns (each on for each month in a year) end put the value for each mounth into an another table using this table variable?

    This won't be quite the answer you expected, but typically, you WOULDN'T/SHOULDN'T make a table that has values for each month in different columns... kinda defies the whole idea of a relational database.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • If it helps you at all bonk - a table IS a persistent 2-dimension array. One that happens to have the ability to perform parallel operations on its members at the rate of oh - several million at a time. One you don't have to keep populating and tearing down, etc....

    Using a loop on a SQL table is like buying a Mercedes for its ashtray: an awful waste of resources, and isn't good for your health....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Hi Jeff!

    I agree with you! But there's nothing to do in this case. I explain: the table with one column for each month already exists and I can't modify its structure.

    I'm working in a BI (Businness Inteligence) Project and I have to import data from diferent databases to my star-schema modelling database (better for BI projects) and in this kind of project, you can find all sort of stuffs. 🙁

    I'm a system victim... 🙂

    See you,

    Bonk

  • bonk (1/29/2008)


    Hi Jeff!

    I agree with you! But there's nothing to do in this case. I explain: the table with one column for each month already exists and I can't modify its structure.

    I'm working in a BI (Businness Inteligence) Project and I have to import data from diferent databases to my star-schema modelling database (better for BI projects) and in this kind of project, you can find all sort of stuffs. 🙁

    I'm a system victim... 🙂

    See you,

    Bonk

    In that case, we'd need a wee bit more information... you said you wanted to read a 12 month record from one table and put it in another... you'd need to post the CREATE statement from each table and possibly some data that we could play with to make sure we get it right. Take a look at the following URL before you start that to get the best results...

    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Regarding the overall article's topic:

    We DO NOT NEED ARRAYS!

    To use XML and Table Variables to try to "emulate" arrays is like grabbing your neighbor's Ferrari to try and emulate a Toyota. Very WRONG thinking!

    (Note: I use XML or delimited lists myself in similar manners: viz. to WORK AROUND SQL Server's pathetic lack of support for true table variables.)

    Please go back and read *anything* about the Relational Model and why it came about. Relations (tables) are vastly superior to arrays and are meant to be a *logical* structure. (That is, independent of how they might be implemented under the hood for performance reasons.) It was meant to give the minimum complexity necessary while providing all the necessary expressive power. ARRAYs do not add any power, only complexity.

    The only possible reason to want an ARRAY type is because of assumptions of how the data will be *physically* organized. But we should have simply TABLES (including in-memory tables, table variables, etc.) and have some control over physical-level implementation concerns like whether it's -- under the hood -- sorted in a particular way. Those physical details do not impact the logical ways we interact with the table--which is in fact one of the points!

    What WE DO NEED is better support for actual relation values, and relation variables. (That is "table values" and "table variables.") We should be able to define arbitrarily complex tables, pass them between client and server, state literal table values (as whole things) inline in code, etc. etc. etc. Just like we can with an INT or a FLOAT. Then we have more POWER than arrays can give, and we have a consistent model across the board instead of introducing yet another type with its own structures--that is more LIMITED.

    Please explain this to any of the ignorante who keep clamoring for array types!

    (And please push MS and others to implement better table-variable support. Even SQL 2008's new table variable type is woefully short of the goal.)

  • I've taken the two procedure examples you supplied and the stored procedure with the cursor runs faster than your example using the table variable. I had to change the tables to HumanResources.Employee in AdventureWorks. To process just over 250 rows it takes twice as long (406 ms) using the table variable method.

    In fact, if you increase the number of rows to 10,000 it takes 15 seconds using the table variable method as opposed to 2 seconds using the cursor. That's a no brainer to me, stick with the cursor (in this particular case anyway).

    I'd carefully think twice about using table variables in this manner.

  • Very nice article

    Now I know how to solve my problem : "SELECT * FROM table WHERE column IN @var". I should have test a lot of thing without finding the solution 😀

    thanks for the idea

  • The example coding in the SQL Cursor SP is incorrect as it displays the last record twice. This is really bad programming in that the fetch status should be checked immediately after it is retrieved in order to exit the loop:

    declare @OK bit

    -- Get first row

    open mycursor /*get row values*/

    set @OK = 1;

    while @OK = 1

    BEGIN

    /*get row values*/

    FETCH MyCursor

    INTO @CompName, @ContName

    if @@fetch_status <> 0

    set @OK=0

    else

    begin

    /*perform operations with single row*/

    PRINT 'T_SQL cursor row | ' + @ContName + ' | ' + @CompName

    end

    END

  • This article is over 2 1/2 years old, and it generated some controversy the first time around. So why is it a "featured article" in today's newsletter?

    Steve, do you guys need some writers?

    Signature is NULL

  • I tried the same comparisons with a real-world example, a data table with over 200,000 rows. In the first iteration the cursor method took 15 seconds, while I killed the 'array' method after 15 minutes after only a small fraction of the rows had been processed. The next time around I added a PRIMARY KEY constraint to the identity field and this did the trick. The time for the array method dropped to 8 seconds. Kind of a key omission, I'd say...

  • By the way, regarding an earlier comment that you cannot Index table variables...

    1) True, technically, and another example of the failure to separate logical and physical issues. (Tables should behave as Tables, regardless...)

    2) False, in practice. You can't "create index" on one. But you *can* create additional UNIQUE constraints on the table-variable at declaration time. This is implemented as an index, and queries (in my tests) will actually use them just like on a real table.

    And any UNIQUE constraint that *also* includes the PK, is guaranteed to be unique. Just put the "thing you want to index" first in the list, so it's more selective. (The extra PK part ends up being wasteful, but oh well.)

    I was able to drop a significantly complex scenario from about 10-seconds to about 3-seconds by placing two additional UQ constraints on the table-variable.

Viewing 15 posts - 31 through 45 (of 56 total)

You must be logged in to reply to this topic. Login to reply