The ARRAY In SQL Server 2000

  • This is the second article in a row that pushes the use of @tables.  There is little benefit if any, of using these instead of #tables.  In my experience @tables are alot harder to work with and dont work well with joins.  In the system I work with, I often need to modify existing code.  About half of the time I am up against @tables, the other half, #tables.  After struggling a few times, now my first step after I open the SP is to change all the @tables to #tables.  Just my opinion.

    If you want to jump on board with temp tables and leave variable tables out, the syntax is similar, only replace "Declare @myTable table(....)"  with "Create table #myTable(....)"

  • TVs are not 100% memory objects...

    http://support.microsoft.com/?kbid=305977

    Q4: Are table variables memory-only structures that are assured better performance as compared to temporary or permanent tables, because they are maintained in a database that resides on the physical disk?

    A4: A table variable is not a memory-only structure. Because a table variable might hold more data than can fit in memory, it has to have a place on disk to store data. Table variables are created in the tempdb database similar to temporary tables. If memory is available, both table variables and temporary tables are created and processed while in memory (data cache).

     

    Terry

  • ONe more advantage to TVs over temp tables is the optmizers ability to use the TV in determining an execution plan, where often it can not do so with a temp table.

    I'm sure we have all had code with a temp table, and displaying the execution plan returns an error because the temp table does not exist. Replacing the temp tables with TVs (where possible) solves the issue, and greatly improves performance of the query.

    Terry

  • Pretty well written article, but I'm confused. I thought "Arrays" where called "Tables" in SQL Server.

    Why do people keep saying SQL server doesn't support arrays? Maybe it's correct to say SQL Server doesn't support multi-dimensional arrays...

    It seems like it's usually in reference to passing in multiple values to a stored procedure, or something along those lines. OK, that's sort of fair, if you're thinking of input parameters. However, in you think about it differently, there is another way of "passing" data into a stored procedure, and that's by storing it in a table. That's SQL programming right there.

    And what do cursors have to do with arrays, besides the fact that arrays are usually "looped" through? In SQL, you can access all or some members of the array without looping! It's called set based processing. Tables don't live in memory necessarily, and that does make them different than arrays. But I might add that TV and TT don't either.

    So maybe it's a paradigm problem.

    Signature is NULL

  • The author of the article wrote: "but using temporary table you are increasing disk activity since TV (Table Variable) is created in memory as opposed to Temp tables (created in TempDb)."

    Now we know how rumors and myths get started.  I'd like to suggest that you do just a little more research...  Please refer to the Microsoft web site which states the following...

    Q4: Are table variables memory-only structures that are assured better performance as compared to temporary or permanent tables, because they are maintained in a database that resides on the physical disk?

    A4: A table variable is not a memory-only structure. Because a table variable might hold more data than can fit in memory, it has to have a place on disk to store data. Table variables are created in the tempdb database similar to temporary tables. If memory is available, both table variables and temporary tables are created and processed while in memory (data cache).

    The website is:

    http://support.microsoft.com/default.aspx?scid=kb;en-us;305977&Product=sql2k

    On top of that, if the table variables are large, the following comes into play (from the same website)...

    Q3: What are some of the drawbacks of table variables?

    A3: These are some of the drawbacks as compared to temporary tables:

    Non-clustered indexes cannot be created on table variables, other than the system indexes that are created for a PRIMARY or UNIQUE constraint. That can influence the query performance when compared to a temporary table with non-clustered indexes.
    Table variables do not maintain statistics like temporary tables can. Statistics cannot be created on table variables through automatic creation or by using the CREATE STATISTICS statement. Therefore, for complex queries on large tables, the lack of statistics may deter the optimizer to determine the best plan for a query, thus affecting the performance of that query.

    And, yes, I agree that cursors should be replaced but NOT that way.   Show me the resource differences between a "firehose" cursor and the table variable/While loop method... I think you'll be surprised how little the difference is...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Note that decisions to use cursors, temp tables and table variables should not only be based on the duration required to run a single process using them. The real cost is in scalability, not performance. Of course the real problem is using row-by-row processing instead of a set-based solution.

  • Using XML to pass in a 'list' of multiple values to use as a search argument is a very bad decision resource wise. There are set based solutions that can split a comma-separated list (with no procedural string parsing) much more effectively so that you do not need to a) pass way to large strings (XML is BLOATED) and more importantly b) use system resources to parse XML.

  • I would be interested to know the set-based solutions without String parsing

    Thanks

  • --create all objects

    IF EXISTS (Select * from dbo.SysObjects where name = 'Numbers' and XType = 'U' and user_name(uid) = 'dbo')

    DROP TABLE Numbers

    GO

    CREATE TABLE dbo.Numbers (PkNumber int identity(1,1) primary key clustered, dude bit null)

    GO

    INSERT INTO dbo.Numbers (dude)

    SELECT NULL FROM (SELECT TOP 100 NULL AS A FROM master.dbo.spt_values) dt100 cross join (SELECT TOP 80 null AS A FROM master.dbo.spt_values) dt80

    GO

    ALTER TABLE dbo.Numbers

    DROP COLUMN dude

    GO

    CREATE FUNCTION [dbo].[fnSplit_Set] (@IDS as varchar(8000), @vcDelimiter varchar(3))

    RETURNS TABLE

    WITH SCHEMABINDING

    AS

    Return

    Select dtSplitted.EachID, dtSplitted.Rank from (

    SELECT SUBSTRING(@vcDelimiter + @IDs + @vcDelimiter, N.PkNumber + len(@vcDelimiter),

    CHARINDEX(@vcDelimiter, @vcDelimiter + @IDs + @vcDelimiter, N.PkNumber + len(@vcDelimiter)) - N.PkNumber - len(@vcDelimiter)) as EachID

    , (LEN(SUBSTRING(@IDs, 1, N.PkNumber)) - LEN (REPLACE (SUBSTRING(@IDs, 1, N.PkNumber), ',', ''))) + 1 AS Rank

    FROM dbo.Numbers N

    WHERE SUBSTRING(@vcDelimiter + @IDs + @vcDelimiter, N.PkNumber, len(@vcDelimiter)) = @vcDelimiter

    AND PkNumber 0

    GO

    --end creation part

    --start testing

    --generate a list of ids

    DECLARE @Ids as varchar(8000)

    SET @IDS = ''

    Select @IDS = @IDS + CAST(id as varchar(10)) + ',' from (Select TOP 10 id from dbo.SysObjects order by NEWID()) dtIds

    SET @IDS = left(@IDS, ABS(len(@IDS) - 1))

    PRINT @IDS

    --use in a query

    Select O.id, dtIDS.EachId, O.name, dtIDS.Rank from dbo.SysObjects O inner join (Select CAST(EachID as INT) AS EachID, Rank from dbo.fnSplit_Set (@IDS, ',')) dtIDS on O.id = dtIDS.EachID order by O.Name

    Select O.id, dtIDS.EachId, O.name, dtIDS.Rank from dbo.SysObjects O inner join (Select CAST(EachID as INT) AS EachID, Rank from dbo.fnSplit_Set (@IDS, ',')) dtIDS on O.id = dtIDS.EachID order by dtIDS.Rank

    --Dont't ever forget to cast the eachid column to the same datatype as in the join to avoid table scans.

  • Each of the methods mentioned in the article has its place in a DBA's tool box. I have used the XML approach many times with good performance but it is important to test each method to see what works best in your own environment.

    An important point to note when using sp_xml_preparedocument is that it supposedly uses 1/8th the total memory on the server. I would not use it in a sproc that is going to be called very frequently.

    From BOL:

    sp_xml_preparedocument returns a handle that can be used to access the newly created internal representation of the XML document. This handle is valid for the duration of the connection to Microsoft® SQL Server™ 2000, until the connection is reset, or until the handle is invalidated by executing sp_xml_removedocument.

    Note A parsed document is stored in the internal cache of SQL Server 2000. The MSXML parser uses one-eighth the total memory available for SQL Server. To avoid running out of memory, run sp_xml_removedocument to free up the memory.

  • So is that saying that if I have 80MB of SQL Server memory allocated then it will use 10MB of RAM and if I have 800MB of SQL Server memory allocated then it will use 100MB of RAM?  Seems a bit off to me!?

    Perhaps it can use UP to 1/8th of the allocated RAM?   But I may be wrong

  • I was wondering whether this is actually the solution to a problem I have, or whether it would be too inefficient.  If anyone has any clever suggestions, then please let me know.

    I have an application which allows the user to enter changes to a list of address records, and when they click "OK" the changes are submitted to the database by calling a stored once for every row which has been modified.  The problem is that there is a description field which is unique (for the selected company), and if the user changes the description of one, then reuses the old description on another address, sometimes there is a violation of the unique constraint - depending on which order the rows happen to be updated in.  I cannot delete and recreate the address records, because there are lots of other records which reference them.

    One solution (I think) would be to try and send details of all the records changed to a stored procedure in one of these arrays, so that this can be parsed into a temporary table of some kind, and then the update can be made from this table.  This would have the advantage of being able to make all of the updates in a single network trip, but it'd be a bit of a palaver getting the changes into the XML in the first place, and if this method is as bad as everybody says...

    The only other way I can think of would be to use a temporary table, and have one stored procedure which would send the changed records to the temporary table, and another which would fire the update from this temporary table.  This doesn't seem ideal either.

    Does anyone have any bright ideas?

    Rachel.

  • I personally hate XML because it's too bloated so I'd go with the temp table approach. I don't think you have any other options in this case. Anyways this wouldn't hit the server a lot since this task wouldn't be ran 1000s of times/hour.

  • Rachel,

    Yeah, a staging table of some sort is ideal for this.

    I'm assuming you want all changes in the same transaction, so that if one change fails due to constraints then all changes fail. Staging the data before updating allows you to validate it, and gracefully exit the update if you see that constraints are going to be violated.

    You certainly don't need a SQL array; that's what you use the staging table for, and you validate the data with SQL statements. If you wanted to use an array, you'd do your validation on the client side (which may be even better than using SQL, actually), which I'm sure would support arrays.

    Signature is NULL

  • Good point to recall. Always validate client side first. Then revalidate on the server to be 100% sure. But I'd still go with a single transaction update using a temp table. This is actually a classic case of a good use for temp tables.

Viewing 15 posts - 16 through 30 (of 56 total)

You must be logged in to reply to this topic. Login to reply