Temp tables vs "permanent" temp table vs User Table Type

  • Hi,

    We have an SP that receives an XML field with loads of data (sometime it has over 1.000 records).

    The XML data is used in the SP and other SPs called from the "main" SP and used to join with other tables.

    What's the best way to store the XML?

    1. A #temp table knowing it will always be created and dropped not taking advantage of statistics (can have PK and IDX to improve the joins);

    2. A permanent temp table, this is, a table created like all the others but the data is inserted and deleted, just for processing porpoise, and the key has a SessionId. This can take advantage of statistics and the table isn't created and dropped whenever the SP is called;

    3. A user defined table variable to be used as parameter (can have the PK to improve joins).

    If the software was made in c#, that can use the UDT type, i'd probably go with it but since it's VB6 it doesn't support UDT types so the data has to passed as one big XML chunk.

    Thanks,

    Pedro



    If you need to work better, try working less...

  • What's the best way to store the XML?

    An XML data type

    http://msdn.microsoft.com/en-us/library/ms187339%28v=sql.120%29.aspx



    For better, quicker answers on T-SQL questions, read Jeff Moden's suggestions.[/url]

    "Million-to-one chances crop up nine times out of ten." ― Terry Pratchett, Mort

  • Dennis Post (9/4/2013)


    What's the best way to store the XML?

    An XML data type

    http://msdn.microsoft.com/en-us/library/ms187339%28v=sql.120%29.aspx

    The XML parameter used is already a XML data type.

    To use the XML with other tables and use it's data it has to be converted to a table, with the nodes() statement or the sp_xml_preparedocument SP. Since it's used more than once it has to stored in a table to be faster.

    It's the type of table that I'm questioning about...



    If you need to work better, try working less...

  • #temp tables have statistics. It's table variables that do not have statistics.

    Here's a question. Will more than one user be running this query? If so, using a permanent table will require you to also have a mechanism to separate out each person's data so that you're not stepping on each other.

    In general, when doing this kind of work, assuming the secondary processing needs statistics (meaning, you filter on the data after loading it) then I would use temporary tables. If you don't need statistics (no filtering of ANY kind including JOIN operations), then I would use table variables. But then, if you don't need to do secondary processing, I'd just use XQUERY to access the XML directly.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (9/4/2013)


    Here's a question. Will more than one user be running this query? If so, using a permanent table will require you to also have a mechanism to separate out each person's data so that you're not stepping on each other.

    That why I mentioned I needed to add a SessionId column, to filter for each user that uses the SP. The SP can be used 10 or more times at the same time and over 400 times a day (it's the SP for adding orders and recalculating stocks).

    Grant Fritchey (9/4/2013)


    In general, when doing this kind of work, assuming the secondary processing needs statistics (meaning, you filter on the data after loading it) then I would use temporary tables. If you don't need statistics (no filtering of ANY kind including JOIN operations), then I would use table variables. But then, if you don't need to do secondary processing, I'd just use XQUERY to access the XML directly.

    The data is used more than one in the "main" SP and in the other SPs as well, that's why we're using temp tables. Isn't it faster to store the data in a table rather than using .nodes() over and over again or sp_xml_preparedocument? In the past I also made a test comparing nodes and sp_xml_preparedocument: there's no big difference in performance (time) but if the XML uses attributes and there are over 50 or so the nodes is a lot more slower.

    Also the data is filtered, used to update some other table data.

    Thanks,

    Pedro



    If you need to work better, try working less...

  • Loading & querying XML is expensive, especially with regards to memory. But, if you're already loading it, then it's not that much more expensive to use it twice. However, if you are processing through filtering, you may simply be better off going to a temporary table. Your overhead is focused into tempdb rather than into a table that you'll have to index carefully to avoid blocking issues. You'll still have statistics available for the filtering. You don't have to write extra code to clean up the data when you're done with it (and again, looking at blocking & resource issues around a single, permanent table).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (9/4/2013)


    Loading & querying XML is expensive, especially with regards to memory. But, if you're already loading it, then it's not that much more expensive to use it twice. However, if you are processing through filtering, you may simply be better off going to a temporary table. Your overhead is focused into tempdb rather than into a table that you'll have to index carefully to avoid blocking issues. You'll still have statistics available for the filtering. You don't have to write extra code to clean up the data when you're done with it (and again, looking at blocking & resource issues around a single, permanent table).

    Thanks Grant,

    The filtering we do is when joining with other tables, for example, the articles table to get some extra information about the article.

    Since there can be over 1.000 rows I think that the stats and indexes can be useful.

    Thanks,

    Pedro



    If you need to work better, try working less...

  • Just one more thing, probably a "stupid" question but here it goes anyway 😉

    Temp tables created with SELECT .. INTO #temp FROM ... and then create clustered index ... on #temp (id) and create index ... on #temp (...) is slower than CREATE TABLE #temp (....) ... and then the INSERT INTO #temp SELECT ... FROM?

    And are the statistics on both methods the same?

    Thanks,

    Pedro



    If you need to work better, try working less...

  • PiMané (9/4/2013)


    Just one more thing, probably a "stupid" question but here it goes anyway 😉

    Temp tables created with SELECT .. INTO #temp FROM ... and then create clustered index ... on #temp (id) and create index ... on #temp (...) is slower than CREATE TABLE #temp (....) ... and then the INSERT INTO #temp SELECT ... FROM?

    And are the statistics on both methods the same?

    Thanks,

    Pedro

    Two general options:

    1) The table and indexes are created, then the data is loaded

    2) The data goes in, then the indexes go on

    The second choice is likely to have better statistics. Likely, not definitely (as in 100%). Creating an index results in a full scan for statistics where as adding data to existing indexes you're subject to the auto update processes and sampled updates by default.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks for the enlightenment 🙂

    Pedro



    If you need to work better, try working less...

  • PiMané (9/4/2013)


    What's the best way to store the XML?

    The best way, IMHO, is to shred it on receipt and store it as properly normalized data. XML is formatted data and one of the worst data sins there is is to store formatted data.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • XML with indexes is very slow and takes lots of memory.

    I've made some tests with XML without index, with indexes and data stored "vertically" (to store data in a database where the user can define whatever they want) and XML was always very very slow no matter what.

    The best way to use XML, in this case, a very large XML, is to store it in temp table.



    If you need to work better, try working less...

  • Jeff Moden (9/19/2013)


    PiMané (9/4/2013)


    What's the best way to store the XML?

    The best way, IMHO, is to shred it on receipt and store it as properly normalized data. XML is formatted data and one of the worst data sins there is is to store formatted data.

    The problem is that we have clients that have receipts with over 1000 lines, so to send the XML has proper data to the SP would mean to call and SP 1000 times or to have user defined data table (only exists in SQL 2008 and VB6 doesn't support it... this is our big problem).

    That why we decided to use the XML to send the data to the SP and then create the temp table with its data to be used in all the process (we are code reviewing the process and so far found over 50 SP, FN and TRG that are called "along" the way...).

    Pedro



    If you need to work better, try working less...

  • PiMané (9/19/2013)


    Jeff Moden (9/19/2013)


    PiMané (9/4/2013)


    What's the best way to store the XML?

    The best way, IMHO, is to shred it on receipt and store it as properly normalized data. XML is formatted data and one of the worst data sins there is is to store formatted data.

    The problem is that we have clients that have receipts with over 1000 lines, so to send the XML has proper data to the SP would mean to call and SP 1000 times or to have user defined data table (only exists in SQL 2008 and VB6 doesn't support it... this is our big problem).

    That why we decided to use the XML to send the data to the SP and then create the temp table with its data to be used in all the process (we are code reviewing the process and so far found over 50 SP, FN and TRG that are called "along" the way...).

    Pedro

    Heh... you asked what the best way to store XML is. I suggested shredding it and storing it in normalized tables. Are you storing the XML in a table or not?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply