﻿<?xml version='1.0' encoding='UTF-8'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>SQLServerCentral / Article Discussions / Article Discussions by Author / Discuss content posted by Sunny  / Gain Space Using XML data type / Latest Posts</title><generator>InstantForum.NET v2.9.0</generator><description>SQLServerCentral</description><link>http://www.sqlservercentral.com/Forums/</link><webMaster>notifications@sqlservercentral.com</webMaster><lastBuildDate>Wed, 19 Jun 2013 18:27:19 GMT</lastBuildDate><ttl>20</ttl><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Just be aware that XML indexes are huge so if you want to get any performance out of querying the data, you're actually going to end up using a lot more space!!</description><pubDate>Thu, 13 Aug 2009 10:45:07 GMT</pubDate><dc:creator>Robert Davis</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Yes, this turned out to be another great discussion - as so often happens on SSC, an article prompts conversation which is just as interesting as the article itself, in many ways.  Cheers everyone, whether you actively participated or 'watched from the sidelines'!Paul</description><pubDate>Mon, 10 Aug 2009 04:37:39 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>[quote][b]Sunny (8/10/2009)[/b][hr]Thanks a lot again Paul for the detailed information. Your every post gets me fresh and new useful tips.:-)[/quote]...and for those of us watching from the sidelines, it's been great too!Thanks for all the effort, guys!</description><pubDate>Mon, 10 Aug 2009 00:32:19 GMT</pubDate><dc:creator>David McKinney</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Thanks a lot again Paul for the detailed information. Your every post gets me fresh and new useful tips.:-)</description><pubDate>Mon, 10 Aug 2009 00:28:13 GMT</pubDate><dc:creator>Sunny-138471</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Hey Satish,Many thanks for the comprehensive reply, and for the extra details.  I should make it clear that I am not at all anti-XML; I use it frequently wherever appropriate.  The point of the two scripts I posted was to challenge some of the assertions made in the article, not to try to claim that a non-XML solution is always to be preferred.  There are very few absolutes in the SQL Server world. :-)  That said, I do want to respond to some of the new points raised:Whenever data is inserted out of sequence, page splits may occur and affected indexes will fragment.  This obviously applies to clustered and non-clustered indexes alike.  We should remember that this is only a performance issue if large range scans are performed regularly - logical fragmentation has no effect on singleton selects.If page splitting and fragmentation do cause measurable performance problems, there are many ways to mitigate the issue.  The affected indexes can be created with a fillfactor which retains just enough free space (at the leaf level, and optionally at the intermediate levels using PAD_INDEX) to ensure that pages do not split before the next index reorganization or rebuild.  In Enterprise Edition, online index operations and the ability to rebuild a single partition (offline) make this less problematic on systems which must be continuously available.As far as the space implications of extra non-clustered indexes are concerned, I would ask you to take a look at the second script I posted, which uses a denormalized structure.  This achieves the best space savings of all, and is the fastest at returning results - significantly improving on the previous non-XML implementation.We should also remember that the XML data may require several XML indexes (including a primary index which will be a complete copy of the XML data).  A final point on XML is that once the data length exceeds 8000 bytes, the data is physically stored as a LOB - exactly as for text and image columns (as well as the MAX types) - with all the potential performance issues that implies.Whether the extra processing on INSERT is worthwhile depends on your requirements.  If you are processing millions of new rows each day, it might be an issue.  Often it does not really mater too much.  If the application is sending large blocks of XML, it might be worth breaking it up using the nodes method and storing it as denormalized values simply to avoid the LOB issues mentioned.  Equally, it may make sense to store the XML as-is.  A lot depends on the queries which will access the data - both for retrieval by the application, and for any reporting you may have.  Retrieving data as XML is fairly trivial since we can use FOR XML to efficiently construct XML from the denormalized data in a simple SELECT.So, there are always reasons for and against.  I would probably be tempted to stay with XML storage in your situation - it seems a good fit, so long as ad-hoc data access can be made to work efficiently.  The key factor for me is that the data is not large enough to warrant heavy optimization on the server, and the application clearly prefers to work with XML.My main point is that it is normally possible to produce a non-XML solution which uses less space and performs faster, but that is not often the overriding concern.  Thanks again for an interesting discussion!Paul</description><pubDate>Sat, 08 Aug 2009 08:51:22 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Hi PaulI executed the script and it does prove your point. The script is indeed an intelligent and efficient option. It has made me rethink over our XML implementation which has been in place here for quite some time now. Besides its given me some very important tips which I could really make use of.The XML design however still stands out here for some situational reasons mentioned below:1. Index: Thats the basic difference in the scripts. As mentioned in the article, this design is for an OLTP table wherein data is continuously inserted and the index has to be already in place. Secondly the data is not necessarily in order of columns mentioned in clustered index, not even in order of date. It is uploaded as and when different locations send in their data to the storage center. That would lead to fragmentation and ultimately slow performance. Hence we had to take a decision of dropping the clustered index and creating a non clustered index. That was even before moving to XML. That way space requirement was lesser for XML. Again we could never have Hour before Location in the order of the index for the same reason.2. When a particular location sends in its data, the application preferably sends in the complete data set to the database rather than sending in one datapoint at a time. And what better format to pass data than XML itself. In conventional format, the SQL procedure would have to parse that XML and then perform the insert while in the XML implementation, we just have to insert the XML as-it-is into the table.3. One very important point again only relevant to the situation is the way the data is retrieved back by the application. As I had mentioned, most of the times application requires a complete dataset and not individual datapoints. In such cases the database returns the dataset as an XML as-it-is instead of tabular resultset. The application has better tools of mapping the XML to a graph or any other display. The only reason why the dataset is not stored in binary format is for those rare cases wherein SQL itself has to dig for an individual datapoint and that is the scenario used in your script.I would request you to check for a data insert scenario using clustered index and compare the results. Again please do see the results when the table sends the data in an XML format directly to the application. I tried the following queries:Data entry scenario (on top of 3 years data):[code="sql"]DBCC FREESYSTEMCACHE (N'SQL Plans');GODBCC DROPCLEANBUFFERS;GO/* XML */DECLARE @Data xmlSELECT @Data = '&lt;RT&gt;&lt;RN H="0" V="9.28196157" /&gt;&lt;RN H="1" V="2.88468614" /&gt;&lt;RN H="2" V="5.42318281" /&gt;&lt;RN H="3" V="6.85346648" /&gt;&lt;RN H="4" V="1.56170590" /&gt;&lt;RN H="5" V="2.01391990" /&gt;&lt;RN H="6" V="1.64564738" /&gt;&lt;RN H="7" V="7.37822517" /&gt;&lt;RN H="8" V="4.13525886" /&gt;&lt;RN H="9" V="5.49004453" /&gt;&lt;RN H="10" V="4.15415243" /&gt;&lt;RN H="11" V="1.56415943" /&gt;&lt;RN H="12" V="8.98584731" /&gt;&lt;RN H="13" V="0.22257066" /&gt;&lt;RN H="14" V="7.47536317" /&gt;&lt;RN H="15" V="9.32589388" /&gt;&lt;RN H="16" V="2.11402752" /&gt;&lt;RN H="17" V="8.72059968" /&gt;&lt;RN H="18" V="9.01413286" /&gt;&lt;RN H="19" V="4.20653117" /&gt;&lt;RN H="20" V="8.11476306" /&gt;&lt;RN H="21" V="1.62897999" /&gt;&lt;RN H="22" V="7.17795456" /&gt;&lt;RN H="23" V="1.93667052" /&gt;&lt;/RT&gt;'INSERT Rainfall_XML(LocID,Date,Rainfall) SELECT 1,'20060125',@DataGODBCC DROPCLEANBUFFERS;GO/* SQL */DECLARE @Data xmlSELECT @Data = '&lt;RT&gt;&lt;RN H="0" V="9.28196157" /&gt;&lt;RN H="1" V="2.88468614" /&gt;&lt;RN H="2" V="5.42318281" /&gt;&lt;RN H="3" V="6.85346648" /&gt;&lt;RN H="4" V="1.56170590" /&gt;&lt;RN H="5" V="2.01391990" /&gt;&lt;RN H="6" V="1.64564738" /&gt;&lt;RN H="7" V="7.37822517" /&gt;&lt;RN H="8" V="4.13525886" /&gt;&lt;RN H="9" V="5.49004453" /&gt;&lt;RN H="10" V="4.15415243" /&gt;&lt;RN H="11" V="1.56415943" /&gt;&lt;RN H="12" V="8.98584731" /&gt;&lt;RN H="13" V="0.22257066" /&gt;&lt;RN H="14" V="7.47536317" /&gt;&lt;RN H="15" V="9.32589388" /&gt;&lt;RN H="16" V="2.11402752" /&gt;&lt;RN H="17" V="8.72059968" /&gt;&lt;RN H="18" V="9.01413286" /&gt;&lt;RN H="19" V="4.20653117" /&gt;&lt;RN H="20" V="8.11476306" /&gt;&lt;RN H="21" V="1.62897999" /&gt;&lt;RN H="22" V="7.17795456" /&gt;&lt;RN H="23" V="1.93667052" /&gt;&lt;/RT&gt;'INSERT Rainfall_Con(LocID,Date,[Hour],Rainfall) SELECT 1,'20060125',T.Item.value('./@H','tinyint') AS Hour,T.Item.value('./@V','numeric(10,8)') AS Rainfall FROM @Data.nodes('/RT/RN') AS T(Item)GO[/code]Data retrieval in XML format:[code="sql"]IF OBJECT_ID(N'tempdb..#BitBucket', N'U') IS NOT NULL DROP TABLE #BitBucket;GO---- This table is used to capture the results from the test-- SELECT statements.  This avoids any delays in sending the-- rows to the client from skewing the results--CREATE TABLE #BitBucket    (    [LocID]       INTEGER NOT NULL,    [Date]        DATETIME NOT NULL,    [Hour]        TINYINT NOT NULL,    [Rainfall]    DECIMAL(10,8) NOT NULL,    );GO---- TEST START--RAISERROR('Starting test', 0, 1) WITH NOWAIT;---- Remove all ad-hoc SQL query plans from cache--DBCC FREESYSTEMCACHE (N'SQL Plans');GO---- Start with a cold data cache-- Each implementation is run 10 times-- One time with a cold cache, then nine times with a warm cache-- This seems more representative to me, feel free to change it--DBCC DROPCLEANBUFFERS;GO/* SQL */INSERT  #BitBucket WITH (TABLOCKX)SELECT  TP.[LocID],        TP.[Date],        TP.[Hour],        TP.[Rainfall]FROM    dbo.Rainfall_Con TP WITH (READUNCOMMITTED) WHERE TP.LocID IN (1,4,6,8) AND TP.Date = '20080106'GO 10---- For fairness, recreate the dump table for the second run-- TRUNCATE TABLE might give the second run an unfair advantage-- since mixed and uniform extent allocations would not need-- to be re-done--DROP TABLE #BitBucket;GOCREATE TABLE #BitBucket    (    [LocID]       INTEGER NOT NULL,    [Date]        DATETIME NOT NULL,    [Rainfall]    XML NOT NULL,    );GO---- Second run - XML--DBCC DROPCLEANBUFFERS;GO/* XML */INSERT  #BitBucket WITH (TABLOCKX)SELECT TP.[LocID],        TP.[Date],        TP.Rainfall FROM Rainfall_XML TP WITH (READUNCOMMITTED) WHERE TP.LocID IN (1,4,6,8) AND TP.Date = '20080106'-- (4 row(s) affected)GO 10[/code]I do agree that this design cannot be generalized at all and would apply to only a limited number of scenarios. Your feedback would be sincerely appreciated.ThanksSatish More</description><pubDate>Sat, 08 Aug 2009 07:39:45 GMT</pubDate><dc:creator>Sunny-138471</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Excellent explanation, Paul. Perhaps we should have you writing XML articles ;-)</description><pubDate>Thu, 06 Aug 2009 19:42:46 GMT</pubDate><dc:creator>Steve Jones - SSC Editor</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Very interesting indeed!  Thanks for the great write up, Paul.  That certainly puts a different slant on "the rumor".</description><pubDate>Thu, 06 Aug 2009 19:08:43 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>[quote][b]Jeff Moden (8/6/2009)[/b][hr]Ah... got it.  My only confusion now would be the same as on the MSDN thread because the note in BOL simply states "The MSXML parser uses one-eighth the total memory available for SQL Server."  It doesn't say "up to"...[/quote]Yes the wording could be better.  I refreshed my memory on the detail last night (one of Ken Henderson's excellent books) so I just want to add:SQL Server uses MSXML and the DOM to process documents you load via sp_xml_prepare&amp;#100;ocument. It restricts the virtual memory MSXML can use for DOM processing to [i]one-eighth of the physical memory on the machine or 500MB, whichever is less[/i].It's unlikely that MSXML would be able to get 500MB of virtual memory since SQL Server reserves most of the user mode address space for use by the buffer cache.  MSXML allocates memory via the multi-page allocator (the so-called MemToLeave region) which has a default maximum size of 256MB.  Since that region is used for all sorts of stuff (by SQL Server as well) it's likely to be much less than that.MSXML is limited to 500MB of virtual memory [i]regardless of the amount of memory on the machine[/i] because SQL Server calls the GlobalMemoryStatus Win32 API function to determine the amount of available physical memory. On machines with more than 4GB of physical memory, GlobalMemoryStatus can return incorrect information, so Windows returns a -1 to indicate an overflow. The Win32 API function GlobalMemoryStatusEx exists to address this shortcoming, but SQLXML does not call it. :laugh:One last thing that I had forgotten - MSXML had an ordinal-only DLL entryn point added specifically for SQL Server to allow it to set the maximum virtual memory to use.  Interesting stuff!Paul</description><pubDate>Thu, 06 Aug 2009 16:01:27 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>[quote][b]Paul White (8/5/2009)[/b][hr][quote][b]Jeff Moden (8/5/2009)[/b][hr]As a side bar... There is a rumor (and I STRESS the word RUMOR!) I heard and I can't confirm it or deny it.  The rumor says that when you use any of the XML extensions in SQL Server, that it automatically allocates at least 1/8th of the available memory to the task.  Does anyone know of a reference that can confirm or deny that rumor?[/quote]I think this is the thing that causes the confusion: [u][url=http://msdn.microsoft.com/en-us/library/ms187367.aspx]sp_xml_preparedocument[/url][/u]So that's just saying that when you use the XML parser to open a document for use with OPENXML, [i]up to[/i] one-eighth of SQL Server memory is [i]potentially [/i]available to Msxmlsql.dll - [i]should it need it[/i].This is the first MSFT answer I found with Google to back that up: [u][url=http://social.msdn.microsoft.com/forums/en-US/sqlxml/thread/51b46347-adee-4ac2-9b73-d4cd31aa2793/]MSDN Forums Thread[/url][/u].My impression (though the method has so far escaped deprecation) is that XML DML methods/XQuery/XPath/FOR XML and the gang are to be preferred to using sp_xml_preparedocument and OPENXML wherever possible.  There are many difficulties associated with OPENXML so personally I avoid it.To be clear, all that applies specifically to prepare/remove document and OPENXML.  XML variables, columns and so on allocate memory in the usual fashion.  That applies to the XML portion of the query optimizer too.  One interesting thing about xml variables (like all the MAX datatypes when used as variables) is that they require tempdb workspace.Many of the XML features use streaming interfaces to minimize memory consumption - bulk loading XML is a good example of this.  See [u][url=http://msdn.microsoft.com/en-us/library/ms345117(SQL.90).aspx#sql2k5xml_topic3]XML Support in Microsoft SQL Server 2005[/url][/u]Paul[/quote]Ah... got it.  My only confusion now would be the same as on the MSDN thread because the note in BOL simply states "The MSXML parser uses one-eighth the total memory available for SQL Server."  It doesn't say "up to"...</description><pubDate>Thu, 06 Aug 2009 12:15:12 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Don't forget to include the sp_xml_removedocument or you will have huge problems with your SQL Server instance causing the eventual reboot of the database server.</description><pubDate>Thu, 06 Aug 2009 08:17:10 GMT</pubDate><dc:creator>stevstr</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>[quote][b]David McKinney (8/6/2009)[/b][hr]....it's ok, now....appears to be javascript related![/quote]Welcome to the users-frustrated-with-the-site-code-windows club! :-)</description><pubDate>Thu, 06 Aug 2009 04:37:00 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>....it's ok, now....appears to be javascript related!</description><pubDate>Thu, 06 Aug 2009 04:11:10 GMT</pubDate><dc:creator>David McKinney</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Paul,When I copy / paste your script into SSMS or a text editor, it all appears on one line.Any ideas what am I doing wrong?</description><pubDate>Thu, 06 Aug 2009 04:09:13 GMT</pubDate><dc:creator>David McKinney</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Here's an interesting variation:This script is the same as the last, except it also includes a test run with a denormalized version.Sample results included in the script comments as before.(The script also includes commented-out sections to try using vardecimal 'compression' too)Paul[code="sql"]-- *****************************************************************************************-- WARNING: This script contains commands to clear the server-wide data and procedure caches-- Do not run on anything other than a TEST instance!---- Please ensure Actual Execution Plan is OFF for representative results-- Total test time is 30s to 1 minute (including test data creation), depending on hardware-- *****************************************************************************************---- Create a test database--USE master;GORAISERROR('Creating test database...', 0, 1) WITH NOWAIT;CREATE DATABASE [E059791C-4F24-40CF-A54A-B21C34F6610D];GO---- Switch to it--USE [E059791C-4F24-40CF-A54A-B21C34F6610D];GO---- Enable vardecimal format----EXECUTE sp_db_vardecimal_storage_format --            @dbname = [E059791C-4F24-40CF-A54A-B21C34F6610D],--            @vardecimal_storage_format = 'ON';GO---- Drop the test tables if they exist--IF OBJECT_ID(N'dbo.Rainfall_Con', N'U') IS NOT NULL DROP TABLE dbo.Rainfall_ConIF OBJECT_ID(N'dbo.Rainfall_XML', N'U') IS NOT NULL DROP TABLE Rainfall_XML;IF OBJECT_ID(N'dbo.Rainfall_DN', N'U') IS NOT NULL DROP TABLE dbo.Rainfall_Con_DN;IF OBJECT_ID(N'tempdb..#BitBucket', N'U') IS NOT NULL DROP TABLE #BitBucket;GO---- Create the conventional table--CREATE TABLE dbo.Rainfall_Con    (    [LocID]       INTEGER NOT NULL,    [Date]        DATETIME NOT NULL,    [Hour]        TINYINT NOT NULL,    [Rainfall]    DECIMAL(10,8) NOT NULL,    );GO---- Populate the conventional table with one year's data for 100 locations-- This should complete in less than 30 seconds--DECLARE    @Rows BIGINT;SET        @Rows = 100 * 365 * 24; -- 24 hours' data for 365 days for 100 locationsRAISERROR('Creating test data...', 0, 1) WITH NOWAIT;;WITH   Numbers (N)AS      (        -- Dynamically create rows numbered 0 to 875999        SELECT  TOP (@Rows)                ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1        FROM    master.sys.all_columns C1,                 master.sys.all_columns C2,                master.sys.all_columns C3        )INSERT  dbo.Rainfall_Con WITH (TABLOCKX)        (        [LocID],         [Date],         [Hour],         [Rainfall]        )SELECT  LocID = (N / 365 / 24) + 1,        [Date] = DATEADD(DAY, (N / 24) % 366 , '2008-01-01 00:00:00.000'),        [Hour] = N % 24,        [Rainfall] = RAND(CHECKSUM(NEWID())) * 10 -- A new random value for each rowFROM    Numbers;GO---- Now create the primary key-- Doing this *after* inserting the data results in a compact and contiguous index-- If the index were to exist during the insert, index pages may split resulting in -- a lower data density, and significant fragmentation--RAISERROR('Creating index...', 0, 1) WITH NOWAIT;ALTER TABLE dbo.Rainfall_Con     ADD CONSTRAINT PK_Rainfall_Con         PRIMARY KEY CLUSTERED([Date], [Hour], [LocID])         WITH (FILLFACTOR = 100);GO ---- Create the XML table--CREATE TABLE dbo.Rainfall_XML    (    [LocID]       INTEGER NOT NULL,    [Date]        DATETIME NOT NULL,    [Rainfall]    XML NOT NULL,    );GO---- Populate the XML table from the conventional table--RAISERROR('Creating XML data...', 0, 1) WITH NOWAIT;INSERT  Rainfall_XML WITH (TABLOCKX)SELECT  [LocID],        [Date],        (            -- The hourly data as XML            SELECT    H = [Hour],                    V = [Rainfall]            FROM    dbo.Rainfall_Con A WITH (READUNCOMMITTED)            WHERE    A.[LocID] = B.[LocID]             AND        A.[Date] = B.[Date]            FOR XML RAW('RN'), ROOT('RT')        )FROM    Rainfall_Con B WITH (READUNCOMMITTED)GROUP   BY         [LocID],        [Date];GO---- Add the primary key now, for the same reasons as before--RAISERROR('Creating index...', 0, 1) WITH NOWAIT;ALTER TABLE dbo.Rainfall_XML     ADD CONSTRAINT PK_Rainfall_XML         PRIMARY KEY CLUSTERED([Date], [LocID])         WITH (FILLFACTOR = 100);GO ---- Create the denormalized table--CREATE TABLE dbo.Rainfall_Con_DN    (    [LocID]         INTEGER NOT NULL,    [Date]          DATETIME NOT NULL,    [0]             DECIMAL(10,8) NOT NULL,    [1]             DECIMAL(10,8) NOT NULL,    [2]             DECIMAL(10,8) NOT NULL,    [3]             DECIMAL(10,8) NOT NULL,    [4]             DECIMAL(10,8) NOT NULL,    [5]             DECIMAL(10,8) NOT NULL,    [6]             DECIMAL(10,8) NOT NULL,    [7]             DECIMAL(10,8) NOT NULL,    [8]             DECIMAL(10,8) NOT NULL,    [9]             DECIMAL(10,8) NOT NULL,    [10]            DECIMAL(10,8) NOT NULL,    [11]            DECIMAL(10,8) NOT NULL,    [12]            DECIMAL(10,8) NOT NULL,    [13]            DECIMAL(10,8) NOT NULL,    [14]            DECIMAL(10,8) NOT NULL,    [15]            DECIMAL(10,8) NOT NULL,    [16]            DECIMAL(10,8) NOT NULL,    [17]            DECIMAL(10,8) NOT NULL,    [18]            DECIMAL(10,8) NOT NULL,    [19]            DECIMAL(10,8) NOT NULL,    [20]            DECIMAL(10,8) NOT NULL,    [21]            DECIMAL(10,8) NOT NULL,    [22]            DECIMAL(10,8) NOT NULL,    [23]            DECIMAL(10,8) NOT NULL,    );GO--EXECUTE sp_tableoption --            @TableNamePattern = [dbo.Rainfall_Con_DN],--            @OptionName = 'vardecimal storage format' ,--            @OptionValue = 'ON';GO---- Populate the denormalized table from the conventional table--RAISERROR('Creating denormalized data...', 0, 1) WITH NOWAIT;INSERT  dbo.Rainfall_Con_DN WITH (TABLOCKX)        (        [LocID],        [Date],        [0],[1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],        [13],[14],[15],[16],[17],[18],[19],[20],[21],[22],[23]        )SELECT  P.[LocID],        P.[Date],        P.[0],P.[1],P.[2],P.[3],P.[4],P.[5],P.[6],        P.[7],P.[8],P.[9],P.[10],P.[11],P.[12],        P.[13],P.[14],P.[15],P.[16],P.[17],P.[18],        P.[19],P.[20],P.[21],P.[22],P.[23]FROM    (        SELECT  DISTINCT                [LocID],                [Date]        FROM    dbo.Rainfall_Con WITH (READUNCOMMITTED)        ) BCROSSAPPLY   (        -- The hourly data        SELECT  H = [Hour],                V = [Rainfall]        FROM    dbo.Rainfall_Con A WITH (READUNCOMMITTED)        WHERE   A.[LocID] = B.[LocID]         AND     A.[Date] = B.[Date]        ) APIVOT   (        MAX(V)         FOR H        IN  (            [0],[1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],            [13],[14],[15],[16],[17],[18],[19],[20],[21],[22],[23]            )        ) PGO---- Add the primary key now, for the same reasons as before--RAISERROR('Creating index...', 0, 1) WITH NOWAIT;ALTER TABLE dbo.Rainfall_Con_DN    ADD CONSTRAINT PK_Rainfall_Con_DN         PRIMARY KEY CLUSTERED([Date], [LocID])         WITH (FILLFACTOR = 100);GO ---- Show the space used by each table---- Results on my system:---- Conventional : 876,000 rows, 27016KB reserved, 26856KB data,  96KB index, 64KB unused-- XML          :  36,500 rows, 36504KB reserved, 36504KB data, 112KB index, 64KB unused-- Denormalized :  36,500 rows,  8968KB reserved,  8856KB data,  40KB index, 72KB unused-- Denormalized :  36,500 rows,  7240KB reserved,  7128KB data,  40KB index, 72KB unused (with vardecimal storage)---- ### XML representation is 35% *larger* ###--EXECUTE sp_spaceused Rainfall_Con;EXECUTE sp_spaceused Rainfall_XML;EXECUTE sp_spaceused Rainfall_Con_DN;GO---- This table is used to capture the results from the test-- SELECT statements.  This avoids any delays in sending the-- rows to the client from skewing the results--CREATE TABLE #BitBucket    (    [LocID]       INTEGER NOT NULL,    [Date]        DATETIME NOT NULL,    [Hour]        TINYINT NOT NULL,    [Rainfall]    DECIMAL(10,8) NOT NULL,    );GO---- TEST START--RAISERROR('Starting test', 0, 1) WITH NOWAIT;---- Remove all ad-hoc SQL query plans from cache--DBCC FREESYSTEMCACHE (N'SQL Plans');GO---- Start with a cold data cache-- Each implementation is run 10 times-- One time with a cold cache, then nine times with a warm cache-- This seems more representative to me, feel free to change it--DBCC DROPCLEANBUFFERS;GO/* SQL */INSERT  #BitBucket WITH (TABLOCKX)SELECT  TP.[LocID],        TP.[Date],        TP.[Hour],        TP.[Rainfall]FROM    dbo.Rainfall_Con TP WITH (READUNCOMMITTED)WHERE   TP.[Date] = '20080106' AND     TP.[Hour] = 15ORDER   BY         LocID ASC;GO 10---- For fairness, recreate the dump table for the second run-- TRUNCATE TABLE might give the second run an unfair advantage-- since mixed and uniform extent allocations would not need-- to be re-done--DROP TABLE #BitBucket;GOCREATE TABLE #BitBucket    (    [LocID]       INTEGER NOT NULL,    [Date]        DATETIME NOT NULL,    [Hour]        TINYINT NOT NULL,    [Rainfall]    DECIMAL(10,8) NOT NULL,    );GO---- Second run - XML--DBCC DROPCLEANBUFFERS;GO/* XML */INSERT  #BitBucket WITH (TABLOCKX)SELECT  TP.LocID,        TP.[Date],        [Hour] = Rainfall.value('(/RT/RN[@H=15]/@H)[1]','tinyint'),        Rainfall_Con = Rainfall.value('(/RT/RN[@H=15]/@V)[1]','numeric(10,8)')FROM    dbo.Rainfall_XML TP  WITH (READUNCOMMITTED)WHERE   TP.[Date] = '20080106'ORDER   BY         LocID;GO 10---- Recreate the dump table for the third run--DROP TABLE #BitBucket;GOCREATE TABLE #BitBucket    (    [LocID]       INTEGER NOT NULL,    [Date]        DATETIME NOT NULL,    [Hour]        TINYINT NOT NULL,    [Rainfall]    DECIMAL(10,8) NOT NULL,    );GO---- Third run - denormalized--DBCC DROPCLEANBUFFERS;GO/* DN */INSERT  #BitBucket WITH (TABLOCKX)SELECT  TP.LocID,        TP.[Date],        [Hour] = CONVERT(TINYINT, 15),        Rainfall = [15]FROM    dbo.Rainfall_Con_DN TP  WITH (READUNCOMMITTED)WHERE   TP.[Date] = '20080106'ORDER   BY         LocID;GO 10---- Show the total execution resource usage for all ten runs--SELECT  query = LEFT(QT.[text], 9),        execution_plan = QP.query_plan,        run_count = QS.execution_count,        total_cpu_time_μs = QS.total_worker_time,        total_logical_reads = QS.total_logical_reads,        total_elapsed_time_μs = QS.total_elapsed_time,        avg_cpu_time_μs = QS.total_worker_time / QS.execution_count,        avg_logical_reads = QS.total_logical_reads / QS.execution_count,        avg_elapsed_time_μs = QS.total_elapsed_time / QS.execution_countFROM    sys.dm_exec_query_stats QSCROSSAPPLY   sys.dm_exec_sql_text (QS.[sql_handle]) QTCROSSAPPLY   sys.dm_exec_query_plan (QS.[plan_handle]) QPWHERE   QT.[text] LIKE '%Rainfall%'AND     QT.[text] NOT LIKE '%dm_exec_query_stats%'ORDER   BY        QS.last_execution_time ASC;GO---- Sample results:---- SQL Averages: CPU   390μs; Elapsed time:   4785μs; Logical reads: 113--  DN Averages: CPU   488μs; Elapsed time:   1660μs; Logical reads: 117 (without vardecimal)-- XML Averages: CPU 76171μs; Elapsed time:  79492μs; Logical reads: 126---- Tidy up--GOIF OBJECT_ID(N'dbo.Rainfall_Con', N'U') IS NOT NULL DROP TABLE dbo.Rainfall_ConIF OBJECT_ID(N'dbo.Rainfall_XML', N'U') IS NOT NULL DROP TABLE Rainfall_XML;IF OBJECT_ID(N'dbo.Rainfall_DN', N'U') IS NOT NULL DROP TABLE dbo.Rainfall_Con_DN;IF OBJECT_ID(N'tempdb..#BitBucket', N'U') IS NOT NULL DROP TABLE #BitBucket;GOUSE master;GODROP DATABASE [E059791C-4F24-40CF-A54A-B21C34F6610D];GORAISERROR('Test run complete', 0, 1) WITH NOWAIT;---- END SCRIPT--[/code]</description><pubDate>Thu, 06 Aug 2009 03:49:27 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>[quote][b]Jeff Moden (8/5/2009)[/b][hr]As a side bar... There is a rumor (and I STRESS the word RUMOR!) I heard and I can't confirm it or deny it.  The rumor says that when you use any of the XML extensions in SQL Server, that it automatically allocates at least 1/8th of the available memory to the task.  Does anyone know of a reference that can confirm or deny that rumor?[/quote]I think this is the thing that causes the confusion: [u][url=http://msdn.microsoft.com/en-us/library/ms187367.aspx]sp_xml_preparedocument[/url][/u]So that's just saying that when you use the XML parser to open a document for use with OPENXML, [i]up to[/i] one-eighth of SQL Server memory is [i]potentially [/i]available to Msxmlsql.dll - [i]should it need it[/i].This is the first MSFT answer I found with Google to back that up: [u][url=http://social.msdn.microsoft.com/forums/en-US/sqlxml/thread/51b46347-adee-4ac2-9b73-d4cd31aa2793/]MSDN Forums Thread[/url][/u].My impression (though the method has so far escaped deprecation) is that XML DML methods/XQuery/XPath/FOR XML and the gang are to be preferred to using sp_xml_preparedocument and OPENXML wherever possible.  There are many difficulties associated with OPENXML so personally I avoid it.To be clear, all that applies specifically to prepare/remove document and OPENXML.  XML variables, columns and so on allocate memory in the usual fashion.  That applies to the XML portion of the query optimizer too.  One interesting thing about xml variables (like all the MAX datatypes when used as variables) is that they require tempdb workspace.Many of the XML features use streaming interfaces to minimize memory consumption - bulk loading XML is a good example of this.  See [u][url=http://msdn.microsoft.com/en-us/library/ms345117(SQL.90).aspx#sql2k5xml_topic3]XML Support in Microsoft SQL Server 2005[/url][/u]Paul</description><pubDate>Wed, 05 Aug 2009 21:27:51 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>As a side bar... There is a rumor (and I STRESS the word RUMOR!) I heard and I can't confirm it or deny it.  The rumor says that when you use any of the XML extensions in SQL Server, that it automatically allocates at least 1/8th of the available memory to the task.  Does anyone know of a reference that can confirm or deny that rumor?</description><pubDate>Wed, 05 Aug 2009 19:53:01 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>So don't get me wrong:  XML support is a useful addition to the SQL Server toolkit - even in OLTP applications.  As usual it's a case of using the right tool for the job.  My main objection to this article is the claim that the equivalent XML structure is either smaller or faster than the 'traditional' equivalent.  Unless we deliberately do something dumb, the traditional approach will smaller, and faster -I can't even think of an 'edge' case where this isn't true (provided dumbness is avoided).  XML has definite advantages, including for semi-structured data and communicating with external components like web services...it's just not a panacea, especially when it comes to storage efficiency or outright speed.The article does include some great examples of query syntax with XML though!Paul</description><pubDate>Wed, 05 Aug 2009 16:14:26 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Why bother with the database at all then? Just store the XML directly on the disk? :-DSeriously though, in today's day and age physical space is not usually an issue.And, if you are going to be querying the data often, you will get better performance from a relational model.The example with Candidates and Tests hit close to home.We have a similar scenario and the original decision (made years ago) was to store the data as XML, for some of the reasons mentioned in the article.But -- guess what? -- querying the data became slower and slower...Eventually we had to write an APP that breaks the XML back into relational tables. :w00t:</description><pubDate>Wed, 05 Aug 2009 07:41:09 GMT</pubDate><dc:creator>Goldie Lesser</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>The application of XML in an OLTP system is not very scalable.  The goal of any OLTP system is to increate the TPS.  XML is not built for speed.  Also consider that most OLTP sytems are highly normalized.  Good luck joining from XML columns out to other tables and retaining your performance.  I see XML fields more applicable in static data scenarios.  Also don't forget that SQL Server provides primary and secondary XML index functionality.  This can be used to increase query performance over your XML, but great care is needed in designing your XML schema.Steve</description><pubDate>Wed, 05 Aug 2009 05:37:18 GMT</pubDate><dc:creator>stevstr</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Thanks to all for the comments and suggestions. Really appreciate them.Paul - Thanks for the script. I shall go through it during the weekend and let you know.</description><pubDate>Wed, 05 Aug 2009 04:52:06 GMT</pubDate><dc:creator>Sunny-138471</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>[quote][b]Paul White (8/4/2009)[/b][hr]@Jeff:  Thanks so much - have you noticed how many of my scripts have at least a passing resemblance to some you have posted in the past?  Not entirely co-incidental ;-)[/quote]Heh... oboy have I noticed.  Thank you for the high compliment.  The embedded comments are the berries, too!  If someone can't figure out what you've done by looking at your code, it's because they simply can't read.  Nicely done.</description><pubDate>Tue, 04 Aug 2009 20:11:05 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>[quote][b]Steve Jones - Editor (8/4/2009)[/b][hr]Didn't test it, and perhaps there are none. In any case, I think it's been a nice debate/discussion of the technique.[/quote]Hey Steve!CTRL+C, CTRL+V, F5 :laugh:(The test results from a run on my machine are also embedded in the comments).Seriously though, XML is cool an' all - and I have been known to dabble some with XQuery, XPath and the rest, but it is easy to get carried away with it and apply XML to everything.@G[sup]2[/sup]: Good point.  Sparse columns have some advantages in 2K8 that make them well worth a look too.  The column_set column that gives you formatted XML based on just the columns with values in is very cool.@Jeff:  Thanks so much - have you noticed how many of my scripts have at least a passing resemblance to some you have posted in the past?  Not entirely co-incidental ;-)</description><pubDate>Tue, 04 Aug 2009 19:04:15 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Didn't test it, and perhaps there are none. In any case, I think it's been a nice debate/discussion of the technique.</description><pubDate>Tue, 04 Aug 2009 16:54:23 GMT</pubDate><dc:creator>Steve Jones - SSC Editor</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>[quote][b]Steve Jones - Editor (8/4/2009)[/b][hr]It's an interesting look from a developer point of view. Compared to the power and speed of the relational engine, I think space savings are a minimal thing to worry about. [/quote]Space savings are certainly important to me especially when it comes to backups and restores but according to Paul's script... what space savings?</description><pubDate>Tue, 04 Aug 2009 15:54:31 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>[quote][b]Paul White (8/4/2009)[/b][hr]This script is closely based on that in the article.I would strongly encourage the author of this article to try it out :-)Nice idea for an article, but size and speed are not XML strengths! ;-)Paul[/quote]MythBuster extraordinaire... that script goes into my "Pork Chop Hall of Fame".  Nice job, Paul.</description><pubDate>Tue, 04 Aug 2009 15:51:48 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>I think the merit in this article lies more in the techniques that are covered, rather than in the premise that it ultimately failed to prove.The article does show a couple of good xquery examples in context - how to select a value from a node and incorporate it in a resultset, and how to modify the value of a node.SELECT Rainfall.value('(/RT/RN[@H=15]/@H)[1]','tinyint') AS Hour,Rainfall.value('(/RT/RN[@H=15]/@V)[1]','numeric(10,8)') AS RainfallUPDATE TP SET Rainfall.modify('replace value of (/RT/RN[@H="20"]/@V)[1] with "8.12345678" ')All in all, not a bad article, well written and clear explanations, even if in some respects fundamentally flawed!</description><pubDate>Tue, 04 Aug 2009 08:37:29 GMT</pubDate><dc:creator>David McKinney</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>It's an interesting look from a developer point of view. Compared to the power and speed of the relational engine, I think space savings are a minimal thing to worry about. Disk prices fall at a rate far beyond that of the improvements in CPU and memory when you compare price/performance.I agree with many of the comments, that a well designed relational system is probably better, but this is an interesting solution and idea. If disk space matters a lot, perhaps it's worth considering, but I would recommend that you investigate some of the other suggestions first.</description><pubDate>Tue, 04 Aug 2009 08:24:55 GMT</pubDate><dc:creator>Steve Jones - SSC Editor</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Another thing to point out about XML's use of space is that, barring compression, the data is being stored as characters wrapped in tags, making storage of integer, numeric, or floating point data LESS efficient.[code]Datatype   SQL Length             XML                              XML Length---------   -----------          ----                             ------------TinyInt      1 Byte             &lt;V&gt;255&lt;/V&gt;                         10SmallInt     2 Bytes             &lt;V&gt;32767&lt;/V&gt;                      12Int          4 Bytes             &lt;V&gt;2147483647&lt;/V&gt;                 17FLOAT        8 Bytes             &lt;V&gt;-2.23.40E-308&lt;/V&gt;              20[/code]In the author's example, he chose numeric(10,8) to store the rainfall amount, which would require only 9 bytes for SQL to store, but 18 for XML to store (&lt;v&gt;90.12345678&lt;/v&gt;).    The author also chose to impose an [hour] column on the conventional table, which was unnecessary, because it could have been included as part of the [date] column.    Finally, he could have chosen to have a separate column for each hour in his schema for the conventional table, rather than making it one row per hour.   (That wouldn't have been my first choice, because I feel it lacks flexibility in manipulating the data.   But it is essentially the approach that was taken with the table that used XML.)</description><pubDate>Tue, 04 Aug 2009 07:54:41 GMT</pubDate><dc:creator>The Dixie Flatline</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Where I've found the XML data type useful has been in situations where you can't control schema changes to a relation (table) and want to record all data from it even if columns are added, modified or removed.Logging triggers, or archiving processes in these cases can take advantage of "select * from table for xml" and insert into an XML column in a log/archive/audit table.For semi-structured data that has variable schema elements, XML is pretty ideal.  Beyond that, relational data structures are better.</description><pubDate>Tue, 04 Aug 2009 06:53:43 GMT</pubDate><dc:creator>GSquared</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>If you make your primary key in both tables CLUSTERED (which would be a reasonable choice in this case given that some of the queries are ORDERing/GROUPing by locid), the storage requirements are somewhat different... XML now uses more storage.[code="sql"]name           rows    reserved    data     index_size	unusedRainfall_XML   36600  36744 KB   36600 KB   112 KB	32 KBname	         rows     reserved   data     index_size	unusedRainfall_Con   878400  27080 KB   26928 KB   96 KB	56 KB[/code]</description><pubDate>Tue, 04 Aug 2009 06:45:17 GMT</pubDate><dc:creator>Ian Scarlett</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Nice article, I learned something about XML in SQL Server today.As Paul pointed out, it's impossible that XML storage needs less space than conventional storage.With appropriate indexing, there's no way that XML works faster than a relational table.I suggest that you add some more comments to your article to avoid misleading somebody on this topic.Nice idea, anyway.Gianluca</description><pubDate>Tue, 04 Aug 2009 05:53:47 GMT</pubDate><dc:creator>spaghettidba</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>This script is closely based on that in the article.I would strongly encourage the author of this article to try it out :-)Nice idea for an article, but size and speed are not XML strengths! ;-)Paul[code="sql"]-- *****************************************************************************************-- WARNING: This script contains commands to clear the server-wide data and procedure caches-- Do not run on anything other than a TEST instance!---- Please ensure Actual Execution Plan is OFF for representative results-- Total test time is 10 to 60 seconds (including test data creation), depending on hardware-- *****************************************************************************************USE tempdb;GO---- Drop the test tables if they exist--IF OBJECT_ID(N'dbo.Rainfall_Con', N'U') IS NOT NULL DROP TABLE dbo.Rainfall_ConIF OBJECT_ID(N'dbo.Rainfall_XML', N'U') IS NOT NULL DROP TABLE Rainfall_XML;IF OBJECT_ID(N'tempdb..#BitBucket', N'U') IS NOT NULL DROP TABLE #BitBucket;GO---- Create the conventional table--CREATE TABLE dbo.Rainfall_Con    (    [LocID]       INTEGER NOT NULL,    [Date]        DATETIME NOT NULL,    [Hour]        TINYINT NOT NULL,    [Rainfall]    DECIMAL(10,8) NOT NULL,    );GO---- Populate the conventional table with one year's data for 100 locations-- This should complete in less than 30 seconds--DECLARE    @Rows BIGINT;SET        @Rows = 100 * 365 * 24; -- 24 hours' data for 365 days for 100 locationsRAISERROR('Creating test data...', 0, 1) WITH NOWAIT;;WITH   Numbers (N)AS      (        -- Dynamically create rows numbered 0 to 875999        SELECT  TOP (@Rows)                ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1        FROM    master.sys.all_columns C1,                 master.sys.all_columns C2,                master.sys.all_columns C3        )INSERT  dbo.Rainfall_Con WITH (TABLOCKX)        (        [LocID],         [Date],         [Hour],         [Rainfall]        )SELECT  LocID = (N / 365 / 24) + 1,        [Date] = DATEADD(DAY, (N / 24) % 366 , '2008-01-01 00:00:00.000'),        [Hour] = N % 24,        [Rainfall] = RAND(CHECKSUM(NEWID())) * 10 -- A new random value for each rowFROM    Numbers;GO---- Now create the primary key-- Doing this *after* inserting the data results in a compact and contiguous index-- If the index were to exist during the insert, index pages may split resulting in -- a lower data density, and significant fragmentation--RAISERROR('Creating index...', 0, 1) WITH NOWAIT;ALTER TABLE dbo.Rainfall_Con     ADD CONSTRAINT PK_Rainfall_Con         PRIMARY KEY CLUSTERED([Date], [Hour], [LocID])         WITH (FILLFACTOR = 100);GO ---- Create the XML table--CREATE TABLE dbo.Rainfall_XML    (    [LocID]       INTEGER NOT NULL,    [Date]        DATETIME NOT NULL,    [Rainfall]    XML NOT NULL,    );GO---- Populate the XML table from the conventional table--RAISERROR('Creating XML data...', 0, 1) WITH NOWAIT;INSERT  Rainfall_XML WITH (TABLOCKX)SELECT  [LocID],        [Date],        (            -- The hourly data as XML            SELECT    H = [Hour],                    V = [Rainfall]            FROM    dbo.Rainfall_Con A WITH (READUNCOMMITTED)            WHERE    A.[LocID] = B.[LocID]             AND        A.[Date] = B.[Date]            FOR XML RAW('RN'), ROOT('RT')        )FROM    Rainfall_Con B WITH (READUNCOMMITTED)GROUP   BY         [LocID],        [Date];GO---- Add the primary key now, for the same reasons as before--RAISERROR('Creating index...', 0, 1) WITH NOWAIT;ALTER TABLE dbo.Rainfall_XML     ADD CONSTRAINT PK_Rainfall_XML         PRIMARY KEY CLUSTERED([Date], [LocID])         WITH (FILLFACTOR = 100);GO ---- Show the space used by each table---- Results on my system:---- Conventional : 876,000 rows, 27016KB reserved, 26856KB data, 96KB index, 64KB unused-- XML          :  36,500 rows, 36504KB reserved, 36504KB data, 112KB index, 64KB unused---- ### XML representation is 35% *larger* ###--EXECUTE sp_spaceused Rainfall_Con;EXECUTE sp_spaceused Rainfall_XML;GO---- This table is used to capture the results from the test-- SELECT statements.  This avoids any delays in sending the-- rows to the client from skewing the results--CREATE TABLE #BitBucket    (    [LocID]       INTEGER NOT NULL,    [Date]        DATETIME NOT NULL,    [Hour]        TINYINT NOT NULL,    [Rainfall]    DECIMAL(10,8) NOT NULL,    );GO---- TEST START--RAISERROR('Starting test', 0, 1) WITH NOWAIT;---- Remove all ad-hoc SQL query plans from cache--DBCC FREESYSTEMCACHE (N'SQL Plans');GO---- Start with a cold data cache-- Each implementation is run 10 times-- One time with a cold cache, then nine times with a warm cache-- This seems more representative to me, feel free to change it--DBCC DROPCLEANBUFFERS;GO/* SQL */INSERT  #BitBucket WITH (TABLOCKX)SELECT  TP.[LocID],        TP.[Date],        TP.[Hour],        TP.[Rainfall]FROM    dbo.Rainfall_Con TP WITH (READUNCOMMITTED)WHERE   TP.[Date] = '20080106' AND     TP.[Hour] = 15ORDER   BY         LocID ASC;GO 10---- For fairness, recreate the dump table for the second run-- TRUNCATE TABLE might give the second run an unfair advantage-- since mixed and uniform extent allocations would not need-- to be re-done--DROP TABLE #BitBucket;GOCREATE TABLE #BitBucket    (    [LocID]       INTEGER NOT NULL,    [Date]        DATETIME NOT NULL,    [Hour]        TINYINT NOT NULL,    [Rainfall]    DECIMAL(10,8) NOT NULL,    );GO---- Second run - XML--DBCC DROPCLEANBUFFERS;GO/* XML */INSERT  #BitBucket WITH (TABLOCKX)SELECT  TP.LocID,        TP.[Date],        [Hour] = Rainfall.value('(/RT/RN[@H=15]/@H)[1]','tinyint'),        Rainfall_Con = Rainfall.value('(/RT/RN[@H=15]/@V)[1]','numeric(10,8)')FROM    dbo.Rainfall_XML TP  WITH (READUNCOMMITTED)WHERE   TP.[Date] = '20080106'ORDER   BY         LocID;GO 10---- Show the total execution resource usage for all ten runs--SELECT  query = LEFT(QT.[text], 9),        execution_plan = QP.query_plan,        run_count = QS.execution_count,        total_cpu_time_μs = QS.total_worker_time,        total_logical_reads = QS.total_logical_reads,        total_elapsed_time_μs = QS.total_elapsed_time,        avg_cpu_time_μs = QS.total_worker_time / QS.execution_count,        avg_logical_reads = QS.total_logical_reads / QS.execution_count,        avg_elapsed_time_μs = QS.total_elapsed_time / QS.execution_countFROM    sys.dm_exec_query_stats QSCROSSAPPLY   sys.dm_exec_sql_text (QS.[sql_handle]) QTCROSSAPPLY   sys.dm_exec_query_plan (QS.[plan_handle]) QPWHERE   QT.[text] LIKE '%Rainfall%'AND     QT.[text] NOT LIKE '%dm_exec_query_stats%'ORDER   BY        QS.last_execution_time ASC;GO---- Sample results:---- SQL Averages: CPU   683μs; Elapsed time:   683μs; Logical reads: 114-- XML Averages: CPU 80078μs; Elapsed time: 83691μs; Logical reads: 126---- Tidy up--GOIF OBJECT_ID(N'dbo.Rainfall_Con', N'U') IS NOT NULL DROP TABLE dbo.Rainfall_ConIF OBJECT_ID(N'dbo.Rainfall_XML', N'U') IS NOT NULL DROP TABLE Rainfall_XML;IF OBJECT_ID(N'tempdb..#BitBucket', N'U') IS NOT NULL DROP TABLE #BitBucket;GORAISERROR('Test run complete', 0, 1) WITH NOWAIT;---- END SCRIPT--[/code]</description><pubDate>Tue, 04 Aug 2009 04:53:57 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>A very interesting article indeed!However, I think that you shouldn't compare query performance without any index in the conventional table.If you add an index (code below) and rerun your query, your query performance will be about 1000 times better. (Yet, the space requirement will also increase remarkably.)Rgds,Pexi[code]CREATE NONCLUSTERED INDEX [IX_Rainfall_Con_Date_Hour] ON [Rainfall_Con] (	[Date] ASC,	[Hour] ASC,	[LocID] ASC,	[Rainfall] ASC)WITH (	PAD_INDEX  = OFF, 	STATISTICS_NORECOMPUTE  = OFF, 	SORT_IN_TEMPDB = OFF, 	IGNORE_DUP_KEY = OFF, 	DROP_EXISTING = OFF, 	ONLINE = OFF, 	ALLOW_ROW_LOCKS  = ON, 	ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY][/code]</description><pubDate>Tue, 04 Aug 2009 04:21:13 GMT</pubDate><dc:creator>Pekka Heimonen</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>The space savings are impressive and it does lend itself to a number of real-world situations but you also need to highlight the shortcomings of the XML data type as shown in SQL books onlinehttp://msdn.microsoft.com/en-us/library/ms189887.aspx Limitations of the xml Data Type Note the following general limitations that apply to the xml data type:The stored representation of xml data type instances cannot exceed 2 GB.It cannot be used as a subtype of a sql_variant instance.It does not support casting or converting to either text or ntext. Use varchar(max) or nvarchar(max) instead.It cannot be compared or sorted. This means an xml data type cannot be used in a GROUP BY statement.It cannot be used as a parameter to any scalar, built-in functions other than ISNULL, COALESCE, and DATALENGTH.It cannot be used as a key column in an index. However, it can be included as data in a clustered index or explicitly added to a nonclustered index by using the INCLUDE keyword when the nonclustered index is created</description><pubDate>Tue, 04 Aug 2009 02:53:43 GMT</pubDate><dc:creator>gkrs</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Hi,You can reduce the space similarly without XML. You can use the table instead of the XML column.[code="sql"]CREATE TABLE Rainfall_LocDate (LocDateId int IDENTITY(1, 1) ,LocID int ,Date datetime ,CONSTRAINT PK_Rainfall_LocDate PRIMARY KEY NONCLUSTERED(LocID,Date) )CREATE TABLE Rainfall_Data (LocDateId int ,Hour tinyint  ,Rainfall numeric(10,8) ,CONSTRAINT PK_Rainfall_Data PRIMARY KEY NONCLUSTERED(LocDId,Hour))[/code]Regards,Pavel Krticka</description><pubDate>Tue, 04 Aug 2009 01:53:16 GMT</pubDate><dc:creator>hek</dc:creator></item><item><title>RE: Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Hi,I notice you're using very short tag names in your xml presumably to reduce space used.  If you use a typed xml column i.e. associate it with an xsd, the length of the tags shouldn't matter, and you may get other associated benefits.  When I've some time I'll do a comparison.Regards,David McKinney.</description><pubDate>Tue, 04 Aug 2009 01:28:09 GMT</pubDate><dc:creator>David McKinney</dc:creator></item><item><title>Gain Space Using XML data type</title><link>http://www.sqlservercentral.com/Forums/Topic757986-640-1.aspx</link><description>Comments posted to this topic are about the item [B]&lt;A HREF="/articles/xml+data+type/67388/"&gt;Gain Space Using XML data type&lt;/A&gt;[/B]</description><pubDate>Thu, 23 Jul 2009 00:14:47 GMT</pubDate><dc:creator>Sunny-138471</dc:creator></item></channel></rss>