Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

XML is 17GB on disk... 4MB when cast as varchar(max) Expand / Collapse
Author
Message
Posted Wednesday, April 30, 2014 10:53 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Yesterday @ 12:28 PM
Points: 17, Visits: 198
Hi folks,

I have a table that always contains only 1 row which has a column defined as XML data type. The size on disk for this table with only 1 row is 17GB. When I cast it to a varchar(max) the size is only 4MB.

It is my understanding that the size of an XML column is limited to 2GB. Is SQL Server doing something behind the scenes (like keeping its own internal index for traversing the nodes in the XML document)?

I am running the following query to check space used:
http://www.databasejournal.com/features/mssql/article.php/3733996/Find-space-Usage-by-Table--Schema-in-SQL-Server-2005-and-2008.htm

Here are the results:

schemaname tablename row_count reserved_KB data_KB index_size_KB unused_KB
BadSchema BadTable 1 17015688 17015296 32 360


I ran the following command:

SELECT CAST(BadColumn AS VARCHAR(MAX))
INTO BadSchema.BadTable_VarcharMax
FROM BadSchema.BadTable;

And then I checked the space again. Here are the results:

schemaname tablename row_count reserved_KB data_KB index_size_KB unused_KB
BadSchema BadTable 1 17015688 17015296 32 360
BadSchema BadTable_VarcharMax 1 4720 4688 8 24

Can anyone explain this?

Thanks,
SQL Dude
Post #1566507
Posted Wednesday, April 30, 2014 11:17 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 5:28 AM
Points: 12,910, Visits: 32,013
my first guess: the table is a heap, and does not have a clustered index.

in the case of a heap, if you update data, or delete and reinsert individual rows (without getting a table lock), a heap never frees up the space previously occupied by the pages of data, unless you got a table lock on the table for the delete/update operation.

my suggestion:
create a clone of the table featuring a clustered index (ie BadTable_tmp)
insert the data into that table,
rename the old table to something else
and rename the clone table to the original name
i'll bet you consistently see the data is much smaller.


Lowell

--There is no spoon, and there's no default ORDER BY in sql server either.
Actually, Common Sense is so rare, it should be considered a Superpower. --my son
Post #1566517
Posted Wednesday, April 30, 2014 11:30 AM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 1:48 PM
Points: 1,714, Visits: 4,528
To further on Lowell's point, what is the output of "select datalength(xml_column) from bad_table"? Should be about 70% of the nvarchar conversion normally.
Post #1566521
Posted Wednesday, April 30, 2014 11:36 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Yesterday @ 12:28 PM
Points: 17, Visits: 198
Lowell (4/30/2014)
my first guess: the table is a heap, and does not have a clustered index.

in the case of a heap, if you update data, or delete and reinsert individual rows (without getting a table lock), a heap never frees up the space previously occupied by the pages of data, unless you got a table lock on the table for the delete/update operation.

my suggestion:
create a clone of the table featuring a clustered index (ie BadTable_tmp)
insert the data into that table,
rename the old table to something else
and rename the clone table to the original name
i'll bet you consistently see the data is much smaller.


Thank you for the reply. The table is not a heap, there is an identity column with a clustered index primary key.
Post #1566524
Posted Wednesday, April 30, 2014 11:39 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Yesterday @ 12:28 PM
Points: 17, Visits: 198
Eirikur Eiriksson (4/30/2014)
To further on Lowell's point, what is the output of "select datalength(xml_column) from bad_table"? Should be about 70% of the nvarchar conversion normally.


Thanks for your reply.

The value returned for select datalength(BadColumn) from BadSchema.BadTable is 3345897 (so like... 3.19 MB?) Now I'm really confused.
Post #1566525
Posted Wednesday, April 30, 2014 11:43 AM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 1:48 PM
Points: 1,714, Visits: 4,528
SQLdude412 (4/30/2014)
Eirikur Eiriksson (4/30/2014)
To further on Lowell's point, what is the output of "select datalength(xml_column) from bad_table"? Should be about 70% of the nvarchar conversion normally.


Thanks for your reply.

The value returned for select datalength(BadColumn) from BadSchema.BadTable is 3345897 (so like... 3.19 MB?) Now I'm really confused.


Just as I thought and what Lowell said earlier, bloated heap! Follow his instructions.
Post #1566528
Posted Wednesday, April 30, 2014 12:11 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Yesterday @ 12:28 PM
Points: 17, Visits: 198
Eirikur Eiriksson (4/30/2014)
SQLdude412 (4/30/2014)
Eirikur Eiriksson (4/30/2014)
To further on Lowell's point, what is the output of "select datalength(xml_column) from bad_table"? Should be about 70% of the nvarchar conversion normally.


Thanks for your reply.

The value returned for select datalength(BadColumn) from BadSchema.BadTable is 3345897 (so like... 3.19 MB?) Now I'm really confused.


Just as I thought and what Lowell said earlier, bloated heap! Follow his instructions.


Thanks again for your reply. As I said before, the table is most definitely not a heap. It has the following structure:

CREATE TABLE BadSchema.BadTable (
primaryKeyID uniqueidentifier not null default newsequentialid(), -- I was wrong before I thought this was an identity column
alternateKeyID int not null,
BadColumn XML null,
datevalue1 datetimeoffset not null,
datevalue2 datetimeoffset not null,
CONSTRAINT PK_BadTable PRIMARY KEY CLUSTERED (primaryKeyID),
CONSTRAINT AK1_BadTable UNIQUE NONCLUSTERED (alternateKeyID)
);

Is there anything causing SQL Server to treat the table like a heap even though it has a clustered index? Is is possible this is a bug?

I appreciate your help so far and hope you can help me further.
Post #1566538
Posted Wednesday, April 30, 2014 12:13 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Yesterday @ 12:28 PM
Points: 17, Visits: 198
More evidence that it is not a heap

SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'BadSchema.BadTable');

object_id name index_id type type_desc is_unique data_space_id ignore_dup_key is_primary_key is_unique_constraint fill_factor is_padded is_disabled is_hypothetical allow_row_locks allow_page_locks has_filter filter_definition
981578535 PK_BadTable 1 1 CLUSTERED 1 1 0 1 0 0 0 0 0 1 1 0 NULL
981578535 AK1_BadTable 2 2 NONCLUSTERED 1 1 0 1 0 0 0 0 0 1 1 0 NULL
Post #1566539
Posted Wednesday, April 30, 2014 12:22 PM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 1:48 PM
Points: 1,714, Visits: 4,528
He he, somehow my phone didn't show the earlier reply, kind of awkward

My first approach would be creating a table with a different name but same structure, copy the data and then rename (sp_rename) to switch the two.

The cause could be an update related problem but must say, I haven't seen this kind of bloating on an xml column before.
Post #1566544
Posted Wednesday, April 30, 2014 12:22 PM This worked for the OP Answer marked as solution
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Yesterday @ 12:28 PM
Points: 17, Visits: 198
Just a heads up, I followed your advice despite my assertion that the table is not a heap, and the size is now 3 MB.

I am continuing to investigate this since I am 100% certain this table is not a heap.

Thanks again for all your help. I'll continue to update this thread with any findings.
Post #1566545
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse