April 30, 2014 at 10:53 am
Hi folks,
I have a table that always contains only 1 row which has a column defined as XML data type. The size on disk for this table with only 1 row is 17GB. When I cast it to a varchar(max) the size is only 4MB.
It is my understanding that the size of an XML column is limited to 2GB. Is SQL Server doing something behind the scenes (like keeping its own internal index for traversing the nodes in the XML document)?
I am running the following query to check space used:
Here are the results:
schemanametablenamerow_countreserved_KBdata_KBindex_size_KBunused_KB
BadSchemaBadTable1170156881701529632360
I ran the following command:
SELECT CAST(BadColumn AS VARCHAR(MAX))
INTO BadSchema.BadTable_VarcharMax
FROM BadSchema.BadTable;
And then I checked the space again. Here are the results:
schemanametablenamerow_countreserved_KBdata_KBindex_size_KBunused_KB
BadSchemaBadTable1170156881701529632360
BadSchemaBadTable_VarcharMax147204688824
Can anyone explain this?
Thanks,
SQL Dude
April 30, 2014 at 11:17 am
my first guess: the table is a heap, and does not have a clustered index.
in the case of a heap, if you update data, or delete and reinsert individual rows (without getting a table lock), a heap never frees up the space previously occupied by the pages of data, unless you got a table lock on the table for the delete/update operation.
my suggestion:
create a clone of the table featuring a clustered index (ie BadTable_tmp)
insert the data into that table,
rename the old table to something else
and rename the clone table to the original name
i'll bet you consistently see the data is much smaller.
Lowell
April 30, 2014 at 11:30 am
To further on Lowell's point, what is the output of "select datalength(xml_column) from bad_table"? Should be about 70% of the nvarchar conversion normally.
😎
April 30, 2014 at 11:37 am
Lowell (4/30/2014)
my first guess: the table is a heap, and does not have a clustered index.in the case of a heap, if you update data, or delete and reinsert individual rows (without getting a table lock), a heap never frees up the space previously occupied by the pages of data, unless you got a table lock on the table for the delete/update operation.
my suggestion:
create a clone of the table featuring a clustered index (ie BadTable_tmp)
insert the data into that table,
rename the old table to something else
and rename the clone table to the original name
i'll bet you consistently see the data is much smaller.
Thank you for the reply. The table is not a heap, there is an identity column with a clustered index primary key.
April 30, 2014 at 11:39 am
Eirikur Eiriksson (4/30/2014)
To further on Lowell's point, what is the output of "select datalength(xml_column) from bad_table"? Should be about 70% of the nvarchar conversion normally.😎
Thanks for your reply.
The value returned for select datalength(BadColumn) from BadSchema.BadTable is 3345897 (so like... 3.19 MB?) Now I'm really confused.
April 30, 2014 at 11:43 am
SQLdude412 (4/30/2014)
Eirikur Eiriksson (4/30/2014)
To further on Lowell's point, what is the output of "select datalength(xml_column) from bad_table"? Should be about 70% of the nvarchar conversion normally.😎
Thanks for your reply.
The value returned for select datalength(BadColumn) from BadSchema.BadTable is 3345897 (so like... 3.19 MB?) Now I'm really confused.
Just as I thought and what Lowell said earlier, bloated heap! Follow his instructions.
😎
April 30, 2014 at 12:11 pm
Eirikur Eiriksson (4/30/2014)
SQLdude412 (4/30/2014)
Eirikur Eiriksson (4/30/2014)
To further on Lowell's point, what is the output of "select datalength(xml_column) from bad_table"? Should be about 70% of the nvarchar conversion normally.😎
Thanks for your reply.
The value returned for select datalength(BadColumn) from BadSchema.BadTable is 3345897 (so like... 3.19 MB?) Now I'm really confused.
Just as I thought and what Lowell said earlier, bloated heap! Follow his instructions.
😎
Thanks again for your reply. As I said before, the table is most definitely not a heap. It has the following structure:
CREATE TABLE BadSchema.BadTable (
primaryKeyID uniqueidentifier not null default newsequentialid(), -- I was wrong before I thought this was an identity column
alternateKeyID int not null,
BadColumn XML null,
datevalue1 datetimeoffset not null,
datevalue2 datetimeoffset not null,
CONSTRAINT PK_BadTable PRIMARY KEY CLUSTERED (primaryKeyID),
CONSTRAINT AK1_BadTable UNIQUE NONCLUSTERED (alternateKeyID)
);
Is there anything causing SQL Server to treat the table like a heap even though it has a clustered index? Is is possible this is a bug?
I appreciate your help so far and hope you can help me further.
April 30, 2014 at 12:13 pm
More evidence that it is not a heap
SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'BadSchema.BadTable');
object_idnameindex_idtypetype_descis_uniquedata_space_idignore_dup_keyis_primary_keyis_unique_constraintfill_factoris_paddedis_disabledis_hypotheticalallow_row_locksallow_page_lockshas_filterfilter_definition
981578535PK_BadTable11CLUSTERED110100000110NULL
981578535AK1_BadTable22NONCLUSTERED110100000110NULL
April 30, 2014 at 12:22 pm
He he, somehow my phone didn't show the earlier reply, kind of awkward :hehe:
My first approach would be creating a table with a different name but same structure, copy the data and then rename (sp_rename) to switch the two.
The cause could be an update related problem but must say, I haven't seen this kind of bloating on an xml column before.
😎
April 30, 2014 at 12:22 pm
Just a heads up, I followed your advice despite my assertion that the table is not a heap, and the size is now 3 MB.
I am continuing to investigate this since I am 100% certain this table is not a heap.
Thanks again for all your help. I'll continue to update this thread with any findings.
April 30, 2014 at 12:23 pm
What happens if you rebuild the clustered index?
April 30, 2014 at 12:38 pm
arnipetursson (4/30/2014)
What happens if you rebuild the clustered index?
Good question! I already switched the table out in production but I should be able to restore a backup and try it out. I will let you know!
April 30, 2014 at 12:55 pm
Quick update.
After switching out the table for a newly created one (double-checked the clustered index), the initial size was 3MB but it has been growing since then and is now around 50MB.
Here are some facts that may help narrow this down:
* We are running SQL Server 2008 SP3 on Windows Server 2008 R2
* We are running with READ_COMMITTED_SNAPSHOT turned ON
* The data in this table is updated by the following statement:
BEGIN TRANSACTION
DELETE FROM BadSchema.BadTable WHERE alternateKeyID = @Foo;
INSERT INTO BadSchema.BadTable VALUES (DEFAULT, @Foo, @foo2, @foo3, @foo4);
COMMIT TRANSACTION
April 30, 2014 at 12:58 pm
arnipetursson (4/30/2014)
What happens if you rebuild the clustered index?
ALTER INDEX PK_BadTable ON BadSchema.BadTable REBUILD;
The command completed successfully.
The size of the data is unchanged. (Currently 66MB)
April 30, 2014 at 1:55 pm
Update: I can reproduce this with READ_COMMITTED_SNAPSHOT turned ON, but not with it OFF.
The results are inconsistent; sometimes it goes up and never goes down... other times it seems to clean up OK. I will keep this thread updated as I gather more information.
EDIT: Nevermind, I believe what I was seeing on my test system was a "lazy" ghost cleanup. It eventually did shrink back down. See follow up post for more info.
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply