September 30, 2018 at 10:11 pm
Comments posted to this topic are about the item Compress and Decompress in SQL Server 2016
October 1, 2018 at 3:11 am
October 1, 2018 at 4:52 am
BrainDonor - Monday, October 1, 2018 3:11 AMNice article but testing of this function is a must. When looking into these commands a few months ago, although the general trend was a reduction in data length I did have some data that actually increased in size when fed through COMPRESS().
I guess if you compress an already compressed value it will make it bigger.
October 1, 2018 at 5:00 am
Jonathan AC Roberts - Monday, October 1, 2018 4:52 AMI guess if you compress an already compressed value it will make it bigger.
That is one way but this was just imported text - Compress and Decompress Basic Examples.
October 1, 2018 at 6:07 am
Interesting article.
As a side note, if I remember correctly, MSSQL will 'ignore' out-of-page data (LOB_DATA) when it comes to PAGE compression; I guess COMPRESS() and DECOMPRESS() are the 'workarounds' for this? I do hope they'll add compression to those sooner than later though.
Anyway, as such I think it would still be interesting to see a figure 3-3 that shows the effect of table-compression on the test-data.
October 1, 2018 at 6:59 am
quite interesting.
Like Deroby, I'd like to see comparison between COMPRESS() and compression at the table level
October 1, 2018 at 8:42 am
deroby - Monday, October 1, 2018 6:07 AMAs a side note, if I remember correctly, MSSQL will 'ignore' out-of-page data (LOB_DATA) when it comes to PAGE compression; I guess COMPRESS() and DECOMPRESS() are the 'workarounds' for this? I do hope they'll add compression to those sooner than later though.
Anyway, as such I think it would still be interesting to see a figure 3-3 that shows the effect of table-compression on the test-data.
My testing shows that:
You can try for yourself using the queries below. In the second set, even though the value is only 6000 bytes, and hence in-row, PAGE compression does nothing really. But, truncate the table, change the replicate amount to be 4000, and then PAGE compression works wonders.
USE [tempdb];
-- DROP TABLE dbo.CompressionTest;
CREATE TABLE dbo.CompressionTest
(
[ID] INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
[SomeValue] VARCHAR(MAX) NOT NULL
);
INSERT INTO dbo.CompressionTest ([SomeValue])
SELECT REPLICATE(CONVERT(VARCHAR(MAX), 'a'), 9000) -- off-row value
FROM master.sys.all_columns;
ALTER TABLE dbo.CompressionTest REBUILD WITH (FILLFACTOR = 100, DATA_COMPRESSION = NONE);
EXEC sys.sp_spaceused N'dbo.CompressionTest';
/*
name rows reserved data index_size unused
dbo.CompressionTest 10539 97376 KB 97024 KB 72 KB 280 KB
*/
ALTER TABLE dbo.CompressionTest REBUILD WITH (DATA_COMPRESSION = ROW);
EXEC sys.sp_spaceused N'dbo.CompressionTest';
/*
name rows reserved data index_size unused
dbo.CompressionTest 10539 97120 KB 96960 KB 72 KB 88 KB
*/
ALTER TABLE dbo.CompressionTest REBUILD WITH (DATA_COMPRESSION = PAGE);
EXEC tempdb.sys.sp_spaceused N'dbo.CompressionTest';
/*
name rows reserved data index_size unused
dbo.CompressionTest 10539 97120 KB 96960 KB 72 KB 88 KB
*/
-------------------------
TRUNCATE TABLE dbo.CompressionTest;
INSERT INTO dbo.CompressionTest ([SomeValue])
SELECT REPLICATE(CONVERT(VARCHAR(MAX), 'a'), 6000) -- in-row value
FROM master.sys.all_columns;
ALTER TABLE dbo.CompressionTest REBUILD WITH (FILLFACTOR = 100, DATA_COMPRESSION = NONE);
EXEC sys.sp_spaceused N'dbo.CompressionTest';
/*
name rows reserved data index_size unused
dbo.CompressionTest 10539 85128 KB 84328 KB 232 KB 568 KB
*/
ALTER TABLE dbo.CompressionTest REBUILD WITH (DATA_COMPRESSION = ROW);
EXEC sys.sp_spaceused N'dbo.CompressionTest';
/*
name rows reserved data index_size unused
dbo.CompressionTest 10539 85128 KB 84328 KB 232 KB 568 KB
*/
ALTER TABLE dbo.CompressionTest REBUILD WITH (DATA_COMPRESSION = PAGE);
EXEC sys.sp_spaceused N'dbo.CompressionTest';
/*
name rows reserved data index_size unused
dbo.CompressionTest 10539 85128 KB 84328 KB 232 KB 568 KB
*/
SELECT *
FROM sys.dm_db_database_page_allocations(DB_ID(), OBJECT_ID(N'dbo.CompressionTest'), 1, 1, 'DETAILED');
Converting the table to a Clustered Columnstore Index sometimes helps with large types, even off row values.
So yes, I think COMPRESS / DECOMPRESS is definitely an alternative for this scenario (that or Clustered Columnstore, depending on the scenario).
On a related note: I brought up this very point (these two work-arounds) in my post about the new UTF-8 Collations in SQL Server 2019:
Native UTF-8 Support in SQL Server 2019: Savior or False Prophet?
Take care, Solomon...
SQL# — https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
Sql Quantum Lift — https://SqlQuantumLift.com/ ( company )
Sql Quantum Leap — https://SqlQuantumLeap.com/ ( blog )
Info sites — Collations • Module Signing • SQLCLR
October 1, 2018 at 8:51 am
BrainDonor - Monday, October 1, 2018 5:00 AMJonathan AC Roberts - Monday, October 1, 2018 4:52 AMI guess if you compress an already compressed value it will make it bigger.That is one way but this was just imported text - Compress and Decompress Basic Examples.
There is a certain minimum size of the resulting "compressed" value (I guess due to "overhead"), so input values under a certain size (depending on compression algorithm, I believe) will actually have negative gains from the compression.
Also, compressing relies upon reducing patterns. If the input value has no discernible patterns (such as a value that has already been compressed), it won't compress plus you still get that overhead.
Take care, Solomon...
SQL# — https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
Sql Quantum Lift — https://SqlQuantumLift.com/ ( company )
Sql Quantum Leap — https://SqlQuantumLeap.com/ ( blog )
Info sites — Collations • Module Signing • SQLCLR
October 1, 2018 at 11:29 am
compress definitely works a lot better for particular data - As an example one of my tables (700Million rows, 900GB + on standard format and page compression), when I compress the main blob on it , with a mix of text data and pdf/images it goes down to 400GB. not more because of PDF data which is already a zip format.
This is using a gzipstream - which includes a header of 128 bytes. If it was deflatestream it would not include this header.
October 1, 2018 at 2:29 pm
@Solomon Rutzky:
thanks for working this out, my tests had shown similar results; I simply wondered what it would be like in this particular tests as it "fits" the subject.
(and I was slightly hoping someone would say that version xyz now includes LOB_DATA when applying PAGE compression =)
Kind regards,
Roby
Viewing 10 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy