DB compression testing in SQL 2014

  • I need to test DB compression where the SQL DB on SQL2008R2 is taking around 2TB and most of the data is XML files. So I want to test the 2TB compression in SQL 2014 version and see how much it compresses. Can anyone provide me details on how to test the DB compression.

  • SQL Server 2008 supports row and page compression. I havn't heard anything about improvements to the compression algorithm in 2014.

    You can estimate the compression savings on a table using the following T-SQL command.

    sp_estimate_data_compression_savings

    [ @schema_name = ] 'schema_name'

    , [ @object_name = ] 'object_name'

    , [@index_id = ] index_id

    , [@partition_number = ] partition_number

    , [@data_compression = ] 'data_compression'

    [;]

    If you have two copies of the table, one uncompressed and the other compressed, then you can use the following command to return space allocation for each.

    sp_spaceused [[ @objname = ] 'objname' ]

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Neither page nor row compression will compress LOB columns (like XML), so it's not as easy as just ALTER INDEX (and the sp_estimate_data_compression_savings is only going to estimate savings from non-LOB columns)

    In SQL 2016 you've got two functions (ref: https://blogs.msdn.microsoft.com/sqlserverstorageengine/2015/12/08/built-in-functions-for-compressiondecompression-in-sql-server-2016/) to compress or decompress LOBs, in 2008 R2, either compress in the application or use CLR procedures/functions to implement your own compression algorithm.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (11/3/2016)


    Neither page nor row compression will compress LOB columns (like XML), so it's not as easy as just ALTER INDEX (and the sp_estimate_data_compression_savings is only going to estimate savings from non-LOB columns)

    In SQL 2016 you've got two functions (ref: https://blogs.msdn.microsoft.com/sqlserverstorageengine/2015/12/08/built-in-functions-for-compressiondecompression-in-sql-server-2016/) to compress or decompress LOBs, in 2008 R2, either compress in the application or use CLR procedures/functions to implement your own compression algorithm.

    Thanks, COMPRESS and DECOMPRESS functions; that's something new I learned today.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Do you have any solution here since the DB has only 1 Data file and is around 2Tb. There is one table of largest size occupying 70% of the data file with more space occupying by XML files.

  • How are you containing the "XML files"; as XML datatype, varchar(max), filestream, etc. ?

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • as XML datatype

  • Pages in SQL Server tables are basically 8 KB. Data pages can be IN-ROW or ROW-OVERFLOW (contained off-row in space reserved for LOB), and Page or Row Compression works on IN-ROW data only.

    However, even for data types like VARCHAR(max) or XML, if the value is less than 8,000 bytes it may still be contained IN-ROW, so smaller XML documents may be compressed IN-ROW while larger XML documents are contained ROW-OVERFLOW and uncompressed.

    Run the stored procedure I referenced earlier against your original table and see what it estimates.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • GilaMonster (11/3/2016)


    Neither page nor row compression will compress LOB columns (like XML), so it's not as easy as just ALTER INDEX (and the sp_estimate_data_compression_savings is only going to estimate savings from non-LOB columns)

    In SQL 2016 you've got two functions (ref: https://blogs.msdn.microsoft.com/sqlserverstorageengine/2015/12/08/built-in-functions-for-compressiondecompression-in-sql-server-2016/) to compress or decompress LOBs, in 2008 R2, either compress in the application or use CLR procedures/functions to implement your own compression algorithm.

    Unfortunately the argument data type xml is invalid for argument 1 of Compress function.

    😎

  • psred (11/3/2016)


    I need to test DB compression where the SQL DB on SQL2008R2 is taking around 2TB and most of the data is XML files. So I want to test the 2TB compression in SQL 2014 version and see how much it compresses. Can anyone provide me details on how to test the DB compression.

    Quick question, is this archival data or do you need to interrogate it regularly/frequently?

    😎

    If it is archival then you can either use a CLR implementation of gzip or other compression algorithms (recommend SQL#) or on 2016 then use the COMPRESS function and store the value as (N)BINARY(MAX), otherwise I recommend storing it as XML as the overhead of compression/decompression is quite high.

  • Eric I have ran the query you have given and dont see any change before and after compression results.

    Eirikur it is not the archival data.

  • ...

  • Sorry removed it since posted twice.

  • Eirikur Eiriksson (11/3/2016)


    GilaMonster (11/3/2016)


    Neither page nor row compression will compress LOB columns (like XML), so it's not as easy as just ALTER INDEX (and the sp_estimate_data_compression_savings is only going to estimate savings from non-LOB columns)

    In SQL 2016 you've got two functions (ref: https://blogs.msdn.microsoft.com/sqlserverstorageengine/2015/12/08/built-in-functions-for-compressiondecompression-in-sql-server-2016/) to compress or decompress LOBs, in 2008 R2, either compress in the application or use CLR procedures/functions to implement your own compression algorithm.

    Unfortunately the argument data type xml is invalid for argument 1 of Compress function.

    So CAST(... AS NVARCHAR(MAX)) first. The column data type would have to be changed to VARBINARY(MAX) because compressed data is binary, and the returned value from DECOMPRESS is also VARBINARY(MAX), so there has to be casting done anyway, it's not as if it's a transparent solution (and it's not usable for the OP because he's on SQL 2008 R2)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (11/4/2016)


    Eirikur Eiriksson (11/3/2016)


    GilaMonster (11/3/2016)


    Neither page nor row compression will compress LOB columns (like XML), so it's not as easy as just ALTER INDEX (and the sp_estimate_data_compression_savings is only going to estimate savings from non-LOB columns)

    In SQL 2016 you've got two functions (ref: https://blogs.msdn.microsoft.com/sqlserverstorageengine/2015/12/08/built-in-functions-for-compressiondecompression-in-sql-server-2016/) to compress or decompress LOBs, in 2008 R2, either compress in the application or use CLR procedures/functions to implement your own compression algorithm.

    Unfortunately the argument data type xml is invalid for argument 1 of Compress function.

    So CAST(... AS NVARCHAR(MAX)) first. The column data type would have to be changed to VARBINARY(MAX) because compressed data is binary, and the returned value from DECOMPRESS is also VARBINARY(MAX), so there has to be casting done anyway, it's not as if it's a transparent solution (and it's not usable for the OP because he's on SQL 2008 R2)

    Although it roughly halves the size of an XML column, this method is not too practical if the data needs to be interrogated/updated, works fine for archiving.

    😎

    For SQL Server versions which don't have COMPRESS/DECOMPRESS then there is always an option to use a CLR routine, the SQL# one is quite good.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply