A script to answer: Should I use SPARSE?

  • Comments posted to this topic are about the item A script to answer: Should I use SPARSE?

  • SET NOCOUNT ON

    GO

    DROP TABLE #tblHold

    GO

    DECLARE @tblA TABLE (RowId INT IDENTITY(1,1),SchemaName VARCHAR(65),TableName VARCHAR(65),TotalRows BIGINT)

    INSERTINTO @tblA

    SELECTS.[name] AS SchemaName,

    ST.[name] AS TableName,

    SUM(p.[rows]) AS TotalRows

    FROM[sys].[tables] AS ST WITH (NOLOCK)

    INNER JOIN [sys].[schemas] AS S WITH (NOLOCK)

    ONST.[schema_id] = S.[schema_id]

    INNER JOIN [sys].[partitions] AS P WITH (NOLOCK)

    ON P.[object_id] = ST.[object_id]

    WHEREP.[index_id] IN (0,1)

    GROUP BY

    S.[name],

    ST.[name]

    DECLARE @tblB TABLE

    (RowId INT IDENTITY(1,1),SchemaName VARCHAR(65),TableName VARCHAR(65),ColumnName VARCHAR(65),

    ExecSQL VARCHAR(2000),NullRows BIGINT)

    INSERT INTO @tblB

    SELECTS.[name] AS SchemaName,

    ST.[name] AS TableName,

    SC.[name] AS ColumnName,

    'SELECT COUNT(1) FROM ['+DB_NAME()+'].['+S.[name]+'].['+ST.[name]+'] WITH (NOLOCK) WHERE ['+SC.[name]+'] IS NULL' AS ExecSQL,

    0 AS NullRows

    FROM[sys].[tables] AS ST WITH (NOLOCK)

    INNER JOIN [sys].[schemas] AS S WITH (NOLOCK)

    ONST.[schema_id] = S.[schema_id]

    INNER JOIN [sys].[columns] AS SC WITH (NOLOCK)

    ONST.[object_id] = SC.[object_id]

    WHERE1=1

    ANDSC.[is_nullable] = 1

    ORDER BY

    S.[name],

    ST.[name],

    SC.[column_id]

    DECLARE @MinIdINT

    DECLARE @MaxIdINT

    DECLARE @ExecSQLVARCHAR(2000)

    DECLARE @tblE TABLE(MyRows BIGINT)

    SELECT@MinId=MIN(RowId),@MaxId=MAX(RowId) FROM @tblB

    WHILE(@MinId<=@MaxId)

    BEGIN

    DELETE@tblE

    SELECT @ExecSQL=ExecSQL FROM @tblB WHERE RowId = @MinId

    INSERT INTO @tblE EXEC(@ExecSQL)

    UPDATE@tblB

    SETNullRows=(SELECT MyRows FROM @tblE)

    WHERERowId = @MinId

    SELECT@MinId=@MinId+1

    END

    SELECTB.RowId,

    A.SchemaName,

    A.TableName,

    B.ColumnName,

    B.NullRows,

    A.TotalRows,

    CAST(CAST(CAST(NullRows AS VARCHAR(10))+'.00' AS DEC(15,2))/CAST(CAST(TotalRows AS VARCHAR(10))+'.00' AS DEC(15,2))*100 AS DEC(5,2)) AS PercentageNull

    INTO#tblHold

    FROM@tblA AS A

    INNER JOIN @tblB AS B

    ONA.SchemaName = B.SchemaName

    ANDA.TableName = B.TableName

    WHERENullRows > 0

    ANDNullRows <> TotalRows

    GO

    SELECT*

    FROM#tblHold

    ORDER BY

    PercentageNull DESC

  • Any ideas how this might work in an OLAP environment? At least in my world, this is where this situation is more likely encountered.

  • I don't have any performance benchmarks yet. I am likely months from getting this into a deployment.

    But since MSFT claims there is no overhead on NULL and only slight on NOT NULL values, I would expect a net improvement considering the reduced storage per row for most rows.

  • NB: One side-affect of using SPARSE columns is the incompatibility with data compression features.

    We had a case where new developers decided to add new columns with the SPARSE column feature but did not realize this limitation until we hit disk space issues. Suggest you make this decision early in the project design data modelling stage.

    T

  • Thanks for pointing that out TerryCatt. Yes, you can use one or the other (Sparse or Compression), but not both.

    Also, for SQL 2014, data compression is not supported for In-Memory tables; and Sparse columns are not compatible with Clustered Columnstore Indexes.

    Also, please note for Sparse columns, the space savings isn't realized until the clustered index is rebuilt.

  • Didn't know about SPARCE. What a cool idea. I'll look into it.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply