sparse columns and column sets

  • Hi All,

    I'm investigating using a wide table and sparse columns as an alternative to an EAV design. One of the touted features of using sparse columns is the ability to define a "column set", which basically aggregates all the non-null sparse columns into a single XML blob.

    My main question is, what purpose does this "column set" blob serve other than as a quick reference to the queryer what's in there? It would seem to extract the data, you'd still have (want?) to reference the underlying columns themselves. Or is the intent that the column set column be referenced via XML functions for all read/write queries and that the underlying columns should never have to be dealt with?

    If you're not sure what I'm referring to consider the following:

    set nocount on

    if object_id('tempdb.dbo.#SparseTest') is not null drop table #SparseTest

    create table #SparseTest

    (

    RID int identity(1,1) primary key clustered,

    Col1 varchar(50) sparse,

    Col2 varchar(50) sparse,

    Col3 varchar(50) sparse,

    Col4 varchar(50) sparse,

    Col5 varchar(50) sparse,

    Col6 varchar(50) sparse,

    ColSet xml column_set for all_sparse_columns

    )

    insert into #SparseTest (Col1, Col2, Col3, Col4, Col5, Col6)

    values

    ('a', null, null, null, 'z', null),

    (null, 'hello', null, null, 'x', null),

    ('Q', null, null, null, 'z', 'y'),

    ('a', null, 'B', null, null, null),

    ('r', null, null, 'tacos', null, null)

    select *

    from #SparseTest

    Executive Junior Cowboy Developer, Esq.[/url]

  • Hi,

    I hope these references help:

    http://msdn.microsoft.com/en-us/library/cc280604(v=sql.100).aspx

    http://msdn.microsoft.com/en-us/library/cc280521(v=sql.100).aspx

    In a nutshell, rather than allocating storage space for the sparse columns as you would regular columns which would be wasteful. All of the sparse column data is stored in an XML column. The first reference is handy because it indicates what proportion of a column is to be NULLable before using the SPARSE attribute is of benefit for the given data type.

    You can use the sparse column names just as you would a regular column name and SQL Server does the extraction of the data from the XML, you do not need to query the XML using XQuery.

    I hope this answers, or a least helps to answer, your query.

    John

  • Further on John Corkett's post, the [COLUMN SET] bridges the gap between a normal table and an EAV type table. Standard set based approach can be used on the database end while an application code can handle it as an EAV.

    Slightly off the topic, two of the biggest drawbacks of an EAV are weak data typing and high complexity when implementing referential constraints, which are straight forward in a sparse table. For those reasons alone, my choice is to use sparse columns rather than an EAV type table.

    It has some limitations though, i.e. the 2Gb BLOB limit for all data in the sparse columns, so if the data in each row will potentially be larger than 2Gb, it should not be used.

    😎

    An example of an EAV like query and constraints based on the previously posted DDL/Data

    USE tempdb;

    GO

    SET NOCOUNT ON

    IF OBJECT_ID('dbo.FKTABLE') IS NOT NULL

    BEGIN

    ALTER TABLE dbo.SparseTest DROP CONSTRAINT FK_FKTABLE_FK_ID;

    DROP TABLE dbo.FKTABLE;

    END

    CREATE TABLE dbo.FKTABLE

    (

    FK_ID INT NOT NULL PRIMARY KEY CLUSTERED

    ,FK_VAL VARCHAR(10) NOT NULL

    );

    INSERT INTO dbo.FKTABLE ( FK_ID , FK_VAL )

    VALUES (1,'ABC'),(2,'DEF'),(3,'GHI');

    if object_id('dbo.SparseTest') is not null drop table dbo.SparseTest;

    create table dbo.SparseTest

    (

    RID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,

    CKEY INT NOT NULL DEFAULT(-1),

    Col1 VARCHAR(50) SPARSE,

    Col2 VARCHAR(50) SPARSE,

    Col3 VARCHAR(50) SPARSE,

    Col4 VARCHAR(50) SPARSE,

    Col5 VARCHAR(50) SPARSE,

    Col6 VARCHAR(50) SPARSE,

    Col7 INT SPARSE CONSTRAINT FK_FKTABLE_FK_ID FOREIGN KEY REFERENCES dbo.FKTABLE(FK_ID),

    CCode CHAR(3) SPARSE CONSTRAINT CHKCNST_DBO_SPARSETEST_CCODE_THREE_CHAR_ONLY CHECK (CCode LIKE '[A-z][A-z][A-z]'),

    CDATE DATE SPARSE CONSTRAINT CHKCNST_DBO_SPARSETEST_CDATE_AFTER_2013_12_31 CHECK (CDATE > '2013-12-31'),

    ColSet XML COLUMN_SET FOR ALL_SPARSE_COLUMNS

    );

    insert into dbo.SparseTest (CKEY,Col1, Col2, Col3, Col4, Col5, Col6,Col7,CCode,CDATE)

    values

    ( 1,'a' , null , null, null , 'z', null ,NULL ,NULL ,'2014-05-10'),

    ( 1,null, 'hello', null, null , 'x', null ,1 ,'USD' ,NULL),

    ( 2,'Q' , null , null, null , 'z', 'y' ,NULL ,'abc' ,'2014-11-12'),

    ( 3,'a' , null , 'B' , null , null, null ,2 ,'GBP' ,NULL),

    ( 3,'r' , null , null, 'tacos', null, null ,3 ,NULL ,'2015-01-10')

    -- Query like an EAV

    SELECT

    ST.RID

    ,ST.CKEY

    ,SPCOL.DATA.value('local-name(.)','VARCHAR(50)') AS SCOL_NAME

    ,SPCOL.DATA.value('.[1]','VARCHAR(50)') AS SCOL_VALUE

    FROM dbo.SparseTest ST

    OUTER APPLY ST.ColSet.nodes('*') AS SPCOL(DATA);

    Results (EAV type query)

    RID CKEY SCOL_NAME SCOL_VALUE

    ----- ------ ----------- ------------

    1 1 Col1 a

    1 1 Col5 z

    1 1 CDATE 2014-05-10

    2 1 Col2 hello

    2 1 Col5 x

    2 1 Col7 1

    2 1 CCode USD

    3 2 Col1 Q

    3 2 Col5 z

    3 2 Col6 y

    3 2 CCode abc

    3 2 CDATE 2014-11-12

    4 3 Col1 a

    4 3 Col3 B

    4 3 Col7 2

    4 3 CCode GBP

    5 3 Col1 r

    5 3 Col4 tacos

    5 3 Col7 3

    5 3 CDATE 2015-01-10

    Constraints examples

    -- FOREIGN KEY VIOLATION

    BEGIN TRY

    INSERT INTO dbo.SparseTest (CKEY,Col1, Col7)

    VALUES (3,'a',4);

    END TRY

    BEGIN CATCH

    SELECT ERROR_NUMBER() AS E_NO, ERROR_MESSAGE() AS E_MSG;

    END CATCH

    -- INVALID CCode

    BEGIN TRY

    INSERT INTO dbo.SparseTest (CKEY, Col1, CCode)

    VALUES (3, 'a' , 'C1P');

    END TRY

    BEGIN CATCH

    SELECT ERROR_NUMBER() AS E_NO, ERROR_MESSAGE() AS E_MSG;

    END CATCH

    -- INVALID CCATE

    BEGIN TRY

    INSERT INTO dbo.SparseTest (CKEY, Col1, CDATE)

    VALUES (3, 'a' , '2013-12-31');

    END TRY

    BEGIN CATCH

    SELECT ERROR_NUMBER() AS E_NO, ERROR_MESSAGE() AS E_MSG;

    END CATCH

    Results

    E_NO E_MSG

    ----------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    547 The INSERT statement conflicted with the FOREIGN KEY constraint "FK_FKTABLE_FK_ID". The conflict occurred in database "tempdb", table "dbo.FKTABLE", column 'FK_ID'.

    E_NO E_MSG

    ----------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    547 The INSERT statement conflicted with the CHECK constraint "CHKCNST_DBO_SPARSETEST_CCODE_THREE_CHAR_ONLY". The conflict occurred in database "tempdb", table "dbo.SparseTest", column 'CCode'.

    E_NO E_MSG

    ----------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    547 The INSERT statement conflicted with the CHECK constraint "CHKCNST_DBO_SPARSETEST_CDATE_AFTER_2013_12_31". The conflict occurred in database "tempdb", table "dbo.SparseTest", column 'CDATE'.

  • Thank you both for the replies and examples. These have been quite helpful.

    Executive Junior Cowboy Developer, Esq.[/url]

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply