NULL Storage

  • I'm trying to figure out how NULLs are actually stored in SQL Server. Non-variable types are pretty straightforward, they use the same amount of space as when a non-null is stored. Variable types are more complicated.

    For a type like varchar SQL Server stores 2bytes + actual length according to documentation, but it isn't clear on if those 2 bytes are stored if the value is NULL. I did some simple testing with all NULLs in a column compared to a table where that column doesn't exist and the size values for 100,000 rows are the same when examined with sp_spaceused.

    That should be the end of my questions, but when I read about SPARSE columns it suggests that if 60% of the rows on a varchar column are NULL you should use SPARSE.

    https://docs.microsoft.com/en-us/sql/relational-databases/tables/use-sparse-columns?view=sql-server-2017

    But if no space is used for NULL varchar entries without SPARSE, why (other than trying to make a wide table) would I use SPARSE? and 60%? When making a wide table suggestions are that your SPARSE column should be 90% NULL.

    Is there a threshold for the column where SQL Server starts storing those 2 bytes for a row even when the value is NULL? Is that threshold around 60%? I tried adding one row with a value in the otherwise NULL column, but that only increased the size of the table by the amount of space for that one row. So the threshold would have to be greater than 1.

    EDIT: After Scott's reply below I'm realizing I should show my work, so I've added the following

    Here's the table creation

    CREATE TABLE Spc.WithoutNull(id int NOT NULL IDENTITY(1,1))

    CREATE TABLE Spc.WithNull(id int NOT NULL IDENTITY(1,1),
    NoValue varchar(100) NULL)

    Added 100,000 rows

    INSERT INTO Spc.WithoutNull DEFAULT VALUES
    GO 100000
    INSERT INTO Spc.WithNull DEFAULT VALUES
    GO 100000

    Checked the space with

    sp_spaceused 'Spc.WithoutNull'
    GO
    sp_spaceused 'Spc.WithNull'
    GO

    With results of

    name        rows               reserved  data     index_size       unused
    WithoutNull100000 1416 KB 1352 KB 8 KB 56 KB
    WithNull100000 1416 KB 1352 KB 8 KB 56 KB

    • This topic was modified 4 years, 12 months ago by  Sioban Krzywicki. Reason: Added samples

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • The 2 bytes for varchar len are always present, even if the column is NULL.

    NULL itself is set as one bit per column, i.e., a "NULL bitmap".  Every column will be assigned a null bit, even if that column itself is currently set to "NOT NULL".

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Thanks Scott, but "present" how? They don't show up in storage.

    And if NULL is set as 1 bit per column, then I should be showing the size of the table with the NULL column as 100,000/8 bytes larger just for that bit. But that's not what shows up.

    Does sp_spaceused not show accurate table size?

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • Interesting.  It seems as if there is no variable data at all on a row, the entire "variable data" section of the row header is left out.  But when the row does contain any var len data, then those overhead bytes come back in, a minimum of 4 bytes.  I swear that's not what a lot of the docs say (or said), at least, but I checked in on SQL 2008 and it's the same even on that version (I don't have a SQL 2005 version to check it on).

    DROP TABLE WithoutNull;
    DROP TABLE WithNull;
    GO
    CREATE TABLE WithoutNull(id int NOT NULL IDENTITY(1,1) PRIMARY KEY, col2 bit NULL)
    CREATE TABLE WithNull(id int NOT NULL IDENTITY(1,1) PRIMARY KEY, col2 bit NULL, NoValue varchar(100) NULL)
    GO
    ;WITH
    cteTally10 AS (
    SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS numbers(number)
    ),
    cteTally100 AS (
    SELECT 0 AS number
    FROM cteTally10 c1
    CROSS JOIN cteTally10 c2
    ),
    cteTally100K AS (
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS number
    FROM cteTally100 c1
    CROSS JOIN cteTally100 c2
    CROSS JOIN cteTally10 c3
    )
    INSERT INTO WithoutNull ( col2 )
    SELECT 0
    FROM cteTally100K;
    GO
    ;WITH
    cteTally10 AS (
    SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS numbers(number)
    ),
    cteTally100 AS (
    SELECT 0 AS number
    FROM cteTally10 c1
    CROSS JOIN cteTally10 c2
    ),
    cteTally100K AS (
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS number
    FROM cteTally100 c1
    CROSS JOIN cteTally100 c2
    CROSS JOIN cteTally10 c3
    )
    INSERT INTO WithNull ( col2 )
    SELECT 0
    FROM cteTally100K;
    INSERT INTO WithNull ( col2, NoValue )
    SELECT 0, REPLICATE('A', 40)
    GO
    SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('WithoutNull'),1,NULL,'DETAILED')
    SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('WithNull'),1,NULL,'DETAILED')

    .

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • The NULLs for EACH ROW are actually stored in the Row Header of EACH ROW and, as of 2012, is always stored for every column even if the column is constrained as NOT NULL.  See the following for one example article on the subject.  There are many more you can search for.

    https://sqlity.net/en/2201/row-header/

    Also and contrary to popular belief, the total space for all rows is 8096, not 8060 (DBCC PAGE can be used to actually determine that) but 8096 includes row headers.  Not sure if 8096 includes the slot array since that's a level of detail I've not had to work on yet.  BUT, SQL Server does use the number 8060 a lot for determining if something will go "out of row" or not.

     

    It's been so long since I analyzed the code in sp_spaceused that I can't help answer your question on that.  I will state that each row header is actually included as a part of the data "data row" for page density (average_percent_of_page_fullness) and is part of the "page count" that sp_spaceused measures.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • That could be a bit misleading as 8060 is the maximum data size for one record. Paul Randal answers this in the comments on one of his posts:

    Inside the Storage Engine: Anatomy of a page

    The 8060 bytes is the maximum size of one record, not the amount of data space on the page – 8096 bytes.

    For a maximum-sized record of 8060 bytes, add two bytes for the slot array entry, 10 bytes for a possible heap forwarded-record backpointer, 14-bytes for a possible versioning tag, and that’s 26 bytes used. The other 10 bytes are for possible future use.

    If there are more than one record on the page, all 8096 bytes of data space can be used.

    Paul White has a nice breakdown here:

    Slot Array and Total Page Size

    If you try to create a table such as:

    CREATE TABLE dbo.SizeTest
    (
    ColA CHAR(8000),
    ColB CHAR(54)
    )

    You get that nice error message of:

    Creating or altering table 'SizeTest' failed because the minimum row size would be 8061, including 7 bytes of internal overhead. This exceeds the maximum allowable table row size of 8060 bytes.

     

    Sue

     

  • I'm not sure how any of that was misleading.  I did state that the 8096 included the row header and did state that SQL Server uses the number 8060 for determining if something will go out of row or not.  The kicker, of course, is that you do have to know that no fixed length column can go out of row and the total of all fixed length columns and the pointers (for out of row columns)/etc for variable width columns cannot exceed 8060.  That's actually the reason why there IS actually a limit on the number of MAX datatype columns you can have even if they'll all MAX datatypes (which are all variable width).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • What's the size of a pointer to a variable width column?

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply