Design contra CONSTRAINT

  • Hi all!

    Using this oportunity to wich you a happy new year!

    Now my question!
    We are developing a C# application, using MS SQL database for the objects and data (Entity Framework for ObjectMapping, as far as I know).

    Now I am faced with a database Migration which redefines one column (RAW data format, from NULL to NOT NULL) in the biggest table inside the database (my assumption is: about 80% to 85% of the used data).

    Because the database itself (recovery model FULL) increases by the size of that table (GUID and RAW column), I told the developers, that we can not "sell" this to any customer, but use a "CONSTRAINT" (check for not null or > 0)) instead.

    The fast response was.
    We do not care, we develop. It's your problem.
    AND FOR PERFORMANCE REASONS, the database management system will perform better when the table definition for the RAW column is "not null", instead of the added CHECK CONSTRAINT (checking the data for being not null).

    Can you verify this Statement?

    Hope you have a quick answer for me!

    Kind regards,
    C.-A.

  • ca.zerbe-heinrich - Wednesday, January 17, 2018 12:27 PM

    Hi all!

    Using this oportunity to wich you a happy new year!

    Now my question!
    We are developing a C# application, using MS SQL database for the objects and data (Entity Framework for ObjectMapping, as far as I know).

    Now I am faced with a database Migration which redefines one column (RAW data format, from NULL to NOT NULL) in the biggest table inside the database (my assumption is: about 80% to 85% of the used data).

    Because the database itself (recovery model FULL) increases by the size of that table (GUID and RAW column), I told the developers, that we can not "sell" this to any customer, but use a "CONSTRAINT" (check for not null or > 0)) instead.

    The fast response was.
    We do not care, we develop. It's your problem.
    AND FOR PERFORMANCE REASONS, the database management system will perform better when the table definition for the RAW column is "not null", instead of the added CHECK CONSTRAINT (checking the data for being not null).

    Can you verify this Statement?

    Hope you have a quick answer for me!

    Kind regards,
    C.-A.

    The argument itself doesn't make a lot of sense to me personally. How does using the not null property vs using a constraint lead to size issues that have some impact to customers - especially when either way, some value is going to be entered? And likely the same value whether with a constraint or not null?
    I'd be more concerned about using Entity Framework. 

    Sue

  • Sue_H - Wednesday, January 17, 2018 1:24 PM

    ca.zerbe-heinrich - Wednesday, January 17, 2018 12:27 PM

    Hi all!

    Using this oportunity to wich you a happy new year!

    Now my question!
    We are developing a C# application, using MS SQL database for the objects and data (Entity Framework for ObjectMapping, as far as I know).

    Now I am faced with a database Migration which redefines one column (RAW data format, from NULL to NOT NULL) in the biggest table inside the database (my assumption is: about 80% to 85% of the used data).

    Because the database itself (recovery model FULL) increases by the size of that table (GUID and RAW column), I told the developers, that we can not "sell" this to any customer, but use a "CONSTRAINT" (check for not null or > 0)) instead.

    The fast response was.
    We do not care, we develop. It's your problem.
    AND FOR PERFORMANCE REASONS, the database management system will perform better when the table definition for the RAW column is "not null", instead of the added CHECK CONSTRAINT (checking the data for being not null).

    Can you verify this Statement?

    Hope you have a quick answer for me!

    Kind regards,
    C.-A.

    The argument itself doesn't make a lot of sense to me personally. How does using the not null property vs using a constraint lead to size issues that have some impact to customers - especially when either way, some value is going to be entered? And likely the same value whether with a constraint or not null?
    I'd be more concerned about using Entity Framework. 

    Sue

    Hi Sue!

    Sorry,If I did not make myself clear enough.
    - we have one table (structure: 2 columns: GUID (PK), RAW (NVARBINARY (NULLABLE))),
    - it's filled up with data (around 80% of the whole database)
    - we migrate the database (and during those steps, development also changed the structure of the RAW column to (NOT NULL)
    - as far as I can imagine, in the Background the Database Management System (MS SQL) creates a copy of the RAW column which is NOT NULLable, and copies the values, as alse Drops the old column and rename it (at least that's what I would do)
    In this case, the migration of the database leads to an "doubled" size of the database (even just temporary, because you can shrink it afterwards) and it takes some time.
    My Suggestion was, not to change the structure , but add the CHECK CONSTRAINT (for not being NULL or 0).
    The Response from development was:
    - It's implemented like this, we do not care about space.
    - the Database Management System performs better if the structure is changed, instead of keeping the old structure with the added CONSTRAINT.

    My quetion was, because I do not know if the dbms really performs slower or the query optimizer cares more about the "structure" than "constraints" and i do not find something about it in the world wide web, if anyone can verify this Statement of development - regarding the performance.

    C.-A.

  • ca.zerbe-heinrich - Wednesday, January 17, 2018 1:37 PM

    Hi Sue!

    Sorry,If I did not make myself clear enough.
    - we have one table (structure: 2 columns: GUID (PK), RAW (NVARBINARY (NULLABLE))),
    - it's filled up with data (around 80% of the whole database)
    - we migrate the database (and during those steps, development also changed the structure of the RAW column to (NOT NULL)
    - as far as I can imagine, in the Background the Database Management System (MS SQL) creates a copy of the RAW column which is NOT NULLable, and copies the values, as alse Drops the old column and rename it (at least that's what I would do)
    In this case, the migration of the database leads to an "doubled" size of the database (even just temporary, because you can shrink it afterwards) and it takes some time.
    My Suggestion was, not to change the structure , but add the CHECK CONSTRAINT (for not being NULL or 0).
    The Response from development was:
    - It's implemented like this, we do not care about space.
    - the Database Management System performs better if the structure is changed, instead of keeping the old structure with the added CONSTRAINT.

    My quetion was, because I do not know if the dbms really performs slower or the query optimizer cares more about the "structure" than "constraints" and i do not find something about it in the world wide web, if anyone can verify this Statement of development - regarding the performance.

    C.-A.

    Thanks - I was clear on things the first time. I've done updates from null to not null before, have never done it anything like the way you describe. I'm curious as to what the problems are when updating the null values and then alter table...alter column. Could you elaborate from a technical perspective? You mentioned you imagine how it's done (it doesn't really work that way in the background by the database) but do you actually know how it's being done in this case?

    Sue

  • I think this is easy to test for yourself. On dev instance create very small DB, and in it a test table that meets your criteria. Fill the table such that it consumes most of the DB space. Then alter the column to not null. If things work as you fear, this will fail due to inadequate space. But as long as you don't already have nulls I think it will succeed, changing from nullable to non-nullable will not grow table.

    Also easy enough to test the perf claim. Use the same table, create loop that does many updates/inserts/deletes, whatever you need, and time it both ways. I suspect not null performs marginally better than a check, but they're probably so close that it will be hard to discern a difference. I could be wrong.

    There is an old discussion of this on the internet titled "SQL SERVER 2008 . Not Null vs Check Constraint" that offers opinions of smart folks, but nothing conclusive.

  • Well your constraint is slightly different than just not null.  Depending on the size of that varbinary column that could be quite a lot it actually does have to check vs just not null.

  • I would be inclined to agree with the developers that using a NOT NULL constraint on the column's type may deliver (but there's no guarantee that it will deliver) better performance than a check constraint.  With the coumn declared NOT NULL, the query optimiser knows that it can't be null; with a check constraint, it doesn't know anything other than that there's a check constraint.  Having the optimser know the extra information may result in it delivering a better execution plan for some queries - it certainly won't cause it to deliver a worse plan; but the amount of improvement can't be known except by suck it and see, and it may be zero because perhaps none of your queries enable the optimiser to take advantage of the extra information.
    But as a general principle, if a column is known not to be nullable it should be declared as not nullable because doing that may be useful.  A check constraint is not the right way to do this.

    Changing the definition of a column to NOT NULL will not cause any size increase if there are currently no NULL values in the column.   If there are NULL values, the change would require all the NULL values to be changed to the default value  if there is a declared default for the column, and the change will fail if there are some NULLs and no default is specified (or the default is NULL).  If some rows have that column changed from NULL to a non-ull default, that may cause a size change.

    Tom

  • Thanks Folks!

    For all your help.
    I had my Meeting about this Topic on Friday, and for now the "Problem" was discussed and because of my Tests the database Migration is postponed .

    To tell my story with the problems I had, I can share some more Information:
    - database table with two columns (GUID, VARBINARY(MAX)),
    - development wrote database migration steps
    - one checked for existing NULL or DATALENGTH = 0, and deleted those entries, if there were some
    next step redefined the column (from NULL to NOR NULL), which increased (recovery model FULL) the database about the size of exactly that table, and the transaction log increased also about this amount
    - I had to do a Manual shrink afterwards, but on a database where this table is about 800 GB or more, it lead to a expected "Migration" of many hours.
    - so we skipped this for that Moment, and left the existing CONSTRAINT (checkung for NULL or DATALENGTH = 0)

    Thanks all for sharing your knowledge/experience and hints.

    C.-A.

  • Nice.  I was surprised by your finding, just ran a small test and confirmed the storage consumption.  Did not expect this.  Now I plan to run some more tests and try to narrow down when this occurs (is it all the time?).  Glad you helped expose my false assumption.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply