I recently was exploring some options to redesign tables that had many columns; most of the columns were NULL for most rows. The issue is the table growth size is reducing scalability and performance.
Some of the options I considered were: Named attribute value pairs, Serialization, XML, and Vertical Fragmentation. All of these options had pros and cons, but the impact on the application layer is an important consideration.
Then I ran across the SPARSE option, first available in SQL 2008. In summary: Sparse columns are ordinary columns that have an optimized storage for null values. Sparse columns reduce the space requirements for null values at the cost of more overhead to retrieve nonnull values. This is a great option to improve scalability without changing the physical data model.
The usage is easy, either create the column or alter like this example:
ProductionSpecification varchar(20) SPARSE NULL,
The script is a stored procedure that takes a schema name and table name and it will report back on each of the columns, how many rows are null, the percentage of rows that are null, the percentage of null threshold for the data type of the column and if sparse is recommended based on the net space saving of 40%.
Also read about the restrictions regarding memory, replication and other fun facts here:
As always, your mileage may vary. Test, Test, Test.