• Eric M Russell (11/13/2012)


    Practically all of the really large databases I've worked with in the past could have benefitted from better normalization and data type usage. For the most part, I think that poor data modeling is the primary problem. Many of the data modeling decisions that developers are making when designing data warehouses actually result in worse (not better) performance.

    For example, I've seen 'Person' tables that contain the full address and multiple phone numbers. Do your research before deciding to denormalize a table for performance reasons.

    I've seen tables containing various integer columns where the datatypes are all an 8 byte BigInt. For example: Sex BigInt, MaritalStaus BigInt, etc. The guy who did this explained the reasoning as follows: "because SQL Server is running on a 64bit operating system, it's more efficient to use 64bit integers". It was a specious claim that couldn't be proven, and even if it were marginally true, the data pages from this table were still comsuming more I/O and memory.

    Also, another big one is date/time values contained in VarChar columns, which not only consumes more resources, but it's problematic in terms of performance and data quality as well.

    Noticed same in may environment, even the data type of transactional database column was set to Numeric(11,0) or +! When inquired the answer was, we thought of that our bigint or likewise data types may cannot serve the future load of 1-2 TB of data! Which in-fact not possible but the reason exists :crazy: