How to handle a fact table with over 250 columns

  • Hello SSC,

    I have a data warehouse question that I would like some advice with...

    Like most companies we get our data from flat files. These flat files are processed and then dumped into SQL staging tables. Then, an SSIS process transforms the data and pushes it out into our production DW environment. The staging tables consist of about 250 columns. We take about 50 of those columns into our production environment into a fact table. Our VP now wants all of the columns  from the staging table pushed into our production data warehouse. This is most likely going to slow everything down and cause bottlenecks in processing.

    What are some ways to approach this? Can a fact table with over 250 columns still perform well?

    I was considering column store indexes, or perhaps splitting the fact table into 2 or 3, but the downstream affect could be costly. We have a lot of SSIS packages and processes that point to that particular fact table. I am not sure what the correct approach to this would be. I would welcome any advice or suggestions.

    As always, thank you very much in advance SSC. I hope you are all well and staying safe!

    Thanks,

    Dave

    Everyone has a plan until they get punched in the mouth. --Mike Tyson

  • This was removed by the editor as SPAM

  • 250 seems like a lot of facts! Are you happy that you've got the data modelling correct?

    If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.
    See https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help/ for details of how to post T-SQL code-related questions.

  • 250 is a lot for a fact table. I had a dimension table that had 573 columns. The problems encountered were during updates due to page splits. So much of a problem that the table had to be split vertically into 2 separate tables then joined by the primary key. There is another fact table that has 151 columns this has no problems.

    The important thing to check when you have a lot of columns is the row width, i.e. the number of bytes in a row. For a fact table most of the columns should be integer foreign keys and things like datetime and measurement. You should not have large text columns. So 250 columns may well be ok.

  • Phil Parkin wrote:

    250 seems like a lot of facts! Are you happy that you've got the data modelling correct?

    Not sure I understand what you are saying Phil.

    Everyone has a plan until they get punched in the mouth. --Mike Tyson

  • Jonathan AC Roberts wrote:

    250 is a lot for a fact table. I had a dimension table that had 573 columns. The problems encountered were during updates due to page splits. So much of a problem that the table had to be split vertically into 2 separate tables then joined by the primary key. There is another fact table that has 151 columns this has no problems.

    The important thing to check when you have a lot of columns is the row width, i.e. the number of bytes in a row. For a fact table most of the columns should be integer foreign keys and things like datetime and measurement. You should not have large text columns. So 250 columns may well be ok.

    Thanks, this is helpful. How about partitioning the table by year, or whatever timestamp works?

    Everyone has a plan until they get punched in the mouth. --Mike Tyson

  • Lord Slaagh wrote:

    Phil Parkin wrote:

    250 seems like a lot of facts! Are you happy that you've got the data modelling correct?

    Not sure I understand what you are saying Phil.

    I'm used to seeing fairly wide dimension tables, but fact tables are usually just a bunch of FKs along with a few numbers. I was wondering if there was a possibility that you have a situation where you have got facts which should be dims.

    If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.
    See https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help/ for details of how to post T-SQL code-related questions.

  • Phil Parkin wrote:

    Lord Slaagh wrote:

    Phil Parkin wrote:

    250 seems like a lot of facts! Are you happy that you've got the data modelling correct?

    Not sure I understand what you are saying Phil.

    I'm used to seeing fairly wide dimension tables, but fact tables are usually just a bunch of FKs along with a few numbers. I was wondering if there was a possibility that you have a situation where you have got facts which should be dims.

    guess it all depends on what's in play - on my shop one of the actuarial cubes has 25 dimensions (all very short - 3 to 10 columns max ) and 1 fact table with 350+ measures (these are already aggregated values) plus a few more calculated values within the cube itself

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply