Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Data Warehouse question Expand / Collapse
Author
Message
Posted Friday, May 7, 2010 10:36 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, December 6, 2012 2:28 PM
Points: 185, Visits: 1,542

This is more of a 'what is a general best practice' question than a specific code or syntax question.
Recently the data users here have started to define names of specific combinations of data, or want data broken out by sub-parts of a field. I am considering different ways of making sure the various report writers and ad-hoc users are arriving at the same datasets when quering for these.

The simplest example is: We have a source sales table with around 7 million rows. There is a new need to report against how the sales order was given to the company, that being electronic submission, paper, or scanned. The way to determine this is from a single character position in the order#. There are a dozen values, which break down to the three categories.

I am considering these options to make sure all users of the data get these categories correct.

A user function, which takes in the order ID, and returns the category. The function really is just a case statement, so not doing anyother selects or queries.

A calculated field in the table which again, uses a case statement to set the category.

A new field in the table, which is populated as the records are added.

In this which makes the most sense in your opinion/experience to do? There is a possibility of perhaps a dozen other groupings or flags which would work like this.

The second situation is a grouping which will require self-joins, or lookup queries to arive at the correct flag. In these I could either create a function, or add fields to the table with the categories. In this situation type, there could be twenty or more flags.

My inclination is to in all cases add either fields to the table, which are marked during the data loads, or to create a new table with the key from the sales data, and fields for the flags being used. However, I do understand there are differences in how one sets up a data warehouse vs a transactional system, and I have found many times when browsing through this forum amazing ideas which seem slightly counter-intuitive to me. So, I think it is worth asking others their thoughts on this, perhaps there is another option I have not thought of which is considerably better.

Any thoughts? I am sure 'It depends', what sorts of things should be considered in such a depends case?


Thanks!
Post #918153
Posted Friday, May 7, 2010 11:30 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Tuesday, January 28, 2014 8:15 AM
Points: 3,068, Visits: 4,639
David Lester (5/7/2010)
There are a dozen values, which break down to the three categories.


Using the above statement as reference I would agree that adding a new column to the target FACT table is the right thing to do. After initial population of the column, ETL process should take care of setting the right value during load.

A nice touch will be to create a DIM table describing the possible values of such new column.


_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
Post #918215
Posted Friday, May 7, 2010 12:01 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, December 6, 2012 2:28 PM
Points: 185, Visits: 1,542
Thanks Paul,
So with the dimension table being nice thought, are you thinking nice for being able to lookup the possible values, and adding new ones sometime in the distant future, or is it a performance related idea?
Post #918248
Posted Friday, May 7, 2010 2:09 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Tuesday, January 28, 2014 8:15 AM
Points: 3,068, Visits: 4,639
David Lester (5/7/2010)
... are you thinking nice for being able to lookup the possible values, and adding new ones sometime in the distant future, or is it a performance related idea?


More like as a place to store the meaning and description of each value - in general I like to have a dimension describing each coded column sitting in a factual table.


_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
Post #918345
Posted Monday, January 10, 2011 2:55 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, January 10, 2011 3:01 AM
Points: 1, Visits: 3
Creating a new table with the key from the sales data, and fields for the flags is the best option rather than addition.



ebay shipping
Post #1045184
Posted Thursday, December 8, 2011 2:51 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Tuesday, January 28, 2014 8:15 AM
Points: 3,068, Visits: 4,639
cochran1010 (1/10/2011)
Creating a new table with the key from the sales data, and fields for the flags is the best option rather than addition.


Do you realize we are talking dimensional modeling here?


_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
Post #1218975
Posted Friday, December 9, 2011 7:40 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, December 6, 2012 2:28 PM
Points: 185, Visits: 1,542
Thanks Paul, and yes I do.
Thankfully, after years of attempts, IT has ceased to be a wall. While they are still not ready to listen to me, they did hire a consultant. We are in process of redesigns, using these very things.
(It is mildly amusing that every suggestion the consultant is giving them independently matches what I have been asking for these last years.)
Post #1219414
Posted Saturday, December 10, 2011 3:58 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Tuesday, January 28, 2014 8:15 AM
Points: 3,068, Visits: 4,639
David Lester (12/9/2011)
Thanks Paul, and yes I do.
Thankfully, after years of attempts, IT has ceased to be a wall. While they are still not ready to listen to me, they did hire a consultant. We are in process of redesigns, using these very things.
(It is mildly amusing that every suggestion the consultant is giving them independently matches what I have been asking for these last years.)


I know what you mean David. Sometimes organizations lack the ability to listen to the internal resources and waste sh*&^tloads of money on consultants that in the best case scenario come up with the same (brilliant) ideas internal resouces have been trying to convey for years with no luck.


_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
Post #1219863
Posted Thursday, May 30, 2013 2:54 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, May 30, 2013 2:47 AM
Points: 1, Visits: 1
Thanks for ideas above.
Post #1458091
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse