SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Data Warehouse question


Data Warehouse question

Author
Message
David Lester
David Lester
Right there with Babe
Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)

Group: General Forum Members
Points: 727 Visits: 1542
This is more of a 'what is a general best practice' question than a specific code or syntax question.
Recently the data users here have started to define names of specific combinations of data, or want data broken out by sub-parts of a field. I am considering different ways of making sure the various report writers and ad-hoc users are arriving at the same datasets when quering for these.

The simplest example is: We have a source sales table with around 7 million rows. There is a new need to report against how the sales order was given to the company, that being electronic submission, paper, or scanned. The way to determine this is from a single character position in the order#. There are a dozen values, which break down to the three categories.

I am considering these options to make sure all users of the data get these categories correct.

A user function, which takes in the order ID, and returns the category. The function really is just a case statement, so not doing anyother selects or queries.

A calculated field in the table which again, uses a case statement to set the category.

A new field in the table, which is populated as the records are added.

In this which makes the most sense in your opinion/experience to do? There is a possibility of perhaps a dozen other groupings or flags which would work like this.

The second situation is a grouping which will require self-joins, or lookup queries to arive at the correct flag. In these I could either create a function, or add fields to the table with the categories. In this situation type, there could be twenty or more flags.

My inclination is to in all cases add either fields to the table, which are marked during the data loads, or to create a new table with the key from the sales data, and fields for the flags being used. However, I do understand there are differences in how one sets up a data warehouse vs a transactional system, and I have found many times when browsing through this forum amazing ideas which seem slightly counter-intuitive to me. So, I think it is worth asking others their thoughts on this, perhaps there is another option I have not thought of which is considerably better.

Any thoughts? I am sure 'It depends', what sorts of things should be considered in such a depends case?


Thanks!
PaulB-TheOneAndOnly
PaulB-TheOneAndOnly
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14073 Visits: 4639
David Lester (5/7/2010)
There are a dozen values, which break down to the three categories.


Using the above statement as reference I would agree that adding a new column to the target FACT table is the right thing to do. After initial population of the column, ETL process should take care of setting the right value during load.

A nice touch will be to create a DIM table describing the possible values of such new column.

_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
David Lester
David Lester
Right there with Babe
Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)

Group: General Forum Members
Points: 727 Visits: 1542
Thanks Paul,
So with the dimension table being nice thought, are you thinking nice for being able to lookup the possible values, and adding new ones sometime in the distant future, or is it a performance related idea?
PaulB-TheOneAndOnly
PaulB-TheOneAndOnly
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14073 Visits: 4639
David Lester (5/7/2010)
... are you thinking nice for being able to lookup the possible values, and adding new ones sometime in the distant future, or is it a performance related idea?


More like as a place to store the meaning and description of each value - in general I like to have a dimension describing each coded column sitting in a factual table.

_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
cochran1010
cochran1010
Forum Newbie
Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)

Group: General Forum Members
Points: 7 Visits: 3
Creating a new table with the key from the sales data, and fields for the flags is the best option rather than addition.

ebay shipping
PaulB-TheOneAndOnly
PaulB-TheOneAndOnly
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14073 Visits: 4639
cochran1010 (1/10/2011)
Creating a new table with the key from the sales data, and fields for the flags is the best option rather than addition.


Do you realize we are talking dimensional modeling here? :-)

_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
David Lester
David Lester
Right there with Babe
Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)Right there with Babe (727 reputation)

Group: General Forum Members
Points: 727 Visits: 1542
Thanks Paul, and yes I do.
Thankfully, after years of attempts, IT has ceased to be a wall. While they are still not ready to listen to me, they did hire a consultant. We are in process of redesigns, using these very things.
(It is mildly amusing that every suggestion the consultant is giving them independently matches what I have been asking for these last years.)
PaulB-TheOneAndOnly
PaulB-TheOneAndOnly
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14073 Visits: 4639
David Lester (12/9/2011)
Thanks Paul, and yes I do.
Thankfully, after years of attempts, IT has ceased to be a wall. While they are still not ready to listen to me, they did hire a consultant. We are in process of redesigns, using these very things.
(It is mildly amusing that every suggestion the consultant is giving them independently matches what I have been asking for these last years.)


I know what you mean David. Sometimes organizations lack the ability to listen to the internal resources and waste sh*&^tloads of money on consultants that in the best case scenario come up with the same (brilliant) ideas internal resouces have been trying to convey for years with no luck.

_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
alexanderzarita
alexanderzarita
Grasshopper
Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)

Group: General Forum Members
Points: 13 Visits: 1
Thanks for ideas above.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search