Log in
::
Register
::
Not logged in
Home
Tags
Articles
Editorials
Stairways
Forums
Scripts
Videos
Blogs
QotD
Books
Ask SSC
SQL Jobs
Training
Authors
About us
Contact us
Newsletters
Write for us
Recent Posts
Recent Posts
Popular Topics
Popular Topics
Home
Search
Members
Calendar
Who's On
Home
»
SQL Server 7,2000
»
Strategies
»
Data Warehouse question
Data Warehouse question
Rate Topic
Display Mode
Topic Options
Author
Message
David Lester
David Lester
Posted Friday, May 07, 2010 10:36 AM
SSC-Enthusiastic
Group: General Forum Members
Last Login: Thursday, December 06, 2012 2:28 PM
Points: 185,
Visits: 1,542
This is more of a 'what is a general best practice' question than a specific code or syntax question.
Recently the data users here have started to define names of specific combinations of data, or want data broken out by sub-parts of a field. I am considering different ways of making sure the various report writers and ad-hoc users are arriving at the same datasets when quering for these.
The simplest example is: We have a source sales table with around 7 million rows. There is a new need to report against how the sales order was given to the company, that being electronic submission, paper, or scanned. The way to determine this is from a single character position in the order#. There are a dozen values, which break down to the three categories.
I am considering these options to make sure all users of the data get these categories correct.
A user function, which takes in the order ID, and returns the category. The function really is just a case statement, so not doing anyother selects or queries.
A calculated field in the table which again, uses a case statement to set the category.
A new field in the table, which is populated as the records are added.
In this which makes the most sense in your opinion/experience to do? There is a possibility of perhaps a dozen other groupings or flags which would work like this.
The second situation is a grouping which will require self-joins, or lookup queries to arive at the correct flag. In these I could either create a function, or add fields to the table with the categories. In this situation type, there could be twenty or more flags.
My inclination is to in all cases add either fields to the table, which are marked during the data loads, or to create a new table with the key from the sales data, and fields for the flags being used. However, I do understand there are differences in how one sets up a data warehouse vs a transactional system, and I have found many times when browsing through this forum amazing ideas which seem slightly counter-intuitive to me. So, I think it is worth asking others their thoughts on this, perhaps there is another option I have not thought of which is considerably better.
Any thoughts? I am sure 'It depends', what sorts of things should be considered in such a depends case?
Thanks!
Post #918153
PaulB-TheOneAndOnly
PaulB-TheOneAndOnly
Posted Friday, May 07, 2010 11:30 AM
SSCrazy
Group: General Forum Members
Last Login: Yesterday @ 1:08 PM
Points: 2,982,
Visits: 4,396
David Lester (5/7/2010)
There are a dozen values, which break down to the three categories.
Using the above statement as reference I would agree that adding a new column to the target FACT table is the right thing to do. After initial population of the column, ETL process should take care of setting the right value during load.
A nice touch will be to create a DIM table describing the possible values of such new column.
_____________________________________
Pablo (Paul) Berzukov
Author of
Understanding Database Administration
available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
Post #918215
David Lester
David Lester
Posted Friday, May 07, 2010 12:01 PM
SSC-Enthusiastic
Group: General Forum Members
Last Login: Thursday, December 06, 2012 2:28 PM
Points: 185,
Visits: 1,542
Thanks Paul,
So with the dimension table being nice thought, are you thinking nice for being able to lookup the possible values, and adding new ones sometime in the distant future, or is it a performance related idea?
Post #918248
PaulB-TheOneAndOnly
PaulB-TheOneAndOnly
Posted Friday, May 07, 2010 2:09 PM
SSCrazy
Group: General Forum Members
Last Login: Yesterday @ 1:08 PM
Points: 2,982,
Visits: 4,396
David Lester (5/7/2010)
... are you thinking nice for being able to lookup the possible values, and adding new ones sometime in the distant future, or is it a performance related idea?
More like as a place to store the meaning and description of each value - in general I like to have a dimension describing each coded column sitting in a factual table.
_____________________________________
Pablo (Paul) Berzukov
Author of
Understanding Database Administration
available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
Post #918345
cochran1010
cochran1010
Posted Monday, January 10, 2011 2:55 AM
Forum Newbie
Group: General Forum Members
Last Login: Monday, January 10, 2011 3:01 AM
Points: 1,
Visits: 3
Creating a new table with the key from the sales data, and fields for the flags is the best option rather than addition.
ebay shipping
Post #1045184
PaulB-TheOneAndOnly
PaulB-TheOneAndOnly
Posted Thursday, December 08, 2011 2:51 PM
SSCrazy
Group: General Forum Members
Last Login: Yesterday @ 1:08 PM
Points: 2,982,
Visits: 4,396
cochran1010 (1/10/2011)
Creating a new table with the key from the sales data, and fields for the flags is the best option rather than addition.
Do you realize we are talking dimensional modeling here?
_____________________________________
Pablo (Paul) Berzukov
Author of
Understanding Database Administration
available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
Post #1218975
David Lester
David Lester
Posted Friday, December 09, 2011 7:40 AM
SSC-Enthusiastic
Group: General Forum Members
Last Login: Thursday, December 06, 2012 2:28 PM
Points: 185,
Visits: 1,542
Thanks Paul, and yes I do.
Thankfully, after years of attempts, IT has ceased to be a wall. While they are still not ready to listen to me, they did hire a consultant. We are in process of redesigns, using these very things.
(It is mildly amusing that every suggestion the consultant is giving them independently matches what I have been asking for these last years.)
Post #1219414
PaulB-TheOneAndOnly
PaulB-TheOneAndOnly
Posted Saturday, December 10, 2011 3:58 PM
SSCrazy
Group: General Forum Members
Last Login: Yesterday @ 1:08 PM
Points: 2,982,
Visits: 4,396
David Lester (12/9/2011)
Thanks Paul, and yes I do.
Thankfully, after years of attempts, IT has ceased to be a wall. While they are still not ready to listen to me, they did hire a consultant. We are in process of redesigns, using these very things.
(It is mildly amusing that every suggestion the consultant is giving them independently matches what I have been asking for these last years.)
I know what you mean David. Sometimes organizations lack the ability to listen to the internal resources and waste sh*&^tloads of money on consultants that in the best case scenario come up with the same (brilliant) ideas internal resouces have been trying to convey for years with no luck.
_____________________________________
Pablo (Paul) Berzukov
Author of
Understanding Database Administration
available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
Post #1219863
« Prev Topic
|
Next Topic »
Permissions
You
cannot
post new topics.
You
cannot
post topic replies.
You
cannot
post new polls.
You
cannot
post replies to polls.
You
cannot
edit your own topics.
You
cannot
delete your own topics.
You
cannot
edit other topics.
You
cannot
delete other topics.
You
cannot
edit your own posts.
You
cannot
edit other posts.
You
cannot
delete your own posts.
You
cannot
delete other posts.
You
cannot
post events.
You
cannot
edit your own events.
You
cannot
edit other events.
You
cannot
delete your own events.
You
cannot
delete other events.
You
cannot
send private messages.
You
cannot
send emails.
You
may
read topics.
You
cannot
rate topics.
You
cannot
vote within polls.
You
cannot
upload attachments.
You
may
download attachments.
You
cannot
post HTML code.
You
cannot
edit HTML code.
You
cannot
post IFCode.
You
cannot
post JavaScript.
You
cannot
post EmotIcons.
You
cannot
post or upload images.
Copyright © 2002-2013 Simple Talk Publishing. All Rights Reserved.
Privacy Policy.
Terms of Use.
Report Abuse.