SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Geography Dimension


Geography Dimension

Author
Message
Dave Klug
Dave Klug
Ten Centuries
Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)

Group: General Forum Members
Points: 1391 Visits: 1174
I am currently working on a warehouse project that includes a dimension for geography. It includes Zip Code, City and State currently. Latitude and Longitude are available as well in the feed that populates the dimension. My question is should the lat/long be placed on the dimension or fact? I am leaning towards the dimension but am getting pushback as others are saying it is too narrow for the dimension (will cause too many rows on the dimension) and it should be placed on the fact.

Anyone else have this debate before?
stevefromOZ
stevefromOZ
Hall of Fame
Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)

Group: Moderators
Points: 3440 Visits: 3757
Seriously guessing this depends on what your fact is actually recording..... But assuming sales (when in reality, it's just as likely mining equipment moving round an opencut mine), then the count/volume of the Long/Lat wouldn't be any higher being in the dim than in the fact. Personaly, I'd go with the dim. To me, it's like saying we have 1MM clients, so thats a lot of phone numbers, so put them in the fact. They still belong in the dim :-)

Steve.
Dave Klug
Dave Klug
Ten Centuries
Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)

Group: General Forum Members
Points: 1391 Visits: 1174
That is exactly my thought and funny you should bring it up. We are tracking phone calls made by customers to a business and where the call was made from.

The consulting company we are working with wants to put the phone number and lat/long in the fact but my argument is it belongs in the dimension as it's not a measure. Should the number be called more than once, storage is less on the dimension than on the fact. Chances are though , either phone or lat/long will most likely appear only once in the data but it's still not a fact and belongs in a dimension.
stevefromOZ
stevefromOZ
Hall of Fame
Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)

Group: Moderators
Points: 3440 Visits: 3757
Wondering if they also lose some reporting/analytical functionality by separating it from the rest of the Customer record? If you/they want to attribute this in *any* way then it's much better suited being in the/a dimension. Although you can definitely get exceptions, my rule of thumb on this would be that it's an attribute of the grain of the transaction - ie the phone call. I could theoretically want to build a hierarchy up over it. It's the equivalent of wanting to store the 'call reason' in the fact, which I'm guessing they're not asking for, because, it's an attribute of the call and again I may want to build a hierarchy up over that.

Steve.
WHug
WHug
SSC Journeyman
SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)

Group: General Forum Members
Points: 99 Visits: 410
One option is to include a field of spatial type Geography and insert into it a polygon of the geographic area. Review the Spatial data types in BOL for more information.



richykong
richykong
SSC-Enthusiastic
SSC-Enthusiastic (192 reputation)SSC-Enthusiastic (192 reputation)SSC-Enthusiastic (192 reputation)SSC-Enthusiastic (192 reputation)SSC-Enthusiastic (192 reputation)SSC-Enthusiastic (192 reputation)SSC-Enthusiastic (192 reputation)SSC-Enthusiastic (192 reputation)

Group: General Forum Members
Points: 192 Visits: 621
Are you going to be capturing any further location detail in the Geography Dimension like street? Is that level of detail needed? If no, then I would not capture the Lat/Lon in this table. Based on what you have right now, the granularity of the dimension is at zip/city level. Capturing the Lat/Lon would make the dimension exponentially larger since each record would essentially have a different Lat/Lon. Assuming a customer may make the call with a cell phone, a single customer could make a call from multiple Lat/Lon even if the customer is always calling from within the same area. If the Lat/Lon is important for auditing/analysis but not used for aggregate analysis, I would just store the detail in the fact since the granularity could potentially be equivalent to the fact granularity.

In situations where I have a detail where the granularity is very close to the granularity of the Fact, I usually just add it to the fact table. I believe details like these are referred to as degenerate dimensions. I would also add the customer number to fact as a degenerate dimension in your situation.

If you create a dimension that is essentially 1 to 1 with the fact granularity, there is no benefit.
tmitchelar
tmitchelar
SSC Veteran
SSC Veteran (249 reputation)SSC Veteran (249 reputation)SSC Veteran (249 reputation)SSC Veteran (249 reputation)SSC Veteran (249 reputation)SSC Veteran (249 reputation)SSC Veteran (249 reputation)SSC Veteran (249 reputation)

Group: General Forum Members
Points: 249 Visits: 508
I would choose dimension because it is a attribute and not a measure.
davoscollective
davoscollective
SSC Eights!
SSC Eights! (921 reputation)SSC Eights! (921 reputation)SSC Eights! (921 reputation)SSC Eights! (921 reputation)SSC Eights! (921 reputation)SSC Eights! (921 reputation)SSC Eights! (921 reputation)SSC Eights! (921 reputation)

Group: General Forum Members
Points: 921 Visits: 1004
I would argue that the central lat/long of a particular ZIP code region is fine enough granularity. Unless you are a Telco or a Utility, mapping the exact coordinates of each customer might be overkill when you can just map by ZIP.

Since you already have ZIP code and possibly the suburb/county/council etc that names the ZIP code then a single lat and long record defining the centre of that ZIP region on the geography dimension will not create any more records than already exist in the dimension.

On the other hand, I've previously used coordinates of client offices in a 'customer' dimension, and this was useful for mapping sales on a fairly zoomed in view, the map was about 50km^2 (about 20Mi^2). I didn't have a geography dimension at all, but that design decision depends on the situation.
lshanahan
lshanahan
SSChasing Mays
SSChasing Mays (616 reputation)SSChasing Mays (616 reputation)SSChasing Mays (616 reputation)SSChasing Mays (616 reputation)SSChasing Mays (616 reputation)SSChasing Mays (616 reputation)SSChasing Mays (616 reputation)SSChasing Mays (616 reputation)

Group: General Forum Members
Points: 616 Visits: 436
In my mind, this is a question of granularity. There are only 181 latitudes (90 degrees north to 90 degrees south including 0) and 360 longitudes (plus or minus 180 degrees) and they aren't going to change. Two dimension tables could be created since they are independent of one another holding only the degree value for each one which are linked to whatever fact tables require them. Then the precise latitude and longitude can be added as fields for the fact tables (which is probably the only way it could be done for decimal geolocation values). The facts can then be rolled up by the overall latitude and longitude without involving huge tables and still preserve the precise values for a given entry. Even giving the latitude/longitude tables a granularity of one minute (60 minutes per degree), would only involve a table of 21,600 records for longitude and 10,860 for latitude. Again, if the precise location is needed, the full latitude and longitude can be included in the fact tables. It is trivial for a computer to convert decimal to degree/minute/second and back if needed.

This is similar to the Kimball design tip 51 for handling time of day as a dimension in addition to a calendar dimension.

____________
Just my $0.02 from over here in the cheap seats of the peanut gallery - please adjust for inflation and/or your local currency.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search