Sales DW ... modeling territory rules

  • Maybe I've read too much of Ralph Kimball's stuff, so now I can't see the forest for the trees.

    I have something like the standard sales data warehouse structure, but with a small twist. There is a territory hierarchy (postal code --> territoryID...), but because of "redistricting" every year, some zip codes may belong to more than one territory over time but only to one territory at a time.

    For example, zip code 10021 may belong to Territory #1 in 2014 and then in 2015, it may become part of Territory #2.

    What I was trying to figure out was how to model this so that I could "overlay" different definitions of a territory over time, so that a territory definition would remain static and I could see the changes in sales over time. In other words, I could do something like "Show me the 2014 sales, but group them using the 2012 territory definitions.

    I was thinking the TerritoryYear table (or whatever you call it) would be something like

    CREATE TABLE TerritoryYear (

    TerritoryID INT,

    SalesYear INT,

    PostalCode CHAR(5),

    CONSTRAINT pkTerritoryYear PRIMARY KEY (TerritoryID, SalesYear)

    );

    How then would I join that to a SalesFact table? Or would I have a snowflake schema instead? ... something like

    SalesFact--(M,1)---Customer---(1,M)---CustomerTerritoryYear---(M,1)---Territory

    SalesYear(1,M)---CustomerTerritoryYear

    (Apologies for the lame drawing... the CustomerTerritoryYear is the joining table for CustomerID, YearID, TerritoryID.)

    How do you model something like this so that you can compare different Territory definitions over time and be able to compare apples to apples? The only way that makes sense to me is to have something like the structure I have described.

    Am I on the right track or completely clueless? Territory is kind of a slowly changing dimension (with respect to Customer), but it doesn't seem to fit the mold.

    Is the solution to this described somewhere?

    Thanks!

    Pieter

  • pietlinden (12/31/2014)


    Maybe I've read too much of Ralph Kimball's stuff, so now I can't see the forest for the trees.

    I have something like the standard sales data warehouse structure, but with a small twist. There is a territory hierarchy (postal code --> territoryID...), but because of "redistricting" every year, some zip codes may belong to more than one territory over time but only to one territory at a time.

    For example, zip code 10021 may belong to Territory #1 in 2014 and then in 2015, it may become part of Territory #2.

    What I was trying to figure out was how to model this so that I could "overlay" different definitions of a territory over time, so that a territory definition would remain static and I could see the changes in sales over time. In other words, I could do something like "Show me the 2014 sales, but group them using the 2012 territory definitions.

    I was thinking the TerritoryYear table (or whatever you call it) would be something like

    CREATE TABLE TerritoryYear (

    TerritoryID INT,

    SalesYear INT,

    PostalCode CHAR(5),

    CONSTRAINT pkTerritoryYear PRIMARY KEY (TerritoryID, SalesYear)

    );

    How then would I join that to a SalesFact table? Or would I have a snowflake schema instead? ... something like

    SalesFact--(M,1)---Customer---(1,M)---CustomerTerritoryYear---(M,1)---Territory

    SalesYear(1,M)---CustomerTerritoryYear

    (Apologies for the lame drawing... the CustomerTerritoryYear is the joining table for CustomerID, YearID, TerritoryID.)

    How do you model something like this so that you can compare different Territory definitions over time and be able to compare apples to apples? The only way that makes sense to me is to have something like the structure I have described.

    Am I on the right track or completely clueless? Territory is kind of a slowly changing dimension (with respect to Customer), but it doesn't seem to fit the mold.

    Is the solution to this described somewhere?

    Thanks!

    Pieter

    Just off the top of my head...

    If you have the zip code in the sales fact table, then you join your territory year table to the sales fact table on zip code, then join your territory dimension table on the territory year id. Add the year of interest to the where clause for the territory year table, and you should be in pretty good shape.

    so...

    select sf.dollars, td.territory

    from salesfact sf

    inner join territoryYear ty on sf.postalCode = ty.postalCode

    inner join territoryDimension td on ty.territoryID = td.territoryID

    WHERE ty.year = @year

    Gives you dollars and territory for whatever year you're looking at. Expand as desired.

  • That's what I suspected. Then I can just filter the Territory dimension with the appropriate year and be off to the races.

    Thanks!

  • That is an interesting problem. How do you currently handle your Territories? Do you have separate Territory dimension or is it part of the Customer dimension?

  • the only way I could think of doing it was to have a "translation" table.

    Something like (CustomerID, Year, AssignedTerritoryID).

    That's the only way I could figure out to be able to compare territory performance over time and freeze the definition of a territory. (Can't change everything over time and expect to get anything that makes any sense.)

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply