Hello SSC,
I am not sure if I posted this in the right spot, but I have a data warehouse question...
I have a star schema with a "junk" dimension. I have never encountered junk dimensions before and from what I understand, they are simply dimensions that have no correlation to the fact table, but they are used for reporting. I have been tasked to come up with an alternative to this "junk dimension."
My first instinct is to change this to a "factless fact table" and only have the surrogate keys of the values and fetch the data for reporting joining to the dimension tables. My second instinct would be to just add it to the fact table, but that could widen the table. I am considering going with my first instinct, but if SSC has any guidance on this, it would be greatly appreciated!
Thanks,
Dave
The are no problems, only solutions. --John Lennon
March 10, 2025 at 10:32 pm
It would really depend on the specific use-case. Can you provide some examples of the contents of your junk dimension?
March 10, 2025 at 11:19 pm
Full disclosure, I am going on a face to face interview and this is a question that was asked in the phone screen. I would say that a factless fact table with just surogate keys back to the dimension tables would work, but I don't think that is the answer they are looking for.
I am not sure of the data, so I am trying to use logic here, but it's challenging without context.
The are no problems, only solutions. --John Lennon
A factless fact table would probably not be a feasible solution, because the purpose of the junk dimension is to take degenerate dimension values with low cardinality and place it in a dimension.
I guess you could have a junk dimension with key-value pairs that replace what would have been different individual dimensions, but adding another fact table wouldn't necessarily be better than just adding the degenerate values into an existing fact.
To your point, I guess you can also split the fact table and end up with a factless fact...but it would be very dependent on the situation.
I find myself in situations periodically where I have a dimension and fact table at the same grain, for example an insurance claim dimension where all the descriptive attributes are stored, and a similar fact table with all of the measures. Not necessarily great from a cardinality perspective, but I guess you can call that a junk dimension if we're going to get nitpicky about it 🙂
March 12, 2025 at 6:43 pm
Are you sure they said "junk" and not "jump"? There is good use for tables that only contain IDs from the Fact Tables and the Dimension tables. In relational databases, some refer to the as "bridge" or "bridging" tables but a common name for them is "Jump" tables.
For example... a single customer may have 5 addresses. It would not be uncommon for someone to have no rows containing ANY information about addresses in the customer table itself. Instead, the relationship would appear as 5 rows in a table named something like Customer_Address and would contain the same ID of the customer in 5 rows and each row would have the unique ID for one of the 5 addresses of the customer.
The reason they're sometimes referred to as "Jump" tables is because they help you "jump" to other tables without the data being listed in either of the two "outer" tables. Same goes for the name of "bridge" table... helps you form a "bridge" from one table to a related table.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 12, 2025 at 6:53 pm
It's "junk dimension" 🙂
Bridge/Helper tables are also commonly used in dimensional modeling to address M:M relationships, but not the same as a "junk dimension".
March 12, 2025 at 8:31 pm
Never mind about me thinking they were talking about "Jump" tables. I'm not a DW kind of person.
I asked Perplexity.AI the following...
What is a "junk" dimension in a data warehouse? What and how is it used?
It provided a rather interesting summary answer. In fact, I've never heard of a "junk" dimension and it's explanation of what, how to use, and how to create them (100,000 ft level... needs extra study for details) is down-right fascinating. It's a really good summary and the tool also provides links for where it gets the information.
As a bit of a sidebar, I'm VERY wary of AI in general but it makes a really, really good first search for articles to read as well as some pretty nicely worded sentences. The early days had me laughing with how stupid it could actually be.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 12, 2025 at 8:36 pm
Right on 🙂
I don't particularly like the term "junk dimension", but that's how Kimball coined it back in the day. Here's the official reference(s) for anyone stumbling across this thread:
https://www.kimballgroup.com/2009/06/design-tip-113-creating-using-and-maintaining-junk-dimensions/
March 13, 2025 at 7:46 pm
Thanks for the links, Martin.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply