March 5, 2025 at 7:27 pm
Hi all, I work with an unusual warehouse and I was wondering if anyone else has this issue, we are looking to document source to target but with our set up it seems like it's going to be a manual process.
We don't have a separate ETL tool that loads the data. Our warehouse is one that deletes all the data in the fact tables nightly then reloads, the only tables that get updated are the dimensions.
To start a fact table is partially loaded from a view and in that view the actual mapping from source to target is in the join. Ex:
Select CareerKey
from SourceTableCareer stc
inner join DimCareer car
on stc.Career = car.SourceKey <-- this would be the mapping we need.
Once the data is loaded in a Staging table from the view we then have stored procedures that set more keys and measures for the fact. Some Fact tables have up to 50 stored procedures used to load the data. Then when staging is completed the data is moved into the final table. So a tool would have to be able to parse through the first view then all of the stored procedures to get the mapping.
Anyway, what were looking for some guidance on how to document this without having to do it all manually. We tried working with an external vendor and their product was unable to do what we need. I also tried working through the lineage code developed by Tomaz Kastrum in 2022 but that was no help to me.
FYI; I've been asked several times why we do it this way and all I say is I wasn't the one to build it.
Any ideas would be very helpful.
Thanks
March 6, 2025 at 9:15 am
Totally understand why you don't want to do this manually! But …
I know of nothing in the market that can do it, though I haven't looked for a few years. I will be interested to see whether any others post something of substance.
Once (or if) it does get done, keeping it up to date with changes is an additional challenge.
March 6, 2025 at 1:58 pm
I'm not aware of anything that could do this.
You might try, strictly experimentally, feeding the code into an AI engine and seeing if you can build a prompt that will give you the answer. I'm not excited about this, but I think it'd be worth a try.
Otherwise, you could probably build some PowerShell (or other shell scripts) to do the work and look for the patterns, that you know, but that tool isn't going to figure out easily. I think this is your best bet.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 7, 2025 at 7:34 pm
AI is an interesting idea.
I might low-tech this and add some sort of over-reaching proc that is called (or a first step in a job), that documents the flows and high level lineage tracking. It's not going to be perfect, and keeping it up to date is work, but if you're capturing code in a VCS, changes to code in a PR should include a "doc" update to this proc. That helps me when I'm debugging as I'm in the code and having a list of what's called in what order helps me.
Viewing 4 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy