I have a cluster hosting multiple GP databases and a second for my data-warehouse I am playing around with (personal project).
I have scripts that pull data from all the DB's to input into the DW's tables(Customers,Reps,Hub....)
an example of my branch script :
select interid as BranchID,
cmpnynam as BranchDesc,
address1 as BranchAddressLine1,
address2 as BranchAddressLine2,
address3 as BranchAddressLine3,
zipcode as PostalCode
where interid in ('comp1', 'comp2', 'comp4', 'comp5')
what would be the best way to using these scripts pull the data to my testDW and not have duplicate data issues?
I was thinking of using a staging DB on the GP cluster and then building an import data package to run nightly. the issue i had was how do i avoid duplicate data ?