First off let me preface this with I'm not a DBA, I'm decent with databases and I sling some code on occasion.
Second, I'll say I'm just not sure where to start and what I'm looking for are some keywords and direction.
I have about 6 databases
The Dupes pretty much contain everything that's on the Mains (seems inefficient but it's what I have to deal with). Now 6 databases with all this data obviously isn't enough so we like to then ship the same data off to a data warehouse which is currently done with csv files.
There are quite a few tables but only 10 views which are "important" enough to be shipped off to the DW.
Sometimes the Dupes don't always match up to the Mains
What we need to do is check each of the views in the Dupes and find the Dupe that has the closest match, rowcount wise, to the Mains
Then we extract the 10 views (About 2 million records all together) out of that Dupe and send it off to a central location that the DW picks up.
We have a process that does this now. If I run through the process manually, by querying the views and dumping the data to a text file then it takes maybe 45 minutes. The process that is currently in place was built but a really good DBA and it takes about 4 to 4 and a half hours. Lots and Lots of error handling and it's executed mostly using scheduled batch files that contain BCP.exe executions. I don't know if that's best practices but it makes me puke in my mouth a little.
My thought on it is to write a service in C# (or whatever) that kicks off 6 threads and each thread connects to a database to do rowcounts on the views. Based on that it picks the best HDS. Selects the views and dumps them to text files. The multi threading is so all the counts happen at the same time instead of linear.
I'm also flirting with the idea of having it all accessed via a webservice. So at midnight the service does the counts and picks the Dupe. Around 2am the DW hits the webservice and the webservice has, for example, Dupe2 as the best choice so it queries that one for the actual records. They make a webservice request for each view they want.
Now my questions.
1. Am I thinking to much like a programmer? Should this be done using existing tools in SQL Server?
2. Is the webservice a good idea?
3. What's the best practice here for getting data out?
Thanks for the help!