First of all, you say that the partitioning was so that you could achieve parallelism in your ETL loads to the staging table. You also state that the partitioned was done to 18 drives. Here's a major catch with that... unless you can guarantee that your 18 drives are actually 18 different physical drives instead of just 18 logical drives pointing to the same 1 or 2 or even 4 (for example) physical drives, your attempts a parallelism might actually be slowing the loads down because of read head movement on the physical drive. True parallel loading can only happen if each partition is guaranteed to live on it's own physical drive or it's own set of physical drives. If the drives are merely logical drives, then your likely still only doing a "time share load in a serial fashion". You'll have the same problem no matter which RDBMS you use.
And, surprise, surprise and depending on how the logical drives are carved out, this can also affect SSDs, although to a lesser extent because there is no physical R/W head movement. There's still a possibility that you might end up time-sharing a pipe, which is a serial rather than parallel operation.
You mention that the staging table can have 200-600 million rows in it. When does it have that many rows in it? What is the life cycle of a row in the staging table? For example, is the staging table empty when you first start to load either the 18 files or the 4 files that you may receive? Since it is a staging table, where and how does the data go from there? Or, are you saying that the staging table isn't actually a "staging" table and that it's really the final target table that you're loading into? If it's the final target table, is the load an "UPSERT" (update matching rows, insert new rows) or is it an "always insert" methodology?
You also haven't stated how many rows and GB each of the 18 or each of the 4 files you load contain. And what are you using to load the files into SQL Server? BCP? BULK INSERT? Something else and if so, what is it? What is the format of the files you load? True CSV with text qualifiers, Excel-like CSV where text qualifiers only appear on the "cell" that may carry a delimiter? Tab separated? Fixed Field? (gasp) XML/JSON? (double gasp) EDI?
Heh... and you mention that you've done POP testing? On what? Just SQL Server? Have you/they done the same testing on Oracle to see if there's actually been an improvement or are folks just speculating that there will be based on some anecdotal evidence?
Also, do you good folks understand and use the concept of actual "Minimal Logging" and have verified that your loads are actually using it? That, all by itself, will make the loads more than twice as fast. Too many indexes for that? Have you read the following article? The article is a bit dated but everything in it is still applicable.
As for the Paul White article... absolutely brilliant work on his part, as normal. With nothing aimed at Paul or his good work there, do you have even one index on your partitioned table where the first key is sorted in descending order? Is it likely that you ever would? While I agree that it IS broken code on Microsoft's part and that they did do a pretty miserable job of patching the error instead of fixing the error, especially since it requires the use of THAT particular trace flag, I agree with Paul that having a descending leading key in an index on a partitioned table would be an extremely rare thing to happen and, unless such a thing is critical to the success of your project, I wouldn't give it another thought except to say, did you test Oracle for the same problem?
Oracle's in-table, meta-data only, row-level archival method sounds really cool except (IMHO because I don't know anything of Oracle's table structure) that it solves a problem that a trivial amount of code would solve and it still leaves the data in the table. It really hasn't solved anything if the data is still in place. I don't know... maybe Oracle's engine is smart enough to not load such rows into memory when adjacent data is read. But, it's still on disk and the OS still has to read past it. If not on physical disk, then it still takes up space on expensive SSDs and I'd think that it would still go through the throws of backups and restores if the archiving is reversible.
As for the table holding 135 columns, I can certainly understand that for a staging table. I can't understand that for permanent data. Assuming that Oracle isn't a column-centric database (I'm pretty sure that it is not) and even if there is no possibility of further normalizing the data in the table, even Oracle would benefit from some thoughtful VERTICAL partitioning through the use of "sister" tables where the high usage columns are kept in a nice narrow table and the infrequently used data is in other tables. A nice updateable view with an Instead Of trigger in SQL Server or a Before trigger in Oracle on a "ghost" table would certainly make life easy and fast. As for indexing every column, of course you wouldn't. You would only index for the lookups you have and possibly not all of them. As you point out, that may be one area where partition elimination may come in handy but only if the code is written in such a fashion as to take advantage of that. And guess what? Not all of your queries are going to be able to take advantage of partition elimination. It may even be that none of the code will.
As for the partitioning, you're doing temporal partitioning based on dates to isolate quarters. So, let me ask... how far back can a row go before you can say it will NEVER be modified ever again? I'm thinking that, compared to 4 years back, it won't be very long. Have you considered the performance and maintenance advantages of making old partitions READ_ONLY? If not, then you're actually missing out on the biggest reasons to partition to begin with.
As for all the bloody damned updates that MS does, I agree that it's a real pain in the patooti but it's better than leaving something broken for a long time and it's better than waiting to be made to suffer through a successful attack. I can't speak to Oracle's update strategy because I don't know what it actually is. It's nice to think that Oracle never requires updates for bugs or security but I doubt that's true.
Last but not least, if you're having problems with front-end performance and that's part of the reason why you want to move to Oracle's multi-column and sub column partitioning methods, have you actually looked at what is causing the performance problems? I can assure you that even on a large database, it's usually not the fault of the database or the objects that are in it even if you have a bit of a bad design. Most of the time, it's crap code that's coming from an ORM or code designed by someone that thinks a database is just a place to store data.
We just went through a monster problem where we reached some sort of tipping point and the system would "lock" for as long as 30 minutes and then occasionally do it again 5 minutes later. Lots of people were blaming the database. Hell... lot's of people were blaming me because I've not done ANY index maintenance since 17 January 2016. I stood my ground and went searching for the problem. It turned out to be two crap problems with one piece of crap ORM code. First, the ORM code that was generated made it impossible to use an index because the WHERE clause had more than 40 columns in it. Worse yet, the items it was looking for in the WHERE clause were in the form of literal strings and every time the code was executed, at least one byte changed in the code every time. That also meant (and I proved it) that the code had to recompile virtually every time it was being used. Because of the silly WHERE clause, it took anywhere from 2 to 22 seconds to recompile each and every time it was used and it was used tens of thousands of times each hour. As if that's not bad enough and contrary to documentation on the subject, we then found out that all of the connection strings created by Entity Framework defaulted to SET MULTIPLE ACTIVE RESULT SETS being on. For good code, that's great. For bad code and to make a much more complicated explanation shorter, the batch doesn't finish before the transaction does (and that bit of information IS in the documentation) and it causes the transaction to rollback. It's a freakin' READ transaction and it rolls back. One would roll back because of the crazy long compile time, which would cause the system to slow a bit and cause another to rollback and I ended up with hundreds of sessions all waiting their turn to rollback and it consumed all but 1 or 2 of 32 CPUs... slammed all but those 1 or 2 at 80 to 100%.
So, with all due respect to the things you've tried, 1) I don't believe that you've tried everything and 2) you haven't said that you've actually done scalable POP testing in Oracle. I could certainly be incorrect but it may be that you'll see little if any improvement in performance once you spend the time, money, and effort to do the migration.
My bottom line recommendation is that until you've done such scalability testing on similarly massive amounts of information, you have no justification for even considering the migration. "One good test is worth a thousand expert opinions".
Now, if you have done such testing on Oracle and it turned out to be a miracle of performance, then why are we still posting about problems to each other? 🙂 You should be bragging about the wicked performance improvements you got instead. But they haven't done such testing yet, have they? :ermm: :Whistling:
p.s. When we turned off MARS on all connections to the database, average CPU across all 32 CPUs dropped from 40% to 22% on the busy hours. When we fixed the root problem by writing a properly written and much better behaved stored procedure, the average CPU across all 32 CPUs dropped from 22% to a ripple between 6 and 8%. During the "firefights" to try to fix the problem, they had added memory to take us from 250GB to 384GB and also took us up to 48 CPUs. It actually made things a bit worse because more sessions were able to get stuck and start rolling back. My post-fix measurements were made after I turned off the 16 extra CPUs just to take the measurements.
You can bet your sweet bippy that I turned them back on when I was done. 😉 I never turn down hardware even when it doesn't make a difference.