Why SQL Server is Better Than Oracle

  • kannan_egd - Wednesday, October 4, 2017 4:08 AM

    Thom A - Wednesday, October 4, 2017 3:53 AM

    That isn't a limitation of SQL Server though, that is a limitation of a tool used to interact with SQL Server (and you don't even have to use it; it's not even available on Linux/Docker). Also, would you really want to output 2GB of data into a GUI? I suspect the answer is no. If you're going to be returning that much data, you'd have an extract process, not display it on screen. I have a textfile on my PC that's over 500MB and Notepad++ has a fit when I open it. Why? Because it's a lot of data to try and load into your PC's resources. It's not Notepad++'s fault, it's just the user being silly trying to load that much raw text.

    If Oracle has a tool where you have display 4GB of data, great! But why would you want to...? I would hazard a guess that your PC probably doesn't have more than 16GB of RAM, so are you happy with your Oracle GUI using about 30% of your resources? I wouldn't be.

    Yes, I too agree, but when comes to large volume data handling still SQL server is standing behind oracle and SQL server is not a true enterprise and not suitable for handling large volume of data. The best example is table partitioning SQL server still relays on range but Oracle have multiple choices, even multiple column partitioning is not at all available in SQL server. Ours is a healthcare database and we are using SQL server 2016 enterprise and we recently migrated from 2012 now we are migrating our database to oracle because of this partitioning limitation. we are planning to deal multi state data for our product for this we are looking for list and range wise and multi column partitioning but SQL server is not at all fit for this, many experts came for us to provide suggestions to make architectural changes but it requires lot of resources and may break the functionality so we still looking for composite partitioning it is not at all available in Microsoft and Microsoft has no plan for this feature development.

    This is getting rather silly. You seem to prefer Oracle for the stuff you do and that is great. Your only real argument has been about the partitioning options in Oracle. For large data that is important but for the more standard smaller datasets a behemoth like Oracle just isn't necessary. Where I work it wouldn't make sense to utilize Oracle because our data isn't massive enough to warrant the absurdly high cost of Oracle. As with everything data related it depends is the only real answer. Sounds like in your environment Oracle is the best option, for plenty of other people it is not a good option because of cost, or data size, or other requirements. I would like to point out that you came to a forum dedicated to SQL Server to discuss how much better Oracle is. I am still hoping you can provide some solid evidence other just listing some options about table partitioning.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • kannan_egd - Wednesday, October 4, 2017 4:08 AM

    Yes, I too agree, but when comes to large volume data handling still SQL server is standing behind oracle and SQL server is not a true enterprise and not suitable for handling large volume of data. The best example is table partitioning SQL server still relays on range but Oracle have multiple choices, even multiple column partitioning is not at all available in SQL server. Ours is a healthcare database and we are using SQL server 2016 enterprise and we recently migrated from 2012 now we are migrating our database to oracle because of this partitioning limitation. we are planning to deal multi state data for our product for this we are looking for list and range wise and multi column partitioning but SQL server is not at all fit for this, many experts came for us to provide suggestions to make architectural changes but it requires lot of resources and may break the functionality so we still looking for composite partitioning it is not at all available in Microsoft and Microsoft has no plan for this feature development.

    So, SQL Server is missing one functionality that you need. That doesn't make Oracle better, it just makes it the correct solution for your problem.
    However, if you can't make changes to adapt to some suggestions, good luck trying to migrate from one RDBMS to another.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • kannan_egd - Wednesday, October 4, 2017 4:08 AM

    Yes, I too agree, but when comes to large volume data handling still SQL server is standing behind oracle and SQL server is not a true enterprise and not suitable for handling large volume of data. The best example is table partitioning SQL server still relays on range but Oracle have multiple choices, even multiple column partitioning is not at all available in SQL server. Ours is a healthcare database and we are using SQL server 2016 enterprise and we recently migrated from 2012 now we are migrating our database to oracle because of this partitioning limitation. we are planning to deal multi state data for our product for this we are looking for list and range wise and multi column partitioning but SQL server is not at all fit for this, many experts came for us to provide suggestions to make architectural changes but it requires lot of resources and may break the functionality so we still looking for composite partitioning it is not at all available in Microsoft and Microsoft has no plan for this feature development.

    Maybe you've already investigated this option, but as far back as SQL Server 2008, we can use a persisted computed column for the partitioning function, and that computed column expression can be based on multiple columns and also supporting a broad range of T-SQL functions.

    https://docs.microsoft.com/en-us/sql/t-sql/statements/alter-table-computed-column-definition-transact-sql

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • To effectively scale to TB sized tables, an enterprise class RDMS like SQL Server or Oracle must be paired with an enterprise class DBA.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • kannan_egd - Wednesday, October 4, 2017 4:08 AM

    Thom A - Wednesday, October 4, 2017 3:53 AM

    kannan_egd - Wednesday, October 4, 2017 3:41 AM

    jfogel - Tuesday, October 3, 2017 6:04 PM

    Lynn Pettis - Tuesday, October 3, 2017 3:10 PM

    jfogel - Tuesday, October 3, 2017 2:37 PM

    Thing is, then anyway, TOAD wasn't free, Oracle didn't have a good query/result interface as part of the package. I used to use a linked server to query Oracle and that worked out pretty well.

    You are correct, Toad isn't free.  When working with Oracle it is worth the cost.

    True, but equally so is the fact that when you buy the 'Rolls-Royce" of database products, it isn't too much to ask to have quality query tools.TOAD should have never been something anyone needed. I've always said that getting data out of Oracle is like getting blood from a rock and nobody has proven me right more that Oracle.

    I still have the all-glass Chess set they gave me. I would have rather had a good query interface.

    People are looking for cosmetic items not on the quality and stablIty. When dealing with large volume these SSMS is not to fit to holds data in its grid if it reaches more than 2GB

    That isn't a limitation of SQL Server though, that is a limitation of a tool used to interact with SQL Server (and you don't even have to use it; it's not even available on Linux/Docker). Also, would you really want to output 2GB of data into a GUI? I suspect the answer is no. If you're going to be returning that much data, you'd have an extract process, not display it on screen. I have a textfile on my PC that's over 500MB and Notepad++ has a fit when I open it. Why? Because it's a lot of data to try and load into your PC's resources. It's not Notepad++'s fault, it's just the user being silly trying to load that much raw text.

    If Oracle has a tool where you have display 4GB of data, great! But why would you want to...? I would hazard a guess that your PC probably doesn't have more than 16GB of RAM, so are you happy with your Oracle GUI using about 30% of your resources? I wouldn't be.

    Yes, I too agree, but when comes to large volume data handling still SQL server is standing behind oracle and SQL server is not a true enterprise and not suitable for handling large volume of data. The best example is table partitioning SQL server still relays on range but Oracle have multiple choices, even multiple column partitioning is not at all available in SQL server. Ours is a healthcare database and we are using SQL server 2016 enterprise and we recently migrated from 2012 now we are migrating our database to oracle because of this partitioning limitation. we are planning to deal multi state data for our product for this we are looking for list and range wise and multi column partitioning but SQL server is not at all fit for this, many experts came for us to provide suggestions to make architectural changes but it requires lot of resources and may break the functionality so we still looking for composite partitioning it is not at all available in Microsoft and Microsoft has no plan for this feature development.

    Actually, it may be (I don't know for sure because I don't know what you've tried) because you don't know SQL Server well enough.

    I say that because what you're calling "Column Partitioning", we call "Properly Indexed",  "Covering Indexes", "Filtered Indexes", "Column Store", etc.  Also, if you thought Table Partitioning was a panacea of performance, you were dead wrong.  It's a storage method used mostly to ease maintenance requirements.  Properly indexed monolithic tables normally beat Partitioned Tables quite handily because of the extra overhead of partitioning.  To wit, even so called "Partition Elimination" is quite often slower than writing good code against properly indexed tables.

    Because partitioning is usually a bit of a performance drag when compared to monolithic tables and it still relies on good code, I think you may end up being disappointed in "Multi-Column Partitioning" but, again, don't know for sure. 

    There's also a pretty good coding paradigm shift required to migrate to Oracle (or vice versa).  The mental changes to use joined UPDATEs in Oracle alone may be enough drive some Developers crazy.  I'm not saying it's bad... I'm saying it's totally different because the UPDATE statement in Oracle has no FROM clause and joined updates rely on correlated subqueries in Oracle (at least they did when I was working with it... I don't know if they changed that).

    Since you say that you're in the process of migrating to Oracle, I have to ask the question... did you good folks actually have the opportunity to test your larger requirements in an Oracle system or is your decision based on some demo by a sales group?  I think you're going to find many performance problems in Oracle, just like you have in SQL Server because performance usually is NOT a matter of which RDBMS you're using... it's usually a matter of how much performance challenged code you have.  If you take the same bad coding methodologies to Oracle, you'll have the same performance problems there as you did with SQL Server.

    All that being said, I'll say again that I bear no malice towards Oracle or any other database engine.  It may be that the shift to Oracle is just what your particular application needs (although I have some good doubt there).  As I said before, "It Depends".  Just keep in mind that SQL Server has been very successfully used for many HUGE applications so it's really not an engine capability problem.

    I sincerely wish you the best of luck with the shift to Oracle.  I just hope you good folks aren't spending a shedload of time, money, and effort (don't forget retraining so you actually can use Oracle) for naught.  As bit of a sidebar, you won't miss feature deprecation and discontinuation. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Eric M Russell - Wednesday, October 4, 2017 7:55 AM

    To effectively scale to TB sized tables, an enterprise class RDMS like SQL Server or Oracle must be paired with an enterprise class DBA.

    Certainly you aren't implying that doing a select * on a table with billions of rows in Oracle would have issues?

  • ZZartin - Wednesday, October 4, 2017 8:23 AM

    Eric M Russell - Wednesday, October 4, 2017 7:55 AM

    To effectively scale to TB sized tables, an enterprise class RDMS like SQL Server or Oracle must be paired with an enterprise class DBA.

    Certainly you aren't implying that doing a select * on a table with billions of rows in Oracle would have issues?

    Yes, even Oracle 12c with 256 GB RAM, 24 cores, and running on a ZFS Storage Appliance will not yield acceptable SELECT * performance. No amount of hardware investment can compensate for poor SQL coding in a TB scale database environment.

    The server is the race car, the data is the track, and the DBA is the driver behind the wheel. Expertise of the DBA is essential.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell - Wednesday, October 4, 2017 9:07 AM

    ZZartin - Wednesday, October 4, 2017 8:23 AM

    Eric M Russell - Wednesday, October 4, 2017 7:55 AM

    To effectively scale to TB sized tables, an enterprise class RDMS like SQL Server or Oracle must be paired with an enterprise class DBA.

    Certainly you aren't implying that doing a select * on a table with billions of rows in Oracle would have issues?

    Yes, even Oracle 12c with 256 GB RAM, 24 cores, and running on a ZFS Storage Appliance will not yield acceptable SELECT * performance. No amount of hardware investment can compensate for poor SQL coding in a TB scale database environment.

    The server is the race car, the data is the track, and the DBA is the driver behind the wheel. Expertise of the DBA is essential.

    I absolutely agree and you already know this, so I'm saying it for other folks that may read this.

    The DBA can't do it all.  Expertise on the part of the Developers is a very strong key issue, as well.  Further, support from Management for the DBA is absolutely critical.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Wednesday, October 4, 2017 11:24 AM

    Eric M Russell - Wednesday, October 4, 2017 9:07 AM

    ZZartin - Wednesday, October 4, 2017 8:23 AM

    Eric M Russell - Wednesday, October 4, 2017 7:55 AM

    To effectively scale to TB sized tables, an enterprise class RDMS like SQL Server or Oracle must be paired with an enterprise class DBA.

    Certainly you aren't implying that doing a select * on a table with billions of rows in Oracle would have issues?

    Yes, even Oracle 12c with 256 GB RAM, 24 cores, and running on a ZFS Storage Appliance will not yield acceptable SELECT * performance. No amount of hardware investment can compensate for poor SQL coding in a TB scale database environment.

    The server is the race car, the data is the track, and the DBA is the driver behind the wheel. Expertise of the DBA is essential.

    I absolutely agree and you already know this, so I'm saying it for other folks that may read this.

    The DBA can't do it all.  Expertise on the part of the Developers is a very strong key issue, as well.  Further, support from Management for the DBA is absolutely critical.

    Dear Jeff Moden,

    Thanks for all your valuable thoughts and replies. I am working as a SQL server DBA for my current project and worked from SQL server 2000 on wards, i am one of the fan of SQL server from my college days, even few of them in the blog points me that i have not much experience in SQL server, any way thanks for their applause, but it is not true. I got only a few (6 months) of experience in oracle and got opportunity to work with a senior oracle DBA, Day by day oracles features are increasing and providing lot of flexibility, that why our entire team is keep standing against me to migrate it to oracle. Today oracle DBA'S come up  with IN-Database archival i.e row level archival introduced in 12c, with that feature they make the rows invisible from the table and the select query only works on the non archived rows, even the optimizer is also works accordingly, but in SQL server this may be achieved by row level security, but archival is entirely different than row level security. Oracle DBA's asked me what is the equivalent feature available in SQL server for this? We all know that archival is the process of moving the old data to other disk, but oracle handles both approach traditional & modern (moving data to other storage, keep data in the same table with inactive). We can achieve this in a different way by introducing a bit column and create a view that looks only for active records and updating or inserting the view similar to table will be the work around. If a work around is available then why Oracle development team put their effort in introducing this feature? we all (sorry for that) need to think. Come to pricing part Earlier Microsoft licensing strategy is entirely different after reaches market share Microsoft revised their license with core based. A license forum clearly explains oracle vs Microsoft licensing cost differs in few dollars. But Microsoft is concentrated on releasing their version every year with more costlier features. MOTS (Memory optimized table with durable and non durable) is one of the feature introduced in SQL server 2016. What is the use of it? Memory optimized table (Feature to Grab memory) Already our provided VM serves are configured with 256 GB of RAM, if we implement mots then our entire memory will gone, again we have lots of workarounds. I am a person who always supports Microsoft SQL server.

    If anybody knows the solutions or work around for my problems kindly share your suggestions.

    1. Composite partition (List & Range) State wise partition and then create sub partition for date-wise how to implement this in SQL server?
    2. Row level archival (Hide inactive rows from tables similar to oracle) so that i will prepare a deck and compromise with the team to stop migrating oracle from SQL server 2016.

  • kannan_egd - Thursday, October 5, 2017 3:27 AM

    ...

    ...

    If anybody knows the solutions or work around for my problems kindly share your suggestions.

    1. Composite partition (List & Range) State wise partition and then create sub partition for date-wise how to implement this in SQL server?
    2. Row level archival (Hide inactive rows from tables similar to oracle) so that i will prepare a deck and compromise with the team to stop migrating oracle from SQL server 2016.

    Getting Started with System-Versioned Temporal Tables
    https://docs.microsoft.com/en-us/sql/relational-databases/tables/getting-started-with-system-versioned-temporal-tables
    Stretch Database migrates your cold data transparently and securely to the Microsoft Azure cloud.
    https://docs.microsoft.com/en-us/sql/sql-server/stretch-database/stretch-database

    How to Implement an Automatic Sliding Window in a Partitioned Table on SQL Server 2005
    https://technet.microsoft.com/en-us/library/aa964122(v=sql.90).aspx

    You may use a persisted computed column for the partition function.
    https://docs.microsoft.com/en-us/sql/relational-databases/tables/specify-computed-columns-in-a-table

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • kannan_egd - Thursday, October 5, 2017 3:27 AM

    Jeff Moden - Wednesday, October 4, 2017 11:24 AM

    Eric M Russell - Wednesday, October 4, 2017 9:07 AM

    ZZartin - Wednesday, October 4, 2017 8:23 AM

    Eric M Russell - Wednesday, October 4, 2017 7:55 AM

    To effectively scale to TB sized tables, an enterprise class RDMS like SQL Server or Oracle must be paired with an enterprise class DBA.

    Certainly you aren't implying that doing a select * on a table with billions of rows in Oracle would have issues?

    Yes, even Oracle 12c with 256 GB RAM, 24 cores, and running on a ZFS Storage Appliance will not yield acceptable SELECT * performance. No amount of hardware investment can compensate for poor SQL coding in a TB scale database environment.

    The server is the race car, the data is the track, and the DBA is the driver behind the wheel. Expertise of the DBA is essential.

    I absolutely agree and you already know this, so I'm saying it for other folks that may read this.

    The DBA can't do it all.  Expertise on the part of the Developers is a very strong key issue, as well.  Further, support from Management for the DBA is absolutely critical.

    Dear Jeff Moden,

    Thanks for all your valuable thoughts and replies. I am working as a SQL server DBA for my current project and worked from SQL server 2000 on wards, i am one of the fan of SQL server from my college days, even few of them in the blog points me that i have not much experience in SQL server, any way thanks for their applause, but it is not true. I got only a few (6 months) of experience in oracle and got opportunity to work with a senior oracle DBA, Day by day oracles features are increasing and providing lot of flexibility, that why our entire team is keep standing against me to migrate it to oracle. Today oracle DBA'S come up  with IN-Database archival i.e row level archival introduced in 12c, with that feature they make the rows invisible from the table and the select query only works on the non archived rows, even the optimizer is also works accordingly, but in SQL server this may be achieved by row level security, but archival is entirely different than row level security. Oracle DBA's asked me what is the equivalent feature available in SQL server for this? We all know that archival is the process of moving the old data to other disk, but oracle handles both approach traditional & modern (moving data to other storage, keep data in the same table with inactive). We can achieve this in a different way by introducing a bit column and create a view that looks only for active records and updating or inserting the view similar to table will be the work around. If a work around is available then why Oracle development team put their effort in introducing this feature? we all (sorry for that) need to think. Come to pricing part Earlier Microsoft licensing strategy is entirely different after reaches market share Microsoft revised their license with core based. A license forum clearly explains oracle vs Microsoft licensing cost differs in few dollars. But Microsoft is concentrated on releasing their version every year with more costlier features. MOTS (Memory optimized table with durable and non durable) is one of the feature introduced in SQL server 2016. What is the use of it? Memory optimized table (Feature to Grab memory) Already our provided VM serves are configured with 256 GB of RAM, if we implement mots then our entire memory will gone, again we have lots of workarounds. I am a person who always supports Microsoft SQL server.

    If anybody knows the solutions or work around for my problems kindly share your suggestions.

    1. Composite partition (List & Range) State wise partition and then create sub partition for date-wise how to implement this in SQL server?
    2. Row level archival (Hide inactive rows from tables similar to oracle) so that i will prepare a deck and compromise with the team to stop migrating oracle from SQL server 2016.

    You have hit one of the points that irritates the hell out of me and that's the release pattern that Microsoft uses and abuses including the deprecation and discontinuation of useful tools for "improved" tools that sometimes don't appear to be improvements.

    As  for the two problems you listed, which seem to be the focus reasons for why the migration is happening, we'd need more information about the multiple facets of those problems.  For example, we don't know why there is a perceived need for partitioning in the manner you listed, what the proposed structure and relationships of the proposed tables involved are, nor even how many rows there are.  I also don't know as much about Oracle as the folks you're working with and I certainly don't know the new features that came out in 12C that you speak of.

    What I can do is put on my practical hat and ask some questions that I believe they'll find tough to answer, maybe even for the onsite Oracle DBA.

    1.  What is the purpose of doing the partitioning?  If it's for ease of maintenance, that's the correct answer.  If it's to "improve performance", then have they done a proof of principle to prove that it actually will improve performance?  If not (and it sounds like you already have Oracle in house so this should be easy for them), will they?  It's real nice that people make claims of performance based on supposed "expert experience" but will it actually work so in this particular case?  As a much smarter person alluded to, "One proper experiment is worth a thousand expert opinions".  

    2.  What are the ramifications of partitioning in Oracle?  For example, in SQL Server, all unique indexes will automatically have the partitioning column added to it unless you override that as a non-aligned index, which also disables some of the features of partitioning.  On the other hand and depending on the column(s) of the unique index and the partitioning column, having the clustered column added to the unique index could (and frequently does) make any foreign keys that are pointed to the partitioned table totally impossible.  If the ability to switch partitions in and out is important, then all indexes must be "aligned" with the partitioning and that means that the partitioning column is also included which also makes the indexes wider which also means that more pages must be read to use the index which means less performance and more memory will be used.  Shifting gears a bit, I actually prefer that ancient black art of using partitioned views because I don't eventually need to restore all partitions to make a copy of the database for the Development boxes (for example) and it works just as fast (and sometimes faster because I can index "closed" partitions differently than the active partition) as partitioned tables.

    3.  As for the "in-table" archive that you speak of, does that actually improve performance?  Can Oracle assign those archived rows to a different lower cost disk?  What of backups?  Does the in-table method provide any relief at all insofar as the size of backups and the speed of restores?  Again, I don't know the Oracle in-table feature to be able to say one way or the other but I'd bet it does absolutely nothing for backups and may do very little for performance.

    I'm also of the opinion that no one has done due diligence on the SQL Server side in the form of research and proof of principle testing.  It didn't take Eric M Russell (see post above) very look to come up with a list of features and methods in SQL Server to do the same that people think only Oracle can do (the multi-column partitioning could be done using the methods in the last link he posted).  Some of the features in the links he posted actually remove the archived rows from the table(s) which has huge advantages (backups, disk space, etc).

    Again, "One proper experiment is worth a thousand expert opinions" and, so far, I've not seen any indications from you that the murder of crows that's insisting on the migration has actually done any due diligence with any kind of testing never mind a full sized proof of principle test.  Wouldn't it be a hoot if all this effort and expense resulted in nothing but a paradigm shift with different colored buttons to push?

    And apologies for my comment on "not knowing enough about SQL Server".  I didn't mean that as a personal slam.  It was my very poor method of citing that not enough research had been done and the "you" was actually the collection of people where you work that are proposing/doing the migration.  They need to be very careful that they not jumping from hot fat to the fire.

    And again... we don't know enough about the problems that the proposed partitioning and in-table row level archiving is supposed to solve to provide a precise answer on how to solve the problems with SQL Server.  If you could provide more information, we may be able to come up with something more concrete than opinion based on perceived experience.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Thursday, October 5, 2017 8:31 AM

    kannan_egd - Thursday, October 5, 2017 3:27 AM

    Jeff Moden - Wednesday, October 4, 2017 11:24 AM

    Eric M Russell - Wednesday, October 4, 2017 9:07 AM

    ZZartin - Wednesday, October 4, 2017 8:23 AM

    Eric M Russell - Wednesday, October 4, 2017 7:55 AM

    To effectively scale to TB sized tables, an enterprise class RDMS like SQL Server or Oracle must be paired with an enterprise class DBA.

    Certainly you aren't implying that doing a select * on a table with billions of rows in Oracle would have issues?

    Yes, even Oracle 12c with 256 GB RAM, 24 cores, and running on a ZFS Storage Appliance will not yield acceptable SELECT * performance. No amount of hardware investment can compensate for poor SQL coding in a TB scale database environment.

    The server is the race car, the data is the track, and the DBA is the driver behind the wheel. Expertise of the DBA is essential.

    I absolutely agree and you already know this, so I'm saying it for other folks that may read this.

    The DBA can't do it all.  Expertise on the part of the Developers is a very strong key issue, as well.  Further, support from Management for the DBA is absolutely critical.

    Dear Jeff Moden,

    Thanks for all your valuable thoughts and replies. I am working as a SQL server DBA for my current project and worked from SQL server 2000 on wards, i am one of the fan of SQL server from my college days, even few of them in the blog points me that i have not much experience in SQL server, any way thanks for their applause, but it is not true. I got only a few (6 months) of experience in oracle and got opportunity to work with a senior oracle DBA, Day by day oracles features are increasing and providing lot of flexibility, that why our entire team is keep standing against me to migrate it to oracle. Today oracle DBA'S come up  with IN-Database archival i.e row level archival introduced in 12c, with that feature they make the rows invisible from the table and the select query only works on the non archived rows, even the optimizer is also works accordingly, but in SQL server this may be achieved by row level security, but archival is entirely different than row level security. Oracle DBA's asked me what is the equivalent feature available in SQL server for this? We all know that archival is the process of moving the old data to other disk, but oracle handles both approach traditional & modern (moving data to other storage, keep data in the same table with inactive). We can achieve this in a different way by introducing a bit column and create a view that looks only for active records and updating or inserting the view similar to table will be the work around. If a work around is available then why Oracle development team put their effort in introducing this feature? we all (sorry for that) need to think. Come to pricing part Earlier Microsoft licensing strategy is entirely different after reaches market share Microsoft revised their license with core based. A license forum clearly explains oracle vs Microsoft licensing cost differs in few dollars. But Microsoft is concentrated on releasing their version every year with more costlier features. MOTS (Memory optimized table with durable and non durable) is one of the feature introduced in SQL server 2016. What is the use of it? Memory optimized table (Feature to Grab memory) Already our provided VM serves are configured with 256 GB of RAM, if we implement mots then our entire memory will gone, again we have lots of workarounds. I am a person who always supports Microsoft SQL server.

    If anybody knows the solutions or work around for my problems kindly share your suggestions.

    1. Composite partition (List & Range) State wise partition and then create sub partition for date-wise how to implement this in SQL server?
    2. Row level archival (Hide inactive rows from tables similar to oracle) so that i will prepare a deck and compromise with the team to stop migrating oracle from SQL server 2016.

    You have hit one of the points that irritates the hell out of me and that's the release pattern that Microsoft uses and abuses including the deprecation and discontinuation of useful tools for "improved" tools that sometimes don't appear to be improvements.

    As  for the two problems you listed, which seem to be the focus reasons for why the migration is happening, we'd need more information about the multiple facets of those problems.  For example, we don't know why there is a perceived need for partitioning in the manner you listed, what the proposed structure and relationships of the proposed tables involved are, nor even how many rows there are.  I also don't know as much about Oracle as the folks you're working with and I certainly don't know the new features that came out in 12C that you speak of.

    What I can do is put on my practical hat and ask some questions that I believe they'll find tough to answer, maybe even for the onsite Oracle DBA.

    1.  What is the purpose of doing the partitioning?  If it's for ease of maintenance, that's the correct answer.  If it's to "improve performance", then have they done a proof of principle to prove that it actually will improve performance?  If not (and it sounds like you already have Oracle in house so this should be easy for them), will they?  It's real nice that people make claims of performance based on supposed "expert experience" but will it actually work so in this particular case?  As a much smarter person alluded to, "One proper experiment is worth a thousand expert opinions".  

    2.  What are the ramifications of partitioning in Oracle?  For example, in SQL Server, all unique indexes will automatically have the partitioning column added to it unless you override that as a non-aligned index, which also disables some of the features of partitioning.  On the other hand and depending on the column(s) of the unique index and the partitioning column, having the clustered column added to the unique index could (and frequently does) make any foreign keys that are pointed to the partitioned table totally impossible.  If the ability to switch partitions in and out is important, then all indexes must be "aligned" with the partitioning and that means that the partitioning column is also included which also makes the indexes wider which also means that more pages must be read to use the index which means less performance and more memory will be used.  Shifting gears a bit, I actually prefer that ancient black art of using partitioned views because I don't eventually need to restore all partitions to make a copy of the database for the Development boxes (for example) and it works just as fast (and sometimes faster because I can index "closed" partitions differently than the active partition) as partitioned tables.

    3.  As for the "in-table" archive that you speak of, does that actually improve performance?  Can Oracle assign those archived rows to a different lower cost disk?  What of backups?  Does the in-table method provide any relief at all insofar as the size of backups and the speed of restores?  Again, I don't know the Oracle in-table feature to be able to say one way or the other but I'd bet it does absolutely nothing for backups and may do very little for performance.

    I'm also of the opinion that no one has done due diligence on the SQL Server side in the form of research and proof of principle testing.  It didn't take Eric M Russell (see post above) very look to come up with a list of features and methods in SQL Server to do the same that people think only Oracle can do (the multi-column partitioning could be done using the methods in the last link he posted).  Some of the features in the links he posted actually remove the archived rows from the table(s) which has huge advantages (backups, disk space, etc).

    Again, "One proper experiment is worth a thousand expert opinions" and, so far, I've not seen any indications from you that the murder of crows that's insisting on the migration has actually done any due diligence with any kind of testing never mind a full sized proof of principle test.  Wouldn't it be a hoot if all this effort and expense resulted in nothing but a paradigm shift with different colored buttons to push?

    And apologies for my comment on "not knowing enough about SQL Server".  I didn't mean that as a personal slam.  It was my very poor method of citing that not enough research had been done and the "you" was actually the collection of people where you work that are proposing/doing the migration.  They need to be very careful that they not jumping from hot fat to the fire.

    And again... we don't know enough about the problems that the proposed partitioning and in-table row level archiving is supposed to solve to provide a precise answer on how to solve the problems with SQL Server.  If you could provide more information, we may be able to come up with something more concrete than opinion based on perceived experience.

    Dear Jeff Moden

    Thanks for the response. Let me explain my problem statement and what is the need of my request
    1. Our product Table Size 1. Stage Table 200 million to 600 million records (Holding 4 years data four quarters per year and we make it as (4*4)16 + 2 = 18 partitions and 2 for holding lower date and upper date i.e from out of partition ranges)
    Why Partitions? 
    1. To achieve parallelism. Question will arise from your end like this how partition will help for this?
    Ans: For four years Data we receive files from customers in two types 1. 18 files or 4 files. These files are load from ETL in parallel executions. We created 18 DFT in a container each individual will read the corresponding files and push it to the stage table in the corresponding partitions residing in 18 different drives. ETL will run this in 18 concurrent threads, as you aware it executes according to processor core availability ours is a VM holding 32 core processors 16 logical and 16 physical with two sockets each having 8 physical core processors. Please forget about the hardware, processor speed, drive IOPS, RPM etc...
    2. Our plan for enhancement. The product deals with only one state records for healthcare now we expand this to 26 states.

    Existing design and why new design?
    1.  for 26 states we run with 26 individual instances due to the volume of records
    2. As you aware 26 individual physical or vm requires more maintenance in terms of
    1. Patching should be done in all 26 if any errors will happen. please dont tell some auto patch configuration tools will available it will push based on the configurations. For patches we internally followed so many procedures and any additional tools will requires licensing cost, firewall port exceptions etc..
    To avoid patches and maintaining 26 individual vm we planned to go for high configuration individual server with recent market release upto 2 TB of RAM high throughput SSD disks. we have more plan to use this server for multiple purpose that is up to mgmt decisions.
    We are consolidating 26 states into 5 databases residing in one instance so patching should in one box instead of 26.
    600 million * 5 for 5 states records here the partition strategy works first list partition and next range, we used to generate a composite key (persisted column to achieve this in SQL server).
    2. Why individual objects (table) not introduce for each states? 
    Ans: If we are doing this, then our product will not be scaled. In future state will reaches to 100 (Ballpark value) so 100 objects will not help us. We need to change all our codes when new objects introduced.
    3. One state partition is running with Data quality, other state with cleansing job and the other partition is running with stage to engine data load process.
    Why index not tried?
    We tried index it help us to improve the the performance completely agree.
    Creating index cons:
     1. More time consuming to build the index. Our table holds around 135 columns it is not possible to cover all the columns in index.
    2. Reduce bulk insert performance
    3. We already introduced column-store, covering,filter,aligned,non aligned to our needs and done lot of POC's for pros and cons. Less benefits more difficult. 2012 have bugs in index for partitioned and sorting order incorrect cardinality. Please refer paul white blog, i already mentioned his url. 
    3. Column store read only in 2012 (but read & write only in 2014 Microsoft version release strategy)

    Why In-database archival similar to oracle 12c required? are they really reducing storage or backup size?

    While executing stored procedures for calculation we delete few records based on conditions as you aware record deletion or updation will write logs when database is in simple recovery modal as well. With help of index also it is difficult to do that as well. But brentozar forum explains that delete or update will be fast using index, we tried that also it helps to some extent but not all the cases.

    But oracle's row archival help to make change only in meta data level to hide the rows which we dont required for calculation or select query purpose.

    Oracle clearly understand the pin points of DBA as well as developers and think for the feature enhancements. Here my intention is not to showcase Oracle is best fit.If SQL server incorporate at least any of the features available in oracle will help lot of developers to reduce their coding stuff, DBA, enterprise companies.

    Why multi column partitioning?

    Partition may be planned for more than one column in that case multi column is useful. But again introduce one additional column as persisted to achieve this is possible, but think about an alter is required in production table, additional storage is required for the column. If is happened only in partition function means everything will becomes simple and DBA have only to work. but computed column requires code change, DBA involvement, developer involvement testing etc.

    Conclusion: My thoughts, Technology and feature enhancements will help to make things simple and help to scale the system. Every year releasing versions will make developers and DBA to think the product is not a stable, and this version release is only for making money not to support any development community. Microsoft will take more time to analyse other competitor product features and compare it with their product and bring the good things will definitely give a success and more market share.

    Regards.
    Kannan.C
    (Always willing to work only on Microsoft SQL server DBA)

  • First of all, you say that the partitioning was so that you could achieve parallelism in your ETL loads to the staging table.  You also state that the partitioned was done to 18 drives.  Here's a major catch with that... unless you can guarantee that your 18 drives are actually 18 different physical drives instead of just 18 logical drives pointing to the same 1 or 2 or even 4 (for example) physical drives, your attempts a parallelism might actually be slowing the loads down because of read head movement on the physical drive.  True parallel loading can only happen if each partition is guaranteed to live on it's own physical drive or it's own set of physical drives. If the drives are merely logical drives, then your likely still only doing a "time share load in a serial fashion".  You'll have the same problem no matter which RDBMS you use.

    And, surprise, surprise and depending on how the logical drives are carved out, this can also affect SSDs, although to a lesser extent because there is no physical R/W head movement.  There's still a possibility that you might end up time-sharing a pipe, which is a serial rather than parallel operation.

    You mention that the staging table can have 200-600 million rows in it.  When does it have that many rows in it?  What is the life cycle of a row in the staging table?  For example, is the staging table empty when you first start to load either the 18 files or the 4 files that you may receive?  Since it is a staging table, where and how does the data go from there? Or, are you saying that the staging table isn't actually a "staging" table and that it's really the final target table that you're loading into?  If it's the final target table, is the load an "UPSERT" (update matching rows, insert new rows) or is it an "always insert" methodology?

    You also haven't stated how many rows and GB each of the 18 or each of the 4 files you load contain.  And what are you using to load the files into SQL Server?  BCP?  BULK INSERT?  Something else and if so, what is it?  What is the format of the files you load?  True CSV with text qualifiers,  Excel-like CSV where text qualifiers only appear on the "cell" that may carry a delimiter? Tab separated?  Fixed Field?  (gasp) XML/JSON?  (double gasp) EDI?

    Heh... and you mention that you've done POP testing?  On what?  Just SQL Server?  Have you/they done the same testing on Oracle to see if there's actually been an improvement or are folks just speculating that there will be based on some anecdotal evidence?

    Also, do you good folks understand and use the concept of actual "Minimal Logging" and have verified that your loads are actually using it?  That, all by itself, will make the loads more than twice as fast.  Too many indexes for that?  Have you read the following article?  The article is a bit dated but everything in it is still applicable.

    https://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx

    As for the Paul White article... absolutely brilliant work on his part, as normal.  With nothing aimed at Paul or his good work there, do you have even one index on your partitioned table where the first key is sorted in descending order?  Is it likely that you ever would?  While I agree that it IS broken code on Microsoft's part and that they did do a pretty miserable job of patching the error instead of fixing the error, especially since it requires the use of THAT particular trace flag, I agree with Paul that having a descending leading key in an index on a partitioned table would be an extremely rare thing to happen and, unless such a thing is critical to the success of your project, I wouldn't give it another thought except to say, did you test Oracle for the same problem?

    Oracle's in-table, meta-data only, row-level archival method sounds really cool except (IMHO because I don't know anything of Oracle's table structure) that it solves a problem that a trivial amount of code would solve and it still leaves the data in the table.  It really hasn't solved anything if the data is still in place.  I don't know... maybe Oracle's engine is smart enough to not load such rows into memory when adjacent data is read.  But, it's still on disk and the OS still has to read past it.  If not on physical disk, then it still takes up space on expensive SSDs and I'd think that it would still go through the throws of backups and restores if the archiving is reversible.

    As for the table holding 135 columns, I can certainly understand that for a staging table.  I can't understand that for permanent data.  Assuming that Oracle isn't a column-centric database (I'm pretty sure that it is not) and even if there is no possibility of further normalizing the data in the table, even Oracle would benefit from some thoughtful VERTICAL partitioning through the use of "sister" tables where the high usage columns are kept in a nice narrow table and the infrequently used data is in other tables.  A nice updateable view with an Instead Of trigger in SQL Server or a Before trigger in Oracle on a "ghost" table would certainly make life easy and fast.  As for indexing every column, of course you wouldn't.  You would only index for the lookups you have and possibly not all of them.  As you point out, that may be one area where partition elimination may come in handy but only if the code is written in such a fashion as to take advantage of that.  And guess what?  Not all of your queries are going to be able to take advantage of partition elimination.  It may even be that none of the code will.

    As for the partitioning, you're doing temporal partitioning based on dates to isolate quarters.  So, let me ask... how far back can a row go before you can say it will NEVER be modified ever again?  I'm thinking that, compared to 4 years back, it won't be very long.  Have you considered the performance and maintenance advantages of making old partitions READ_ONLY?  If not, then you're actually missing out on the biggest reasons to partition to begin with. 

    As for all the bloody damned updates that MS does, I agree that it's a real pain in the patooti but it's better than leaving something broken for a long time and it's better than waiting to be made to suffer through a successful attack.  I can't speak to Oracle's update strategy because I don't know what it actually is.  It's nice to think that Oracle never requires updates for bugs or security but I doubt that's true.

    Last but not least, if you're having problems with front-end performance and that's part of the reason why you want to move to Oracle's multi-column and sub column partitioning methods, have you actually looked at what is causing the performance problems?  I can assure you that even on a large database, it's usually not the fault of the database or the objects that are in it even if you have a bit of a bad design.  Most of the time, it's crap code that's coming from an ORM or code designed by someone that thinks a database is just a place to store data.

    We just went through a monster problem where we reached some sort of tipping point and the system would "lock" for as long as 30 minutes and then occasionally do it again 5 minutes later.  Lots of people were blaming the database.  Hell... lot's of people were blaming me because I've not done ANY index maintenance since 17 January 2016.  I stood my ground and went searching for the problem.  It turned out to be two crap problems with one piece of crap ORM code.  First, the ORM code that was generated made it impossible to use an index because the WHERE clause had more than 40 columns in it.  Worse yet, the items it was looking for in the WHERE clause were in the form of literal strings and every time the code was executed, at least one byte changed in the code every time.  That also meant (and I proved it) that the code had to recompile virtually every time it was being used.  Because of the silly WHERE clause, it took anywhere from 2 to 22 seconds to recompile each and every time it was used and it was used tens of thousands of times each hour.  As if that's not bad enough and contrary to documentation on the subject, we then found out that all of the connection strings created by Entity Framework defaulted to SET MULTIPLE ACTIVE RESULT SETS being on.  For good code, that's great.  For bad code and to make a much more complicated explanation shorter, the batch doesn't finish before the transaction does (and that bit of information IS in the documentation) and it causes the transaction to rollback.  It's a freakin' READ transaction and it rolls back.  One would roll back because of the crazy long compile time, which would cause the system to slow a bit and cause another to rollback and I ended up with hundreds of sessions all waiting their turn to rollback and it consumed all but 1 or 2 of 32 CPUs... slammed all but those 1 or 2 at 80 to 100%.

    So, with all due respect to the things you've tried, 1) I don't believe that you've tried everything and 2) you haven't said that you've actually done scalable POP testing in Oracle.  I could certainly be incorrect but it may be that you'll see little if any improvement in performance once you spend the time, money, and effort to do the migration.

    My bottom line recommendation is that until you've done such scalability testing on similarly massive amounts of information, you have no justification for even considering the migration.  "One good test is worth a thousand expert opinions".

    Now, if you have done such testing on Oracle and it turned out to be a miracle of performance, then why are we still posting about problems to each other? 🙂  You should be bragging about the wicked performance improvements you got instead. But they haven't done such testing yet, have they?  :ermm: :Whistling:

    p.s.  When we turned off MARS on all connections to the database, average CPU across all 32 CPUs dropped from 40% to 22% on the busy hours.  When we fixed the root problem by writing a properly written and much better behaved stored procedure, the average CPU across all 32 CPUs dropped from 22% to a ripple between 6 and 8%.  During the "firefights" to try to fix the problem, they had added memory to take us from 250GB to 384GB and also took us up to 48 CPUs.  It actually made things a bit worse because more sessions were able to get stuck and start rolling back.  My post-fix measurements were made after I turned off the 16 extra CPUs just to take the measurements.

    You can bet your sweet bippy that I turned them back on when I was done. 😉  I never turn down hardware even when it doesn't make a difference.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Friday, October 6, 2017 9:06 PM

    First of all, you say that the partitioning was so that you could achieve parallelism in your ETL loads to the staging table.  You also state that the partitioned was done to 18 drives.  Here's a major catch with that... unless you can guarantee that your 18 drives are actually 18 different physical drives instead of just 18 logical drives pointing to the same 1 or 2 or even 4 (for example) physical drives, your attempts a parallelism might actually be slowing the loads down because of read head movement on the physical drive.  True parallel loading can only happen if each partition is guaranteed to live on it's own physical drive or it's own set of physical drives. If the drives are merely logical drives, then your likely still only doing a "time share load in a serial fashion".  You'll have the same problem no matter which RDBMS you use.

    And, surprise, surprise and depending on how the logical drives are carved out, this can also affect SSDs, although to a lesser extent because there is no physical R/W head movement.  There's still a possibility that you might end up time-sharing a pipe, which is a serial rather than parallel operation.

    You mention that the staging table can have 200-600 million rows in it.  When does it have that many rows in it?  What is the life cycle of a row in the staging table?  For example, is the staging table empty when you first start to load either the 18 files or the 4 files that you may receive?  Since it is a staging table, where and how does the data go from there? Or, are you saying that the staging table isn't actually a "staging" table and that it's really the final target table that you're loading into?  If it's the final target table, is the load an "UPSERT" (update matching rows, insert new rows) or is it an "always insert" methodology?

    You also haven't stated how many rows and GB each of the 18 or each of the 4 files you load contain.  And what are you using to load the files into SQL Server?  BCP?  BULK INSERT?  Something else and if so, what is it?  What is the format of the files you load?  True CSV with text qualifiers,  Excel-like CSV where text qualifiers only appear on the "cell" that may carry a delimiter? Tab separated?  Fixed Field?  (gasp) XML/JSON?  (double gasp) EDI?

    Heh... and you mention that you've done POP testing?  On what?  Just SQL Server?  Have you/they done the same testing on Oracle to see if there's actually been an improvement or are folks just speculating that there will be based on some anecdotal evidence?

    Also, do you good folks understand and use the concept of actual "Minimal Logging" and have verified that your loads are actually using it?  That, all by itself, will make the loads more than twice as fast.  Too many indexes for that?  Have you read the following article?  The article is a bit dated but everything in it is still applicable.

    https://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx

    As for the Paul White article... absolutely brilliant work on his part, as normal.  With nothing aimed at Paul or his good work there, do you have even one index on your partitioned table where the first key is sorted in descending order?  Is it likely that you ever would?  While I agree that it IS broken code on Microsoft's part and that they did do a pretty miserable job of patching the error instead of fixing the error, especially since it requires the use of THAT particular trace flag, I agree with Paul that having a descending leading key in an index on a partitioned table would be an extremely rare thing to happen and, unless such a thing is critical to the success of your project, I wouldn't give it another thought except to say, did you test Oracle for the same problem?

    Oracle's in-table, meta-data only, row-level archival method sounds really cool except (IMHO because I don't know anything of Oracle's table structure) that it solves a problem that a trivial amount of code would solve and it still leaves the data in the table.  It really hasn't solved anything if the data is still in place.  I don't know... maybe Oracle's engine is smart enough to not load such rows into memory when adjacent data is read.  But, it's still on disk and the OS still has to read past it.  If not on physical disk, then it still takes up space on expensive SSDs and I'd think that it would still go through the throws of backups and restores if the archiving is reversible.

    As for the table holding 135 columns, I can certainly understand that for a staging table.  I can't understand that for permanent data.  Assuming that Oracle isn't a column-centric database (I'm pretty sure that it is not) and even if there is no possibility of further normalizing the data in the table, even Oracle would benefit from some thoughtful VERTICAL partitioning through the use of "sister" tables where the high usage columns are kept in a nice narrow table and the infrequently used data is in other tables.  A nice updateable view with an Instead Of trigger in SQL Server or a Before trigger in Oracle on a "ghost" table would certainly make life easy and fast.  As for indexing every column, of course you wouldn't.  You would only index for the lookups you have and possibly not all of them.  As you point out, that may be one area where partition elimination may come in handy but only if the code is written in such a fashion as to take advantage of that.  And guess what?  Not all of your queries are going to be able to take advantage of partition elimination.  It may even be that none of the code will.

    As for the partitioning, you're doing temporal partitioning based on dates to isolate quarters.  So, let me ask... how far back can a row go before you can say it will NEVER be modified ever again?  I'm thinking that, compared to 4 years back, it won't be very long.  Have you considered the performance and maintenance advantages of making old partitions READ_ONLY?  If not, then you're actually missing out on the biggest reasons to partition to begin with. 

    As for all the bloody damned updates that MS does, I agree that it's a real pain in the patooti but it's better than leaving something broken for a long time and it's better than waiting to be made to suffer through a successful attack.  I can't speak to Oracle's update strategy because I don't know what it actually is.  It's nice to think that Oracle never requires updates for bugs or security but I doubt that's true.

    Last but not least, if you're having problems with front-end performance and that's part of the reason why you want to move to Oracle's multi-column and sub column partitioning methods, have you actually looked at what is causing the performance problems?  I can assure you that even on a large database, it's usually not the fault of the database or the objects that are in it even if you have a bit of a bad design.  Most of the time, it's crap code that's coming from an ORM or code designed by someone that thinks a database is just a place to store data.

    We just went through a monster problem where we reached some sort of tipping point and the system would "lock" for as long as 30 minutes and then occasionally do it again 5 minutes later.  Lots of people were blaming the database.  Hell... lot's of people were blaming me because I've not done ANY index maintenance since 17 January 2016.  I stood my ground and went searching for the problem.  It turned out to be two crap problems with one piece of crap ORM code.  First, the ORM code that was generated made it impossible to use an index because the WHERE clause had more than 40 columns in it.  Worse yet, the items it was looking for in the WHERE clause were in the form of literal strings and every time the code was executed, at least one byte changed in the code every time.  That also meant (and I proved it) that the code had to recompile virtually every time it was being used.  Because of the silly WHERE clause, it took anywhere from 2 to 22 seconds to recompile each and every time it was used and it was used tens of thousands of times each hour.  As if that's not bad enough and contrary to documentation on the subject, we then found out that all of the connection strings created by Entity Framework defaulted to SET MULTIPLE ACTIVE RESULT SETS being on.  For good code, that's great.  For bad code and to make a much more complicated explanation shorter, the batch doesn't finish before the transaction does (and that bit of information IS in the documentation) and it causes the transaction to rollback.  It's a freakin' READ transaction and it rolls back.  One would roll back because of the crazy long compile time, which would cause the system to slow a bit and cause another to rollback and I ended up with hundreds of sessions all waiting their turn to rollback and it consumed all but 1 or 2 of 32 CPUs... slammed all but those 1 or 2 at 80 to 100%.

    So, with all due respect to the things you've tried, 1) I don't believe that you've tried everything and 2) you haven't said that you've actually done scalable POP testing in Oracle.  I could certainly be incorrect but it may be that you'll see little if any improvement in performance once you spend the time, money, and effort to do the migration.

    My bottom line recommendation is that until you've done such scalability testing on similarly massive amounts of information, you have no justification for even considering the migration.  "One good test is worth a thousand expert opinions".

    Now, if you have done such testing on Oracle and it turned out to be a miracle of performance, then why are we still posting about problems to each other? 🙂  You should be bragging about the wicked performance improvements you got instead. But they haven't done such testing yet, have they?  :ermm: :Whistling:

    p.s.  When we turned off MARS on all connections to the database, average CPU across all 32 CPUs dropped from 40% to 22% on the busy hours.  When we fixed the root problem by writing a properly written and much better behaved stored procedure, the average CPU across all 32 CPUs dropped from 22% to a ripple between 6 and 8%.  During the "firefights" to try to fix the problem, they had added memory to take us from 250GB to 384GB and also took us up to 48 CPUs.  It actually made things a bit worse because more sessions were able to get stuck and start rolling back.  My post-fix measurements were made after I turned off the 16 extra CPUs just to take the measurements.

    You can bet your sweet bippy that I turned them back on when I was done. 😉  I never turn down hardware even when it doesn't make a difference.

    Dear Jeff Moden,
    Thanks for the response. The Drives provided by us are taken from two SAN storages. Each network storage drives are of 150 GB of size 18 * 150 GB. 
    1. We loaded data from CSV files with delimiters separated columns.
    2. We done for full load only. Truncate the entire stage table and kick start the full load job.
    3. Why stage table? File -> stage -> engine table (Destination)
    4. Data Quality, Cleansing will happen after then we moved only the refined data to destination.
    From Stage the refined data will move to file by BCP 18 files created by BCP triggered by ETL. then bulk insert will done the job in target table. source and target tables holds similar structure i.e 18 partitions.

    Already the similar set of data used by oracle database. The place where we faced problem is also faced by oracle team and now they are trying to implement the new features i.e composite partition, in-table archival, may it will take months to get the performance benchmarks, team is working on the new Oracle partition strategy.Let see who is the winner SQL server or Oracle with 600 million data.

    Regards,
    Kannan.C

  • kannan_egd - Saturday, October 7, 2017 12:00 AM

    Dear Jeff Moden,
    Thanks for the response. The Drives provided by us are taken from two SAN storages. Each network storage drives are of 150 GB of size 18 * 150 GB. 
    1. We loaded data from CSV files with delimiters separated columns.
    2. We done for full load only. Truncate the entire stage table and kick start the full load job.
    3. Why stage table? File -> stage -> engine table (Destination)
    4. Data Quality, Cleansing will happen after then we moved only the refined data to destination.
    From Stage the refined data will move to file by BCP 18 files created by BCP triggered by ETL. then bulk insert will done the job in target table. source and target tables holds similar structure i.e 18 partitions.

    Already the similar set of data used by oracle database. The place where we faced problem is also faced by oracle team and now they are trying to implement the new features i.e composite partition, in-table archival, may it will take months to get the performance benchmarks, team is working on the new Oracle partition strategy.Let see who is the winner SQL server or Oracle with 600 million data.

    Regards,
    Kannan.C

    At least they're doing some tests now... before they do the migration.  And I'm glad the SAN admins realize the value of separate spindles.  I'm also right there with you on the use of a staging table for data validation before it goes to the final table.

    I'm curious... have you read the article on high performance loads that I provided the link for?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 76 through 90 (of 90 total)

You must be logged in to reply to this topic. Login to reply