A little tale and a seven questions about SQL partition

  • The tale

    Once up a time a small software shop had a application used Firebird DBMS.
    The small shop managed to do a very good service and many companies hired them and started to use the app.

    And for a long time everyone was happy

    One day the small shop started to worry about maintenance.
    New companies started to make new demands on the app and this led to frequent database updates to add more tables, columns, SPs, views, etc.
    Each company has his small DB and app upgrades started to take a lot of time.
    So one day they migrated to MS SQL and merged all bases in one big database so, the small shop wondered, I only need to upgrade one DB.

    And for a time everyone was happy

    More companies landed and both the companies and the now not that small shop growth and also the DB growth, a lot.
    The DB was now a big centralized one, online users from hundreds companies all over the country inserting thousands new records and printing insanely huge reports with thousands pages all day.
    The app started to suffer from timeouts and deadlocks, they hired a SQL guy and together they managed to change the DB isolation level to SNAPSHOT.
    Angry costumers were pleased and the medium shop got time to breath again.

    And for a short time everyone was happy

    Bigger companies landed and the shop started to refac his old app in sexy web media.
    But costumers now want the insanely large reports to be magically printed in the blink of a eye.
    Also, there a few yearly "closure procedures" take its time a big toll on the DB server resources.

    The DB guy in charge discovered SQL Server Partition and started to dwell in it.
    He managed to test it in a backup from a production DB and was amazed to this results.
    The hired remote SQL consultant guy protested and said it was a bad idea and preached about overhead and other issues.
    The DB guy in charge manager said "I only care about results" and said goodbye to the hired remote SQL consultant guy.

    The shop created a "test task force" and put all developers and testers it can in it.
    And is on the verge of using partition in production.

    The questions

    I'm tasked to help test it, to find any problem and if performance really get better.
    1) What I'm looking for?
    2) What I cand do to properly test robustness?
    3) How is partitioning helping with indexing?
    4) What to expect as indexes gorws fragmented? (there's a lot of fragmentation due to insert/delete)
    5) What to expect and new companies and "Towns" are created?

    More on the schema
    I) We have 2 databases, one with and another without partition, in the same server for the sake of the test.
    II) All IDs columns are... numeric(18,0)
    III) Almost all tables got a COMPANY_ID and a TOWN_ID column.
    IV) The main tables are rarely updated, the data processing is complex and changing one record can demand many other records in many tables to change.
    That's why for many complex tasks a "update" means delete and recreate the data. (leading to a lot of index fragmentation)
    V) There's no data shared between two companies
    VI) Each company has a few hundred "Towns" and each "Town" a few hundred "Depts".
    VII) The partitin is by "TOWN"

    Partition function:
    CREATE PARTITION FUNCTION [ID](numeric(18,0)) AS RANGE LEFT FOR VALUES (0, 1, 2, 3,....,max TOWN_ID)

    Partition schema:
    CREATE PARTITION SCHEME [MYSCHEME_ID] AS PARTITION [ID] TO ([0], [1], [2],...,max TOWN_ID)

    6) This mean we must update these objects as more "Towns" are created?
    7) What if one is deleted?

  • Ok, bit of a short reply because I need to leave work 10 minutes ago to miss the traffic...

    Partitioning is NOT for performance. It's for data management, for fast loads and for archiving data quickly. It can and will make queries slower, it also won't help most queries past what decent indexing would have got you
    If you're looking for good performance, stop looking at partitioning.

    https://www.simple-talk.com/sql/database-administration/gail-shaws-sql-server-howlers/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Do you have 1 database for all of the "towns" or 1 database per "town"?  Your description makes it sound like 1 database that holds all of the towns.
    Rather than partitioning, 1 database per town, each on its own physical disk, will likely get you better performance than partitioning.  Plus it will make data security a LOT easier.  

    I agree with GilaMonster (plus he is a VERY smart SQL guy based on the forum posts I've read) and if he says that partitioning is bad, I'd tend to agree with him.

    But something to watch for before just jumping on any bandwagon is to do research into your particular problem.  Your customers are saying that things are slow.  What is causing the slowness?  Blocking? Deadlocks? CPU? RAM? DISK I/O?  Network I/O?  IIS/APACHE misconfiguration?  First find out WHAT is the problem and then look at how best to solve it.  Also, is your TEST SQL Instance on a machine that is a 100% replica of the live one?

    Are you running your TEST database out of the same SQL Instance as your live one?  If so, I'd stop doing that as quickly as you can!  Grab a new computer with identical (or as close to identical) specs as the live one and spin up a brand new SQL instance on it.  Running 2 different databases out of the same instance and trying to do performance analysis on it will give you very inconsistent results AND will impact your live system a lot more than you want.

    Also, generally consultants are paid the big bucks because they have a LOT of experience with the thing they were hired for.  If the consultant says "that's not a good idea", it likely is not a good idea.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • bmg002 - Tuesday, April 18, 2017 9:47 AM

    plus he is a VERY smart SQL guy based on the forum posts I've read

    Errr... (avatar is character from Star Wars, not a picture)

    Also, generally consultants are paid the big bucks because they have a LOT of experience with the thing they were hired for. If the consultant says "that's not a good idea", it likely is not a good idea.         

    And in this case the consultant is correct. Partitioning has overheads (usually in query performance), lots of caveats and the suggested partition design is not a good one. Yes, the scheme and function will need to be altered for every new town, and it'll give no better performance than a clustered index leading with the Town column

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • heh... whoops.  Guess reading your signature would have helped.  I don't know too many men named Gail.  Sorry about that.  I think the avatar threw me off.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • I've worked at a software as a service company that faced the problem of one database for all their customers vs. separate database for each customer, and even though the business people designing the system said they wanted one huge database and that it would make things easier, I had to be the one to explain the technical challenges behind that, such as being forced to scale up instead of having scale out options, having to implement loads of extra logic in all your queries to handle the security, having more manual intervention instead of being able to quickly spin up or shut down customers, etc.

    Since you're already in the one huge database environment, a question to consider, is the testing that was done to compare non-partitioned vs. partitioned done with just an individual or was this with a higher workload?  Has analysis been done on the wait stats to determine where performance problems are occurring?  Looking into index usage, the worst performing queries, and missing indexes can lead to better results than just trying to partition all the data.

    You mentioned your partition was going to be on Town, but does someone from a Company ever need to query data across the multiple Towns that Company has?  Those kids of queries would probably be hurt by this partitioning scheme, and yes it would require maintenance every time a new Town was added into the system.

    I'm also a bit concerned about how the data relationships are defined and data is maintained, since you say instead of an UPDATE you perform a DELETE then recreate the data, it sounds like there's some design issues there.

  • Thanks for the reply.

    Actually the partitioning was not my idea and I'm too ignorant (of partiotioning) to take sides.
    My guts tell me there's something else wrong. Partitioning isonly masking or helping to avoid it.

    But nor my guts or "Gail Shaw told me not to do it" will make management change his mind and knowing they dismissed a real DBA (the consultant) don't make me eager to report anything bad about the idea.

    I'm asking advice in how to find evidence it's a bad idea. Real evidence I can show in performance tests for example.

    bmg002

    Actually we have a big DB with all companies and tows but it evolved from "each company have your small BD" as told in the little tale.
    The actual simulation server is a VM any developer can access, I was picked from another team to help search for issues.
    The consultant run benchmarks and took a close look on the production server and helped with indexes and all that stuff and agreed to change the isolation level to snapshot, this was a few months ago. Still there are a few problems some costumers get in heavy load times.
    The COMPANY_ID and TOWN_ID are columns in almost any table, there are hundreds FK pointing to COMPANY and TOWN tables, their IDs are clustered PKs.
    You can think the schema as: each costumer is a COMPANY and each company operates in a set of TOWN, there's never shared data between two companies nor two towns but each company keeps a eye on his own towns.

    Example

    I just started to test it today and was puzzled in fact it fells more faster on the DB with partition.
    As a first test I just created and deleted a town, it takes more than 30 seconds to delete a town (many millions of FK checks on hundred tables I guess)
    The DB com the partition takes only 3 seconds and it make me check if the constraints and FKs are all enabled. In fact I'm running a schema compare right now just in case.

  • For the tests, quantify how many queries will not use partition elimination. These are queries that either don't use the Town or, don't filter on Town adequately to prevent the query hopping across all the partitions to retrieve the data. If all the queries are well behaved and perfectly using partition elimination, you may be one of the rare places where partitioning helps performance (and they are exceedingly rare). Validate that you're seeing partition elimination and not just good performance because of up to date statistics (I assume one database was built newer than the other and the partitioned index is more up to date). For that matter, go to the old database and update the statistics. Does performance improve there? Maybe the problem hasn't been indexes at all, but statistics (which is shockingly common).

    It all comes down to whether or not your queries can use the partitions or if the queries suffer from the existence of partitions. If they are always (or at least an overwhelming majority of the time) benefiting, great. If not, partition scans are worse than table scans and this should very quickly become evident.

    Also, in terms of testing, having them side-by-side, please, only run tests on one database at one time or you're testing your memory and I/O setup more than your database setup.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • That is nice that you have a dedicated VM for testing.  I imagine you have the databases set up so they are both running off of the same VM then (the partitioned and non-partitioned) but no actual LIVE access?  I just want you to be sure that when you are doing your comparisons it is under similar load.

    Like if you are running a delete town (for example) on the non-partitioned database at the same time as somebody is running a report, you will have blocking which could hurt performance.  Then you do that on the partitioned one and it is faster, that isn't a very good test, no?

    I would run:
    SET STATISTICS IO ON
    SET STATISTICS TIME ON

    before your queries and look at the execution plans for both.  Also make sure your statistics are up to date in both databases.
    Another thing to consider is the end user doesn't really care about dropping or adding companies.  They will notice no performance change in that regard.  I would be focusing a LOT more on testing what the end users see.  Deleting a company you could schedule for some low activity time and hardly anybody would notice a thing.  What you will want to test like crazy is is the reporting any faster?

    Also, depending on the tools you are using for reporting, you will want to ensure that the slowness isn't caused by a specific web browser.  I was loading a report in IE 11 and it sat there with a big blank page until the whole thing was loaded.  It was a locally stored, 150MB HTML file, but loading it was horrendously slow.  I loaded the same thing in chrome and could see the results while it was processing the HTML.  Just another thing to test.

    On another note, can you replicate the customers slowness that they experienced?  Some web elements are processed on client side so you could optimize things on your end so stuff is running completely instant, but the end user has a Windows 98 SE machine running on 256 MB of RAM with a 2400 BAUD dial-up modem complaining that the page loads slowly.  I would try to find out the specs of the end users machine, replicate that in a VM and replicate the slow running report in test if possible.
    The problem may lie outside of the SQL side of things.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Grant Fritchey - Tuesday, April 18, 2017 10:40 AM

     Validate that you're seeing partition elimination and not just good performance because of up to date statistics

    Or better indexing.

    It's very common to see a test that compares a table indexed on useless columns (or no indexes at all) with one where the clustered index is partitioned, the partition column is a column that's frequently queried  on and hence the clustered index is now useful for the queries. In that case the partitioned table will perform better. Not because of the partitioning, because the index is now useful

    If you're doing performance comparisons, then everything other than the one thing you're testing MUST be identical. Indexes, statistics, query, etc, the only difference is one case the table is ON PRIMARY and the other it's ON PartitionScheme(PartitionColumn)

    See http://www.sqlservercentral.com/articles/Performance+and+Tuning/126532/ for an example

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • bmg002 - Tuesday, April 18, 2017 10:47 AM

    I would run:
    SET STATISTICS IO ON
    SET STATISTICS TIME ON

    Exercise caution with this. STATISTICS IO can actually negatively impact performance negating the measurements of STATISTICS TIME. I have largely stopped using these measures except for the grossest possible tests. When I'm doing full performance tuning, I set up extended events and capture the events there so that my measurements don't negatively impact my tests (or, at least do so as minimally as possible).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey - Tuesday, April 18, 2017 12:08 PM

    bmg002 - Tuesday, April 18, 2017 10:47 AM

    I would run:
    SET STATISTICS IO ON
    SET STATISTICS TIME ON

    Exercise caution with this. STATISTICS IO can actually negatively impact performance negating the measurements of STATISTICS TIME. I have largely stopped using these measures except for the grossest possible tests. When I'm doing full performance tuning, I set up extended events and capture the events there so that my measurements don't negatively impact my tests (or, at least do so as minimally as possible).

    I do agree with that.  If you are looking at doing performance tuning, you shouldn't have any extra overhead.  But if you are doing A-B testing, it can be helpful.  Extended events are a LOT better as they have less overhead; still a little bit, but I believe they have the least overhead of all monitoring inside SQL.
    When I'm doing A-B testing, I generally turn those both on, run the query 10 times and store it somewhere.  Then connect to database 2 and do the same thing.  My query would be something like:
    SET STATISTICS IO ON;
    SET STATISTICS TIME ON;
    GO;
    PRINT "START TEST";
    GO;
    <optimizing query goes here>
    GO 10;
    PRINT "END TEST";
    GO;

    That way I have 10 sets of data to average things out and compare uncached plan to cached plan (first run would be un-cached plan, others would be cached).  And I have real world numbers I can throw to management to show that the optimizations helped.  Problem with that method is that it doesn't work well for INSERT, UPDATE or DELETES; only is valid testing for SELECTS.
    If I am working on getting a query to run as fast as it possbly can, then I don't use those.  But our test instances are a lot less powerful than live so performance tuning is a bit more difficult.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • jcb - Tuesday, April 18, 2017 7:44 AM

    The tale

    II) All IDs columns are... numeric(18,0)
    III) Almost all tables got a COMPANY_ID and a TOWN_ID column.
    IV) The main tables are rarely updated, the data processing is complex and changing one record can demand many other records in many tables to change.
    That's why for many complex tasks a "update" means delete and recreate the data. (leading to a lot of index fragmentation)
    V) There's no data shared between two companies
    VI) Each company has a few hundred "Towns" and each "Town" a few hundred "Depts".
    VII) The partitin is by "TOWN"

    Partition function:
    CREATE PARTITION FUNCTION [ID](numeric(18,0)) AS RANGE LEFT FOR VALUES (0, 1, 2, 3,....,max TOWN_ID)

    Others can comment more intelligently on partitioning.
    But I have an observation on datatypes and size:  The IDs seem orders of magnitude too large for companies, & perhaps for towns.  There certainly aren't 999 quadrillion companies or towns in the world.  And even if you use a unique ID for every town for every different company, I doubt that would approach that limit.  You can benefit from smaller, more appropriate datatypes where possible.  An integer is probably sufficient for COMPANY_ID (saving 5 bytes/row).  
    And even where you need 18 digits, use a bigint rather than numeric(18,0) to save one byte/row.  
    The savings is important not for disk space, but because it will allow more rows per page in memory and more efficient indexing

  • jcb - Tuesday, April 18, 2017 7:44 AM

    ...
    The questions:
    I'm tasked to help test it, to find any problem and if performance really get better.
    1) What I'm looking for?
    2) What I cand do to properly test robustness?
    3) How is partitioning helping with indexing?
    4) What to expect as indexes gorws fragmented? (there's a lot of fragmentation due to insert/delete)
    5) What to expect and new companies and "Towns" are created?
    ...

    You may have heard about how partition elimination can help improve the performance of queries, but that only turns table scans into partial table scans and it requires some assumptions about your partitioning strategy and query predicates, so it's usefulness in that regard is limited. It's more useful to focus on efficient indexing so you avoid table scans as much as possible. However, are other reasons to implement partitioning. For example, it can help with the performance of index maintenance and bulk inserts, which indirectly can improve the response time of your queries.

    The ALTER INDEX statement supports a PARTITION clause, meaning that you can rebuild indexes on a specific partition, perhaps only the most recent "hot" partition. Even if you intend to re-index all partitions, it can be significantly faster to re-index one partition at a time, particularly on very large tables. Remember to partition your non-clustered indexes, not just your clustered table index. For example:
    ALTER INDEX ALL ON <table name> REBUILD WITH ( PARTITION = <partition number> );

    You can also leverage partition switching to reduce blocking on the table during bulk inserts. Isolate your queries by first inserting into a staging table (simply another table having the same DDL as your transactional table), and then switch the new partition into your transactional table, which takes only a few seconds.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Thanks everybody insigths

    Looks like they told me a different tale after all

    The sql guy give me two databases to test performance and told me both are equal but by the partition.
    Comparing both databases I found all PK (all tables ID clustered indexes) has been changed to not clustered and all TOWN_ID changed to the table clustered index.
    sql guy just said the index change was necessary for the partition to work. (I have my doubts but as I said I'm ignorant in this topic)

    As I said before almost all tables got a TOWN_ID column.

    Well, happens most queries get a TOWN_ID clause so for some know to be problematic reports and features this index change really speed up things.
    But at the same time other are slowed down, a lot.

    At peak times we have a thousand online uses feeding some huge tables, those tables also get a lot of deletes and there are like 10k to 500k records for each TOWN_ID.
    I'm afraid this will impact production server (still no time for decent load test due to other duties).

    At first I was like calling this partition boost performance a miracle but now I'm more inclined to declare "Mith Busted!"

    By the way, talking with management and other coworkers about performance problems and now we are seriously considering to split the DB and and take the scale out route

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply