Commercial Software with All Tables as Heaps and Terrible Performance

  • Hello

    I recently moved to a new organisation and one of my first tasks was to work out why the main software the organisation uses is, and has always been, very slow and unpredictable.

    Every table in the database used by the system is a heap as no tables have any clustered indexes/primary keys set up and I have never seen this kind of setup before (unless it was developed by someone who was new to databases). The tables do mostly have non-clustered indexes on them.

    Is there a valid reason to use only heap tables in a large complex system with hundreds of tables where certain tables hold millions of records?

    This is software that is actively supported and used across the sector.

    Thanks

    Robin

  • robinwilson (9/12/2015)


    Hello

    I recently moved to a new organisation and one of my first tasks was to work out why the main software the organisation uses is, and has always been, very slow and unpredictable.

    Every table in the database used by the system is a heap as no tables have any clustered indexes/primary keys set up and I have never seen this kind of setup before (unless it was developed by someone who was new to databases). The tables do mostly have non-clustered indexes on them.

    Is there a valid reason to use only heap tables in a large complex system with hundreds of tables where certain tables hold millions of records?

    This is software that is actively supported and used across the sector.

    Thanks

    Robin

    Starting from scratch, no...gradually migrating from C-ISAM tables on unix to SQL server and needing to maintain code that works across multiple systems with different DB engines, whilst only having a few programmers...maybe financial...

    I also have to work with this (in my case Sage 1000 ERP), and I think the legacy of a 30 year old code base is the problem.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • robinwilson (9/12/2015)


    Is there a valid reason to use only heap tables

    Only reason I know of is where the insert rate is astronomic. Even then lacking a Clustered Index can cause problems and there is a White paper by Paul Randall with some tests he made suggesting that all tables should have clustered indexes.

    But that said "Every table in the database used by the system is a heap as no tables have any clustered indexes/primary keys" is clearly a conscious decision (even if one born out of ignorance!) as not all the tables can have astronomic insert rates, surely?!!

    work out why the main software the organisation uses is, and has always been, very slow and unpredictable.

    We have a 3rd party APP whose perform is "poor". Its a well known name sold by a company worth $billions and, to my mind, should be far better written than it is. The code is peppered with NOLOCK and we have sporadic "Data missing from reports" issues ... hard to prove but my gut feeling is that it relates to NOLOCK or some other poorly written code.

    Anyways, I have started logging DMVs about index usage to a table so that I can make an informed choice about which columns to use for the clustered index (when I add one to the heap-tables) - you might need to do the same??

    A key problem we have is that there is a hierarchy of folders and each folder that a user causes to be inserted / moved causes all child rows "in between" (in the hierarchy) to have a count column decremented / incremented. This runs in a tight loop, some nights for 10 hours, and a typical update statement (i.e. one iteration of the RBAR loop) updates 20,000 rows. The COUNT column being updated is part of the Clustered index ... and there are about 15 non clustered indexes - all of which have to be changed, because the clustered index key column changes, for all 20,000 rows ... on every iteration of the loop ... for 10 hours ... every night 🙁

    If your APP has updates on records which cause them to physically move in the heap (e.g. record size increases and not enough space to write it back to original location) then, without a clustered index, all non-clustered indexes will have to be modified (they contain the "disk address" of the record, whereas if you had a clustered index the non-clustered indexes would contain the Clustered Index Key Column(s) and thus a record moving to a different disk address would not change the non-clustered index (provided you don't have the problem I do of a clustered index key column being updated!!)

  • robinwilson (9/12/2015)


    I recently moved to a new organisation and one of my first tasks was to work out why ...

    You are probably already aware of it, but in case not I have found running Brent Ozar's "Blitz" Server Health Check Script invaluable in discovering what sort of mess the databases created/used by 3rd party APPs are in.

    http://www.brentozar.com/blitz/[/url]

  • robinwilson (9/12/2015)


    Hello

    I recently moved to a new organisation and one of my first tasks was to work out why the main software the organisation uses is, and has always been, very slow and unpredictable.

    Every table in the database used by the system is a heap as no tables have any clustered indexes/primary keys set up and I have never seen this kind of setup before (unless it was developed by someone who was new to databases). The tables do mostly have non-clustered indexes on them.

    Is there a valid reason to use only heap tables in a large complex system with hundreds of tables where certain tables hold millions of records?

    This is software that is actively supported and used across the sector.

    Thanks

    Robin

    I recently went through this with a 3rd party vendor that my company has hired to write some code for us in a database designed for them. When I cornered the Lead "developer", his answer was so that his code would always be portable because clustered indexes are proprietary.

    Unfortunately, the code is like what Kristen describe except it's some really poorly written document processing code that hits the database with more than 150,000 SELECTs every 32 seconds and that's just on one of the tables and that's just to process 15 documents. I'll also tell you that the code doesn't work and the table in question has grown to 4 million rows in just a couple of months.

    It's a really good thing that I don't live within driving distance of that, ummm... person.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Using heap to speed the inserts is like a very bad Credit Card deal, no transaction fees but X000% Annual interest rate:pinch:

    😎

    Jeff Moden (9/12/2015)


    It's a really good thing that I don't live within driving distance of that, ummm... person.

    Jeff, what happened to the PoIP (Porkchop over IP) you were working on?:-D

  • Hello all

    Thanks for the replies.

    The software in question was developed many years ago and whilst it is still supported and maintained it does look and feel very old fashioned with a confusing UI with lots of small grey dialog boxes.

    Whilst inserts do occur, we are mostly reading the data and inserts end up slow as well as the database is so taxed trying to cope with all the reads. An external lT company has had a look and said we are doing over 10 table scans per second. Presumably due to us not having any clustered indexes/primary keys.

    Whilst I cannot see the SQL in the software to see if they use NOLOCK they do have an equally bad drag and drop query writer which generates all the SQL on one line and adds NOLOCK to all the tables and puts the ON part of each join in a different place to the rest of the JOIN statement so I'm not even sure how SQL Server can interpret it - I know I can't!

    I will have a look at Brent Ozar's "Blitz" Server Health Check as I was not aware of this.

    We struggle with things which I feel are fairly basic such as returning a list of 5,000 students with their highest Maths, English and Science results (the organisation in question is a college) - this task takes 8 hours!

    I did ask the software supplier why they had not added clustered indexes to the tables and they told me it was because they couldn't know how I wanted to query the data and it was up to me if I wanted to create the indexes myself. It seems strange they leave this kind of thing up to the customer and due to the fact that queries always join/group on the PK/FK relationships surely they would always be beneficial (as we are not doing a large amount of inserts).

    Robin

  • robinwilson (9/13/2015)


    I cannot see the SQL in the software to see if they use NOLOCK

    SQL Profiler, with filters on TextData for %NOLOCK% ([perhaps also a filter on Database Name, although I would want to check on all DBs ...) - but if you find some there is nothing you can do about it (i.e.. you cannot modify the source code) except add it to your sleepless-nights-worry-list 🙁 or [definitely my preferred choice!!] make a nuisance of yourself at a User Group Meeting 🙂

    But as you "we are mostly reading the data" it may not be something to worry too much about. But there again, if inserts are rare I am not sure why NOLOCK would be chosen in the first place - as presumable not much [performance/blocking] benefit in the first place?

    We struggle with things which I feel are fairly basic such as returning a list of 5,000 students with their highest Maths, English and Science results (the organisation in question is a college) - this task takes 8 hours!

    Boy are you going to be popular when you add some Clustered Indexes and tuning and cut these reports from Hours to Milliseconds!

    it was up to me if I wanted to create the indexes myself

    Good news ... as you are not in breach of their T&C's 🙂

  • We struggle with things which I feel are fairly basic such as returning a list of 5,000 students with their highest Maths, English and Science results (the organisation in question is a college) - this task takes 8 hours!

    is this using the software reports as designed or are you writing your own query and getting this poor response?

    out of interest....how many rows in the largest tables?

    assume this is a SQL 2000 database?

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Thanks I didn't think of looking at SQL Profiler as even if I can't do anything about it at least I can complain to them to maybe influence them to change it.

    I do actually notice locking and the queries even seem to lock themselves when there is nothing else running on the server (i.e. through the night outside the backup window).

    In regards to the poor performance it is mainly in trying to report from the system although the system itself is slow (saving a learner taking 25 seconds is considered fairly quick).

    They don't really provide any reports though so you are expected to write your own.

    The database was running on 2008 R2 but it has been moved to a new 2014 server.

    The largest table is the audit table which currently has 29 million rows and is much larger than the rest but it was originally 5 times the size and had some corruption so I fixed it and trimmed it down. You couldn't select from it at all to start with.

    Other main tables are around 100,000 - 500,000 records .

    Another issue I have discovered is that rather than use an auto-number field or even a sequence the system has a single table with one row which holds the next ID and every table gets its next ID from this table which involves first locking the ID table, getting the next ID and incrementing it and finally unlocking the table again afterwards. They said it was a feature but I have no idea why this would be a good thing to do.

    Surely this will mean that in a multi-user environment each insert transaction will be suspended whilst others complete - thus negating any argument for a heap table?

    Thanks

    Robin

  • Eirikur Eiriksson (9/13/2015)


    Using heap to speed the inserts is like a very bad Credit Card deal, no transaction fees but X000% Annual interest rate:pinch:

    😎

    Jeff Moden (9/12/2015)


    It's a really good thing that I don't live within driving distance of that, ummm... person.

    Jeff, what happened to the PoIP (Porkchop over IP) you were working on?:-D

    I decided to go whole hog on a new porkchop launcher project, instead 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (9/14/2015)


    Eirikur Eiriksson (9/13/2015)


    Using heap to speed the inserts is like a very bad Credit Card deal, no transaction fees but X000% Annual interest rate:pinch:

    😎

    Jeff Moden (9/12/2015)


    It's a really good thing that I don't live within driving distance of that, ummm... person.

    Jeff, what happened to the PoIP (Porkchop over IP) you were working on?:-D

    I decided to go whole hog on a new porkchop launcher project, instead 😀

    That's quite "striking":hehe:

    😎

  • robinwilson (9/13/2015)


    In regards to the poor performance it is mainly in trying to report from the system although the system itself is slow (saving a learner taking 25 seconds is considered fairly quick).

    Scary ... I suppose I am in a similar boat because all of my Database Life I have spent 90% of my time looking after our own APPs. We have written them using experience we have gained, and changing our procedures as we discover A Better Way, so I have never had any nasty surprised with performance, but lately I have been spending 90% of my time trying to improve the performance of poorly written 3rd party APPs (from Vendors who should know better, or who have clients who don't complain enough to get them to improve!) (as well as being expected to spend the other 90% of my time on our own in-house APPs 😀 )

    I'm finding improving the 3rd party APPs much harder, and I have spent a LOT of time writing logging for DMV stats, and reports on those log files, to try to get enough info to improve them (and then finding that my skills are lacking in this area so I then have to Rinse and Repeat ... which is what contributes to the Elapsed Time).

    I have had a single person do a single task in the middle of a weekend when the system was unused to try to capture what was going on in SQL Profiler, with an expectation that I could then fix that one dire problem easily, but all that did was to open a whole new can of worms! Hence I am at the "log everything and only then move forwards" point now.

    But that apart it might be worth you trying to get a SQL Profile snapshot of a single, poorly performing and frequently used, task with a view to improving that for users. If I was the Business Manager for a process that was taking 25 seconds I'd be jumping up and down ... we have issues here where a process is taking 5 seconds to save a record as it is seriously slowing down the work-rate for data entry staff.

    The database was running on 2008 R2 but it has been moved to a new 2014 server.

    Dunno if you will need it, but new versions of SQL have the ability to force a Plan Guide onto a query. I've been looking at that for some queries with disastrous query plans but I'm still not sure if that is the right solution to the right problem (in my case), as I think that improving all the indexes on all the tables might be better, hence I'm still walking rather than running ...

    [quote-2s]The largest table is the audit table[/quote-2]

    That's true of our APPs too ... our Audit Table is always huge compared to any actual client data 🙂 But we built it as, essentially, a write-only table as we only need to query it when something goes wrong, or overnight for STATS.

    If I had a problem with the STATS query I'd be happy to restore last-night-backup to another server and run the reports there (with better indexes / whatever)

    the system has a single table with one row which holds the next ID and every table gets its next ID from this table ... I have no idea why this would be a good thing to do.

    Two benefits I can see:

    Every ID in the system is unique. So if you have an ID of 1234 then you are only going to find a matching row in one table, I can see that being helpful in debugging (and maybe users find it helpful too if a 3rd party quotes an ID without the "context"?). We've had major cockups in the past in DEV where we wrongly joined TableA.ID to TableB.ID - instead of TableC.ID - and because we only had ID's 1-10 in DEV :hehe: the JOIN worked. Now we set the Initial Value of IDENTITY 1,000 apart on each new table we create so there is no chance of an accidental JOIN (well, in DEV at least, in Real World the record counts are higher of course).

    Second benefit is that you can say "Reserve me 1,000 IDs" and the central ID table will be incremented by 1,000. This can be handy by getting the ID range ahead of any inserts when writing Parent / Child data as opposed to inserting Parents Rows and then trying to find out what ID they have been allocated. Modern SQL versions have much better ways of solving that problem, but I have sympathy for legacy systems. Of course YOUR system may never use the central table for bulk ID allocations 🙂

    Other than that it stinks as a bottleneck as you said; I doubt that is part of your 25-second delay though. Dunno anything about the in-memory tables (in SQL2014, rather than SQL2016 I think??) and whether you can just move a table to be in-memory ... should be faster though for your ID table though, if you can. There may be other ways to improve performance if that is just a single row table?

    I'd be very interested to hear how you solve these issues, over time. Seeing other people's SQL Profiler logs, the problem they are tackling, and the line of attack they take would almost certainly help me with attacking my own PP3PP (Poorly Performing 3rd Party Programs - best I could do for an acronym, sorry!)

  • Hello Kristen

    Thanks for your response. I agree I do find it much harder trying to fix other people's code and code in 3rd party apps.

    The database in question also doesn't seem to be correctly normalised as some data almost seems too normalised (where I need to pull data from many tables) and in other places it repeats the same data in more than one location.

    Your advantage to the primary key issue was how the software supplier tried to sell it to me actually when I asked why they had done it that way too - also these fields are numeric rather than just int which may be a way of making the database work on other DBMSs too but not sure if this also causes a performance hit at all.

    I ran Brent Omar's stored procedure and that returned about 4,000 results where it said the non-clustered indexes we had were not valid as it says "The index X is a leftover hypothetical index from the Index Tuning Wizard or Database Tuning Advisor. This index is not actually helping performance and should be removed."

    It also confirmed that clustered indexes were missing on commonly used tables.

    There was also this one which was quite funny: "X has user-created statistics. This indicates that someone is being a rocket scientist with the stats, and might actually be slowing things down, especially during stats updates."

    Maybe Poorly Performing Legacy 3rd Party Pile of Pants Program PPL3PPPP is more appropriate!

  • robinwilson (9/14/2015)


    also these fields are numeric rather than just int which may be a way of making the database work on other DBMSs too but not sure if this also causes a performance hit at all.

    I swear we are working on the same shoddy 3rd party APP!! I too am confronted with NUMERIC IDs (although IDENTITY in my case). It does have a whiff of Oracle about it ...

    Definitely causes a performance hit 🙁 the key size of NUMERIC is wider than INT so fewer-keys-per-page and thus more pages / more frequent page splits.

    Pretty good chance that you can swap NUMERIC for INT in this situation though ... but whether it is a good idea to do that I'm less sure. If the Vendor approves it, after you've tested that it appears OK, then I reckon that would be worth doing - after all the other massive improvements have been made and you have got bored with polishing your halo!!

    I was wondering whether you could replace your NextAvailableID table with something that intercepted the calls and handled them better - e.g using SEQUENCE instead. I haven't thought about it very deeply, other than to convince myself it would not be possible!, but the idea of having a trigger that rolled back the attempted UPDATE and a VIEW instead of the Table which somehow magically hooked up to a SEQUENCE ...

    "X has user-created statistics. This indicates that someone is being a rocket scientist with the stats, and might actually be slowing things down, especially during stats updates."

    I enjoy the humour Brent Omar's crew manage to inject into their products. Glad it has proved useful, even if only to clarify the scale of the problem ...

    "Maybe Poorly Performing Legacy 3rd Party Pile of Pants Program PPL3PPPP is more appropriate!"

    Ah ... its got 6xP's in it so could well be related to Proper Planning Prevents Piss Poor Performance? 🙂

  • Viewing 15 posts - 1 through 15 (of 69 total)

    You must be logged in to reply to this topic. Login to reply