Database Queries and Scalability

  • Comments posted to this topic are about the item Database Queries and Scalability

  • I think that an essential point is that scalability is considered in the design.

    If everyone agrees on that point then we are then down to selecting appropriate designs and implementation that are specific for each set of circumstances. Having said that there will be some designs/implementations that will suit many circumstances (dare I call them patterns?) and it is a good thing that we actively seek them, document them and discover the caveats of their use.

    I am not going to comment on the implementation discussed because I will end up either being told or telling someone "it depends" πŸ˜‰

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • I don't think the article is applicable to the kind of sites and development I do, is the way I would put this one.

    I can see that for certain circumstances it would be fine but isn't that always the way?

  • Two things:

    - Maybe my browser rendered the page incorrectly, but I had a hard time reading it as it was all black text on a dark grey background.

    - The author seemed to be talking more about the efficiency (or lack thereof) of caching strategies and blaming it on joins. At the risk of oversimplification, it's about when and where joins are performed rather that whether or not they are performed. To little old me over here in the cheap seats of the peanut gallery, it seems more of an architecture issue than a design pattern per se as the author puts it.

    ____________
    Just my $0.02 from over here in the cheap seats of the peanut gallery - please adjust for inflation and/or your local currency.

  • lshanahan (10/21/2013)


    ...To little old me over here in the cheap seats of the peanut gallery, it seems more of an architecture issue than a design pattern per se as the author puts it.

    Architectural pattern / design pattern: just differences in granularity on the way to implementation πŸ˜‰ (Oh, OK there is more to it than that!!!)

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Most of the time I try to design things assuming the scale will increase, and may increase significantly. To me it seems to depend more on how many columns of data we're keeping rather than how many rows. Harder to consider scalability for hundreds of columns verses only a few.

    Tony
    ------------------------------------
    Are you suggesting coconuts migrate?

  • Gary Varga (10/21/2013)


    lshanahan (10/21/2013)


    ...To little old me over here in the cheap seats of the peanut gallery, it seems more of an architecture issue than a design pattern per se as the author puts it.

    Architectural pattern / design pattern: just differences in granularity on the way to implementation πŸ˜‰ (Oh, OK there is more to it than that!!!)

    Point. Still, the article just seemed to harp overmuch on just not using JOINs when his real issue is finding more efficient ways of moving data from front- to back-end and vice-versa while maintaining data quality and integrity among other things.

    ____________
    Just my $0.02 from over here in the cheap seats of the peanut gallery - please adjust for inflation and/or your local currency.

  • I certainly don't like the article referenced, it seems to me that it is a heap of assertions for whose credence no basis is offered. Yes, scalability needs to be a factor in design and in architecture and in selection of technologies, to the extent that scalability is a requirement. But claiming that the use of databases is the one thing that prevents scalability of websites is just unvarnished rubbish; almost always the main cause of lack of scalability is bad design. Bad design such as, for example, using 16 byte binary string identifiers for everything, instead of whichever of 16 bit, 32 bit, and 64 bit integer values provides enough range (even 8 bit in rare cases). Conversion between natural key and these identifiers through look-up in the database should be done rarely, at most once per session for each one, with the results cached in session data, which should not be in the database - - and of course consideration should be given to simplifying the design by using natural keys, which are unlikely to cause any more overhead than the 16-byte monstrosities advocated in the article. Another example of bad design which Mr Ozar seems to believe is a consequence of having a database is that session data is held in the database, something which I have never had the misfortune to see. At least I've learnt something useful from the article: that one should never allow PHP to be used for a website (at least not if what Mr Ozar says it does can't be switched off). Why does Mr Ozar translate that into "never use a database for website"? I can see nothing in use of a database that mandates use of PHP. I've been in the IT business since long before there were any web servers; even in the 1960s we didn't go back to the platform's security database and revalidate the login for every command issued by a MAC user; neither did we ask the platform software to store the current context of each logged in user, instead of having the MAC application do it for itself; the distributed TP systems for which I contributed parts of the infrastructure back in the 70s didn't make the mistake of putting the session data in the database (IDMSX, not an RdBMS at that date) either. Perhaps Mr Ozar's belief is that web designers with access to a database always make mistakes that we already knew, half a century ago for MAC systems and at least 40 years ago for TP, should be avoided.

    Tom

  • L' Eomot InversΓ© (10/21/2013)


    I certainly don't like the article referenced, it seems to me that it is a heap of assertions for whose credence no basis is offered. Yes, scalability needs to be a factor in design and in architecture and in selection of technologies, to the extent that scalability is a requirement. But claiming that the use of databases is the one thing that prevents scalability of websites is just unvarnished rubbish; almost always the main cause of lack of scalability is bad design. Bad design such as, for example, using 16 byte binary string identifiers for everything, instead of whichever of 16 bit, 32 bit, and 64 bit integer values provides enough range (even 8 bit in rare cases). Conversion between natural key and these identifiers through look-up in the database should be done rarely, at most once per session for each one, with the results cached in session data, which should not be in the database - - and of course consideration should be given to simplifying the design by using natural keys, which are unlikely to cause any more overhead than the 16-byte monstrosities advocated in the article. Another example of bad design which Mr Ozar seems to believe is a consequence of having a database is that session data is held in the database, something which I have never had the misfortune to see. At least I've learnt something useful from the article: that one should never allow PHP to be used for a website (at least not if what Mr Ozar says it does can't be switched off). Why does Mr Ozar translate that into "never use a database for website"? I can see nothing in use of a database that mandates use of PHP. I've been in the IT business since long before there were any web servers; even in the 1960s we didn't go back to the platform's security database and revalidate the login for every command issued by a MAC user; neither did we ask the platform software to store the current context of each logged in user, instead of having the MAC application do it for itself; the distributed TP systems for which I contributed parts of the infrastructure back in the 70s didn't make the mistake of putting the session data in the database (IDMSX, not an RdBMS at that date) either. Perhaps Mr Ozar's belief is that web designers with access to a database always make mistakes that we already knew, half a century ago for MAC systems and at least 40 years ago for TP, should be avoided.

    +1000. I just went through some real hooie on one of the apps designed before the current team started. One of the apps checks for the credentials of the user on ever screeen and every object touched. Some of the pull downs check for credentials for every item in the pull down. One click would generate a dozen calls to the same credential proc (and we have a LOT of credentials. It's one of the worst designs I've ever seen but not the only one I've seen like it. Of course, the outcry was "the database has a performance problem". It sure did... the problem was the application using it. πŸ˜‰

    It'll become a problem in the future again because their "fix" was to cache the credentials. It's still making a dozen checks per object clicked. Time for the webserver to start taking the blame for crap code. :hehe: It's a real shame that they don't want to spend the time to fix the bloody root problem!!! :sick:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (10/21/2013)


    +1000. I just went through some real hooie on one of the apps designed before the current team started. One of the apps checks for the credentials of the user on ever screeen and every object touched. Some of the pull downs check for credentials for every item in the pull down. One click would generate a dozen calls to the same credential proc (and we have a LOT of credentials. It's one of the worst designs I've ever seen but not the only one I've seen like it. Of course, the outcry was "the database has a performance problem". It sure did... the problem was the application using it. πŸ˜‰

    It'll become a problem in the future again because their "fix" was to cache the credentials. It's still making a dozen checks per object clicked. Time for the webserver to start taking the blame for crap code. :hehe: It's a real shame that they don't want to spend the time to fix the bloody root problem!!! :sick:

    This idea reminds me that Mongo DB Is Web Scale. Making a website try to act like a miniDB is actually sort of scary. Is it guaranteed to be written into the DB at some point? Will you be sure that the the user didn't loose connectivity and connect back to a different web server. I have an aircard at home for my web experience. I'm not sure if it is the aircard, the provider network, or my laptop's USB ports that are screwing up but I am having weak signals and dropouts on a regular basis for the last few weeks. Am I guaranteed to go back to the same web server? Can the cache on webserver1 overwrite webserver2?

    Another bad design example is the healthcare.gov website. They needed it to eventually get over 16M people to use it. Could you imagine designing a website that you are dumping every bit of your personal financial and health information into and it is trying to store it in live memory and can only handle 50-60K users at a time[/url]?



    ----------------
    Jim P.

    A little bit of this and a little byte of that can cause bloatware.

  • Sometimes the problem is what the team have implemented.

    Sometimes the problem is the team.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • From the article:

    Queries and joins are bad for your website performance. So is connecting to a database in the first place.

    When I read this I thought of a quote a friend of mine made: "We put the 'DUH' in 'datacenter'." Kinda like the difference between (to borrow the author's waiter analogy) pulling leftovers from the fridge vs a full seven-course meal at an exclusive restaurant. "DUH", it's tons faster to pull from the fridge, but you might not want to eat it depending upon how long it's been there.

    ____________
    Just my $0.02 from over here in the cheap seats of the peanut gallery - please adjust for inflation and/or your local currency.

  • Gary Varga (10/22/2013)


    Sometimes the problem is what the team have implemented.

    Sometimes the problem is the team.

    Gaz - Love it!

    I led a project back in the 80's where I was the project manager and the team architect. We had a large number of people on board and due to a very tight timeline not everything got reviewed as it should by me and parts of the QA was delegated out to a senior developer. One of the dev staff wanted to do things an alternative way and in review the senior developer and I both said no. The developer went ahead on his own and developed it using the unapproved technology. Once his first major component was installed in the development environment we saw that the internals of the machine were showing a huge workload with a small number of transactions being active. Actually one system subroutine was running constantly consuming huge resource allotments.

    When I investigated this in detail by reading believe it or not a huge internal core dump I was able to determine that the problem started and ended with the use of a self-defining data structure in an IBM CICS world. FYI this was mainframe PL/1. When I found this I called the developer in and we discussed the issue and got to the point where he admitted that he went ahead on his own initiative and did it his way even after being told no.

    Sometimes the problem is the team!

    Not all gray hairs are Dinosaurs!

  • The majority of the people I have worked with, worked for or have worked for me have been, at the very least, cooperative team players with reasonable communication skills. It has been a pleasure to know them and work with them but, unfortunately, there are always exceptions.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply