• craig 81366 (11/1/2012)


    L' Eomot Inversé (10/31/2012)


    craig 81366 (10/31/2012)


    Then why when I run the following by itself, does the plan not spool?

    SELECT AVG(COL) FROM TABLE_NAME

    Because a displayed MS SQL query plan only shows spool operations when they are not blatantly obvious (which I think is eminently sensible); no other reason.

    Whereas I can agree to disagree on the previous point, here I must step in and call a spade a spade.

    I can only assume your insistence on pursuing this is that you don't fully understand what a spool operation does.

    Not pulling punches: I consider your claim to be utterly unsubstantiated, unfounded rubbish!

    There were "spool operations" way back in ancient times, many years before I got involved in computing (I started computing back 1963 at Oxford with some of Lucius Fox's numerical methods stuff, then some more odd bits in 1964 at STC, worked seriously on it in 1966 at RHEL, played with it while a research student at Bristol U in 1966/67, but only switched from being a mathematician to being a computer scientist some time after I joined NRL at the end of that academic year). What a spool operation did then - and still does now - is read some data from permanent store or working store, execute some transformations on it (the transformations may be null) and place the result or those transformations in some temporary store which is preserved until all subsequent already planned operations requiring the transformed form of that data are completed, at which point that transformed data (usually called "the spool") is discarded. This is a general computing concept dating from the 1950s, long before the existence of any RDBMS. The idea that term "spool operation" has changed its meaning just to cope with RDBMS is complete insanity, since the term is still used every day in connection with passing data to networked destinations (eg to a networked printer). The purposes of spooling are at least threefold: (i) to avoid unwanted repeat computation; (ii) as a direct consequence of (i), to ensure that a single value (the spooled value) of the (derived) data is used for all dependent tasks; and (iii) to allow use to be made of the spool even though the original data may be (temporarily) inaccessible (probably irrelevant today, but enormously important when I first worked in data communications). It is of course irrelevant whether the only use of the spool before destroying it is to format it and output to a console (particularly when that output is asynchronous). If you have a different understanding of "spool", then you are just plain wrong - in fact if you want to disagree with that definition then you are (to use the style of personal abuse that you have chosen to introduce into this dialogue) an ignoramus, and your insistence that your utter misunderstanding of the term is correct while my accurate understanding is incorrect is a very clear indication that you are not just an ignoramus but an arrogant and self-important ignoramus. Of course all of us, not just MS, normally ignore the trivial spool where output is to a single console which just happens to be active when the spool is created and has a reasonably close to synchronous (albeit asynchronous) connection to the task creating the spool.

    It effectively reads all relevant data into temporary storage before proceeding to the next step. This isolates the data from changes that could be effected in said step. It can also be a relatively expensive operation, and so is not nearly as trivial as you seem to think.

    It doesn't read all relevant data and put it into temporary storage. It reads all relevant data and puts some value computed from it (perhaps as little as 1 bit computed from a terabyte of original data) into temporary storage. It matters not one jot (just to correct something else you got wrong elsewhere, and save me the trouble of quoting it) whether it computes the derived data after reading all the original data or computes it on the fly.

    As for being expensive and not as trivial as I think, I should perhaps give you some of my history: in the late 80s I was chief architect of a declarative system R&D project that involved a relational (but definitely NOT SQL) database. One of the concepts being pushed around was that the DML/DDL should be declarative except at transaction boundaries (ie transactions should never see their own afterlooks); I killed that precisely because I understood the cost of spooling. Single statements we could afford (maybe - but if not, we didn't have a viable project); whole multi-statement transactions, no bl***y way! I think that that suggests I understood the cost a few decades ago. Maybe you think I've forgotten?

    The fact is you do not need to spool (first read all data into temporary storage) to calculate an average.

    Ye, you do. It doesn't all have to be there simultaneously, of course. Nobody but you has sugested that for an average one spools the original data, rather than the single datum the average.

    I could carry on by introducing an attack on your method of calculating averages (which any mathematician knows doesn't work unless you have exact numerics with adequate accuracy throughout) but I won't as MS SqQL seems to do it that way. I could be even nastier with your method of recalculating the average as elements are deleted, but I don't see the point in doing so - after all, it is utterly unimaginable that MS is stupid enough to allow the on-the-fly correction as elements are deleted to be done by the method you suggest (if I discovered that it did I'd be recommending to HMG to instruct its suppliers to ban the use of aggregates in SQL Server in any software it commissions except where the ranges and the data types are such that the calculation is provably safe, and I believe that there are enough mathematically qualified civil servants that my recommendation would stick; and it would probably spread to US Gov too; but it won't happen, because MS is not that stupid; in fact it uses what you want to miscall "Halloween Protection" to ensure that that never happens).

    Tom