• It's an interesting question.

    Or is it really two quite distinct questions?

    (1) Will there be a great increase in the proportion of data in relational databases that is big chunks of character or binary data? And

    (2) will there be a great proliferation of types recognised by relational database systems?

    Perhaps these two questions have different answers: I suspect the answers are (1) Yes and (2) No, but would be much happier if I thought that system designers in general were bright enough to ensure that the answers were (1) Yes but not as much as Steve suggests and (2) No.

    I think it's important to keep a distinction between operations that ought to be provided by the database and operations that ought to be provided outside the database. People have different takes on where the boundary lies, of course, but I imagine a majority are sufficiently rational to conclude for example that natural language translation, summary generation, and Bowdlerisation (three perfectly reasonable collection of operations on a properly defined type that I guess I had better not call "text" as T-SQL already misuses that term for something else, so I'll call it "sdws" instead) should not be database functions, so the database will just see that text as character string objects and the database will never have a "sdws" (or whatever you want to call it) type. With some large binary object types there are, I think, similar boundaries: if we look at image types, will the format conversion between 24 bit colour and 16 bit colour be a database operation? I think not, because this seems irrelevant to data management. Will resizing, including of course anamorphic resizing be a database operation? Again I think not. How about gamma adjustment, colour adjustment, contrast adjustment, cropping? I find it hard to see any of those as database operations. Will the database do format conversion between JPeg and PNG and TIFF and BMP and so on? How about masking, layer-transparency adjustment, overlaying, twisting, red-eye removal, fish-eye effect, reflection? Will it do handle projective distortion? The list of operations that have no place in a database but without which an image type is a pretty poor imitation of a real image type is rather long even for straightforward 2D images, and a decent 3D image type seems to be far beyond the scope of a database.

    That isn't to say we won't store images in relational databases – just that the database won't see them as images but as strings of bits (or of byes – we already have the required type for this, it's called varbinary). So images won't contribute to an explosion of types, and neither will movies, sound tracks, and so on, but that leaves open the question of how often these things are going to be as binary objects in the relational database.

    And it also doesn't mean either that there will not be new types in databases, just that the new types won't be for the big binary objects that Steve mentioned. Since T-SQL's % (modulus) doesn't function on float, we could maybe do with an angle type; maybe will get a complex number type; we certainly should get a longer float type (128 bits); and perhaps we will get an unbounded integer type and an exact rational type; what else? I don't know: maybe someone has a use for a point type for the complex projective plain (I certainly don't), but I suspect it wouldn't be popular enough to be worth a manufacturer's while to implement. Maybe some types that represent the truth values of various logic systems? I see no sign of a great proliferation of types, and no reason for it to happen.

    The next question is whether images, movies, books, and so on will be in the relational databases. We'll certainly have information about subscriptions and licences and access rights and other meta-data (plot summaries, cast, composer, size, duration) in the databases but do we want the things themselves? I can think of a couple of reasons why not, if one thinks of entertainment systems serving large numbers of users (especially a mixture of transient users and subscribers). Some of these reasons (all of which are familiar to everyone who has worked in this field) are

    A) Sometimes the content owner will not give you the right to store the data on your system – you have to stream it from his system; so your db holds a resource identifier for the data, not the data itself.

    B) Sometimes you want to hold different numbers of copies of different films and/or the same number of copies but distributed differently over different streaming servers; inj addition, streaming to some clients will need more read-ahead than streaming to others, and some films will be encoded at different bit rates than others (the rights holder usually puts restrictions on ecoding methods and bit rates, and different rights holders impose diffeent rules) so your database holds sets of resource indicators, not copies of the data, because it is quite hard (perhaps impossible) to organise the load-sharing for streaming using SQL DDL.

    C) Sometimes (actually very frequently) the content owner wants things securely deleted from your system within some short time (maybe a month) of license expiry (securely means that the pages it occupied are first overwritten with all 1 bits and then overwritten several times with different garbage patterns and finally overwritten with all 0 bits before being returned to the free space list); it's currently easier much easier to do this outside the database than inside it (that of course could change in the future).

    D) The rights holder would probably have a fit if his valuable movies were backed up (even in encrypted form) in the manner that database data is usually backed up. He expects recovery – if you lose the movie – to be by getting the original from him again and reprocessing it (including decrypting from his encrypted form and encrypting in yours without the in-clear version ever existing in filestore, and securely clearing the storage that was used for the new holder-encrypted original), the thought that you could restore it from off-site backups would ensure that he would never give you a license. So you either have inadequate backup for the rest of your data or you don't put this stuff in the database.

    E) How many databse server licences do you want to buy? Streaming stuff through the database service instead of direct from the storage meia seems somethat uneconomical (and may have some rather unpleasant bandwidth implications for parts of your network).

    Tom