SQL future

  • David Portas (11/18/2009)


    ... They include Dataphor, Muldis and various other open source efforts.

    Interesting that you mention Dataphor. There have been others that have been religous zealots about Dataphor and I did look at it a while back. What was really interesting about it, it was nothing more than a layer of abstraction over (get this) MS SQL Server. How does that solve anything?

    Things may have changed since I last looked at Dataphor, but what good is another layer over something you thing is flawed and broken?

  • A number of RDBMSs and languages have been built and are being used. They include Dataphor, Muldis and various other open source efforts. I'm not sure what problems you are referring to with those systems when compared to SQL DBMSs. Maybe you could elaborate.

    I'd like to know what alternatives you think there are. The deficiencies of SQL are a pressing and very topical problem for a great many people right now. That's why there is so much interest and investment in alternative non-relational models. How else do you propose to improve the capabilities of our DBMSs? Do you really think current problems can be solved within the SQL model and if so, how? Or do you propose using alternative (non-relational) data models instead?

    First, Dataphor isn't a database engine. It's a C# abstraction layer that runs on top of other databases. Second, if you look at the discussion forums for the product, it has a number of major issues. The only places it receives unending praise are its own marketing pages on its own site.

    Can't say I'm familiar with Muldis. I searched for it, and it looks like it has verbosity issues (a simple left outer join takes 14 lines of code, as written by it's original developer), but I'm not familiar enough with it to judge the viability of it one way or the other.

    Personally, I don't think there is a viable alternative to SQL at this time. I think one or more will evolve, and SQL will go the way of COBOL, PASCAL and FORTRAN. But keep in mind that COBOL still has a simply titanic install-base.

    My assertion is not that SQL is perfect. Far from it. For one thing, OOP devs often seem to have horrific problems with comprehending the paradigm at all. That barrier is a real barrier, is not an indictment of OOP devs, and needs to be solved. But it needs to be solved in such a way that it doesn't create more problems than it solves.

    Quite likely, as with most human disciplines, specialization will evolve enough that OOP devs won't need to work with SQL, not because SQL will go away, but because it will be hidden behind the scenes enough to effectively disappear.

    Take a look at metalworking for an example of this sort of evolution. It used to be, way back when, that the same person mined, refined, and worked the metal into final products. Very inefficient. Now, the mining, refining, working, etc., are all done by specialized industries that very rarely overlap. I seriously doubt that any of the welders who work for Toyota would be able to identify a potential iron mine, much less work out how to best extract the ore from the ground, nor do they need to know that. As well, the tools for it have evolved into a tremendous diversity of highly specialized machines.

    Every human industry I can think of works that way. Starts out with a lot of ineffiency and everyone has to know everything in order to get anything done at all, using tools that are barely adequate to the job, then evolves towards more and more specialization and more and more efficiency, with better and better tools that automate more and more of the process.

    I think some people are still looking for a magical transformation, not an evolution. I expect an evolution. And I expect that it will be incremental, and will end up somewhere none of us could have anticipated.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Lynn Pettis (11/18/2009)


    David Portas (11/18/2009)


    ... They include Dataphor, Muldis and various other open source efforts.

    Interesting that you mention Dataphor. There have been others that have been religous zealots about Dataphor and I did look at it a while back. What was really interesting about it, it was nothing more than a layer of abstraction over (get this) MS SQL Server. How does that solve anything?

    Things may have changed since I last looked at Dataphor, but what good is another layer over something you thing is flawed and broken?

    Oh!! Snap!

    Sorry, that was just the first thing that came to mind..

    CEWII

  • GSquared (11/18/2009)


    I think some people are still looking for a magical transformation, not an evolution. I expect an evolution. And I expect that it will be incremental, and will end up somewhere none of us could have anticipated.

    I think this is probably the way it will end up being as well.. The end result will be nothing like we are arguing about now..

    CEWII

  • David Portas (11/18/2009)


    GSquared (11/18/2009)


    Interesting problem, but easily solved by adding a foreign key constraint to the Orders table, referencing the OrdersItems table (a join table between Orders and Items).

    Then how would you insert a new Order or populate the OrderItems table?

    Null does not (accurately) represent the property of being unknown. That's just your assumption or convention, not anything actual or implied by standard SQL. That is demonstrable by the straightforward fact that x = x is True in mathematics and reality if x is unknown but is NOT true in SQL.

    If you are seriously going to use quantum mechanics as a criteria for the design of database systems then I have nothing much more to say! If you are really going to be that silly then you would have to disallow ALL deterministic results of any kind from the database wouldn't you? No, don't bother to answer that... :crying:

    I didn't say null equals nothing. What I meant was that SQL wrongly returns a null as the sum of an empty set, instead of the mathematical and real world answer of zero.

    Null is exactly defined as being an unknown value.

    <programming> A special value used in several languages to represent the thing referred to by an uninitialised pointer.

    <database> A special value that may be stored in some database columns to represent an unknown, missing, not applicable, or undefined value. Nulls are treated completely differently from ordinary values when evaluating SQL expressions and there are several SQL constructs for dealing with nulls.

    Reference: Free Online Dictionary of Computing

    Check ANY reference on SQL and you'll find this is far from some assumption I came up with on my own. It's the standard, accepted definition. Check a good dictionary.

    On the point of x=x, yes, that's mathematically true. Ask any decent mathematician if you can validly compute 5 + x without knowing what the value of x is. Most high school algebra graduates can answer that one correctly. You have the option in SQL of allowing that null = null, but unless you are comparing the same two nulls, you can't be sure.

    For example, if you have an item on an unknown order. You know that Joe is going to order a sandwich for lunch, but you don't have an order number yet for the cash register, and don't yet know what else, if anything, will be ordered, can you really assign that sandwich to EVERY order that has an unknown cash register receipt ID? If you use null = null in that real-life situation, you're going to create HUGE problems for your business. Yet, per your assertion, null = null because x = x, and you're going to add a sandwich to every order for every customer who's still in line to go to the register.

    With your empty set example, there you go again assuming that null means something other than unknown value. What's the sum of the weight of all wamingungumas in the world? Per your logic, it's zero, because the weight is currently null, because you can't possibly know what the unit weight is.

    My statement about mathematics being an abstract that any decent scientist understands isn't the real thing is not limited to quantum physics. That's an absurd statement. If you have an apple in your left hand and an apple in your right hand, you don't have to go down to the level of them being made of different energy quanta to know that they aren't truly identical. Put them on a ballance, and odds are, they won't even weigh the same amount. We're talking grams here, not electron-volts. Take a good look at them in reasonable light, and I'll bet you can, by eyeball, identify differences in shape, color, length of stem, dimensions. Bite into them, and by that crude sense known as taste, you'll probably find that one is sweeter than the other, one perhaps more tart. If you didn't get that when you took grade school science, and weren't taught that math is an artificial abstraction, then you missed out on a very, very important fundamental to all human knowledge.

    Mocking me by making asinine statements about what I wrote doesn't actually win the argument for you. It doesn't make your point. You need to actually defend the statement that a "true relational database" really solves real world situations. You haven't done so yet. Haven't even tried.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • All my replies were intended in a positive and constructive spirit of contribution. I hope we can keep it that way. But as I said, I really didn't want to get drawn into a tedious discussion about nulls because it isn't the main topic of this thread.

    The dictionary definition you posted I think confirms my point. Null doesn't mean "unknown". It means (your quote) "unknown, missing, not applicable, or undefined value". In other words it can mean lots of different things and sometimes it means different things to different people. Inapplicable is not the same as unknown. The key point surely is that SQL does not treat null as meaning "unknown", as evidenced by "x=x" or "SELECT SUM(x) sumx FROM tbl WHERE 1=0;" among many, many other examples.

    GSquared (11/18/2009)


    With your empty set example, there you go again assuming that null means something other than unknown value.

    I didn't assume that but SQL does. The example I had in mind was this one:

    SELECT SUM(x) sumx FROM tbl WHERE 1=0;

    which is the sum of an empty set. The correct answer is clearly not unknown. It is a known, definite value: Zero. Therefore SQL is using null to represent something other than "unknown".

    GSquared (11/18/2009)


    You need to actually defend the statement that a "true relational database" really solves real world situations. You haven't done so yet. Haven't even tried.

    I already posted a very real example of one such a problem, one that you thought could be solved in SQL. But you haven't provided a working SQL solution or any other alternative yet.

  • Lynn Pettis (11/18/2009)


    David Portas (11/18/2009)


    ... They include Dataphor, Muldis and various other open source efforts.

    Interesting that you mention Dataphor. There have been others that have been religous zealots about Dataphor and I did look at it a while back. What was really interesting about it, it was nothing more than a layer of abstraction over (get this) MS SQL Server. How does that solve anything?

    Things may have changed since I last looked at Dataphor, but what good is another layer over something you thing is flawed and broken?

    Yes of course Dataphor is only an abstraction layer. But at least it's a relational abstraction layer. You could do a lot without having to deal with the SQL underneath and then when a better alternative engine comes along you can potentially slot that in without too much trouble.

    The fact is that thousands of organisations are migrating their systems to non-relational abstraction layers or non-relational DBMSs because they are sick of dealing with the complexities and limitations of SQL. One day they will be able to ditch SQL altogether and move on to a better DBMS. Unfortunately if they are using a non-relational abstraction layer then that new underlaying data store very likely may not be relational either. I think this is a shame because the relational model has a lot to offer.

    This is the reality at the moment and perhaps it is happening at least partly because SQL professionals spend too much time defending the status quo without proposing any new solutions to the real problems that people have with SQL DBMSs!

    I'm definitely not advocating Dataphor or even any database abstraction layers generally. I'm saying we need to recognise the limitations of SQL and encourage and support its replacement by non-SQL Relational DBMSs. If we don't do that then there's a danger that the future will be non-relational as well as non-SQL because non-relational is stealing all the limelight at the moment.

  • I already posted a solution to your "impossible problem". A foreign key, or a simple trigger, or one line of code in the insert proc, or any number of other solutions. You expressed that solving this would have a heavy cost or have to be done outside the database. I'm asserting simply that it can be done very easily inside the database using simple SQL tools.

    Here's one simple way to solve this. Took me about 2 minutes to write all of this:

    create table dbo.Orders (

    OrderID uniqueidentifier primary key,

    Account uniqueidentifier not null);

    go

    create table dbo.Orders_Items (

    OrderID uniqueidentifier not null,

    ItemID uniqueidentifier not null,

    primary key (OrderID, ItemID);

    go

    create proc dbo.OrdersCreate

    (@Account_in uniqueidentifier,

    @ItemIDs_in XML)

    as

    set nocount on;

    declare @OrderID uniqueidentifier;

    select @OrderID = newid();

    insert into dbo.Orders_Items (OrderID, ItemID)

    select @OrderID, Item.ID.value('(/Items/@ID)[1]','uniqueidenfifier')

    from @ItemsIDs_in.node('Items') Item(ID);

    if @@rowcount = 0

    begin

    raiserror('Failed to create items for order', 16, 1);

    rollback;

    end;

    else

    insert into dbo.Orders (OrderID, Account)

    select @OrderID, @Account_in;

    Want a foreign key from Orders_Items to Orders? Easy, insert into Orders first, and issue a rollback if you get zero rowcount in Orders_Items.

    Want to prevent deleting the last item from an order? Easy, have an On Delete trigger.

    Quite obviously, whomever is claiming this is near-impossible, or difficult, or expensive, just doesn't know SQL very well. I don't know who made that original assertion, since you seem to be stating that you read about it in a textbook. Whoever they are, I doubt their expertise on the subject.

    As to the bit about the sum of an empty set, I'll grant that a poorly written query like that will give you Null in T-SQL when maybe it should give you zero. Again, this isn't a real-world issue that can't be solved very, very easily. Wrap the query in Coalesce/IsNull.

    My challenge to you was to come up with something that (a) can't be solved easily, AND (b) where the "true relational" solution doesn't cause more problems than it solves.

    Everything you've given so far is very, very easy to solve in SQL. Or it's just spin (like your crying emote), which doesn't contribute to the discussion.

    It would take a HUGE expense and effort to replace all SQL databases with something "truely relational". Over a large amount of time, that expense can be spread out to where it doesn't overcost the benefits. But, so far, in this discussion and every other I've seen on the subject, the benefits are tiny, imaginary, or expensive, while the cost is much higher.

    My point, again, is that I have yet to see an argument made on the subject that actually shows a positive cost/benefit. I firmly believe there will eventually be one. Very few technologies withstand the true test of time. Tools like knives, clubs, fire and the wheel are persistent. Not much else is.

    SQL will, I'm sure, go the route of the buggy whip one day. But the replacement technology hasn't, to my knowledge, been invented yet. I expect that it will be something that will evolve from SQL, as opposed to something completely different.

    You're making a radical claim. You need to back it up with something more solid than "It takes a few seconds of work to make this happen in SQL".

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Your SQL solution does not solve the problem because it is still possible to create an order in the Orders table without any corresponding Order_Item.

    You suggested using a trigger but a trigger is not a constraint, which is what the problem specified. A trigger is procedural code rather than declarative. So this is one example of the point I made earlier in reply to Elliott: SQL's limitations are to blame for the amount of procedural code and row-by-row processing that programmers are forced to write. Not all DBMSs even have set-based triggers. Set-based or not, a trigger is just a procedural crutch to fix a problem caused because SQL cannot enforce a very simple constraint.

  • GSquared (11/18/2009)


    SQL will, I'm sure, go the route of the buggy whip one day. But the replacement technology hasn't, to my knowledge, been invented yet. I expect that it will be something that will evolve from SQL, as opposed to something completely different.

    You don't think SQL will be replaced by RDBMS? That would be a pity.

  • David Portas (11/18/2009)


    Your SQL solution does not solve the problem because it is still possible to create an order in the Orders table without any corresponding Order_Item.

    You suggested using a trigger but a trigger is not a constraint, which is what the problem specified. A trigger is procedural code rather than declarative. So this is one example of the point I made earlier in reply to Elliott: SQL's limitations are to blame for the amount of procedural code and row-by-row processing that programmers are forced to write. Not all DBMSs even have set-based triggers. Set-based or not, a trigger is just a procedural crutch to fix a problem caused because SQL cannot enforce a very simple constraint.

    That's a straw man argument.

    An SQL trigger is no more nor less procedural than a constraint. Both are executed by procedural code in the datbase engine. It's not like I'd be writing the trigger in machine-code in any database product, relational, OO, hierarchical, or otherwise. Not because that can't theoretically be done, but because I don't write machine code.

    Saying, "SQL is flawed because constraints can't be used outside of the scope for which they were designed", is like saying toothpicks are flawed because you can't drive nails with them.

    You posed an engineering problem, inserting a row into an orders table after enforcing that there be at least one item in a table that defines what items go with what orders. You said this problem cannot be easily solved in SQL. Yes, given the rule that it must be done purely with constraints, you're right. By that logic, if I tell you to solve pi to the billionth decimal place just using your fingers, your hands are flawed and need to be replaced with a supercomputer. You'll no longer be able to brush your teeth, but that won't matter much because, being immobilized by being grafted to a ton or more of metal, you probably won't be in a position where fresh breath and good oral hygiene matter.

    You posed an engineering problem. I solved it with the tools available. I'm sorry I used a screwdriver to drive a screw, and didn't use a hammer for that. But that's NOT a valid argument that the toolbox is flawed.

    As for row-by-row processing, there is none in what I wrote. It's very, very rare that it is "required", as per your post. In almost every case I've seen where data manipulation was being done on a row-by-row basis, it was because the developer didn't know that screwdrivers exist and was thus using a hammer (to stretch that analogy).

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • David Portas (11/18/2009)


    GSquared (11/18/2009)


    SQL will, I'm sure, go the route of the buggy whip one day. But the replacement technology hasn't, to my knowledge, been invented yet. I expect that it will be something that will evolve from SQL, as opposed to something completely different.

    You don't think SQL will be replaced by RDBMS? That would be a pity.

    Who knows? Since there really aren't any, it's impossible to say.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Leaving aside the well-known disadvantages of procedural code, a trigger is not a substitute for a constraint because it does not enforce the rule. It allows the data to get into an invalid, inconsistent state and then corrects it.

    The advantages of declarative referential integrity over procedural code are pretty well known I think. Significantly, declarative code is available to the optimiser for optimising other queries and performing query rewrites. Procedural code cannot do that. "Declarative, Not Procedural" is in fact one of the guiding principles of enforcing business rules (the Business Rules Manifesto). See: http://www.brcommunity.com

    So doing it declaratively is a legitimate requirement and a workaround using procedural code is not nearly equivalent.

    GSquared (11/18/2009)


    David Portas (11/18/2009)

    You don't think SQL will be replaced by RDBMS? That would be a pity.

    Who knows? Since there really aren't any, it's impossible to say.

    There are though! They exist already and there are bigger and better ones in development. The relational model needs advocates now, not tomorrow. It doesn't benefit from discussions about the finer points of SQL because that argument is already lost. Customers are and will abandon SQL where they can even if they have to replace it with non-relational systems. That's why I think there is an important debate to be had about the future, and SQL database professionals should be leading that debate.

  • You still haven't posted the non-SQL relational solution to the problem that you posed.

    I looked at the sample "true relational" databases you posted earlier. They are only "true relational" by redefining the term "relational", so far as I can tell.

    Also, based on adoption volume, it's really hard to say that companies are abandoning SQL databases in droves. Take a look at non-SQL DBA jobs available, and you probably won't find any anywhere. I sure don't see any. (I'm including all SQL-based databases in that, not just MS-SQL. An Oracle job is still SQL by the definitions used in this discussion.) At the same time, I've had six employers approach me about jobs in the last two months. This while unemployment in the US is over 17% (real unemployment) per the federal government.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (11/18/2009)


    You still haven't posted the non-SQL relational solution to the problem that you posed.

    Here's one I did using Rel (http://dbappbuilder.sourceforge.net/Rel.php)

    // Create OrderItem and Order

    VAR PurchaseOrder REAL RELATION {

    OrderNum INTEGER,

    CustomerNum INTEGER} KEY{OrderNum};

    VAR OrderItem REAL RELATION {

    OrderNum INTEGER,

    ItemNum INTEGER,

    ItemQty INTEGER} KEY{OrderNum,ItemNum};

    // Constraint: Order must have items

    CONSTRAINT OrderHasItems (PurchaseOrder{OrderNum} = OrderItem{OrderNum});

    INSERT PurchaseOrder RELATION {

    TUPLE {OrderNum 1, CustomerNum 123}

    },

    INSERT OrderItem RELATION {

    TUPLE {OrderNum 1, ItemNum 1001, ItemQty 100}

    };

    I looked at the sample "true relational" databases you posted earlier. They are only "true relational" by redefining the term "relational", so far as I can tell.

    What makes you say that? An RDBMS is one whose data structures are relations and whose data operators are relational operators (equivalent to the relational algebra).

    I don't think searching for DBA jobs is any good indication of non-SQL DBMS adoption. There's a list of some of the non-relational stuff at: http://nosql-databases.org. These all seem to be getting adoption by big name firms in real systems judging by the feedback I've seen and heard. That is only the tip of the iceberg however. My company sells some XML database software that has been around for quite a few years. There are lots of other schema-less alternatives.

    Another hugely important factor is the rapid adoption of persistence-ignorant frameworks and O/R Mappers. These basically represent a SQL exit strategy for future development and their advocates are quite open about that goal. They can and will plug and play those apps with non-SQL DBMSs and I'd prefer those replacements to be relational rather than not.

Viewing 15 posts - 16 through 30 (of 36 total)

You must be logged in to reply to this topic. Login to reply