trubolotta (10/15/2010) ... The "corporate" goal to reduce admin user time per technical report succeeded (10 mins. to 7 mins.) while ignoring professional user time (30 mins. to 120 mins.). ... The net result, in my opinion, is to produce low quality apps with the illusion of lower costs.
Ahhh yes the illusion of lower costs. Another perpetual example of this is what I call "silo management" (others might call it "left hand doesn't know what the right is doing").
Two of the cost centers in every company are "capital equipment" and "human resources". Each seems to have a separate mandate to reduce its own costs. The capital equipment people are responsible for buying servers. When it comes time for new ones, they gauge the current disk space requirement and then buy new machines with the minimum disk space that will meet that requirement. Of course by the time the new machines arrive, the requirement is obsolete and developers are out of disk space. The developers then proceed to burn their hours daily and miss deadlines monthly and lose business yearly because they must spend their time archiving and restoring lower-priority data instead of developing applications. But the capital equipment expense has been minimized, you betcha.
I cannot remember the last time I worked on a project where the staff was not doing the "disk space disco" on at least one machine. Every medium-sized company with software developers, I would guess, saves thousands on hardware every year while spending hundreds of thousands or perhaps millions paying developers to shuffle data back and forth. I am reminded of Dr. Evil, whose staff had to politely explain to him that "trillions" are bigger than "billions".
David Ziffer (10/13/2010)
RAP is intended for use primarily by small to medium projects ...
I have worked on several large projects where the project is informally storing every version of every record ever entered in many of its tables, but it's always done badly. Every one of these projects has accomplished this by using "effective date" fields or "deleted" flags. These mechanisms are horrendous because 1) every query that looks into any table has to account for the fact that there are possibly multiple versions of every record and it has to find the latest (or valid) one, 2) the presence of such data requires that keys and indexes accommodate this "archiving" (rather than representing the natural current structure of the problem) and 3) of course the performance is a disaster because in addition to every query having to do the extra query work described in "1" above, the vast majority of data in the tables is obsolete, yet the server must look through it on every single query for current data (which is 99% of the queries).
Hopefully this will only be used on small projects that do not have to be maintained and do not have to scale.
Having been on the receiving end of consultant\vendor created apps that are built using techniques very similar to this, it is definitely not an approach that should be followed for anything other than small, quick and dirty, one off apps that won't be around for very long. Trying to transfer the whole RAD\Agile philosophy into the database design and implementation world is at best just lazy, at worst a huge waste of time and money as the project ends up being reworked from the ground up. Maybe there will come a time where CPUs are so fast and memory so cheap that the extremely poor performance from databases designed using these techniques won't matter, but that time is not now. The performance and quality of any serious enterprise application depend on the design of the database and the abilities\experience of those who are writing the SQL to access the databases. Some may think that designing a database can be written into a few "rules" to follow and then code those rules into a generator, but that just reveals how little is actually known about the database design process. I am not going to rehash the issues and problems with using synthetic keys as these have been gone over time and time again. Just because they make it easy for a programmer does not make them right for a database design. The one thing programmers\developers refuse to think about is that the data is not part of the program, the data is part of the corporation and that data and hence the database design needs to last long after the program\application is obsolete. As such, the database design requirements should be analyzed and worked out independently of the application, not as part of coding the application.
With regards to your second remark about date fields and keeping all data in the "current" table, I would suggest that there is nothing at all wrong with that approach depending upon the business requirements. I, also, have worked in a place where all data was kept in the "current" table and no rows were updated, other than to expire or logically delete them. Keeping them in the "current" table was the only way to meet the business requirements that they be able to see the data as if they were "traveling in time", what the organization called "full point-in-time". To briefly explain it, they needed to be able to display, for legal reasons, exactly what a user would have seen at any point in the past. But the full concept is much more complex than that. Suffice it to say that four different dates are required for each row,(created, effective,expired ,logically deleted) and while the SQL was complex, it was by no means a performance disaster. Every query was significantly sub-second as the SLA between the application group and the business was that every screen change had to be sub-second. This is a very large scale government application with over 40,000 online users. I cannot say the same for the "generated" database designs and applications that we have receiving lately from vendors and contractors. They are the ones that have been performance nightmares and, because the developers don't go anywhere near the database but leave their "middle tier" framework to generate every thing for them, they haven't a slightest clue about how to tune the code or design. The "get the historical data out of the current table" response to poor performing queries is typical of quick and dirty solutions that end up becoming a nightmare down the road. We have a couple of legacy systems where someone decided that was the way to go. Now, every single query has to run against both the current and historical databases because the business needs to access data that was not considered "current" by the developers. The problem, as I see it these days, is that people are so intent on getting something done quickly, they look for and use quick solutions to build things out their realm of expertise. There is a reason why people specialize in certain areas and that is because the complexity of the field requires the expertise of a specialist for anything beyond a very simple application that may have only a couple of users.
No one has ever been able to convince me that any of these code-generating\database-generating solutions is solid enough for serious use. They might be good enough to generate a prototype\mockup, but I would never stake my career or reputation on the use of such a product for a production quality application\database. As such, I will always recommend against them when management comes asking for input from the senior IT staff. I understand that those working contracts want to get the code\apps done quickly and get their pay check, but those of us who work for the company want to receive products that perform, are maintainable, and have vendors that are able to support performance\scalability issues when they arise. When we receive an app that is obviously generated and not thoughtfully coded, that vendor\contractor is noted and will not receive my recommendations in the future.
Senior IT Analyst - Database
Manitoba Public Insurance Corporation
David Ziffer (10/13/2010)
Unfortunately in this day & age we still don't have relational databases that allow us to divide our various objects into different name spaces, so this task is left to the developer. At some point we are potentially going to want to name one thing "table X" and another thing "view X", and the only difference in the names will be whatever we choose to designate something as being a "view" vs. "table". If we choose nothing to designate tables from views, then we have to start making arbitrary name choices in order to accomplish this, but the whole point of RAP is consistency, and its code generators never make arbitrary choices.
If you have a relation (table) and a derived relation (view) of the same thing, your design is nonsense. So is a view and a table need their names distinguishing by |TBL and VW prefixes you need to wake up and fix the problem that you have represented the same thing twice instead of once.
Tom.Thomson (10/17/2010)[hrIf you have a relation (table) and a derived relation (view) of the same thing, your design is nonsense. So is a view and a table need their names distinguishing by |TBL and VW prefixes you need to wake up and fix the problem that you have represented the same thing twice instead of once.
Nowhere did I propose that anyone construct a view and a table which when queried directly display the same content. However in a separate post I did discuss a useful concept that RAP actually uses, which is to have a stored procedure and also a user function that both perform the same operation (and consequently should have identical names except for whatever designates them as being a procedure vs. a function).
David Ziffer (10/17/2010)
Nowhere did I propose that anyone construct a view and a table which when queried directly display the same content. .
No. However you did say
At some point we are potentially going to want to name one thing "table X" and another thing "view X", and the only difference in the names will be whatever we choose to designate something as being a "view" vs. "table".
and it seems absolutely clear to me that if you do that you have some real confusion about what your "X" is.
Having been on the receiving end of consultant\vendor created apps that are built using techniques very similar to this, it is definitely not an approach that should be followed for anything other than small, quick and dirty, one off apps that won't be around for very long.
Dear Mr. Dundas: It appears that you are dismissing RAP based upon prior experiences with other products or concepts, and that you presume that because past efforts at higher levels of automation have not met your expectations, all future efforts are doomed to failure. I suggest that eventually someone will succeed.
Trying to transfer the whole RAD\Agile philosophy into the database design and implementation world is at best just lazy, at worst a huge waste of time and money as the project ends up being reworked from the ground up.
I am not an advocate of Agile philosophy. I am an advocate of generating mundane code that otherwise would almost certainly be poorly written and maintained by hand, and of pre-designing things that most people would design poorly given the chance (like auditing for example). Eliminating useless handwork does not constitute an advocacy of Agile methodology, any more than advocating the use of any other code-generating technology (such as compilers vs. assemblers) does.
With regards to your second remark about date fields and keeping all data in the "current" table, ...
RAP does not keep "all data" in the "current table". My article discussing this hasn't even appeared yet. RAP keeps only current data in the "current" table, allowing the application to properly delete obsolete records (and thereby respect referential integrity rules). All other auditing designs I have seen are considerably less systematic and rigorous, requiring the pollution of current data with former data in the same table. The RAP auditing design is discussed fully in the upcoming Part 3 of this article series.
I, also, have worked in a place where all data was kept in the "current" table and no rows were updated, other than to expire or logically delete them .... and while the SQL was complex, it was by no means a performance disaster
I am confused here by your use of my RAP term ("current table") to describe an application that is completely opposite to what RAP does. In RAP, the "current table" stores only current data and such data is indeed both updated and deleted.
I am glad that you were able to manage this design you describe and get good performance with it, but good performance and good design are not the norms that I have observed with such projects. Since your project never truly deleted records, presumably you could not then have the database assist you in maintaining referential integrity. And you yourself say the SQL was complex. Well the RAP code for accessing both current and archived data is both fast and simple (this is described fully in the upcoming Part 4 and Part 5 articles). In RAP, the SQL author is absolved of knowing whether data is current or archived, yet nonetheless his code can access both types of data with no more effort than an ordinary query would require to simply access current data.
Obviously it is possible to create a system where the database server must look through tons of obsolete data in order to find current data, and where the queries to access current data must be always written so as to eliminate the deleted data from the results, but why would you want do to this?
Now, every single query has to run against both the current and historical databases because the business needs to access data that was not considered "current" by the developers.
RAP does not have this problem; it runs against one or the other and it knows ahead of time which one to hit. The mechanism is described thoroughly in upcoming article series parts 4 and 5..
No one has ever been able to convince me that any of these code-generating\database-generating solutions is solid enough for serious use.
I cannot speak for your experience or the quality of what other vendors. I can say only that I have never seen anyone or any product solve the problems that RAP does in the way that RAP solves them. Never, ever, anywhere. Please take a look and see if it at least warrants some consideration.
I am taking a fundamentally different approach here than the people you've dealt with in the past. First of all I'm not forcing it on you, so you can feel free to ignore it. I deliver it in source form so if you don't like the way it does something you can change it. But most of all, it is being presented to the development community (via this article series) for everyone to look at it, play with it, and decide whether it at least contains some components or ideas worth using.
You are of course free to bet against the further future systemization and automation of the code generation process, and I am free to bet against you. Certainly RAP might not survive the test of popularity, and perhaps even my methods will not be the choice of the future, but in my opinion somebody is going to automate code generation to a far higher level of abstraction than what we are doing today. RAP is just my entry into the game.
Could you comment on how RAP may or may not use datamodels? I currently have a love-hate relationship with ERwin. But I can't really imagine wanting to start or maintain a database without ERwin. One of the things that I like about ERwin is that I can design the tables graphically, quickly enter certain "data types" consistently, create relationships consistently with a quick drag and drop, etc. And when it is all done, I click a button and all the code I need (for my tables) is generated automatically.
Of course, ERwin does not automatically generate the audit tables or the basic stored procedures that RAP does. I haven't downloaded your RAP app yet, but from reading your descriptions, it sounds like for RAP to work, you still have to manually write out a basic table definition and joins, etc, only with hand-written shortcuts for the primary key and audit fields. I'd hate to write it all out. So, I want to know if there is a way to use ERwin and get it's benefits along with RAP.
I'd fully understand if the answer is, "wait for the rest of the series." or "not really". 🙂 Just wondering if you see a place for both.
JJ B (10/18/2010)
Could you comment on how RAP may or may not use datamodels? I currently have a love-hate relationship with ERwin. But I can't really imagine wanting to start or maintain a database without ERwin....
Actually RAP is ideally suited for use with data modeling tools like Erwin because its main code generator works from the schema that you actually have in the database ... so all you have to do is get your "current" tables in there (i.e. the tables you'd normally include in a your design - no need for you to try to generate RAP's "archive" tables), skipping the RAP preprocessor (described in Part 2) and then continuing on with the rest of the RAP system, starting with the code generator that will be discussed in part 3.
It takes a little work though. For example to make Erwin generate RAP-consistent tables, you have to properly design the primary keys to match the RAP key design (this isn't hard because it's just an autoincrement bigint).
Furthermore I believe that you can simulate all the RAP data types using Erwin "domains".
I recommend that you NOT define the status fields explicitly in your Erwin model because they're exactly the same in every table, so you end up trashing up your schema diagrams with all these redundant status fields that don't really relate to your actual design. My approach would be to write a macro that generates the status fields for every table whenever you generate the physical model from the logical model (Erwin has triggers that let you do this). I did this myself on a project that wasn't using RAP, so I can't say for sure how easy it would be to match RAP's status field design ... but I do know that it's pretty easy to get an Erwin macro to generate consistent status fields for all physical tables.
Once you have the physical tables with the proper consistent primary keys and status fields you're golden. The rest of the RAP system should work perfectly with your Erwin-designed schema.
JJ B (10/18/2010)
I can't really imagine wanting to start or maintain a database without ERwin.
Actually it's kind of interesting that you mention Erwin. In 2007 I was working on a project that used Erwin. The project was NOT using RAP, but I was already developing RAP and its concepts so I wanted the project to have consistent status fields across all tables.
I was expecting that in most professional projects people would have goals like mine and that Erwin would be well equipped to produce identical sets of status fields on all tables. This would require two things:
1. a mechanism for defining common sets of fields that could be incorporated into any or all tables,
2. and a mechanism for hiding such sets (as in the case of status fields, which you really don't need to look at in your diagrams).
To my chagrin I discovered that Erwin didn't seem to support either concept. There is an Erwin concept called "domain", which is analogous to data types in SQL server in that it can define a single field and its attributes for consistent reuse in multiple tables. But I could not find any mechanism for defining groups of fields. So that's when I came up with the idea of writing a macro to define my status fields consistently across all tables.
So I was amazed. The notion of having a consistent set of status fields across all tables (or even a subset of tables) was apparently so foreign to professional schema designers in 2007 that a super-popular professional schema designer like Erwin apparently didn't have any mechanism for supporting the concept.
It's still not there. And you are right, there is a whole lot left to be desired about ERwin. The glaring lacks are shocking really, given the software cost.
That said, I do use those domains/data types that you talk about, and it really helps a lot--not only for creating consistently defined fields across tables, but also when writing stored procs and functions. There are also other aspects that I like about having a professional data model, both at the beginning of the project and over time. I appreciate your tips for how to make RAP work with ERwin.
Thanks again for sharing. You sure are putting yourself out there and taking a lot of you-know-what for it. Takes courage.
As far as auditing, I have used http://autoaudit.codeplex.com/ very successfully (code generated DML and DDL triggers). Its auditing approach is not individual archive tables, however, but a similar technique could probably be used for that paradigm.
I'm all for standardization and code generation.
There are still no silver bullets, though. The intrinsic difficulties of software engineering remain; what techniques like this do is simply stop us from shooting ourselves in the feet or dragging ourselves down with grunt work and manual errors.
I have actually found that a better naming convention is as follows (using Organization as the basic entity):
Table - Organization
View - OrganizationView
Because of the way we use tables and views, this is easily enough to distinguish them and it reads well. As for your categorization of tables, I would suggest just using schemas. So, you would have "adm", "sec", and "crm" schemas. This also has the benefits of security within the schemas.
Actually the generator that we use is CodeSmith and NetTiers. It is quite extensible and powerful. We have seen no issues with performance, and it is enterprise worthy. It does a great job of getting us through the ORM part of development. We focus on DB, BL, and UI design, getting the data in and out of the database is quick for us (as it should be since it is plumbing).
If you want, NetTiers also has the ability to create business rules to be applied to the entities and build workflows.
Viewing 13 posts - 31 through 42 (of 42 total)