• roger.plowman (10/13/2010)


    Ok, I have a problem with the approach from the word go, and it's this:

    "implement "ideal" auditing, where the app itself can recall all versions of the data that were ever entered into it;"

    *All* versions? 🙂

    You do realize this is physically impossible, yes? The storage requirements for a CRM system are already astronomical, and that's just for the *current* state of the data. Add in a requirement that you keep EVERY SINGLE CHANGE (presumably reversibly) and you've moved from engineering to magic.

    Given that, the rest of the approach is seriously suspect.

    Unless, of course, I misunderstood "all versions of data ever entered"...

    RAP is intended for use primarily by small to medium projects, so given today's enormous storage capacities, storing all versions of every record ever deleted isn't really that big of an issue. But aside from that, RAP uses two distinct sets of tables: the primary or "current" tables holding current data, and the "archive" (or "audit" or "shadow" tables) in which all past versions of records are stored. At some point a large project could decide that all data beyond a particular date is forever obsolete, and it could simply delete that data from the archive tables. So even for a large project, RAP could easily handle the space-constraint problem because it neatly segregates its past data for you. And since RAP keeps only current data in its "current" tables, your app never has to wade through past data to find current data.

    I have worked on several large projects where the project is informally storing every version of every record ever entered in many of its tables, but it's always done badly. Every one of these projects has accomplished this by using "effective date" fields or "deleted" flags. These mechanisms are horrendous because 1) every query that looks into any table has to account for the fact that there are possibly multiple versions of every record and it has to find the latest (or valid) one, 2) the presence of such data requires that keys and indexes accommodate this "archiving" (rather than representing the natural current structure of the problem) and 3) of course the performance is a disaster because in addition to every query having to do the extra query work described in "1" above, the vast majority of data in the tables is obsolete, yet the server must look through it on every single query for current data (which is 99% of the queries).

    RAP eliminates the arbitrary nature of deciding which tables get audited and which don't. It eliminates the need for developers to design specialized auditing mechanisms (both storage and retrieval) because it implements both the storage and retrieval mechanisms for you automatically. It separates current from past data, making it both easy and efficient to either keep or dispose of the past data as you wish.

    So please, stick with the series and give this a chance. It is really well thought out. I don't think I've missed a trick here, and at the end I'm hoping you'll agree.