need hlep to decide the apdesign approach

  • I have requirement to design tables for archive records, maintaining historical data and auditing.

    The historical data requirement is also coupled with reporting requirements. I apporached below desing concept but i am stuck in design for Historical Data as its not meeting reporting requirements.

    1. Data Archival requirement-[/b] as per business I need to archive the data mathcing the buinsess rules. for this I have adopted strategy to create the copy of base transactional table and moving data from base table to archive table.

    2. Audit - I have created verticle design having columns - Key ID, Table Name, Column Name, Old Value, New Value, Modified Date. Let's name the table as "AuditLog". In search also I have requirement to show audited information in verticle format. The user can search on field name also.

    3. Historical data- The requriement says system should capture each modification details of Customer (master+Orders) which I am achieving via AuditLog table but in verticle format. If user generate reports he should be able to see same transactional data at any given time.

    We have reporting requirement where each month the different cusomter reports has been generated by groupin/aggregating data along with Custoemr Details.

    e.g. If user generates Customer details with Sales information report in March 2013 and customer xyz has 100 Orders, system should generate the report with 100 orders.

    but If user genrates Customer Sales information in Oct 2013 for the month of March 2013 (user can select month and year as report parameter). the system should generate only 100 Orders along with same Customer details enough though there is modification happen in between.

    Basically we have to maintain history for each data.

    Thinking about about requirement I came across creating History table same as base tables. But I am not sure whether I am approaching in right direction.

    Appreicate in your opinions on above design apporach or new design apporach.

    Thanks,

    Abhijit More

    Abhijit - http://abhijitmore.wordpress.com

  • This is a problem which Date's 6NF is ideal for. The vertical design is a big problem, history shouldn't be a record of changes if you want to be able to reproduce old reports and so forth, it should be a record of sates at particular times. In order to avoid this blowing up into an enormous mass of data, you have to tag things with srat or end dates at fine granularity. I sugest you look up sixth normal form and anchor modeling, and see if that gives you an idea for a different approach to your data that will resolve some of your problems.

    Warning: there are two different things both called 6th Normal Form; the one relevant to keeping histories is Dates 6NF, not Ron Fagin's 6NF.

    Tom

  • Thank you L' Eomot Inversé , but as per our requirement a separate history table for each attribute table could result in a massive amount of table. Additionally with such deisng I wouldn't achive auti requirement of showing change log in verticle manner(table, col, old value, new value)

    I came up with few other apporach may you can validate and tell me which one can be consider...

    1. Apporach 1 - Created Single Report Data table - With this apporach whenever the report is generated I will generate the report data and store it in the table henceforth, whenever user access that report data will be displayed from this single table. The advantage is its one time processing, my requrirement is getting fullfill i.e. each time user display the report system should generate back dated data that was genrated earlier.

    2. Instead of having Single Report Data tabl, creating One report - One table. By this way, I am offloading load from single table into multiple tables.

    3. Is there any way if i create History tables (replica of main table) I can generate audit report in verticle manner for all columns from entity? - (Table Name, Field Name, Old Value, New Value, Change Start Date, Chnage End Date)

    please provide your views....

    Abhijit - http://abhijitmore.wordpress.com

  • The entire structure depends on your data, what you're keeping, etc.

    The way we have ours designed is by using versioning. We keep a "main" table that is narrow and contains information that does not change. Then a versioning table that is connected via Foreign Key and has the changable columns.

    Using AdventureWorks Employee table, here's how it would break down:

    CREATE TABLE [HumanResources].[Employee](

    [EmployeeID] [int] IDENTITY(1,1) NOT NULL,

    [NationalIDNumber] [nvarchar](15) NOT NULL,

    [ContactID] [int] NOT NULL

    CONSTRAINT [PK_Employee_EmployeeID] PRIMARY KEY CLUSTERED

    (

    [EmployeeID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    CREATE TABLE [HumanResources].[EmployeeVersion](

    [EmployeeVersionID] [int] IDENTITY(1,1) NOT NULL,

    [EmployeeID] INT NOT NULL,

    [LoginID] [nvarchar](256) NOT NULL,

    [ManagerID] [int] NULL,

    [Title] [nvarchar](50) NOT NULL,

    [BirthDate] [datetime] NOT NULL,

    [MaritalStatus] [nchar](1) NOT NULL,

    [Gender] [nchar](1) NOT NULL,

    [HireDate] [datetime] NOT NULL,

    [SalariedFlag] [dbo].[Flag] NOT NULL,

    [VacationHours] [smallint] NOT NULL,

    [SickLeaveHours] [smallint] NOT NULL,

    [CurrentFlag] [dbo].[Flag] NOT NULL,

    [rowguid] [uniqueidentifier] ROWGUIDCOL NOT NULL,

    [EffectiveDate] [datetime] NOT NULL,

    CONSTRAINT [PK_Employee_EmployeeVersionID] PRIMARY KEY CLUSTERED

    (

    [EmployeeVersionID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    The EffectiveDate in the version table gives us the date the record was altered. By using the latest EffectiveDate, one could find the current version (or last changed) record. Or one could search on EffectiveDate between a certain range to pull up older records.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Some of it depends on the data. Dimensional might be better treated differently than a summable field.

    For summable fields, you could have for example, line number, order number, current record flag, effectivity date (date of change), revision number, etc.

    Say the value starts at 4 units, then is changed to 2 units.

    Rev 1 shows 4 units, rev 2 shows -4 units, rev 3 (current record) shows 2 units.

    Showing all records, you can see the progression, but summing still shows 2 units, the current value.

    Not real elegant, but is one way of keeping the math right while showing changes in a verticle fashion.

    Another thing that can become a factor - if you are tracking order lines as they come in, then they change over time, especially if they go through statuses and lines ship partially, change, then create new lines / sublines.

    Looks like you have a couple of ideas, and something to read. Then run a couple examples by your users. Be sure to cover both fields you can aggregate, and other customer details that might be subject to change.

    There are a few different ways of tracking changes, each with some pluses and minuses.

  • Abhijit More (10/9/2013)


    2. Audit - I have created verticle design having columns - Key ID, Table Name, Column Name, Old Value, New Value, Modified Date. Let's name the table as "AuditLog". In search also I have requirement to show audited information in verticle format. The user can search on field name also.

    You've forgotten the all-important "Modified By" column.

    My question would be, how are you going to populate this table? Before you answer, it should be done by trigger but, based on the fact that you were complaining about the number of tables for "like" audit tables, I'm hoping you haven't made the extremely severe mistake of using a "Generic CLR Audit Trigger" for this. We had those on our system (previous guard put them there) and it caused a huge bogging down of the system (I fixed it with some T-SQL, instead). Depending on the width of the source table, it would take up to 4 minutes just to audit a 4 column update on a lousy 10K rows because CLR triggers (usually caused by scope loss of the INSERTED and DELETED logical tables of the trigger).

    Also and in case you haven't thought about it, you might want to consider using SQL_Variant for the "Old Value" and "New Value" columns. The reason is that SQL_Variant stores metadata about the data such as what the type of data stored is.

    Last but not least, consider NOT logging any data for INSERTs and DELETEs other than just marking the event with a single row in the audit table. Logging all values for all columns for such things can easily cause your database to grow by a factor of 2 to 14 depending on the data in the tables.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Abhijit More (10/9/2013)


    1. Data Archival requirement-[/b] as per business I need to archive the data mathcing the buinsess rules. for this I have adopted strategy to create the copy of base transactional table and moving data from base table to archive table.

    ... {snip} ...

    3. Historical data- The requriement says system should capture each modification details of Customer (master+Orders) which I am achieving via AuditLog table but in verticle format. If user generate reports he should be able to see same transactional data at any given time.

    We have reporting requirement where each month the different cusomter reports has been generated by groupin/aggregating data along with Custoemr Details.

    e.g. If user generates Customer details with Sales information report in March 2013 and customer xyz has 100 Orders, system should generate the report with 100 orders.

    but If user genrates Customer Sales information in Oct 2013 for the month of March 2013 (user can select month and year as report parameter). the system should generate only 100 Orders along with same Customer details enough though there is modification happen in between.

    Basically we have to maintain history for each data.

    Thinking about about requirement I came across creating History table same as base tables. But I am not sure whether I am approaching in right direction.

    Appreicate in your opinions on above design apporach or new design apporach.

    Thanks,

    Abhijit More

    Be advised that "reporting requirements" and the "archival" of data creates a huge paradox (especially if you want to see the womb-to-tomb life of rows and still avoid full columnar audits of inserts and deletes) along with some real PITA code. Rather than archiving (it never really happens, either), consider partitioning the data by month to 1 filegroup per month and 1 file perfile group. This will greatly ease backup requirements of huge amount of Audit Data that is never supposed to change and make doing peicemeal restores a possibility.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply