SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Worst Practice - Not Having An Archive Plan For Each Table


Worst Practice - Not Having An Archive Plan For Each Table

Author
Message
Andy Warren
Andy Warren
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: Moderators
Points: 44918 Visits: 2783
Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/awarren/worstpracticenothavinganarchiveplanforeachtable.asp>http://www.sqlservercentral.com/columnists/awarren/worstpracticenothavinganarchiveplanforeachtable.asp

Andy
SQLAndy - My Blog!
Connect with me on LinkedIn
Follow me on Twitter
Frank Marzan
Frank Marzan
Grasshopper
Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)

Group: General Forum Members
Points: 18 Visits: 26
Glad to see someone brought this up, I must say this is a very important topic Andy.

After working from a financial institution who had money to blow on DASD and coming to a company which have a limited space and no archiving implemented/designed. A solution had to be developed. I came up with about 3 ways which I think are good enough solutions.

In this scenario, the company would like to keep the data as far back as they can and only about 3 mths of live/production data. So ...


overview

First, you could make a copy of the live db and name it arch_db and write to 2 places live db and arch_db. And just do a monthly cull, no worries.

Second, your developers may not necessarily want to change their code, or some dont even use SPs and use embeded sql in their code. So... if you have no choice, you can use Triggers to keep tab of the changes, deletes, inserts. And apply these to the archive db, nightly, 5 mins or as needed.

Thirdly, if you're only in the design stage, if you know that the data is going to grow exponentially, you could break up the tables into Months ... therefore it's easier to move old tables into an archive db and searching in the current db is as fast as you have tuned your server.

Any questions please post, be glad to help and share the codes to my arch procedure.

Cheers,

-¢ödêmån-


-¢ödêmån-
Dave Poole
Dave Poole
SSC-Dedicated
SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)

Group: General Forum Members
Points: 30002 Visits: 3607
Great Andy,

There is an issue here of "who owns the data".

Your particular example is something that can be decided by the IT department as it is their own internal logging process.

The problem comes when ownership is unclear and no-one is prepared to say "Yeah, its old, we don't need it" even though no-one has used it, knows anyone who has used it, has heard of anyone using it.

I think any project that involves storing data needs a particular piece in the functional requirement specification that deals with this issue before the project is built.

In the case of contacts databases over in the UK we have this thing called the data protection act. One of the things it states is that data must not be stored for an excessive amount of time. Failing to have a decent archiving procedure is actually in breach of the law.

I worked for an organisation that was scared to archive data because we never knew when we would be asked to plot trends in customer behaviour over 'n' years. 'n' being the amount of data you finally decided to retain in your system plus one.

The way that I got around the problem was to have a set of archive tables with no indices that simply contained data over 12 months old.

The users knew that if they needed to search this old data they had to specifically query the old data and that it would be slow to retrieve. Surprise surprise, they rarely bothered.

I tended to archive the archive tables into MS Access and burn this into a CD and duplicate it. AS MS Access could store up to 2Gb and the archive per month was under 650Mb this was feasible. Storing it in Access on a CD-ROM meant that a user who specifically needed the archive could receive a copy and query it at their leisure.
It also made retrieval of data back into SQL Server easy.

LinkedIn Profile
www.simple-talk.com
PeterLivesey
PeterLivesey
SSC Journeyman
SSC Journeyman (82 reputation)SSC Journeyman (82 reputation)SSC Journeyman (82 reputation)SSC Journeyman (82 reputation)SSC Journeyman (82 reputation)SSC Journeyman (82 reputation)SSC Journeyman (82 reputation)SSC Journeyman (82 reputation)

Group: General Forum Members
Points: 82 Visits: 49
As a developer I have got to commend a DBA for bringing this up. I very rarely deal with DBA's and am mainly responsible for my own databases.

In my experience most developers develop a product without thought to Entity Life History which is a very simple check on whether data should be created and checks that it is later destroyed (archived).
This is fine at the development stage and will normally stand up to expected usage testing, but the knock on affects of creating data that
1. is never used so there was never a good reason to create it except it existed in the old system
2. is never archived (destroyed) so it slows down querying for more recent useful data.

are larger than most developers give credit to.

Mr Peter Livesey


Mr Peter Livesey
Clive Strong
Clive Strong
SSCarpal Tunnel
SSCarpal Tunnel (4.7K reputation)SSCarpal Tunnel (4.7K reputation)SSCarpal Tunnel (4.7K reputation)SSCarpal Tunnel (4.7K reputation)SSCarpal Tunnel (4.7K reputation)SSCarpal Tunnel (4.7K reputation)SSCarpal Tunnel (4.7K reputation)SSCarpal Tunnel (4.7K reputation)

Group: General Forum Members
Points: 4741 Visits: 6643
Great article Andy.

I fully agree. With more and more archive tables, the resources on the Server do increase and who knows, if you have a fair number, it could even impact on overall performance...although you'd need to be doing a lot of archive processing.

We do archiving, but all we really do is copy the days orders and all it's other associated information that we download from JDE into history tables within SQL. This has no impact on performance as we do this around 10pm before we do backups so were ok. We only have 2 tables which track changes/updates throughout the day. As the changes/updates are small we have no issues with this.

And the other issue you raised is how long is the data valid? Yes, the cost of disks is next to nothing, but why use up resources when you don't have to? Personally, I'd keep as much space free as possible.

Fortunately, the management here agreed that we only need four months history within SQL Server. With that, we run a job on the first of each month to delete all history over 4 months. On first run, this removed almost 2 million rows!!


Clive Strong
clivestrong@btinternet.com
Jonr
Jonr
SSC Eights!
SSC Eights! (976 reputation)SSC Eights! (976 reputation)SSC Eights! (976 reputation)SSC Eights! (976 reputation)SSC Eights! (976 reputation)SSC Eights! (976 reputation)SSC Eights! (976 reputation)SSC Eights! (976 reputation)

Group: General Forum Members
Points: 976 Visits: 65
One of my clients had a customer complaints system that was killed every Monday morning because (badly written) management reports were being run against the database server whilst it was being used to log customer calls from 90 service desk people. Used to grind to a halt and drop connections regularly. This was a third party app, so changing the (dynamic...) SQL or archiving a chunk of the data was a non-starter and would have invalidated the support contract.

Various solutions had been suggested by the vendor: putting a copy of the database onto a separate server purely for reporting ; upgrading the network ; upgrading the client PCs (odd suggestion as it was evident from the start that it was a server performance problem, even though they had visited the site on a number of occassions to diagnose the problem); and upgrading the server to quad 2.4Ghz Pentium Xeons (yep, we were considering going into weather forecasting as a side line...)

Once the previous DBA left, we got into analysing the problem properly, and it transpired that virtually all of the queries were triggering table scans on big tables containing 18 months of archive data which couldn't be moved. So we set about creating some decent indexes and giving the supplier a roasting over the lack of them in the original product.

Guess what? Reports ran in seconds not hours(!), users did't get booted off anymore, and customers didn't have to wait 5 minutes for their details to appear on the operators' screen. I guess the lesson of this is that if archiving isn't possible, then look at performance tuning next, before anything else. Costing it out, the time spent tuning worked out 20 times less expensive than upgrading the server - a very sensitive point to management in these times of cost cutting. So if you can't archive, tune. And if you can archive, look at tuning anyway Smile

Jon Reade

Edited by - jonreade on 11/12/2002 05:47:43 AM


Jon
Dave Poole
Dave Poole
SSC-Dedicated
SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)

Group: General Forum Members
Points: 30002 Visits: 3607
..and of course simply bunging in another processor adds to your licensing costs, never mind the cost of RAM, disk and tapes.

LinkedIn Profile
www.simple-talk.com
ckempste
ckempste
SSCertifiable
SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)

Group: General Forum Members
Points: 5331 Visits: 1
Hi

Its far from worst practice, but is something to consider. Archiving off data and making it subsequently accessible for historical reporting (typicaly old user queries and auditing) is difficult to plan and manage. I have worked on a lot of databases in my time as a DBA, and I have to say, very few analysts really take the time out to address this issue. I must say also that with the modern DBMS, large amounts of data can be searched quickly and effectively, server (cou, IO etc) performance is excellent and with the advent of grid-computing and highly distibuted systems with super quick periperials, archiving in MANY cases tends to be a non-issue and not worth the added costs in terms of application complexity.

Saying all of that, auditing tables and logs (where the friggen developers log everything you can possibly think of in their apps) is really tough and is the bane of a DBA's life, especially in terms of # indexes (yes - queriyign auditing tables is just as complex as the live ones, resulting in more indexing and speed issues).

Cheers

Chris K


Chris Kempster
www.chriskempster.com
Author of "SQL Server Backup, Recovery & Troubleshooting"
Author of "SQL Server 2k for the Oracle DBA"
Jonr
Jonr
SSC Eights!
SSC Eights! (976 reputation)SSC Eights! (976 reputation)SSC Eights! (976 reputation)SSC Eights! (976 reputation)SSC Eights! (976 reputation)SSC Eights! (976 reputation)SSC Eights! (976 reputation)SSC Eights! (976 reputation)

Group: General Forum Members
Points: 976 Visits: 65
quote:

..and of course simply bunging in another processor adds to your licensing costs, never mind the cost of RAM, disk and tapes.

Yes, that was pointed out to them as it was per-processor licencing they were using. Something the vendor ommitted to mention in their 'recommendations'.



Jon Reade


Jon
joshcsmith13
joshcsmith13
SSCommitted
SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)

Group: General Forum Members
Points: 1701 Visits: 254
This topic is one that I've had in the back of my mind for months now. We're currently mid-stream in a total rewrite of a major application that involves transferring data from a Sybase SQL Anywhere DB to MS SQL Server 2000. (that is an issue in itself, but one one I'm too worried about right now) What has been a concern for me is the need to trim the size of our result sets when desktop applications are querying the server for records. Current design, as well as business process and legal issues, have meant that we must keep all historical data accessable. When someone goes to run our app, they're most likely only concerned with the last weeks worth of data. But, when someone goes to run reports they're going to want the last 3 months of data, and that could be in the 100s of thousands of rows, all the way up to current. I guess my biggest obstacle is what to do with foreign-keys.?? there is always a possibility that a user will need to update an old record, and I don't want to lose data integrity. so, maybe we have been doing worst practices, but I don't know a solution. ? does anybody else?



Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum







































































































































































SQLServerCentral


Search