Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Worst Practice - Not Having An Archive Plan For Each Table Expand / Collapse
Author
Message
Posted Wednesday, November 6, 2002 12:00 AM
SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: Moderators
Last Login: Monday, November 17, 2014 2:20 PM
Points: 6,800, Visits: 1,914
Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/awarren/worstpracticenothavinganarchiveplanforeachtable.asp>http://www.sqlservercentral.com/columnists/awarren/worstpracticenothavinganarchiveplanforeachtable.asp

Andy
SQLAndy - My Blog!
Connect with me on LinkedIn
Follow me on Twitter
Post #7989
Posted Sunday, November 10, 2002 5:25 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, April 9, 2013 1:00 AM
Points: 16, Visits: 26
Glad to see someone brought this up, I must say this is a very important topic Andy.

After working from a financial institution who had money to blow on DASD and coming to a company which have a limited space and no archiving implemented/designed. A solution had to be developed. I came up with about 3 ways which I think are good enough solutions.

In this scenario, the company would like to keep the data as far back as they can and only about 3 mths of live/production data. So ...


overview

First, you could make a copy of the live db and name it arch_db and write to 2 places live db and arch_db. And just do a monthly cull, no worries.

Second, your developers may not necessarily want to change their code, or some dont even use SPs and use embeded sql in their code. So... if you have no choice, you can use Triggers to keep tab of the changes, deletes, inserts. And apply these to the archive db, nightly, 5 mins or as needed.

Thirdly, if you're only in the design stage, if you know that the data is going to grow exponentially, you could break up the tables into Months ... therefore it's easier to move old tables into an archive db and searching in the current db is as fast as you have tuned your server.

Any questions please post, be glad to help and share the codes to my arch procedure.

Cheers,

-¢ödêmån-



-¢ödêmån-
Post #45853
Posted Monday, November 11, 2002 3:42 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 9:14 AM
Points: 2,914, Visits: 1,844
Great Andy,

There is an issue here of "who owns the data".

Your particular example is something that can be decided by the IT department as it is their own internal logging process.

The problem comes when ownership is unclear and no-one is prepared to say "Yeah, its old, we don't need it" even though no-one has used it, knows anyone who has used it, has heard of anyone using it.

I think any project that involves storing data needs a particular piece in the functional requirement specification that deals with this issue before the project is built.

In the case of contacts databases over in the UK we have this thing called the data protection act. One of the things it states is that data must not be stored for an excessive amount of time. Failing to have a decent archiving procedure is actually in breach of the law.

I worked for an organisation that was scared to archive data because we never knew when we would be asked to plot trends in customer behaviour over 'n' years. 'n' being the amount of data you finally decided to retain in your system plus one.

The way that I got around the problem was to have a set of archive tables with no indices that simply contained data over 12 months old.

The users knew that if they needed to search this old data they had to specifically query the old data and that it would be slow to retrieve. Surprise surprise, they rarely bothered.

I tended to archive the archive tables into MS Access and burn this into a CD and duplicate it. AS MS Access could store up to 2Gb and the archive per month was under 650Mb this was feasible. Storing it in Access on a CD-ROM meant that a user who specifically needed the archive could receive a copy and query it at their leisure.
It also made retrieval of data back into SQL Server easy.




LinkedIn Profile
Newbie on www.simple-talk.com
Post #45854
Posted Monday, November 11, 2002 5:01 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, November 10, 2014 2:40 AM
Points: 34, Visits: 46
As a developer I have got to commend a DBA for bringing this up. I very rarely deal with DBA's and am mainly responsible for my own databases.

In my experience most developers develop a product without thought to Entity Life History which is a very simple check on whether data should be created and checks that it is later destroyed (archived).
This is fine at the development stage and will normally stand up to expected usage testing, but the knock on affects of creating data that
1. is never used so there was never a good reason to create it except it existed in the old system
2. is never archived (destroyed) so it slows down querying for more recent useful data.

are larger than most developers give credit to.

Mr Peter Livesey



Mr Peter Livesey
Post #45855
Posted Monday, November 11, 2002 5:16 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, November 17, 2014 4:37 AM
Points: 1,120, Visits: 6,474
Great article Andy.

I fully agree. With more and more archive tables, the resources on the Server do increase and who knows, if you have a fair number, it could even impact on overall performance...although you'd need to be doing a lot of archive processing.

We do archiving, but all we really do is copy the days orders and all it's other associated information that we download from JDE into history tables within SQL. This has no impact on performance as we do this around 10pm before we do backups so were ok. We only have 2 tables which track changes/updates throughout the day. As the changes/updates are small we have no issues with this.

And the other issue you raised is how long is the data valid? Yes, the cost of disks is next to nothing, but why use up resources when you don't have to? Personally, I'd keep as much space free as possible.

Fortunately, the management here agreed that we only need four months history within SQL Server. With that, we run a job on the first of each month to delete all history over 4 months. On first run, this removed almost 2 million rows!!


Clive Strong
clivestrong@btinternet.com
Post #45856
Posted Monday, November 11, 2002 5:19 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, November 13, 2014 6:23 AM
Points: 142, Visits: 63
One of my clients had a customer complaints system that was killed every Monday morning because (badly written) management reports were being run against the database server whilst it was being used to log customer calls from 90 service desk people. Used to grind to a halt and drop connections regularly. This was a third party app, so changing the (dynamic...) SQL or archiving a chunk of the data was a non-starter and would have invalidated the support contract.

Various solutions had been suggested by the vendor: putting a copy of the database onto a separate server purely for reporting ; upgrading the network ; upgrading the client PCs (odd suggestion as it was evident from the start that it was a server performance problem, even though they had visited the site on a number of occassions to diagnose the problem); and upgrading the server to quad 2.4Ghz Pentium Xeons (yep, we were considering going into weather forecasting as a side line...)

Once the previous DBA left, we got into analysing the problem properly, and it transpired that virtually all of the queries were triggering table scans on big tables containing 18 months of archive data which couldn't be moved. So we set about creating some decent indexes and giving the supplier a roasting over the lack of them in the original product.

Guess what? Reports ran in seconds not hours(!), users did't get booted off anymore, and customers didn't have to wait 5 minutes for their details to appear on the operators' screen. I guess the lesson of this is that if archiving isn't possible, then look at performance tuning next, before anything else. Costing it out, the time spent tuning worked out 20 times less expensive than upgrading the server - a very sensitive point to management in these times of cost cutting. So if you can't archive, tune. And if you can archive, look at tuning anyway :)

Jon Reade

Edited by - jonreade on 11/12/2002 05:47:43 AM



Jon
Post #45857
Posted Monday, November 11, 2002 5:40 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 9:14 AM
Points: 2,914, Visits: 1,844
..and of course simply bunging in another processor adds to your licensing costs, never mind the cost of RAM, disk and tapes.



LinkedIn Profile
Newbie on www.simple-talk.com
Post #45858
Posted Monday, November 11, 2002 7:28 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Tuesday, July 31, 2007 8:20 AM
Points: 885, Visits: 1
Hi

Its far from worst practice, but is something to consider. Archiving off data and making it subsequently accessible for historical reporting (typicaly old user queries and auditing) is difficult to plan and manage. I have worked on a lot of databases in my time as a DBA, and I have to say, very few analysts really take the time out to address this issue. I must say also that with the modern DBMS, large amounts of data can be searched quickly and effectively, server (cou, IO etc) performance is excellent and with the advent of grid-computing and highly distibuted systems with super quick periperials, archiving in MANY cases tends to be a non-issue and not worth the added costs in terms of application complexity.

Saying all of that, auditing tables and logs (where the friggen developers log everything you can possibly think of in their apps) is really tough and is the bane of a DBA's life, especially in terms of # indexes (yes - queriyign auditing tables is just as complex as the live ones, resulting in more indexing and speed issues).

Cheers

Chris K






Chris Kempster
www.chriskempster.com
Author of "SQL Server Backup, Recovery & Troubleshooting"
Author of "SQL Server 2k for the Oracle DBA"
Post #45859
Posted Tuesday, November 12, 2002 5:50 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, November 13, 2014 6:23 AM
Points: 142, Visits: 63
quote:

..and of course simply bunging in another processor adds to your licensing costs, never mind the cost of RAM, disk and tapes.

Yes, that was pointed out to them as it was per-processor licencing they were using. Something the vendor ommitted to mention in their 'recommendations'.



Jon Reade



Jon
Post #45860
Posted Tuesday, November 12, 2002 4:31 PM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Wednesday, November 19, 2014 11:39 AM
Points: 509, Visits: 246
This topic is one that I've had in the back of my mind for months now. We're currently mid-stream in a total rewrite of a major application that involves transferring data from a Sybase SQL Anywhere DB to MS SQL Server 2000. (that is an issue in itself, but one one I'm too worried about right now) What has been a concern for me is the need to trim the size of our result sets when desktop applications are querying the server for records. Current design, as well as business process and legal issues, have meant that we must keep all historical data accessable. When someone goes to run our app, they're most likely only concerned with the last weeks worth of data. But, when someone goes to run reports they're going to want the last 3 months of data, and that could be in the 100s of thousands of rows, all the way up to current. I guess my biggest obstacle is what to do with foreign-keys.?? there is always a possibility that a user will need to update an old record, and I don't want to lose data integrity. so, maybe we have been doing worst practices, but I don't know a solution. ? does anybody else?




Post #45861
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse