July 25, 2009 at 8:39 pm
I am building an app that will backup files and directories. After the first backup I only want to backup the changed files and delete the files that no longer exist. This means that I need to store information about the files that have been backed up so it can be compared to the current info on each file. That being said I am wondering if SQL Server is an appropriate tool for this purpose?
If it is, what would be the recommended structure for stored data? The format is server/drive/directory/file name. The app will be processing 20 - 25 million files so I think that performance should be a major consideration from the start. I don't know the best structure for storing the data. The data will be encrypted and condensed prior to being transmitted to a remote site for storage. I am currently favoring the idea of a total backup every weekend, with just the changes being backed up during the week. I am not sure if the total db should be sent for backup or if seperate files or groups of files should be saved independantly; not in a db.
The "pipe" for data transfer is a dedicated T1.
If a file need to be retrieved from the backup, having to retrieve only a subset of the total will reduce the time required to retrieve the file and reduce the amount of data that is transferred. I think that the retrieval of the entire db should be avoided, except in the case of a total restore.
As I understand things now the db should be clustered.
Since this app will be run on a machine with a four core CPU that will be dedicated to this process for the duration of the application's operation and because any type of db process is relative slow, I am planning on running multiple threads. However, I will have to experiment with what the threads should be doing; encrypting, condensing, data transfer.
Any assistance you can offer will be appreciated.
TH
July 25, 2009 at 9:08 pm
...and delete the files that no longer exist
This is bit unclear to me. How will you delete files which do not exist?
I am wondering if SQL Server is an appropriate tool for this purpose?
SQL Server isn't a tool. It's a database management system.
What i understand is you have many files, some of which may change. You need to backup the entire set of files first and going forward only those files which have changed since last full backup.
I'm unclear on how you plan to achieve this. Do you want a have a list of all files in a table and then backup files from your hard drive based on the list of files from the table. in that case you may want to store the files' details including Modified Date. Initially you fire a query against this table to fetch all file names and then physically back them up. You update this backup date in a column in the same table. In future, you query the tables to find out which files have been modified after the last backup date, get the names of all files from the table and then back it up.
Did i understand your requirement clearly?
Viewing 2 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy