Document Management

  • I am creating a document management system that will be accessible through a GIS interface over the web. You will click on a "polygon" and through a relation have a SQL Server 2005 Database show all of the documents that are tied to that polygon. I want the end user to be able to upload pdf documents to the SQL Server database, so I figured the best way to accomplish this is to have the pdf documents stored in the SQL Server Database. I've been reading that having the pdf in the Sql database will severly hamper performance, but I think it will be easier to to upload to a database instead of uploading to a folder and typing the path in the SQL Database.

    Any suggestions?

  • "Severely Hamper Performance" is an open-ended statement. Performance of what? You can use a blob field and upload the document into it and it will not hurt database performance in other areas at all. If you need to, for some reason, open and scan through these documents regularly, you may have a problem, but I don't think you are trying to search in the documents for any reason. If you are just looking for a storage place for a PDF and link it to a record, putting it in your database will work fine.

    In many cases, when pdf documents get stored in a database, they need to be regularly streamed to a file and then the file is opened to view it This is often inefficient since you have to repeatedly stream the file contents to a file location and then delete it when done. On the other side of the argument, the file system is not a database and does not often get backed up at the same time, will not enlist in a transaction, and has a general risk of being different than the table you keep in your database. I am typically willing to accept a pretty big performance hit for data integrity, but many situations can be drastically different.

  • The performance is nicely summarized above and it's really a good debate. But the answer is that it depends.

    I know, that sucks. But it depends on how you architect things. I think you can get some performance benefits through good design by putting large documents/images in the file system, but then you have an admin issue of making sure those are backed up and somehow associated with the database. This is especially problematic when you migrate servers or have a DR situation.

    You also have issues if someone renames, moves, permissions the file system.

    Microsoft is adding extensions in SQL Server 2008 to make this smoother. They'll have ways to stream to the filesystem, but retrieve from T-SQL.

    What's large? If these are documents that easily fit inside your database without causing issues on the disk system (in terms of gross space) and don't overwhelm your backups, I'd lean that way. It's simpler to me.

    I might also set up a separate filegroup for the images just so you can grab backups of one or the other if you need to.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply