Best way to store images

  • Hi.

    I recognized if we store an image in binary type in SQL SERVER they take more space. For example a 700kb image take about 900kb space.

    With this condition which way is better to store images?

    ___________________________________
    Computer Enterprise Masoud Keshavarz
    I don't care about hell.
    If I go there I've played enough Diablo to know how to fight my way out.

  • masoudk1990 (3/9/2013)


    Hi.

    I recognized if we store an image in binary type in SQL SERVER they take more space. For example a 700kb image take about 900kb space.

    With this condition which way is better to store images?

    Generally speaking, I'll allow storing of small images (such as small badge photographs) in a database and store large binaries in a file-path reference only.

    In all practicality, I'll usually only use the file-path reference method even for small binaries for two reasons.

    1. The data normally originates in a file. If we use the file-path reference method, you can avoid the bit of pain it takes to load a binary into the database. This is especially important if changes to the binary are made. No database action is required, really. Just overwrite the file and you're done.

    2. Since the data normally does originate as a file, most people end up wanting to keep the file even if they don't use it for other things. It seems a bit wasteful to me to have the data stored in two places.

    3. Of course, there's the front end and related security aspects. It's sometimes a whole lot easier on both if the binary data is stored in the database. Of course, if the binary data will never change, it seems a waste to constantly backup potentially voluminous static data over and over. It also makes restores take much longer should a restore need to be accomplished. If the binary data were stored in files instead of the database, restores would be much quicker. It will also distribute the I/O to possibly more than one machine (a file server and SQL Server) which would usually enhance performance.

    So the answer is and although I lean toward the file-path reference method, "It Depends". Every situation is a bit different and I wouldn't rely on any rule of thumb here. Sit down and make a list of requirements vs advantages/disadvantages of each method and then make a decision.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/9/2013)


    masoudk1990 (3/9/2013)


    Hi.

    I recognized if we store an image in binary type in SQL SERVER they take more space. For example a 700kb image take about 900kb space.

    With this condition which way is better to store images?

    ...

    So the answer is and although I lean toward the file-path reference method, "It Depends". Every situation is a bit different and I wouldn't rely on any rule of thumb here. Sit down and make a list of requirements vs advantages/disadvantages of each method and then make a decision.

    Agreed. I have done it both ways. The file path presents an issue with double-hop sometimes and can be a pain to troubleshoot. One thing I sometimes do is to separate the images into a separate filegroup for backup purposes (when stored in the database). If they are static, you could even turn the filegroup into a read-only filegroup. The backup of that filegroup becomes less painful and requires less overall maintenance.

    That said, I too have often found it easier to just store the filepath in the database - especially with frequently changing images or developer design.

    In the end, It Depends.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • This is more of a question that it is a solution but, Jeff, with the advent of filestream technology in more recent versions of sql server have you used that approach or is that more or less what you are refering to by the "file path reference approach"?

  • OTF (3/11/2013)


    This is more of a question that it is a solution but, Jeff, with the advent of filestream technology in more recent versions of sql server have you used that approach or is that more or less what you are refering to by the "file path reference approach"?

    I'd have to include FileStream in the same bucket as the "File Path Reference" approach with some reservations. To be sure, I've not used FileStream, yet (too much on my plate for now), but I like what I see in the documentation. One of the things that would be important to me is that other applications should be able to use the files in the data containers that FileStream uses without having to go through anything having to do with SQL Server or FileStream. For example, I'd like folks to be able to get to the files using only, say, Windows Explorer and be able to open up a PDF or a JPG or whatever just by double-clicking on it. I also need to be able to "update" the content of a file simply by replacing it through something as simple as the Command Prompt. If that can't be done in FileStream data containers directly (and I haven't found such explicit information in the documentation, yet), then Filestream wouldn't work so well for the things I need to do.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/11/2013)


    OTF (3/11/2013)


    This is more of a question that it is a solution but, Jeff, with the advent of filestream technology in more recent versions of sql server have you used that approach or is that more or less what you are refering to by the "file path reference approach"?

    I'd have to include FileStream in the same bucket as the "File Path Reference" approach with some reservations. To be sure, I've not used FileStream, yet (too much on my plate for now), but I like what I see in the documentation. One of the things that would be important to me is that other applications should be able to use the files in the data containers that FileStream uses without having to go through anything having to do with SQL Server or FileStream. For example, I'd like folks to be able to get to the files using only, say, Windows Explorer and be able to open up a PDF or a JPG or whatever just by double-clicking on it. I also need to be able to "update" the content of a file simply by replacing it through something as simple as the Command Prompt. If that can't be done in FileStream data containers directly (and I haven't found such explicit information in the documentation, yet), then Filestream wouldn't work so well for the things I need to do.

    Thanks for the response.

  • I've never used the filestream so I won't comment on it, but I have stored images (and lots of other file types, such as AutoCad) in Oracle and SQL Server in the past. I've regretted it every time. The database files become huge and fragmentation was always a concern. They're already in a physical file and they're presumably stored so the users can get them again, so I just leave them that way. Live and learn.

  • Storing files on File Server is better option if file size is huge.

  • Ed Wagner (3/12/2013)


    I've never used the filestream so I won't comment on it, but I have stored images (and lots of other file types, such as AutoCad) in Oracle and SQL Server in the past. I've regretted it every time. The database files become huge and fragmentation was always a concern. They're already in a physical file and they're presumably stored so the users can get them again, so I just leave them that way. Live and learn.

    I've recently discovered a reason to store even large images in the database. We have call recordings that get stored on disk and then imported into SQL Server. Don't ask me why but I decided to doa reconciliation between the call files on disk and SQL Server. It turned out that people had moved the files several times, left some behind, moved some to different directories (some of which were never backed up), had lost many files in the process, etc, etc, ad infinitum.

    None of that happened with the calls stored in the database even after several moves of the database. The cool part is that the files are right there with the keys to look them up. The "bad" part is that backups have gotten huge and, of course, a restore would take hours. There's a solution for that, though. These calls are, by their very nature, "read only". We're partitioning the call recording table by month with each partition in its own read-only file group. We'll only keep the current month and the previous 3 months in the PRIMARY file group to make restores a no brainer and to allow for "piece-meal" restores of the other file groups (again, 1 per month). This means instead of a 5.5 hour restore in a DR situation, we can get "back in business" in less than a half hour and take our sweet time restoring the older months and in an online fashion. Much like an archive bit in a file system, it allows us to really only backup the data that has changed (new call recordings) or that is super important even if it hasn't changed (previous 3 months).

    I used to be one that said "Hell no, not in my database" to large binary "files" but I'm really starting to see some of the advantages to storing such data in the database when you combine that with some of the tools for handling larger tables. The added bonus is that the front-enders love it, management loves it (no lost or difficult to find recordings), and it has become a whole lot easier to manage the calls because they're guaranteed to all be in the same place.

    One final thing... we keep the informational data for each call in a separate table and have a "sister" table for the recordings. This is kind of the best of both worlds because we can very quickly search the information table (very narrow row sizes and easy to index) and the recordings are "pulled up" only when the correct information is found. Of course, the recording table has the row ID of the informational table in it to provide the link between the two.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Snigdha Vartak (8/20/2013)


    Storing files on File Server is better option if file size is huge.

    When you make such a statement, it would be good if you provided a reason why.

    Shifting gears, I used to think what you stated but, as you can see from my previous post, I'm having a serious change of heart about that. If the files are less than 2GB, they fit quite nicely in the database. Obviously, if they won't fit into 2GB per "file" then you will have to restort to the file system instead. If that's what you meant by "huge", then I agree but not because it's a "better" option... if the files are > 2GB, they won't fit into the MAX datatypes and that eliminates that particular option. Filestream of the old fashioned "store the file path" option would be a way around that.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/11/2013)


    I'd have to include FileStream in the same bucket as the "File Path Reference" approach with some reservations. To be sure, I've not used FileStream, yet (too much on my plate for now), but I like what I see in the documentation. One of the things that would be important to me is that other applications should be able to save image[/url] in the data containers that FileStream uses without having to go through anything having to do with SQL Server or FileStream. For example, I'd like folks to be able to get to the files using only, say, Windows Explorer and be able to open up a PDF or a JPG or whatever just by double-clicking on it. I also need to be able to "update" the content of a file simply by replacing it through something as simple as the Command Prompt. If that can't be done in FileStream data containers directly (and I haven't found such explicit information in the documentation, yet), then file saving[/url] Filestream wouldn't work so well for the things I need to do.

    Right now, I am saving file in a table and I know it will suck when the image file size grows bigger. Thus, I am trying to find another way for storing image in sql server. And I am very interested in your file path reference approach. You mentioned the documentation and I wonder could you please recommend that documentation for me? so I can know more about this kind of file saving method. Thanks you so much.

  • aframiller326 (2/13/2014)


    Jeff Moden (3/11/2013)


    I'd have to include FileStream in the same bucket as the "File Path Reference" approach with some reservations. To be sure, I've not used FileStream, yet (too much on my plate for now), but I like what I see in the documentation. One of the things that would be important to me is that other applications should be able to save image[/url] in the data containers that FileStream uses without having to go through anything having to do with SQL Server or FileStream. For example, I'd like folks to be able to get to the files using only, say, Windows Explorer and be able to open up a PDF or a JPG or whatever just by double-clicking on it. I also need to be able to "update" the content of a file simply by replacing it through something as simple as the Command Prompt. If that can't be done in FileStream data containers directly (and I haven't found such explicit information in the documentation, yet), then file saving[/url] Filestream wouldn't work so well for the things I need to do.

    Right now, I am saving file in a table and I know it will suck when the image file size grows bigger. Thus, I am trying to find another way for storing image in sql server. And I am very interested in your file path reference approach. You mentioned the documentation and I wonder could you please recommend that documentation for me? so I can know more about this kind of file saving method. Thanks you so much.

    I believe this is the documentation referred to:

    http://technet.microsoft.com/en-us/library/bb933993(v=sql.105).aspx

  • I'm inclined to go with an "It Depends" answer. In the past I've worked a lot with binary files outside the database, in filestore, and it's been fine; sound files, image files, movie files, executable files, library files - all handled much the same way. Small files and large files - why treat them diferently.

    But I can see cases where I would want to have binary data involved in transactional consistency with other data; that's something I've worked around in the past (using multi-phase transactions, which I generally don't like) but I can see that filestore data might provide a better handle on that. Of course the idea of manipulating some forms of binary data within the database is a bit odd - DML is not really well suited to image processing, for example; writing XPs to do that sort of thing is not much fun, but maybe the ability to use CLI has made that easier.

    The point Jeff makes about people messing around with filestore so that the database references to files are broken is a valid one - it is essential to have the discipline in place that ensures that doesn't happen. Of course if the data is held both inside and outside the database that's still just as important, since it is important to keep the two copies in step, and doing that would mean keeping track of which file corresponds to which blob (or set of blobs) in the database, so I guess if I wanted the stuff in the database I wouldn't want it outside too, there's too much to go wrong, except that I'd probably want to pull it out now and again in order to use standard tools to update it rather than having to reinvent the wheel (and that of course is entirely analagous to letting an accountancy application pull data out and update some of it before putting it back). Of course it's also valid in some cases to have references in the database to files that are absent - it's sometimes contractually essential in a system where one is handling licensed accessed binary material that the material not licensed for use in a country is not held on the servers in that country, but highly desirable that all material is referenced in the databases on those machines so that when a license becomes avaialble (after the usual silly bureaucratic delays created by content-provider obduracy) it can be made available immediately without requiring database upgrades on a subset of machines; that too, of course, requires some pretty solid displine in the approach to file distribution.

    On balance, I would usually just put references to external files in the database; but I imagine there would be cases where I would want the files to be inside the database and have them outside only transiently (just as for any other data in the database), provided I could cope with the backup and recovery problems entailed by having a lot of infrequently updated data in the DB.

    Tom

  • aframiller326 (2/13/2014)


    Jeff Moden (3/11/2013)


    I'd have to include FileStream in the same bucket as the "File Path Reference" approach with some reservations. To be sure, I've not used FileStream, yet (too much on my plate for now), but I like what I see in the documentation. One of the things that would be important to me is that other applications should be able to save image[/url] in the data containers that FileStream uses without having to go through anything having to do with SQL Server or FileStream. For example, I'd like folks to be able to get to the files using only, say, Windows Explorer and be able to open up a PDF or a JPG or whatever just by double-clicking on it. I also need to be able to "update" the content of a file simply by replacing it through something as simple as the Command Prompt. If that can't be done in FileStream data containers directly (and I haven't found such explicit information in the documentation, yet), then file saving[/url] Filestream wouldn't work so well for the things I need to do.

    Right now, I am saving file in a table and I know it will suck when the image file size grows bigger. Thus, I am trying to find another way for storing image in sql server. And I am very interested in your file path reference approach. You mentioned the documentation and I wonder could you please recommend that documentation for me? so I can know more about this kind of file saving method. Thanks you so much.

    Sorry for the late reply. The link behind the bold lettering in my previous reply apparently didn't work for click-ability. Here's the plain link for the "file path reference approach".

    http://www.rasteredge.com/how-to/vb-net-imaging/open-save-file/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/17/2014)


    Here's the plain link for the "file path reference approach".

    http://www.rasteredge.com/how-to/vb-net-imaging/open-save-file/

    Ugh! :sick:

    Couldn't you point to something that doesn't reference that abominable language VB? Can't RasterEdge Image SDK for .NET be used when programming in other languages which support .NET?

    Also, if I do a google search for "RasterEdge Image SDK for .NET" the first non-advert hit is on soft112.com and that's a site which McAfee suggests you don't visit. So probably if anyone wants to download the SDK they should get it from rasteredge.com, not from google's first hit.

    Tom

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply