Retrieving a Binary File from Your Database

  • Comments posted to this topic are about the item Retrieving a Binary File from Your Database

  • Great article, 5 stars

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • In the last code block I think you're missing an apostrophe, or have an extra one on the first line. Makes the whole block red...

    declare @filePath varchar(128)=<your source file path plus filename>'

  • Hi Eric. As I mentioned in my comment on your previous article (regarding reading the file), I think this is a good topic and concept, but there are some aspects of the implementation that are concerning.

    • !! CRITICAL ISSUE !!: As stated in my comment on the prior article (linked above), there is no error handling--either "using" block or explicit try / catch / finally--around the FileStream in the GetBytesFromFile method. This is a huge problem as it means that the fs.Dispose() method won't be called if an exception occurs. While that method is not the focus of this article, that code is shown in this article and people copy-and-paste examples from articles into their code all of the time.

      So again, until the code in the article is updated to trap errors to ensure proper disposal, I would have to recommend against anyone using this in a Production capacity.

      I had previously thought (and had written here) that a similar problem existed with the FileStream object in the SaveFileToDisk method due to the "catch" block swallowing the error. However, this actually shouldn't be a problem due to the compiler putting in a "finally" block with a call to "file.Dispose()" that should get processed no matter what.

    • There is no reason to do the following:

      byte[] Buffer = new byte[Blob.Length];

      for (int i = 0; i < Blob.Length; i++) Buffer = Blob;

      Please see the MSDN page for SqlBytes.Buffer as it is a byte[] already, and a streamed one. By copying the data from the "Blob" SqlBytes input param into the "Buffer" byte[], you are duplicating the data in memory but for no benefit. So if the file was 10 Mb, you now have taken up 20 Mb of memory. And wasting CPU to make that copy.

    • And you definitely don't need:

      System.IO.MemoryStream stream = new System.IO.MemoryStream(Buffer);

      byte[] bytes = new byte[stream.Length];

      stream.Read(bytes, 0, bytes.Length);

      since you already had it in a byte[] called "Buffer" (which again, wasn't needed due to SqlBytes having the "Buffer" property). Copying the data into the MemoryStream created a 3rd copy of it, and then reading that Stream into the "bytes" byte[] created a 4th copy of it in memory. So that 10 Mb file is now taking up 40 Mb of memory.

    • Why would you copy the DLL to “C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL”? I don't see any benefit to that as the DLL isn't needed after it is imported into SQL Server.

    I hope this info helps. Take care, Solomon..

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • Solomon Rutzky (5/28/2015)


    Hi Eric. As I mentioned in my comment on your previous article (regarding reading the file), I think this is a good topic and concept, but there are some aspects of the implementation that are concerning.

    • !! CRITICAL ISSUE !!
    • While the FileStream is being handled better this time since it is in a "using" block, the fle.Dispose() call that is being automatically generated by the compiler (via the "using" construct) won't actually get called if an exception is throw. The reason for this is that you have a try / catch encapsulating all of the logic in the "using" block and the "catch" block doesn't throw an error of its own: it swallows the error and returns "0". The only way the FileStream for the file variable will get disposed of in the case of an exception is if the code within the "catch" block has an exception itself.

      So again, until the code in the article is updated to trap errors to ensure proper disposal, I would have to recommend against anyone using this in a Production capacity.[/li]

      I mentioned in the other thread about using ReadAllBytes, in the case of creating the file, you can use WriteAllBytes. Wrapping in a "using" is unnecessary if you do that, and the whole method is reduced to something like this:

      public static int MakeFileFromBytes(string filePath, SqlBytes data)

      {

      int returnValue = 0;

      try

      {

      System.IO.File.WriteAllBytes(filePath, data.Buffer);

      }

      catch (System.IO.IOException ex)

      {

      // Do something with/about the exception

      returnValue = ex.HResult;

      }

      return returnValue;

      }

      Probably better to trap for a general Exception instead of IOException though since a number of things could go wrong.

    • dmbaker (5/28/2015)


      I mentioned in the other thread about using ReadAllBytes, in the case of creating the file, you can use WriteAllBytes. Wrapping in a "using" is unnecessary if you do that, and the whole method is reduced to something like this:

      public static int MakeFileFromBytes(string filePath, SqlBytes data)

      {

      int returnValue = 0;

      try

      {

      System.IO.File.WriteAllBytes(filePath, data.Buffer);

      }

      catch (System.IO.IOException ex)

      {

      // Do something with/about the exception

      returnValue = ex.HResult;

      }

      return returnValue;

      }

      Probably better to trap for a general Exception instead of IOException though since a number of things could go wrong.

      First, I need to say that I realized after posting my first comment that the SaveFileToDisk method should actually not have the unmanaged resource cleanup issue that I originally thought was there (i.e. the part of my post that you quoted above). I have since updated my post above to remove that concern.

      Next, I agree with both this suggestion and the related suggestion on the prior article (i.e. using ReadAllBytes), or at least exploring the possibility of using either or both of those methods. But personally, I do like having the option of setting FileMode and FileAccess.

      Along those same lines, I would also suggest replacing the following code in the "catch" block:

      using (System.IO.StreamWriter sw = new System.IO.StreamWriter(@"C:\temp\error.txt", true))

      {

      sw.Write(ex.Message);

      }

      with the following:

      File.WriteAllText(@"C:\temp\error.txt", ex.Message);

      OR, to get around any possible file system permission issues, just return the error message as the return value of the function. Meaning, instead of returning "int" (which should actually be "SqlInt32"), return SqlString and upon success return an empty string "", or else return ex.Message.

      Take care,

      Solomon..

      SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
      Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
      Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
      Info sitesCollations     •     Module Signing     •     SQLCLR

    • But personally, I do like having the option of setting FileMode and FileAccess.

      Sure, nothing wrong with wanting more control. It might be better to change this up entirely though, and go ahead and make it fully streaming, rather than trying to read/write the whole file contents. If you're dealing with a really, really big file then ReadAllBytes/WriteAllBytes might cause problems, as would the existing code (since it's reading/writing the entire file in one big chunk).

      OR, to get around any possible file system permission issues, just return the error message as the return value of the function. Meaning, instead of returning "int" (which should actually be "SqlInt32"), return SqlString and upon success return an empty string "", or else return ex.Message.

      Yeah, I don't think I'd do that. Might as well just forgo the catch altogether and just let it error like any other TSQL statement would. If the code needs to clean up any resources then you can always try/finally and just let any exception bubble back up (and catch it with a TSQL TRY/CATCH).

    • I tried this example and found out that Excel and Word documents cannot open and it says they are corrupt.

    • rasika.nanayakkara (5/29/2015)


      I tried this example and found out that Excel and Word documents cannot open and it says they are corrupt.

      Since I did not see any mention of file encoding I was wondering if that would rear its ugly head.

      Simple test is to read a large binary file to a blob and then write it right back out using a different name. Then do a binary compare on the two files. They must be identical. Has that been done?

      ATBCharles Kincaid

    • I want to thank everyone for the comments. One of the ways we can learn about different technical solutions to various problems is to submit articles like this in order to have commenters identify technical issues with the proposed solution and offer alternatives. In this way we all learn.

      Solomon suggested that error handling was needed; I agree. This solution was designed as a proof of concept rather than a production application. I would never incorporate code without significant error handling.

      Another commenter to my part one of this series questioned posting the dll to a Sql Server folder. I do this so if the server crashes and has to be rebuilt DBAs always know where to find the dll for a CLR.

      As a final comment, I sensed a level of harshness in the banter between some commenters. Harsh comments even between friends has been known to drive individuals away from a site like this. I have found that graciousness even in differing opinions draws people in.

    • Charles Kincaid (5/29/2015)


      rasika.nanayakkara (5/29/2015)


      I tried this example and found out that Excel and Word documents cannot open and it says they are corrupt.

      Since I did not see any mention of file encoding I was wondering if that would rear its ugly head.

      Simple test is to read a large binary file to a blob and then write it right back out using a different name. Then do a binary compare on the two files. They must be identical. Has that been done?

      I think there was a bug in the original code that was chopping off the last byte in the file? In any case, you don't need to encode/decode binary data, you're simply reading and writing the raw bytes so there's nothing to encode or decode. Unless you're going across different operating systems maybe? I don't know about that though.

      I did test ReadAllBytes/WriteAllBytes methods (with a text file, with an Excel file) and they worked fine.

    • dmbaker (5/28/2015)


      But personally, I do like having the option of setting FileMode and FileAccess.

      Sure, nothing wrong with wanting more control. It might be better to change this up entirely though, and go ahead and make it fully streaming, rather than trying to read/write the whole file contents. If you're dealing with a really, really big file then ReadAllBytes/WriteAllBytes might cause problems, as would the existing code (since it's reading/writing the entire file in one big chunk).

      Full streaming is only really an option for retrieval, and that can be accomplished via the SqlBytes.Buffer property, as in:

      int _ReturnValue = 0;

      try

      {

      using (System.IO.FileStream file = new System.IO.FileStream(sFileName.Value,

      System.IO.FileMode.CreateNew, System.IO.FileAccess.Write))

      {

      file.Write(Blob.Buffer, 0, Blob.Length);

      }

      }

      catch()

      {

      _ReturnValue = 1;

      throw;

      }

      return _ReturnValue;

      That should pretty much be the entire method. Please note that I did reverse the return values since it is customary for 0 to be "success" while positive values indicate various errors (each positive value being a different specific error).

      OR, to get around any possible file system permission issues, just return the error message as the return value of the function. Meaning, instead of returning "int" (which should actually be "SqlInt32"), return SqlString and upon success return an empty string "", or else return ex.Message.

      Yeah, I don't think I'd do that. Might as well just forgo the catch altogether and just let it error like any other TSQL statement would. If the code needs to clean up any resources then you can always try/finally and just let any exception bubble back up (and catch it with a TSQL TRY/CATCH).

      I think that depends on how it is being used. For a single call then yes, it does make more sense to simply allow the method to bubble up the error. But when doing set-based operations, I tend to prefer to simply return the error message such that an error on saving one file doesn't break the operation for the other files. And of course, since that might still be desirable in some cases, I really prefer to pass in a parameter to allow the user to determine how to handle errors: whether to hard-break with raising an exception or continue with returning the error message as the return value.

      Take care,

      Solomon...

      SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
      Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
      Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
      Info sitesCollations     •     Module Signing     •     SQLCLR

    • dmbaker (5/29/2015)


      Charles Kincaid (5/29/2015)


      rasika.nanayakkara (5/29/2015)


      I tried this example and found out that Excel and Word documents cannot open and it says they are corrupt.

      Since I did not see any mention of file encoding I was wondering if that would rear its ugly head.

      Simple test is to read a large binary file to a blob and then write it right back out using a different name. Then do a binary compare on the two files. They must be identical. Has that been done?

      I think there was a bug in the original code that was chopping off the last byte in the file?

      Correct.

      In any case, you don't need to encode/decode binary data, you're simply reading and writing the raw bytes so there's nothing to encode or decode.

      Agreed. Encoding is only an issue for text files.

      Unless you're going across different operating systems maybe? I don't know about that though.

      When moving data between OS's, binary data is still binary data. But text data can be represented differently on different systems, and they might handle newlines differently, so this is where encodings would potentially come into play. This is the difference when using FTP between ASCII and BINARY modes. BINARY mode just transfers the bytes as they are, which will keep text files with their original encoding and newlines which might not be readable on the destination system. ASCII mode will translate things like ASCII character set to EBCDIC (for some IBM systems) and convert any newlines to the destination system's preferred value. And that is why using ASCII mode to transfer a binary type file (e.g. images, zip/gzip/tar/etc archives, docx/xlsx/etc files, and so on), will destroy the file.

      Take care,

      Solomon...

      SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
      Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
      Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
      Info sitesCollations     •     Module Signing     •     SQLCLR

    • eric.notheisen (5/29/2015)


      I want to thank everyone for the comments. One of the ways we can learn about different technical solutions to various problems is to submit articles like this in order to have commenters identify technical issues with the proposed solution and offer alternatives. In this way we all learn.

      You are welcome and I absolutely agree with this.

      Solomon suggested that error handling was needed; I agree. This solution was designed as a proof of concept rather than a production application. I would never incorporate code without significant error handling. (emphasis mine)

      So this is really the main issue: you did not state that you were presenting a proof of concept and that it wasn't Production-ready. Though, even if you had there is still the issue that once you show somebody either code or the working program, then it is at least difficult, if not impossible, to keep it as "proof of concept" because people see something that appears to be working and "working = working". And while you would "never incorporate code without significant error handling", the audience of this article is generally someone who is much more familiar with T-SQL and usually has little to no knowledge of C# / .NET, etc. Meaning, a large portion of your audience has no idea that there is no error handling in the "GetBytesFromFile" method. And how many readers are aware that a "FileStream" object needs to be disposed? How many are even aware that a "using(x) { code }" block is a compiler macro that rewrites the code into a "try { code } finally { x.Dispose(); }" structure? It should be assumed that the audience generally won't be able to identify that a) things are missing, and b) what to do about the things that are missing.

      And to be fair, this is not an issue specific to these two articles. This is unfortunately an issue with far too many articles and blog posts, etc, where proper error and resource handling is omitted, and best-practices are not followed, in the interest of time since it is just a demo, but people copy and paste this stuff into Production far too often. Sure, people that copy and paste without having someone knowledgeable review it first certainly share in the responsibility when things go wrong, but there really needs to be a very large / clearly visible warning in the article that what is being presented is not Production-ready. And this is why I still feel that these two articles should be updated at the very least to include such warnings, if not to update the code itself to be Product-ready (especially with regards to simplifying the code in the "SaveFileToDisk" method such that it doesn't quadruple the memory usage per call).

      Another commenter to my part one of this series questioned posting the dll to a Sql Server folder. I do this so if the server crashes and has to be rebuilt DBAs always know where to find the dll for a CLR.

      Actually, that was also me ;-). Again, explaining your reasoning in the article would help. However, it is unnecessary to have the DLL laying around for 2 reasons:

      1) Assemblies are data in the database, just like all other object definitions, and get backed up with the database. Restoring the database will restore the Assembly and any of the T-SQL wrapper objects that have been created (i.e. the CREATE FUNCTION ... AS EXTERNAL NAME ...)

      2) If the server crashed and needs to be rebuilt, you might not have access to any folders.

      So, a better location to store such things is wherever you keep your release scripts. This is also why I prefer to not use DLLs at all, but instead use the "CREATE ASSEMBLY FROM 0x...." syntax since everything is encapsulated in the .sql script with no external dependencies.

      As a final comment, I sensed a level of harshness in the banter between some commenters. Harsh comments even between friends has been known to drive individuals away from a site like this. I have found that graciousness even in differing opinions draws people in.

      I completely agree. And sadly I think I am involved, at least a little bit, in that so I do apologize and will do better.

      Take care,

      Solomon..

      SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
      Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
      Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
      Info sitesCollations     •     Module Signing     •     SQLCLR

    • Some good discussion on this one.

    Viewing 15 posts - 1 through 14 (of 14 total)

    You must be logged in to reply to this topic. Login to reply