Extract/export embedded files from binary in SQL 2017 Standard

  • Hi Everyone !

    If I run

    SELECT * FROM [QPulse5Training].[dbo].[RevisionAttachmentItem]

    I get a generated list of 218 records with a small number of NULL in mixed formats, including PDF, Word, Excel and image files.

    If I run the following (taken from this web page https://www.c-sharpcorner.com/blogs/export-blobbinary-or-varbinary-from-sql-table-and-save-it-as-a-file - credit where it is due), I get 102 documents exported, although the query completes with 218 rows affected:

    ============

    USE [QPulse5Training]

    GO

    DECLARE @outPutPath varchar(50) = 'C:\xp'

    , @i bigint

    , @init int

    , @data varbinary(max)

    , @fPath varchar(max)

    , @folderPath varchar(max)

    --Get Data into temp Table variable so that we can iterate over it

    DECLARE @Doctable TABLE (id int identity(1,1), [FileName] varchar(100), [Doc_Content] varBinary(max) )

    INSERT INTO @Doctable( [FileName],[Doc_Content])

    Select [DisplayFileName],[FileData] FROM [dbo].[RevisionAttachmentItem]

    --SELECT * FROM @table

    SELECT @i = COUNT(1) FROM @Doctable

    WHILE @i >= 1

    BEGIN

    SELECT

    @data = [Doc_Content],

    @fPath = @outPutPath + '\' +[FileName],

    @folderPath = @outPutPath

    FROM @Doctable

    WHERE id = @i

    --Create folder first

    EXEC sp_OACreate 'ADODB.Stream', @init OUTPUT; -- An instance created

    EXEC sp_OASetProperty @init, 'Type', 1;

    EXEC sp_OAMethod @init, 'Open'; -- Calling a method

    EXEC sp_OAMethod @init, 'Write', NULL, @data; -- Calling a method

    EXEC sp_OAMethod @init, 'SaveToFile', NULL, @fPath, 2; -- Calling a method

    EXEC sp_OAMethod @init, 'Close'; -- Calling a method

    EXEC sp_OADestroy @init; -- Closed the resources

    print 'Document Generated at - '+ @fPath

    --Reset the variables for next use

    SELECT @data = NULL

    , @init = NULL

    , @fPath = NULL

    , @folderPath = NULL

    SET @i -= 1

    END

    ============

    Obviously there is a difference between the number of records in the DB and the number of exported documents, and I will do an actual count minus NULL entries to figure out whether that is in fact correct.

    However, if I try to run a join and export approved documents only, using WHERE [dbo].[Revision].[StatusID] = 2, it goes south.

    This is obviously not my area of expertise...

    What I need to achieve is to extract/export documents from [dbo].[RevisionAttachmentItem] using [DisplayFileName] to name the exported files, and the [FileData] field to extract the file data itself, and limit the export to approved documents only which is where [dbo].[Revision].[StatusID] = 2 comes in.

    I have tried various JOINs along the lines of the following, but I om obviously not getting something right.

    INNER JOIN [dbo].[RevisionAttachmentPdf] ON [dbo].[RevisionAttachmentItem].[PdfId] = [dbo].[RevisionAttachmentPdf].[ID]

    INNER JOIN [dbo].[Revision] ON [dbo].[RevisionAttachmentItem].[RevisionID] = [dbo].[Revision].[ID]

    INNER JOIN [dbo].[Document] ON [dbo].[Revision].[DocumentID] = [dbo].[Document].[ID]

    INNER JOIN [dbo].[yDocumentType] ON [dbo].[Document].[DocumentTypeID] = [dbo].[yDocumentType].[ID]

    WHERE [dbo].[Revision].[StatusID] = 2 AND [dbo].[RevisionAttachmentItem].[ID] = @i

    Any thoughts on how I can achieve this ?

    So far I have found an amazing number of ways for it not to work 🙂

    Thanks for looking !

  • The first problem that we have in trying to answer this is that we cannot see your database and therefore it is difficult to advise on the validity of JOINs.

    Regarding this statement:

    However, if I try to run a join and export approved documents only, using WHERE [dbo].[Revision].[StatusID] = 2, it goes south.

    What do you mean, exactly, by "it goes south"? Does it generate an error? If so, what is it?

    Have you detected a pattern when comparing the 108 successes against the failures? There must be something common in the data to cause this.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • To make things a little clearer, here is that code again, but reformatted:

    USE QPulse5Training;
    GO

    DECLARE @outPutPath VARCHAR(50) = 'C:\xp'
    ,@i BIGINT
    ,@init INT
    ,@data VARBINARY(MAX)
    ,@fPath VARCHAR(MAX)
    ,@folderPath VARCHAR(MAX);

    --Get Data into temp Table variable so that we can iterate over it
    DECLARE @Doctable TABLE
    (
    id INT IDENTITY(1, 1)
    ,FileName VARCHAR(100)
    ,Doc_Content VARBINARY(MAX)
    );

    INSERT INTO @Doctable
    (
    FileName
    ,Doc_Content
    )
    SELECT DisplayFileName
    ,FileData
    FROM dbo.RevisionAttachmentItem;

    --SELECT * FROM @table
    SELECT @i = COUNT(1)
    FROM @Doctable;

    WHILE @i >= 1
    BEGIN
    SELECT @data = Doc_Content
    ,@fPath = @outPutPath + '\' + FileName
    ,@folderPath = @outPutPath
    FROM @Doctable
    WHERE id = @i;

    --Create folder first
    EXEC sys.sp_OACreate 'ADODB.Stream', @init OUTPUT; -- An instance created

    EXEC sys.sp_OASetProperty @init, 'Type', 1;

    EXEC sys.sp_OAMethod @init, 'Open'; -- Calling a method

    EXEC sys.sp_OAMethod @init, 'Write', NULL, @data; -- Calling a method

    EXEC sys.sp_OAMethod @init, 'SaveToFile', NULL, @fPath, 2; -- Calling a method

    EXEC sys.sp_OAMethod @init, 'Close'; -- Calling a method

    EXEC sys.sp_OADestroy @init; -- Closed the resources

    PRINT 'Document Generated at - ' + @fPath;

    --Reset the variables for next use
    SELECT @data = NULL
    ,@init = NULL
    ,@fPath = NULL
    ,@folderPath = NULL;

    SET @i -= 1;
    END;

    Note that @folderPath is declared but not used.

     

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Thanks for your replies !

    Images of the relevant tables/fields now included. I could sanitise the test database I am using and provide a copy of that as well, if required.

    Basically if I adjust the scipr to pull direct from the source table and not use a temp table I still get file exports, but if I try to get clever and use JOINs it repeatedly fails, seeming unable to use:

    WHERE [dbo].[Revision].[StatusID] = 2  AND id = @i

    Basically it can't work out the AND id = @i part, which I suspect (but don't know) could be down to not knowing which id to reference, given id features in other tables within the JOIN.

    It's a long time since I fooled with sub-queries (which I know are very different to the above) but I seem to recall having to declare tables and same-name fields to avoid confusion along these lines.

     

    Example:

    FROM [dbo].[RevisionAttachmentItem]

    INNER JOIN [dbo].[RevisionAttachmentPdf] ON [dbo].[RevisionAttachmentItem].[PdfId] = [dbo].[RevisionAttachmentPdf].[ID]

    INNER JOIN [dbo].[Revision] ON [dbo].[RevisionAttachmentItem].[RevisionID] = [dbo].[Revision].[ID]

    INNER JOIN [dbo].[Document] ON [dbo].[Revision].[DocumentID] = [dbo].[Document].[ID]

    INNER JOIN [dbo].[yDocumentType] ON [dbo].[Document].[DocumentTypeID] = [dbo].[yDocumentType].[ID]

    WHERE [dbo].[Revision].[StatusID] = 2 AND id = @i

    I hope that makes it clearer 🙂

    Attachments:
    You must be logged in to view attached files.
  • I haven't looked at the tables, but I think a CURSOR may be worth considering here. I've rewritten your query to use a cursor – untested, see what you think:

    USE QPulse5Training;
    GO

    DECLARE @outPutPath VARCHAR(50) = 'C:\xp'
    ,@init INT
    ,@fPath VARCHAR(MAX);

    DECLARE @FileName VARCHAR(100)
    ,@Doc_Content VARBINARY(MAX);

    DECLARE DocsToExport CURSOR LOCAL FAST_FORWARD FORWARD_ONLY FOR
    SELECT DisplayFileName
    ,FileData
    FROM dbo.RevisionAttachmentItem;

    OPEN DocsToExport;

    FETCH NEXT FROM DocsToExport
    INTO @FileName
    ,@Doc_Content;

    WHILE @@FETCH_STATUS = 0
    BEGIN
    SELECT @fPath = @outPutPath + '\' + @FileName;

    EXEC sys.sp_OACreate 'ADODB.Stream', @init OUTPUT; -- An instance created

    EXEC sys.sp_OASetProperty @init, 'Type', 1;

    EXEC sys.sp_OAMethod @init, 'Open'; -- Calling a method

    EXEC sys.sp_OAMethod @init, 'Write', NULL, @Doc_Content; -- Calling a method

    EXEC sys.sp_OAMethod @init, 'SaveToFile', NULL, @fPath, 2; -- Calling a method

    EXEC sys.sp_OAMethod @init, 'Close'; -- Calling a method

    EXEC sys.sp_OADestroy @init; -- Closed the resources

    PRINT 'Document Generated at - ' + @fPath;

    FETCH NEXT FROM DocsToExport
    INTO @FileName
    ,@Doc_Content;
    END;

    CLOSE DocsToExport;
    DEALLOCATE DocsToExport;

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Many thanks for your considered and interesting reply.

     

    I have just run your code and I get the same number of documents exported.

  • Xunil wrote:

    Many thanks for your considered and interesting reply.

    I have just run your code and I get the same number of documents exported.

    Can we recap on exactly what the problem is, please?

    Your StatusId = 2 requirement is easily accommodated in my later post – by changing this part as shown:

    DECLARE DocsToExport CURSOR LOCAL FAST_FORWARD FORWARD_ONLY FOR
    SELECT DisplayFileName
    ,FileData
    FROM dbo.RevisionAttachmentItem
    WHERE StatusId = 2;

    Does that get you what you want? If not, please explain again what is missing/wrong with what is happening, thanks.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • StatusId is in dbo.Revision, so I think you'd need to adjust slightly to something like the following

    DECLARE DocsToExport CURSOR LOCAL FAST_FORWARD FORWARD_ONLY FOR
    SELECT RevisionAttachmentItem.DisplayFileName
    ,RevisionAttachmentItem.FileData
    FROM dbo.RevisionAttachmentItem
    INNER JOIN dbo.Revision ON Revision.Id = RevisionAttachmentItem.RevisionID
    WHERE Revision.StatusId = 2;
  • ratbak wrote:

    StatusId is in dbo.Revision, so I think you'd need to adjust slightly to something like the following

    Well spotted, thanks for fixing that up.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Wow!

     

    That actually appears to work !!!

     

    Thank you so much for your collective input. I will run some tests ansd make sure it is extracting the sub-set I need but, right now, it looks very solid.

     

    Medals and promotions all round 🙂

     

     

  • This was removed by the editor as SPAM

  • This was removed by the editor as SPAM

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply