SELECT INTO and VARBINARY(max)

  • Hello everyone,

    I am working with some legacy code for my client. They have a table, let's call it MyDocuments, which has the primary key for the Document entity, and also includes the contents of a file in binary format using VARBINARY(MAX), plus some other data. It has about 50,000 rows. I'm not a big fan of the design, I'd prefer to see them in the file system. But it can't be changed at this point (per the client).

    Yesterday I was looking around, and decided to run a query like this, as part of my research on data cleanup:

    SELECT TOP 100 * FROM MyDocuments WHERE Foo IS NOT NULL

    The column Foo is not indexed, so the query plan called for a clustered index scan. However, the query was taking longer than expected to run, so I cancelled it after about 5 minutes. When I looked at the sp_who2 result it showed a "SELECT INTO" operation. After cancel it seemed to take quite a while to roll back, also.

    It seems I made a mistake by selecting * (all columns) from the table, because the rows are quite wide with the VARBINARY included. So I suppose it was trying to move a lot of those VARBINARY values into tempdb. (Which is seriously impractical, because some document contents are quite large.) Can someone confirm this is what I did wrong?

    And my next question: In the future if I include only needed columns in a SELECT list, and don't include the VARBINARY one -- is there any possibility that MSSQL will still internally need to shuttle around the entire contents of that table's rows, including the VARBINARY -- thus still having the same type of performance problem? Or will I be safe with specifying only the narrower columns?

    Thanks!

    Jordan

  • after omiting varbinary column from SELECT list, you will stil get the clustered index scan only.....

    _______________________________________________________________
    To get quick answer follow this link:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • Yes, I understand about the clustered index scan, of course without indexing that will happen. I am just concerned whether this type of sluggishness will happen, as I said above, to ensure that MSSQL isn't trying to move that VARBINARY data around, log it, put into tempdb etc. Anyone?

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply