Starting a new FileTable

  • I have a database with two versions - a production version and a development version. As I fix and upgrade, I regularly copy data from the production version into the development version, so that I have current data to experiment on, and when it's time for a new rollout, I then delete the production verion and restore the upgraded development version to become the current (newest) production version. All this works fine - I have been doing this since around 2008, and using this backup/restore method this way also has me regularly verifying that my backups are, in fact, fully functional.

    One of the tables is a table of filenames. There is an associated directory, with files and subdirectories that are 'related' to file records. Users may add files using the database app, or copy them in directly using a shared network path. There is no real binding - the records' ID must appear as the first part of the filename, then any character that cannot be part of the ID, then anything else. So, record E47 may have associated files E47.jpg, E47.pdf, E47 fossil.png (the space may not exist in an ID), E047.docx (leading zeros are not significant), E00047.gif, E47;Let It Be.txt (the semicolon may not be part of the ID), and so on. Badly constructed names are not a problem, they just don't appear with any record. I occasionally run a manual scan for any filenames that do not map to an existing record ID. It's not an error - files may appear before a record is created, but I want to keep an eye on it, and make sure there is not a serious accumulation of such orphans.

     

    Initially, I had the DB scan the proper subfolder looking for each file that matched the name when the user wanted to examine a record in detail. That eventually became too slow, as the number of such files increased - it is now almost 75,000 such files. I then added a table of filenames, and the database updates it when the user adds a file using the app, and there is a stored procedure that runs every night to pick up any new files that have been added directly into the share.

     

    Again, all this works, but it's kind of a Rube Goldberg setup, and I now want to migrate this entire apparatus to a FileTable.

     

    My question (finally) - if I put the FileTable into a database, is there a way for that to work with both databases? Will SQL Server allow two databases to simultaneously work with one storage location? My first guess is not, since you have to CREATE a FileTable, with the proper path, and not use an existing folder on the machine. If I restore a copy of a database that has such a FileTable to a different name (as I now do, when my new and improved development version becomes the production version), will that restored FileTable in the production version conflict with original FileTable created in the development version? Will such a database even restore properly?

    I've been experimenting with a test FileTable, and it works about like I'd expect - adding and deleting files is instantly reflected in the table, and deleting a record in the table also deletes the file, so I could see a potential for serious conflict if two databases were trying to maintain the same external filestructure.

    My guess is that the best way to deal with this is to put the FileTable in a separate database in the instance, then either refer to it by the database name, or set up a synonym in each database pointing to the database containing the FileTable.

  • Update - I put a test FileTable into a separate database, backed it up and restored it to a different name. The existing FileTable restored, fully and with no error message, but the link to the file share is broken. When I add or remove files in the share, the FileTable in the original database reflects the changes, and deleting a line in the table deletes the corresponding file. But the FileTable in the restored copy is a static snapshot of the state at the moment the backup was created - changes in the fileshare do nothing, and deletes from the table have no effect on the share. So I guess I answered my own question - I have to use either a separate database for the FileTable, or mung up some complex mechanism to detect whether code is running in the production version or the development version - maybe a configurable synonym or something similar. But that's getting off the original topic.

  • I've not used "File Table" but I want to say "Thank you" for taking the time to share what you're doing.  You've always been one of the "good guys" on this forum and I'm glad to see that you're still around.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks for the kind words. I'm one of those accidental DBAs, and I have a great deal of other work, so I come and go in various forums, as my workflow dictates. And I was out of the country for half of 2022, so a lot of my stuff was on hold for that half-year.

     

    I now wonder about backing up the FileTable database. I tried detaching the original, then completely deleting all the files and folders of the share. When I then restored the FileTable database,  using its original name again, it put everything back, specifically the share and all files/folders therein. If I do a full backup, I will apparently be making a full copy of the entire share, which is by now almost 300 GB. I have a regular backup job, which does transaction log backups every fifteen minutes, differential backups on the hour and a full backup every night. There is a test of the last change date/time, so that  if there was no activity during the preceding 15 min / hour / day, no backup is performed, to keep from making pointless copies of unchanged data.

     

    But if I employ a similar strategy for this FileTable database, I'm going to quickly run out of room, not to mention beating up the drives by constantly going over the same stuff for no reason. I wonder if I should use the same strategy, but with vastly extended intervals (like week / month / year, instead of 15 min, hour / day), or if I would be better off not backing up the FileTable database at all, and instead backing up the files and folders directly, via the share path. The FileTable can be created any time, by simply making one and copying the files into the created share. It seems like a recreation and reload from backup might be easier than slogging through the accumulated log and diff backups.

     

    And I'm not sure what are the effects of a delete on such backups. If a file is deleted, will an earlier restore restore the file, then a later restore remove it again? It seems like it would, but what if the reason I'm going back is to restore just one accidentally deleted file? Can I pick that one out of the FileTable somehoew and restore only it, or would I have to delete the entire share, restore piece by piece until the file appears, copy it out, continue the restore through the subsequent delete and on up to get back to 'the present', then copy the file back in? That seems like a huge amount of work, and I cannot find anything in BOL that goes into that much detail on the subject. And what if a file is added and then accidentally removed again all between backup cycles? Is that file gone forever? I suppose it's no different from the same thing happening in one day, between nightly file backups,

     

    It looks like I have a lot more testing to do before I implement this fully.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply