Need to store .wav files out on disk, but the metadata should remain inside the database

  • One of my clients have a project requirement, where they need to store the .wav files out of the database onto the storage subsystem, and have the pointers or metadata stored inside the database. I know FileStream is a compelling option, but I am not sure whether it can be used to server the above purpose. Also, client asked me to look into RBS (Remote Blob Storage) as an option, but nevertheless when I googled it, each, and every article points me towards how that is implemented for Sharepoint, and Content Databases. Now, I am not certain whether RBS can be used to provide a solution for non Sharepoint databases. Since, I haven't had a single article justifying the fact I can't tell the client if that is possible.

    Could you'll please provide me an answer or a mechanism to deal with it. Ya, and I am not sure how the front end would deal with it, but I am more concerned how it would look in the back end, and how I would end up implementing it. Any comments or suggestions is highly appreciated.

    Thanks

    Regards,

    Feivel

  • Bear in mind that the situation you describe is complex enough that vendors such as Hyland sell software up into the six figure range (OnBase) to do exactly what you describe. This is a basic document management situation, even if the "document" is an audio file. If you don't want to use filestream, you'll have to code a custom app to locate the file in the filesystem and then store the metadata in the database, same as any other document management system. There's nothing out of the box for SQL other than filestream to manage something like this.

  • You can use FileStream concepts here...

    However when you want to open specific file , add with file name type like .wav or .txt or .jpeg etc...

    I got similar requirement for long back to store Case documents, retrieving is need to cross check once, if you do direct access it could not open.

    Thanks,

    Sasidhar P

  • jeff.mason (5/3/2016)


    Bear in mind that the situation you describe is complex enough that vendors such as Hyland sell software up into the six figure range (OnBase) to do exactly what you describe. This is a basic document management situation, even if the "document" is an audio file. If you don't want to use filestream, you'll have to code a custom app to locate the file in the filesystem and then store the metadata in the database, same as any other document management system. There's nothing out of the box for SQL other than filestream to manage something like this.

    Thanks! Jeff. I do not think client would invest in any outside software app for this. Basically, one of the Solution Architect proposed this solution to use RBS (Remote Blob Store). However, since I haven't worked on that piece I was not really sure if I should say yes or should come up with a different alternative. RBS does use Filestream underneath it, infact that is a requirement for RBS to work. But, as I mentioned previously whenever I googled it all the examples that I am getting for this stuff relates to Sharepoint, and I am not sure how to implement it without it.

    It would be helpful if you or someone in the forum could provide me a step by step approach or at least point me in a right direction where I do not see that being implemented using Sharepoint.

    Thanks

  • I think you're overcomplicating the matter.

    There must be already some software which creates and/or saves the sound files in the file storage.

    Without it there would not be .wav files.

    So, your task in only to modify this software by adding a call to a DB procedure updating a record about the .wav file with its current location.

    Procedure should be very simple: update a record for a file by replacing the old location (1st parameted) with a new one (2nd parameter). If there is no record in DB which matches the 1st parameter (this includes the situation when 1st parameter is NULL) it creates a new record. If 2nd parameter is null it deletes a record matching the 1st parameter.

    If you cannot modify the application responsible for storing the file you can do it even with a simple SQLCMD command in a batch file.

    _____________
    Code for TallyGenerator

  • Sergiy (5/4/2016)


    I think you're overcomplicating the matter.

    There must be already some software which creates and/or saves the sound files in the file storage.

    Without it there would not be .wav files.

    So, your task in only to modify this software by adding a call to a DB procedure updating a record about the .wav file with its current location.

    Procedure should be very simple: update a record for a file by replacing the old location (1st parameted) with a new one (2nd parameter). If there is no record in DB which matches the 1st parameter (this includes the situation when 1st parameter is NULL) it creates a new record. If 2nd parameter is null it deletes a record matching the 1st parameter.

    If you cannot modify the application responsible for storing the file you can do it even with a simple SQLCMD command in a batch file.

    Brother,

    I don't think you've understood my question correctly or have seen my last comments. I apologize upfront if I am sounding complicated here, but my question revolves around using RBS as a solution. Client only needs RBS, but I do not have specific implementation steps for it, and from Microsoft I could only find the implementation of RBS using Sharepoint Content databases, so I would repeat my question now...do you have a step by step approach on implementing RBS for a non-Sharepoint databases. In case if you have implemented anything of this nature in the past, could you please provide me the details...it would be really helpful.

  • We might be using different Googles.

    Because mine one has very little presence of SharePoint in relation to RBS.

    First of all, it tells me that:

    "Remote Blob Storage (RBS) is a library API set "

    https://blogs.msdn.microsoft.com/sqlrbs/page/2/

    And then it sends me to number of codeplex pages with manual instructions, starting with this one:

    http://sqlrbs.codeplex.com/

    So, as I said - you need to connect your sound file producing application to RBS (using API provided).

    If you cannot modify the application - create you own to access the storage.

    And you still need to read status confirmation about stored files from RBS interfaces and update records in SQL database correspondingly.

    _____________
    Code for TallyGenerator

  • Sergiy (5/4/2016)


    We might be using different Googles.

    Because mine one has very little presence of SharePoint in relation to RBS.

    First of all, it tells me that:

    "Remote Blob Storage (RBS) is a library API set "

    https://blogs.msdn.microsoft.com/sqlrbs/page/2/

    And then it sends me to number of codeplex pages with manual instructions, starting with this one:

    http://sqlrbs.codeplex.com/

    So, as I said - you need to connect your sound file producing application to RBS (using API provided).

    If you cannot modify the application - create you own to access the storage.

    And you still need to read status confirmation about stored files from RBS interfaces and update records in SQL database correspondingly.

    I don't know what to tell you if you think you're Google is responding to you with something that my Google does not. Anyways, these links I already know about them, so there is nothing new in there. I can also redirect you to a website with a list of links, and which doesn't even make sense for me to do. I guess you talk a lot about implementing API etc, and I told you clearly that I am not concerned how they are going to handle this from the front end, but I would like to focus more on how to implement things at the back end. Anyways, thanks! for the help. I thought the way you are explaining it...seems that you might have worked on it or for that matter I even ended up thinking that you might actually provide me with an implementation documentation...my bad. If you have time go through this link below, and click on the whitepaper link in that webpage, and read it if possible...and yes this you can get from whichever Google you use.

    https://msdn.microsoft.com/en-us/library/gg638709(v=sql.110).aspx

  • ffarouqi (5/2/2016)


    One of my clients have a project requirement, where they need to store the .wav files out of the database onto the storage subsystem, and have the pointers or metadata stored inside the database. I know FileStream is a compelling option, but I am not sure whether it can be used to server the above purpose. Also, client asked me to look into RBS (Remote Blob Storage) as an option, but nevertheless when I googled it, each, and every article points me towards how that is implemented for Sharepoint, and Content Databases. Now, I am not certain whether RBS can be used to provide a solution for non Sharepoint databases. Since, I haven't had a single article justifying the fact I can't tell the client if that is possible.

    Could you'll please provide me an answer or a mechanism to deal with it. Ya, and I am not sure how the front end would deal with it, but I am more concerned how it would look in the back end, and how I would end up implementing it. Any comments or suggestions is highly appreciated.

    Thanks

    Regards,

    Feivel

    Hmmm... ".WAV" files. Is this for a telephone system by any chance?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (5/5/2016)


    ffarouqi (5/2/2016)


    One of my clients have a project requirement, where they need to store the .wav files out of the database onto the storage subsystem, and have the pointers or metadata stored inside the database. I know FileStream is a compelling option, but I am not sure whether it can be used to server the above purpose. Also, client asked me to look into RBS (Remote Blob Storage) as an option, but nevertheless when I googled it, each, and every article points me towards how that is implemented for Sharepoint, and Content Databases. Now, I am not certain whether RBS can be used to provide a solution for non Sharepoint databases. Since, I haven't had a single article justifying the fact I can't tell the client if that is possible.

    Could you'll please provide me an answer or a mechanism to deal with it. Ya, and I am not sure how the front end would deal with it, but I am more concerned how it would look in the back end, and how I would end up implementing it. Any comments or suggestions is highly appreciated.

    Thanks

    Regards,

    Feivel

    Hmmm... ".WAV" files. Is this for a telephone system by any chance?

    Yes...that's correct. It is for their Avaya systems.

  • ffarouqi (5/5/2016)


    Jeff Moden (5/5/2016)


    ffarouqi (5/2/2016)


    One of my clients have a project requirement, where they need to store the .wav files out of the database onto the storage subsystem, and have the pointers or metadata stored inside the database. I know FileStream is a compelling option, but I am not sure whether it can be used to server the above purpose. Also, client asked me to look into RBS (Remote Blob Storage) as an option, but nevertheless when I googled it, each, and every article points me towards how that is implemented for Sharepoint, and Content Databases. Now, I am not certain whether RBS can be used to provide a solution for non Sharepoint databases. Since, I haven't had a single article justifying the fact I can't tell the client if that is possible.

    Could you'll please provide me an answer or a mechanism to deal with it. Ya, and I am not sure how the front end would deal with it, but I am more concerned how it would look in the back end, and how I would end up implementing it. Any comments or suggestions is highly appreciated.

    Thanks

    Regards,

    Feivel

    Hmmm... ".WAV" files. Is this for a telephone system by any chance?

    Yes...that's correct. It is for their Avaya systems.

    Heh... oh my. And for them to be Avaya systems, as well. Small world. The only thing that would make it even smaller is if they were using PollCat ™ PBX Data Recorders by the WTI company, as well. I wrote a world-wide telephone accounting package for a company called RedSky Tech. American Airlines, Grainger, Cook County Illinois, and Grant Thornton were just some of the customers. We also did what is known as "Enhanced 911" more commonly known simply as "E911". I also used data by CCMI.com, which includes the Telcordia V&H coordinate system, which makes distance calculations as easy as the Pythagorean Theorem thanks to the "Donald Projection" (Please! No Donald Trump jokes!). That was more than a decade ago.

    Shifting gears, WAV files are the result of a "lossless" recording method and, as a result, are quite large. Some companies use them as "perfect masters" to record much smaller MP3 and other compressed audio file formats. We have an IC3 telephone system that generates WAV files and compresses them using a Codex Converter. I have more than 7 years of calls stored, with meta-data, in my "Calls" database.

    When I first arrived at my current company, I found that they had the compressed WAV files stored both in the database and in separate files. The recordings in the database also had the path and file name associated with them. My goal was to remove such "nonsense" as stored WAV files from the database. However, doing due diligence, I used the store path and file name to try to find and confirm that all the files existed. Much to my horror, 10% of the files couldn't be found and another 10% were corrupted and couldn't be played back. An impressive random sampling of the WAV file data in the database showed no such corruption so new lesson learned even though many of the files were quite large.

    Still, 7 years of call recordings is a hell of a lot of storage and full backups were taking more than 10 hours. I developed a partitioning system that allowed me to only have to backup the current month of data dropping the backups down to between a minute (beginning of the month) to just 8 minutes (end of month) and an equally impressive DR restoration method that would allow us to get back in business in about 10 minutes and then take our time doing online piecemeal restores.

    There are a couple of points that I'm trying to make here.

    1. I used to believe and fully agree with the idea of storing only paths and file names to such files. No longer do I believe that. No one takes care of data better than a DBA with a bat. 😛 We haven't lost a single call from the database, ever. The files they came from, however, still occasional disappear.

    2. It's not as bad as you think. Properly partitioned data (and I'll tell you do NOT use real live Enterprise Edition Table Partitioning for this... it sucks for this in more ways than you can imagine) is super easy to maintain (it's JUST data!) and the old partitions can be set to READ_ONLY and backed up only once (twice for a "safety" copy). Indexing maintenance is a snap, backups and restores are lightning quick, and real live "partial" restores for dev and test systems are a snap.

    ... and it can all be done on the Standard Edition using technology from SQL Server version 7! In fact and with little ingenuity required, it can actually be done on SQL Server Express even though I have a 500GB table that the recordings are stored in and SQL Server Express has a 10GB per database limit. Of course, I recommend something a bit beefier than SQL Server Express but the methods are so simple they could be used on Express.

    The Avaya switches are great, especially if you hook them up with a PollCat. I will say that I didn't have the opportunity to capture WAV files from them between 2000 and 2003 when I worked with them but I would imagine they did a pretty good job with those like they did with the rest of the box.

    As a bit of a sidebar, it would be worth it (though not essential) to invest in a good, fast, accurate WAV-to-MP3 converter just to reduce the size of the WAV files.

    And, to be honest, I don't bother with things like FILESTREAM so, again, this could actually be done using something as simple as SQL Server Express (but use the Standard Edition or better... you'll need it for performance of call accounting reports).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (5/5/2016)


    ffarouqi (5/5/2016)


    Jeff Moden (5/5/2016)


    ffarouqi (5/2/2016)


    One of my clients have a project requirement, where they need to store the .wav files out of the database onto the storage subsystem, and have the pointers or metadata stored inside the database. I know FileStream is a compelling option, but I am not sure whether it can be used to server the above purpose. Also, client asked me to look into RBS (Remote Blob Storage) as an option, but nevertheless when I googled it, each, and every article points me towards how that is implemented for Sharepoint, and Content Databases. Now, I am not certain whether RBS can be used to provide a solution for non Sharepoint databases. Since, I haven't had a single article justifying the fact I can't tell the client if that is possible.

    Could you'll please provide me an answer or a mechanism to deal with it. Ya, and I am not sure how the front end would deal with it, but I am more concerned how it would look in the back end, and how I would end up implementing it. Any comments or suggestions is highly appreciated.

    Thanks

    Regards,

    Feivel

    Hmmm... ".WAV" files. Is this for a telephone system by any chance?

    Yes...that's correct. It is for their Avaya systems.

    Heh... oh my. And for them to be Avaya systems, as well. Small world. The only thing that would make it even smaller is if they were using PollCat ™ PBX Data Recorders by the WTI company, as well. I wrote a world-wide telephone accounting package for a company called RedSky Tech. American Airlines, Grainger, Cook County Illinois, and Grant Thornton were just some of the customers. We also did what is known as "Enhanced 911" more commonly known simply as "E911". I also used data by CCMI.com, which includes the Telcordia V&H coordinate system, which makes distance calculations as easy as the Pythagorean Theorem thanks to the "Donald Projection" (Please! No Donald Trump jokes!). That was more than a decade ago.

    Shifting gears, WAV files are the result of a "lossless" recording method and, as a result, are quite large. Some companies use them as "perfect masters" to record much smaller MP3 and other compressed audio file formats. We have an IC3 telephone system that generates WAV files and compresses them using a Codex Converter. I have more than 7 years of calls stored, with meta-data, in my "Calls" database.

    When I first arrived at my current company, I found that they had the compressed WAV files stored both in the database and in separate files. The recordings in the database also had the path and file name associated with them. My goal was to remove such "nonsense" as stored WAV files from the database. However, doing due diligence, I used the store path and file name to try to find and confirm that all the files existed. Much to my horror, 10% of the files couldn't be found and another 10% were corrupted and couldn't be played back. An impressive random sampling of the WAV file data in the database showed no such corruption so new lesson learned even though many of the files were quite large.

    Still, 7 years of call recordings is a hell of a lot of storage and full backups were taking more than 10 hours. I developed a partitioning system that allowed me to only have to backup the current month of data dropping the backups down to between a minute (beginning of the month) to just 8 minutes (end of month) and an equally impressive DR restoration method that would allow us to get back in business in about 10 minutes and then take our time doing online piecemeal restores.

    There are a couple of points that I'm trying to make here.

    1. I used to believe and fully agree with the idea of storing only paths and file names to such files. No longer do I believe that. No one takes care of data better than a DBA with a bat. 😛 We haven't lost a single call from the database, ever. The files they came from, however, still occasional disappear.

    2. It's not as bad as you think. Properly partitioned data (and I'll tell you do NOT use real live Enterprise Edition Table Partitioning for this... it sucks for this in more ways than you can imagine) is super easy to maintain (it's JUST data!) and the old partitions can be set to READ_ONLY and backed up only once (twice for a "safety" copy). Indexing maintenance is a snap, backups and restores are lightning quick, and real live "partial" restores for dev and test systems are a snap.

    ... and it can all be done on the Standard Edition using technology from SQL Server version 7! In fact and with little ingenuity required, it can actually be done on SQL Server Express even though I have a 500GB table that the recordings are stored in and SQL Server Express has a 10GB per database limit. Of course, I recommend something a bit beefier than SQL Server Express but the methods are so simple they could be used on Express.

    The Avaya switches are great, especially if you hook them up with a PollCat. I will say that I didn't have the opportunity to capture WAV files from them between 2000 and 2003 when I worked with them but I would imagine they did a pretty good job with those like they did with the rest of the box.

    As a bit of a sidebar, it would be worth it (though not essential) to invest in a good, fast, accurate WAV-to-MP3 converter just to reduce the size of the WAV files.

    And, to be honest, I don't bother with things like FILESTREAM so, again, this could actually be done using something as simple as SQL Server Express (but use the Standard Edition or better... you'll need it for performance of call accounting reports).

    Thanks! Jeff. It is definitely very informative, but unfortunately it still keeps me thinking on how the implementation steps look like.

  • ffarouqi (5/6/2016)


    Thanks! Jeff. It is definitely very informative, but unfortunately it still keeps me thinking on how the implementation steps look like.

    Understood. The key here will now be for you and your team to decide where you want to store the files because the "implementation steps" will rely on that bit of knowledge.

    If you don't already have it, you need to get the Avaya documentation on the switch itself (and find out if the have an external buffer or not) because marrying the call meta-data to the WAV file in a reliable fashion starts there. You'll need the documentation on how to setup to export the data and only then will you be able to really nail down an implementation plan.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply