Alternative to filestream BLOB storage in Azure PaaS SQL?

  • (first post in a squillion years)

    I'm assisting with a project to migrate a solution to PaaS SQL DB and have just discovered that the current system is maintaining documents as BLOBs in order to retain their formatting etc., as their formatted content is used in reporting.

    Aside from storing the documents in SQL Server as HTML I can't come up with any way to provide this capability if the DB is implemented as PaaS. I wondered if you could confirm that I'm dead right or whether anyone has found another solution?

    Thanks

  • MissTippsInOz - Tuesday, June 12, 2018 4:06 PM

    (first post in a squillion years)

    I'm assisting with a project to migrate a solution to PaaS SQL DB and have just discovered that the current system is maintaining documents as BLOBs in order to retain their formatting etc., as their formatted content is used in reporting.

    Aside from storing the documents in SQL Server as HTML I can't come up with any way to provide this capability if the DB is implemented as PaaS. I wondered if you could confirm that I'm dead right or whether anyone has found another solution?

    Thanks

    Another option may be to use Azure Data Warehouse with Azure Blob Storage or Azure Data Lake Store. The purpose of Azure Data Warehouse is to leverage Polybase with your documents and or other unstructured data. Polybase allows you to create EXTERNAL TABLES on top of those documents where you can define a table/schema on top of them without having to use say, FILESTREAM with MAX datatypes. It also helps keep the documents OUT OF THE DATABASE where you can call them as needed.

    Azure SQL DB can take advantage of this through Elastic Queries. You can create an EXTERNAL TABLE that can be queried using elastic query into Azure SQL DB. Essentially, separating your hot and cold data. Use Azure Data Warehouse to store all the data, computate it and provide a window to your unstructured BLOB storage and then source your data marts used for reporting.

    While this is adding yet another service or few services to the stack here. You can just pause the warehouse when not in use. Blob storage also provides another accessible way to land that data with a publically documented API to boot. 

    I personally use both Blob storage and Azure Data Lake Store with Azure Data Warehouse. This allows me to store all my documents outside the DB where I can call to them as needed. As I do this for reporting, both BLOB and Data Lake is still accessible to PowerBI reporting until I can get to the data later on in the cycle. Saves me a lot of time and gives me flexibility to focus on other things until I can ETL it into the DB.

  • There's nothing that keeps your db synced with files.You'll need a process that can verify what's out of sync.

    I'd look at the suggestions above, especially Data Lake.

  • xsevensinzx - Tuesday, June 12, 2018 7:39 PM

    MissTippsInOz - Tuesday, June 12, 2018 4:06 PM

    (first post in a squillion years)

    I'm assisting with a project to migrate a solution to PaaS SQL DB and have just discovered that the current system is maintaining documents as BLOBs in order to retain their formatting etc., as their formatted content is used in reporting.

    Aside from storing the documents in SQL Server as HTML I can't come up with any way to provide this capability if the DB is implemented as PaaS. I wondered if you could confirm that I'm dead right or whether anyone has found another solution?

    Thanks

    Another option may be to use Azure Data Warehouse with Azure Blob Storage or Azure Data Lake Store. The purpose of Azure Data Warehouse is to leverage Polybase with your documents and or other unstructured data. Polybase allows you to create EXTERNAL TABLES on top of those documents where you can define a table/schema on top of them without having to use say, FILESTREAM with MAX datatypes. It also helps keep the documents OUT OF THE DATABASE where you can call them as needed.

    Azure SQL DB can take advantage of this through Elastic Queries. You can create an EXTERNAL TABLE that can be queried using elastic query into Azure SQL DB. Essentially, separating your hot and cold data. Use Azure Data Warehouse to store all the data, computate it and provide a window to your unstructured BLOB storage and then source your data marts used for reporting.

    While this is adding yet another service or few services to the stack here. You can just pause the warehouse when not in use. Blob storage also provides another accessible way to land that data with a publically documented API to boot. 

    I personally use both Blob storage and Azure Data Lake Store with Azure Data Warehouse. This allows me to store all my documents outside the DB where I can call to them as needed. As I do this for reporting, both BLOB and Data Lake is still accessible to PowerBI reporting until I can get to the data later on in the cycle. Saves me a lot of time and gives me flexibility to focus on other things until I can ETL it into the DB.

    Sounds like an article or series of articles for those that would like to learn more and could use guidance.

  • Lynn Pettis - Friday, June 15, 2018 10:37 AM

    xsevensinzx - Tuesday, June 12, 2018 7:39 PM

    MissTippsInOz - Tuesday, June 12, 2018 4:06 PM

    (first post in a squillion years)

    I'm assisting with a project to migrate a solution to PaaS SQL DB and have just discovered that the current system is maintaining documents as BLOBs in order to retain their formatting etc., as their formatted content is used in reporting.

    Aside from storing the documents in SQL Server as HTML I can't come up with any way to provide this capability if the DB is implemented as PaaS. I wondered if you could confirm that I'm dead right or whether anyone has found another solution?

    Thanks

    Another option may be to use Azure Data Warehouse with Azure Blob Storage or Azure Data Lake Store. The purpose of Azure Data Warehouse is to leverage Polybase with your documents and or other unstructured data. Polybase allows you to create EXTERNAL TABLES on top of those documents where you can define a table/schema on top of them without having to use say, FILESTREAM with MAX datatypes. It also helps keep the documents OUT OF THE DATABASE where you can call them as needed.

    Azure SQL DB can take advantage of this through Elastic Queries. You can create an EXTERNAL TABLE that can be queried using elastic query into Azure SQL DB. Essentially, separating your hot and cold data. Use Azure Data Warehouse to store all the data, computate it and provide a window to your unstructured BLOB storage and then source your data marts used for reporting.

    While this is adding yet another service or few services to the stack here. You can just pause the warehouse when not in use. Blob storage also provides another accessible way to land that data with a publically documented API to boot. 

    I personally use both Blob storage and Azure Data Lake Store with Azure Data Warehouse. This allows me to store all my documents outside the DB where I can call to them as needed. As I do this for reporting, both BLOB and Data Lake is still accessible to PowerBI reporting until I can get to the data later on in the cycle. Saves me a lot of time and gives me flexibility to focus on other things until I can ETL it into the DB.

    Sounds like an article or series of articles for those that would like to learn more and could use guidance.

    You hinting that to me? Thought that was already covered here?

  • It's not really covered, or not covered like that.

  • xsevensinzx - Saturday, June 16, 2018 6:42 AM

    Lynn Pettis - Friday, June 15, 2018 10:37 AM

    Sounds like an article or series of articles for those that would like to learn more and could use guidance.

    You hinting that to me? Thought that was already covered here?

    Even if it has been having another persons perspective is always welcome.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply