Data population on SQL filestream enabled table

Question

Post reply

Data population on SQL filestream enabled table

Mia2022

Old Hand

Points: 342
More actions
June 10, 2021 at 11:25 pm
Go to Answer
#3894688

Hi team,
We wanted to do load test using the sql filestream enabled table. It is an audit tablr which captures the request and response API.
We have enabled filestream on request and response fields.
I understand the sql filestream will create the respective files for each record behind the scene.
Now i want to pre-populate the table with around 5 Million records at one stretch, however this might not be the case in prod where the data gets loaded over the period of time.
Consider the request and response as 5 MB.
Constraints i could see,
Time taken to perform insert - high
Any problem with the backup file going to create for that day?
Any other problems ?
How we can do this with minimal impact ?.
Steve Jones - SSC Editor

SSC Guru

Points: 738337
More actions
June 14, 2021 at 7:36 pm
Answer
#3896142

You haven't really explained what a "problem" is to you. You mean the server will run slower? It could with any insert in bulk.
Filestream has tradeoffs, which is what you are load testing. Test the load as you need to with a normal table and inserting large volumes of data and then test with Filestream. Most of the Filestream benefits are when reading. When writing, it really depends on how your file system is structured and the underlying hardware.
If you run a batch of 5mm records, you will get load from writing to the filesystem outside the MDF, as well as log records.
As with any load, if you spread this out over time, the impact on the workload is less than if you try to run a large batch at once.

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Site Owners SSC Guru Points: 80342 More actions · Answer 1

Thanks for posting your issue and hopefully someone will answer soon.

This is an automated bump to increase visibility of your question.

Steve Jones - SSC Editor SSC Guru Points: 738337 More actions · Answer 2

I'm not quite sure what you mean, but with your questions.

The backup file will contain this data inserted into a FILESTREAM table, so it will be larger.
Not sure what you mean by problems. Filestream allows some data to be visible through a file system API, which means that access and xfer can be faster than T-SQL. I believe that with files above 1MB, FS is definitely faster. Below 256KB, it's not, and in between "it depends".
Do what with minimal impact? It's unclear what you mean here.

Mia2022 Old Hand Points: 342 More actions · Answer 3

Ok. To make it clear, if i insert around 5 million records in one stretch for load testing, will it be a problem however it might not be the case in prod where the data gets inserted over the period of time..

**AGL enabled for database..

Any suggestions for data insert in batches in filestream enabled table

This reply was modified 4 years, 1 month ago by Mia2022. Reason: Added details

Mia2022 Old Hand Points: 342 More actions · Answer 4

Thanks.

Yes we are inserting records in batches not bulk insert.

What i understood so far is, we need to focus more on normal table rather on focusing on filestream enabled tables for load testing.

The load testing which i mean is to simulate production like scenario of large volume. In order to do that, i need to insert records i large volume..

AGL enabled for database..

This reply was modified 4 years, 1 month ago by Mia2022.

Steve Jones - SSC Editor SSC Guru Points: 738337 More actions · Answer 5

I'd watch both. Filestream has performed well for many customers. The downside is usually the code changes to take advantage of the feature. If you are storing documents that are on average, >1MB, it should work well. If they are small, it might not be better and not worth the effort.

Run your load test a few different ways to see how things might perform. I'd try to get average times for queries in here, so you better understand what is changing, not just a total workload time. I'd be sure I tested some INSERT times, not just SELECTs.

Data population on SQL filestream enabled table

Cookies on SQLServerCentral