January 16, 2012 at 2:35 am
hi,
will the select become slow if huge number of records inserted through ETL?
thanks,
regards,
ami
January 16, 2012 at 11:19 am
Hi, yes selective queries will certainly suffer as the data grows in an unindexed table. So if you have no indexes yet, you should definitely consider creating some.
For more detailed help, post the table create script here along with the select queries, at least the where conditions.
January 16, 2012 at 11:35 am
That's a little bit of a vague question, can you clarify a little please?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 16, 2012 at 10:05 pm
hi,
I have a fact table which is getting loaded with 1 GB of data every day.
Instead of running the ETL once in a day, there is a proposal to run it for 15 minutes once, to get the latest in place.
the data might be reduced will not be 1GB but I'm not sure about the size of the data for 15 mins.
The same fact table might be queried during the ETL load. What could be the effect on select queries? I'm planning to give nolock hint so that the select will not depend on inserts.
now the question is
1. will the inserts affect the select performance?
2. What could be the other precautions I should make to get better performance from select?
Thanks,
Regards,
Ami
January 17, 2012 at 2:16 am
Anamika (1/16/2012)
hi,I have a fact table which is getting loaded with 1 GB of data every day.
Instead of running the ETL once in a day, there is a proposal to run it for 15 minutes once, to get the latest in place.
the data might be reduced will not be 1GB but I'm not sure about the size of the data for 15 mins.
Well, if the 1GB is spread evenly over the day, that's 10 or 20 MB per run. That should take seconds (but you should test that before you make this change
The same fact table might be queried during the ETL load. What could be the effect on select queries? I'm planning to give nolock hint so that the select will not depend on inserts.
So no locking, just a fair chance of incorrect data due to the nolock hint. Sure that your users are OK with their queries possibly returning duplicate data or missing entire pages of rows if they run them while the insert is in progress, because I don't know many users who want incorrect data fast vs correct data slightly slower.
now the question is
1. will the inserts affect the select performance?
2. What could be the other precautions I should make to get better performance from select?
1. Yes, they probably will, but if the inserts are fast (and with small data volumes they should be), you may notice less overall impact than from the 1GB bulk load
2. Consider one of the snapshot isolation levels if there is unacceptable blocking, consider updating statistics after the inserts, especially if you're often querying for the latest rows.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 17, 2012 at 2:38 am
Hi,
The snapshot isolation sounds very cool. I have never actually seen it in action in live systems. I think it must put an incredible pressure on tempdb as all the temporary objects are created there also and there is only one tempdb per instance. So to avoid being a serious bottleneck, there must be a serious review of the hardware design.
Considering this and assuming that read committed isolation would be a sufficient alternative, is the snapshot isolation really better in practice and what are the considerations before introducing it?
Regards
Istvan
January 17, 2012 at 3:15 am
Brigadur (1/17/2012)
I think it must put an incredible pressure on tempdb as all the temporary objects are created there also and there is only one tempdb per instance.
No. Pressure, yes, but unless TempDB's already under serious load or there's an absolute huge volume of changes it's not that much of an issue
Considering this and assuming that read committed isolation would be a sufficient alternative, is the snapshot isolation really better in practice and what are the considerations before introducing it?
Maybe. As always, test carefully, evaluate tradeoffs.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 17, 2012 at 3:31 am
Thank you Gail :).
Cheers
Istvan
Viewing 8 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply