Todd, let me make sure that I understand your senario before I post an answer. Am I right in saying that you have Parent table A which can or does have multiple Child table B records. Each Parent table A record requires that a single file be created to contain...
Are you saying there are around 3 million of these parent records?
I think that the slowdown in your file creation rate "sounds" like mem-to-leave or other memory pressure.
Can you correct or confirm my summation of the issue before we go on?
Zach, you have this correct.
To flesh out the scenario a bit, this is a conversion process from one application to another involving as much as 10-15 years worth of data. In the original application, every time a real-life event happened (which could be tens or hundreds of time per day) a report (or narrative) was written. These narratives vary in from one 70-character line to as many as 1000 70-character lines.
The database in the new application is designed differently, so these old narratives are converted to text documents and then “attached” to the converted event in the new application. This allows users to see all the old information, especially when the old info doesn’t fit well into the new database.
This conversion process will happen when the customer goes live on the new application. They will stop the old system, the data will be converted, and the customer will go live on the new system that now contains their converted data. The text documents don’t have to be generated exactly within that window, but will come very shortly thereafter. So we are attempting to create these documents as quickly as possible.
For the particular customer I am currently helping, they have roughly 3 million events for which the narratives have to be converted to text files.
Any thoughts/suggestions are welcome.
I agree that this sounds like some sort of memory-pressure issue. I'm working to diagnose 1) exactly what the cause is, and 2) how I can fix it/work around it.
Very good summary of methods, but you forgot the oldest one in the book. We routinely use CMD calls to write files, but surprisingly it's just as fast to write using the ECHO command to write any amount of data necessary. Something like:
Declare @cmd varchar(255), Declare @data varchar(1000), Declare @append varchar(2), Declare @path varchar(255), Declare @filename varchar(255), Declare @output varchar(2000)
Set @cmd = 'echo'
Set @data = 'some sort of data'
Set @append = '>>' -- to create a file
Set @append = '>' -- to append a file
Set @path = '\\someservershare\share' or 'c:\localpath\localdir'
Set @filename = 'somefilename'
Set @output = @cmd + ' ' + @data + ' ' + @append + ' ' + @path + @filename
EXEC master..xp_cmdshell @output
There are other techniques to just create an empty file by using the @echo command, and dozens of other things you can do with echo. Plus, if you place the whole thing in params in a Store Procedure you can pass in other commands and do most any DOS command you want. Of course be sure the security is set correctly so it can't be abused. This method outputs about a 1.5MB text file in about 30 seconds.
I'm thining the above are backwards... >> is the append symbol and > is the create symbol in DOS (Cmd Window)...
I just tried the first osql cmd you wrote in SQL 2005 and replaced osql with sqlcmd
osql command runs in 7 seconds
sqlcmd runs in 2 minutes 26
SQL 2005 docs say that sqlcmd is the replacement for osql. Can this be right? Obviously they are structured very differently, but wouldn't you expect that sontinuity of performance might be desirable?
I have discovered the problem with my text file generation degradation.
It all comes back to the file system. As files are being generated and more and more files are generated into one folder, the subsequent file generates get slower. By moving to a new folder, the file generates pick right up.
In my final test scenario I did the following:
Step 1 ran in 25 minutes with the file generation degrading from 16800/min down to 600/min.
Step 2 ran in 5 minutes with no noticeable degradation, averaging about 17000/min.
Step 3 ran in 24 minutes with the file generation degrading from 18000/min down to 350/min.
So I will include multiple subfolders in my file generation process.
I modified my process to include subfolders and ran a partial test. 446,000 documents were generated in 68 minutes, giving an average rate of over 6500 documents generated per minute.
I modified a stored procedure that moved DTS packages from one server to another. It is from http://www.dbazine.com/sql/sql-articles/larsen8
It would stop after moving 255 packages, everytime, no mateter how the packages were sorted.
It would always stop at sp_OAMethod @object, 'LoadFromSQLServer'. Make of it what you will. Smells like a memory leak.