Instead of having each thread do the all the work of getting the data and such you might want to consider using a diconnected recordset (still might want a stage flag for you data) and use common storage to all of your threads,
then as each thread is working thru it you can signal a record in use with the stage flag and complete when done. This way you know what is being processed by other threads and when they have been processed.
This is just a thought off the top of my head.