September 13, 2011 at 1:32 am
Hi all
First of all, I'm fairly new to Integration services, so any advice you can give me is highly appreciated.
Here's the situation:
I'm in an internship at a company, as part of my Bachelor degree. I'm going to be put on a major (for me, anyway) project pretty soon, but until then, I have this lovely little assignment. I have a folder filled with 3 file groups. First are the .ses files that we wish to preserve. Those are no problem, as they were easy to filter away with a precedence constraint inside my foreach loop. Next up are the files, that are simply named file. You know the ones, where you enable the option of seeing extension and then remove them, and all you are left with is a file (something.txt - textfile, ends up being something - file)? Anyway, we wish to preserve those files as well. And that is what's bugging me. I cannot, for the life of me, find a guide out there, to avoid deleting those files as well. I've tried precedence constraints like RIGHT(@[User::strFilePathWithName], LEN(@[User::strFileType])) != @[User::strFileType], where strFileType have been both ".", "" and " ", but no matter what, it still takes everything. It then even takes the .ses that I otherwise could filter away.
The third and final file type that I have in the folder are some temporary files. "Then why not enumerate on them?" I've tried that. But they are named from .__1 to .5df or so, but potentially all the way up to .xxx. So since they have different extensions, I have no way of enumerating on them (I don't have one - that's not to say that you guys don't). And these temp files are the ones we wish to delete, if they are older than 28 days.
I hope I've explained my situation well enough. It's not an important assignment, so there's no rush, as I was given it to have something to do while we wait for the project. But if you guys have any idea how to do this, I'll gladly listen. Just make sure to take it in babysteps, okay?
If I missed any information that you need, let me know.
- Henrik
September 13, 2011 at 3:00 am
Those files with no extensions are a pain.
I'd be tempted to do this processing with a Windows batch file - here is one way:
1) Rename files with no extension (this is possible - try rename *. *.keep, for example)
2) Move files to be retained to another folder (use robocopy)
3) Delete everything that remains (delete...)
4) Move the files back (robocopy again)
5) Rename your renamed files back to how they were (if required).
September 13, 2011 at 3:14 am
Hi Phil
I'd love to try that. I found a guide on how to create batch-files, but one thing still eludes me. How do I ensure that only the files I want gets an additional extension, namely the .ses and the files without any? As far as I can tell, your first rename command, would rename ALL the files, including the temp-files, and that leaves me right back where I started, at least as far as I can tell...
Got any advice for that one?
Henrik
September 13, 2011 at 3:19 am
Nevermind, Phil. I found a small guide and it actually said what you said - I just didn't see the space between *. and *.keep.
I'll try and see if I can get this to work. Thanks again.
Henrik
September 13, 2011 at 3:31 am
No problem. Let us know how you get on and post back with any additional questions.
September 13, 2011 at 4:51 am
Hi Phil
I think I got it now. Granted, what I've made is only a small test on my own laptop, I still need to make the package on the right server that already have most of the things needed, but as least I'll have a model to build it from.
What I did (am going to?) was (will be?) to first execute a batch file, that renames (*.) to (*.keep). Then, in a foreach loop, I loop over all files in the folder, putting the filename into a variable. In a script task I find the file and determine its age - if it's older than 28 days (not used in my own laptop example right now, but I know that part works from previous tests), it will be allowed to proceed down the flow. My precedence constraints now ensures that a file cannot be either .ses or .keep, or it won't be processed (possibly .bat too, as I can't remember if there's subfolders or not, so I still need to see where exactly my batch files will be located). If it is not one of those files, it will be deleted - I'll start by copying them first though, to ensure that I got everything in order. Once the foreach loop is over and done, another execute process task ensures that all (*.keep) returns to (*.) just to be sure.
I think it's what you suggested, but I'm not entirely sure. In any case, it seems to be working in my own tests. Once it's done, though, I'll have to take a look at making the package configurable, as it would be nice if/when the locations are altered.
Once again, many many thanks. I've never used batch-files before (that I can remember, anyway), so this solution would never have occurred to me.
September 13, 2011 at 5:11 am
starspejd (9/13/2011)
Hi PhilI think I got it now. Granted, what I've made is only a small test on my own laptop, I still need to make the package on the right server that already have most of the things needed, but as least I'll have a model to build it from.
What I did (am going to?) was (will be?) to first execute a batch file, that renames (*.) to (*.keep). Then, in a foreach loop, I loop over all files in the folder, putting the filename into a variable. In a script task I find the file and determine its age - if it's older than 28 days (not used in my own laptop example right now, but I know that part works from previous tests), it will be allowed to proceed down the flow. My precedence constraints now ensures that a file cannot be either .ses or .keep, or it won't be processed (possibly .bat too, as I can't remember if there's subfolders or not, so I still need to see where exactly my batch files will be located). If it is not one of those files, it will be deleted - I'll start by copying them first though, to ensure that I got everything in order. Once the foreach loop is over and done, another execute process task ensures that all (*.keep) returns to (*.) just to be sure.
I think it's what you suggested, but I'm not entirely sure. In any case, it seems to be working in my own tests. Once it's done, though, I'll have to take a look at making the package configurable, as it would be nice if/when the locations are altered.
Once again, many many thanks. I've never used batch-files before (that I can remember, anyway), so this solution would never have occurred to me.
Sounds slightly more complicated than you suggested in your first post. But, as this is purely a file-handling problem - and not a data import/export problem, I don't think you need SSIS at all, to be honest.
In particular, the Foreach loop is unlikely to be the optimum solution - purely on the grounds that it will process one file at a time. My idea is to do any rename/delete/move operations on sets of files to get things done quicker.
If you would rather use SSIS than Windows batch files, I understand. But a single Script task will be all that you need to do all of the file handling.
September 13, 2011 at 5:39 am
I might not have been entirely clear in that last one.
The two batch file operate outside of the foreach loop - one before and one after. Inside the foreach loop, it is determined, if the file the loop is looking at should be deleted or not. Once it's through with all the files, it renames them again.
Now, I'm sure there's plenty of ways to do it, but they asked that it be done as a SSIS package, as there's multiple folders and even a few temporary tables in a database that needs to be removed. And if nothing else, I can use the practise.
Henrik
Viewing 8 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy