This is my first time here so hope I am posting this at the right place. I recently started a new job which involves SQL scripting and loading tables into a database using SSIS within SSDT
Is there a way to load multiple CSV files into a database such that a new table is created for every CSV file loaded. Hope the following explanation of a scenario paints a clearer picture
Every month or so I receive a large number of excel file (anywhere from 40-100) from the client. I convert these excel files into CSVs after removing some unwanted header rows
The composition of the files are such there are different groups of files where each group contains information for different department or business. The structure of the files within a group are the same but vary from group to group, i.e Group1 files have the same structure and so on. The files are named
Once I have the CSVs ready, the following are the steps I perform using SSIS in SSDT. (Note that I repeat these for each and every file that it has now become muscle memory but it is extremely time consuming!)
Step 1 : I add a single data flow task and click on the Data flow tab in ssdt
Step 2: I then drag and drop "Flat File Source" and "OLE DB Destination" into the canvas. I create a connection by joining the blue line from the flat file source to the OLEDB destination controls. I name the Destination "Filename1_Group1"
Step 3: I then double click on the flat file source, add Filename1_Group1.csv, change Text Qualifer to " . Next, I go to the Advanced properties in the flat file connection manager editor and change the output column width to 800 for all the columns and then click "OK"
Step4 : In the Advanced Editor for Flat File Source window, I click on the Component Properties tab and enter "OrgFilename" in the FileNameColumnName properties button and then click OK
Step5: I open the OLE DB Destination editor, click New and then select the relevant database (the database remains the same for all the files). I then select "Table or view-fast load" from the Data access mode drop down options and click on NEW which opens up the Create Table window
Step 6: In the Create table window I add an identifier column at the beginning of the query by writing this code Id_num IDENTITY(1,1) and click OK
Step 7: I click on the Mapping property to see if the columns are mapped correctly and then click OK
Step 8: I click on Start which then executes the process and loads Filename1_Group1.csv in to the specified database. This then creates a new table named Filename1_Group1 in the database
What I have described here is just for one file. I have to do this for a very large number of files and, as you can see, it is definitely not a feasible method to do this
Is there a way to improve this process such that the data flow performs these steps for all files within a specific folder. As different groups have different structures, would this still be possible. Even performing these tasks in batches will save me a lot of time and effort
Thanks in advance!