SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Extract data from CSV file - SSIS


Extract data from CSV file - SSIS

Author
Message
var05
var05
SSC-Enthusiastic
SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)

Group: General Forum Members
Points: 183 Visits: 437
Hi all,

I have a csv file with data from 2005...and on daily basis the data is getting appended in the csv file.. Im able to use a conditional split to include data for current 2 years and exclude the rest...


But I just wanted to know if something can be done in the script task(control flow) itself, so that the flat file connection manager itself picks up the data for the current year only. I don wanna pass all the rows to the conditional split and exclude the rows...any help on this?

Thanks
Phil Parkin
Phil Parkin
SSCoach
SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)

Group: General Forum Members
Points: 18714 Visits: 20451
Why not archive the file (or the very old data in the file) rather than letting it grow?


Help us to help you. For better, quicker and more-focused answers to your questions, consider following the advice in this link.

If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.

Please surround any code or links you post with the appropriate IFCode formatting tags. It helps readability a lot.
var05
var05
SSC-Enthusiastic
SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)

Group: General Forum Members
Points: 183 Visits: 437
archiving the file is not in my scope...the file is being generated by the third party vendor on a daily basis...im just accessing their back up csv files to feed data for my database.

Is there by anyways to read the data from CSV file for current year or last 30 days or so...instead reading everything and using the split?

thanks
Orlando Colamatteo
Orlando Colamatteo
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15035 Visits: 14396
How are your C# or VB.net skills? I am thinking pre-processing the file might be one way. You could use a Script Task to read the file in reverse order until you found data that has already been imported. Depending on how much data you expect and how much memory you have available you could store the data as you reversed through the file and then write it in one shot when you found old data, or, you could not store the data as you reversed and just changed course to read the file in forward-order to the end when you found old data, writing a new file to disk as you went. Either way you would end up with a file that you could import straightaway using a Data Flow Task. It would be tedious but not terribly difficult if you have some .NET skills.

See the second example in this article to see how to read a file in reverse order. And the first example shows you how to write a file byte by byte. It is for .NET 4.0 but these techniques will work in 2.0, it's just that the 2.0 article does not have the nice example to show you how to read a file in reverse using Seek.
http://msdn.microsoft.com/en-us/library/system.io.filestream.seek(v=vs.100).aspx

__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
Phil Parkin
Phil Parkin
SSCoach
SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)

Group: General Forum Members
Points: 18714 Visits: 20451
opc.three (4/8/2013)
How are your C# or VB.net skills? I am thinking pre-processing the file might be one way. You could use a Script Task to read the file in reverse order until you found data that has already been imported. Depending on how much data you expect and how much memory you have available you could store the data as you reversed through the file and then write it in one shot when you found old data, or, you could not store the data as you reversed and just changed course to read the file in forward-order to the end when you found old data, writing a new file to disk as you went. Either way you would end up with a file that you could import straightaway using a Data Flow Task. It would be tedious but not terribly difficult if you have some .NET skills.

See the second example in this article to see how to read a file in reverse order. And the first example shows you how to write a file byte by byte. It is for .NET 4.0 but these techniques will work in 2.0, it's just that the 2.0 article does not have the nice example to show you how to read a file in reverse using Seek.
http://msdn.microsoft.com/en-us/library/system.io.filestream.seek(v=vs.100).aspx


Neat idea. Maybe using a Script Component source would mean that this could be done in one hit in a data flow (though I'm not sure how you would gracefully 'escape' from the data flow when the old records are reached).


Help us to help you. For better, quicker and more-focused answers to your questions, consider following the advice in this link.

If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.

Please surround any code or links you post with the appropriate IFCode formatting tags. It helps readability a lot.
Orlando Colamatteo
Orlando Colamatteo
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15035 Visits: 14396
Phil Parkin (4/9/2013)
opc.three (4/8/2013)
How are your C# or VB.net skills? I am thinking pre-processing the file might be one way. You could use a Script Task to read the file in reverse order until you found data that has already been imported. Depending on how much data you expect and how much memory you have available you could store the data as you reversed through the file and then write it in one shot when you found old data, or, you could not store the data as you reversed and just changed course to read the file in forward-order to the end when you found old data, writing a new file to disk as you went. Either way you would end up with a file that you could import straightaway using a Data Flow Task. It would be tedious but not terribly difficult if you have some .NET skills.

See the second example in this article to see how to read a file in reverse order. And the first example shows you how to write a file byte by byte. It is for .NET 4.0 but these techniques will work in 2.0, it's just that the 2.0 article does not have the nice example to show you how to read a file in reverse using Seek.
http://msdn.microsoft.com/en-us/library/system.io.filestream.seek(v=vs.100).aspx


Neat idea. Maybe using a Script Component source would mean that this could be done in one hit in a data flow (though I'm not sure how you would gracefully 'escape' from the data flow when the old records are reached).

Good thought. I think a Script Component setup as a Source could keep it all in one Data Flow, which might avoid an intermediate file. The only tradeoff I can think of would be having to manually parse the row in the .NET or maybe a Derived Column Transform if pushing a single-column row down. I have not used a Script Component Source in a long time but I think after we pushed all the new data onto the pipeline as new rows and found the first old data then we would just set "DTS task = Success" and the Data Flow would happily move on.

__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
Gosta Munktell
Gosta Munktell
Mr or Mrs. 500
Mr or Mrs. 500 (519 reputation)Mr or Mrs. 500 (519 reputation)Mr or Mrs. 500 (519 reputation)Mr or Mrs. 500 (519 reputation)Mr or Mrs. 500 (519 reputation)Mr or Mrs. 500 (519 reputation)Mr or Mrs. 500 (519 reputation)Mr or Mrs. 500 (519 reputation)

Group: General Forum Members
Points: 519 Visits: 2161
You don't tell us how big the file is.
You might not be able to store it in the internal memory and read it reversed?
Anyhow you can read the file per line and write it to another file.
There must be something in the file which tells the date or year of a record.
If the file is sorted you can use a pretty simple and fast method like:


Imports System
Imports System.IO

Public Class testa
Public Shared Sub test()

Dim line As String
Dim reader As StreamReader = New StreamReader("C:\temp\input.txt")
Dim writer As StreamWriter = New StreamWriter("C:\temp\output.txt")

line = "start"

' Loop over each line in file, While line is Not Nothing.
Do While (Not line Is Nothing)
line = reader.ReadLine
If Mid(line, 1, 9) <= "something" Then ' Adjust
Else
writer.WriteLine(line)
End If

Loop

writer.Close()
reader.Close()

End Sub
End Class

Thanks
Gosta
Orlando Colamatteo
Orlando Colamatteo
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15035 Visits: 14396
Gosta Munktell (4/10/2013)
You don't tell us how big the file is.
You might not be able to store it in the internal memory and read it reversed?

You do not need to store it as you read through in in reverse, or you could, both options were mentioned above and size of file and available memory would likely dictate the choice.

Anyhow you can read the file per line and write it to another file.
There must be something in the file which tells the date or year of a record.
If the file is sorted you can use a pretty simple and fast method like:


Imports System
Imports System.IO

Public Class testa
Public Shared Sub test()

Dim line As String
Dim reader As StreamReader = New StreamReader("C:\temp\input.txt")
Dim writer As StreamWriter = New StreamWriter("C:\temp\output.txt")

line = "start"

' Loop over each line in file, While line is Not Nothing.
Do While (Not line Is Nothing)
line = reader.ReadLine
If Mid(line, 1, 9) <= "something" Then ' Adjust
Else
writer.WriteLine(line)
End If

Loop

writer.Close()
reader.Close()

End Sub
End Class

Thanks
Gosta

What do you think using a Flat-File Manager to read the file and then applying the Conditional Split in the Data Flow accomplishes? Whistling

__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
var05
var05
SSC-Enthusiastic
SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)SSC-Enthusiastic (183 reputation)

Group: General Forum Members
Points: 183 Visits: 437
Hi All,

Thanks for the replies.

As you say to keep it simple I use conditional split to filter the records from falt file and load it. But the proble is the flat file reads all the records.. I wanted to restrict the flat file to read only the reqd records and not the entire one..Probably I got to implete some logic within the script component to read olny the reqd data from flat file.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search