Dynamic ETL with SSIS

  • Comments posted to this topic are about the item Dynamic ETL with SSIS

  • I don't know why you'd want to go to such lengths to use SSIS, just use Pervasive (formerly Data Junction). It's a far superior way of getting data into SQL.

  • Martyn Hughes (10/19/2010)


    I don't know why you'd want to go to such lengths to use SSIS, just use Pervasive (formerly Data Junction). It's a far superior way of getting data into SQL.

    I agree with you to a point but there are more elegant ways of doing this in SSIS

    Sarah, you mention that SSIS requires a flat file format to load the data properly in to SQL Server... so why not create the bulk load format file on the fly using a script? its fairly easy to parse the file and determine number of columns, data types etc. and then build the format file and the destination table based on that information.

  • Great work, if you have a spare time to do so. I think SSIS is designed to take care the work done in the back end. Storing the file name and others make sense but rest is really confusing for anyone who have to maintain or troubleshoot the package. I would say simple is better.

  • The assumption that files coming in are of fixed length is little hard to generalize.

    In that case the Configuration table values are hard to determine.

    Great work in getting it up and running

  • Martyn Hughes (10/19/2010)


    I don't know why you'd want to go to such lengths to use SSIS, just use Pervasive (formerly Data Junction). It's a far superior way of getting data into SQL.

    Not everyone can afford third party tools, or get the approval to buy them. Knowing how they could perform this in SSIS is a piece of knowledge that many can benefit from.

    If there's a superior way to use Pervasive, perhaps you would want to mention specifically why rather than just add a snarky comment.

  • Thanks and nice work Sarah! But can you update it with link to source code. I know all the SQL scripts are in the content, but we need SSIS files too. Thanks in advance!

  • Nitya (10/19/2010)


    The assumption that files coming in are of fixed length is little hard to generalize.

    In that case the Configuration table values are hard to determine.

    Great work in getting it up and running

    It's not a generalization if your suppliers have given you a data dictionary. Many of our data suppliers do the same. The files they give us are always formatted with these fixed field lengths in mind, which we must then TRIM before using properly. The only difference in our packages is we handle erroneous row lengths in the package.

    Very interesting way of importing flat files. We also build specific packages for each of our files coming in from different suppliers, and it's a real pain. SSIS can handle a lot of different scenarios, but IMO it's not a simple enough process, as you have mentioned.

    Good article. ... Need to think about the pros and cons of adopting it here. Thanks for sharing!

    -------------------------------------------------------------------------------------------------
    My SQL Server Blog

  • Steve, I've been a great fan of SSC for many years, I've used a number of redgate tools and Litespeed to compress some rather large DBs especially when using a SAN. I think Susan made a good contribution to the forum.

    My comment is not meant to be 'snarky', I'm always troubled when I have to fight within an organization to get useful tools. Pervasive does cost several thousand but the amount of time saved using appropriate tools rather than paying FTEs to 'reinvent the wheel' is certainly a better use of company resources.

    Pervasive has data connectors for just about anything, even really old 132 column report type stuff! (Only useful if you're dealing with old mainframe stuff)

    I've used it to convert Progress to Oracle DW, Progress to SQL and vice-versa, Foxpro to SQL. On a smaller scale I used it to convert Quickbooks to SQL.

    SSIS is certainly better than the old DTS but far behind 3rd party tools.

  • I appreciated you article on SSIS. It is timeconsuming to use SSIS in a lot of situations like yours.

    You might be able to eliminate SSIS completely. You can use xp_cmdshell to run operations like ftp directly on the operating system. You can also unzip. Just build the commands you want.

    Use insert / exec to pull the directory contents into the stored procedure.

    Use bulk insert to load data into a database table. The format file option is great for fixed length records and the fastest way I know to load data.

  • Martyn Hughes (10/19/2010)


    My comment is not meant to be 'snarky', I'm always troubled when I have to fight within an organization to get useful tools. Pervasive does cost several thousand but the amount of time saved using appropriate tools rather than paying FTEs to 'reinvent the wheel' is certainly a better use of company resources.

    ...

    Thanks, and I would agree with you. there are many tools that are worth buying and will help you do a better job with a good ROI. It's good to hear where other tools might fill in a space.

    My comment was there as it appeared you were knocking this article as a way of accomplishing a task without providing any reasoning. Thanks for the clarification.

  • Its good info. Gives me some new ideas on how to handle text files....

    I have always used a User Defined Function to get text files into a table.

    --------------------------------------------------

    -- USAGE :

    -- Select line from

    -- Dbo.uftReadfileAsTable('MyPath','MyFileName')

    --------------------------------------------------

    USE [master]

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    Create FUNCTION [dbo].[uftReadfileAsTable]

    (

    @path VARCHAR(255),

    @filename VARCHAR(100)

    )

    RETURNS

    @file TABLE

    (

    [LineNo] int identity(1,1),

    line varchar(8000))

    AS

    BEGIN

    DECLARE @objFileSystem int,

    @objTextStream int,

    @objErrorObject int,

    @strErrorMessage Varchar(1000),

    @command varchar(1000),

    @HR int,

    @String VARCHAR(8000),

    @YesOrNo INT

    select @strErrorMessage='opening the File System Object'

    EXECUTE @HR = sp_OACreate 'Scripting.FileSystemObject' , @objFileSystem OUT

    if @HR=0 Select @objErrorObject=@objFileSystem, @strErrorMessage='Opening file "'+@path+'\'+@filename+'"',@command=@path+'\'+@filename

    if @HR=0 execute @HR = sp_OAMethod @objFileSystem , 'OpenTextFile'

    , @objTextStream OUT, @command,1,false,0--for reading, FormatASCII

    WHILE @HR=0

    BEGIN

    if @HR=0 Select @objErrorObject=@objTextStream,

    @strErrorMessage='finding out if there is more to read in "'+@filename+'"'

    if @HR=0 execute @HR = sp_OAGetProperty @objTextStream, 'AtEndOfStream', @YesOrNo OUTPUT

    IF @YesOrNo<>0 break

    if @HR=0 Select @objErrorObject=@objTextStream,

    @strErrorMessage='reading from the output file "'+@filename+'"'

    if @HR=0 execute @HR = sp_OAMethod @objTextStream, 'Readline', @String OUTPUT

    INSERT INTO @file(line) SELECT @String

    END

    if @HR=0 Select @objErrorObject=@objTextStream,

    @strErrorMessage='closing the output file "'+@filename+'"'

    if @HR=0 execute @HR = sp_OAMethod @objTextStream, 'Close'

    if @HR<>0

    begin

    Declare

    @source varchar(255),

    @Description Varchar(255),

    @Helpfile Varchar(255),

    @HelpID int

    EXECUTE sp_OAGetErrorInfo @objErrorObject,

    @source output,@Description output,@Helpfile output,@HelpID output

    Select @strErrorMessage='Error whilst '

    +coalesce(@strErrorMessage,'doing something')

    +', '+coalesce(@Description,'')

    insert into @file(line) select @strErrorMessage

    end

    EXECUTE sp_OADestroy @objTextStream

    -- Fill the table variable with the rows for your result set

    RETURN

    END

    your method works well also. Thanks....

    http://www.simple-talk.com/sql/t-sql-programming/reading-and-writing-files-in-sql-server-using-t-sql/

  • Thanks - I needed something like this.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Hi Sarah

    While testing I found that dbo.[Load_RNFile] is not defined anywhere in your script and am bit lost (or may be am totally wrong) Could you please review and profide the schema of dbo.[Load_RNFile]

    Thanks

  • The format for the table would look like this:

    CREATE TABLE [dbo].[Load_RNFile](

    [Id] [int] IDENTITY(1,1) NOT NULL,

    [EverythingElse] [varchar](500) NULL,

    CONSTRAINT [PK_Load_RN] PRIMARY KEY CLUSTERED

    (

    [Id] ASC

    )

    ) ON [PRIMARY]

    GO

Viewing 15 posts - 1 through 15 (of 33 total)

You must be logged in to reply to this topic. Login to reply