Automating SSIS Design Pattern - Stage Delimited Flat File

  • Comments posted to this topic are about the item Automating SSIS Design Pattern - Stage Delimited Flat File

  • Very interesting but why someone would spend so much time on such trivial tasks? It would be whole lot easier just do it in SSIS. If you want to do that for multiple files and multiple destination tables, you simply duplicate the package and update the connection managers. SSIS is created to release people from coding c# or whatever programming language so why go back? If you really like programming, you can c# your entire ETL project, which is a lot more powerful and far more flexible and most c# programmers can understand the code.

  • ^I agree; this is interesting but I can't see your average developer using this tool since one of the points of SSIS is to get away from C#and XML. Copying and pasting a template package and then adjusting the values 50 times may be monkey work but hey at least you don't have to explain to your boss that your packages don't work because of a non native tool you've decided to try.

  • nzhang6666 - Huh! My attempt to write an elaborate article so even people new to Biml would be follow may have caused you to feel that this is complicated. This is actually quite the opposite. And staging is only the beginning on an ETL project, you can use Biml to manage the entire ETL, not just this task.

    May be I should do a better job communicating the simplicity behind this 🙂

  • TheComedian (7/13/2015)


    ^I agree; this is interesting but I can't see your average developer using this tool since one of the points of SSIS is to get away from C#and XML. Copying and pasting a template package and then adjusting the values 50 times may be monkey work but hey at least you don't have to explain to your boss that your packages don't work because of a non native tool you've decided to try.

    But if I was asked to choose between doing money work and using a new tool that would reduce time and improve quality, I'd always choose the later.

    With this "new tool", all you need to do is to fool-proof the template package. Then all the other packages will exactly be the same, same as the packages created by copying and pasting them. Moreover, if you have to change something, say add an email task, you simply have to add a few lines of code and regenerate the packages. If you were copy-pasting, you'd have to manually do that 50 more times. What if something else changes 6 months down the line?

    Hope I convinced you a little bit now!

  • This tool is no doubt useful and I would definitely include it in my future projects that involve hundreds of similar packages but what I'm saying is that your average developer wouldn't want to go back to XML and C# coding after using SSIS. He will most likely be more comfortable with copying and pasting.

    Another question: Have you had any experience using this tool with Theobald components? These are third party components used to query SAP databases.

  • Thanks for the all the work putting together this article.

  • This article is excellent. After managing hundreds of packages through monthly updates that require metadata changes to support all types of changes to database schema and package code this is nothing short of revolutionary. Finally I can check in code into subversion that can actually be compared! I have played around with generating packages with C# with some success but this seem much more intuitive. Well done!

  • TheComedian - No, I don't have experience with Theobald. If you need help, try posting on Biml forums and I'm sure one of the guys will help you. Link is here: https://www.varigence.com/Forums?forumName=Biml

    Iwas Bornready - Thanks for the comment!

    kgresham - I'm glad you liked the article! Yes, Biml is revolutionary, and you're spot on with source control too!

  • Nice Idea. We use an SQL-Script that scan the folder with the Flatfiles (csv). Every csv-File in this folder will be load with BULK INSERT into the database.

    For this we only add one row in the head with the forced datatypes (DECIMAL, INT, NVARCHAR, etc.).

    The Job looks in the first row with the column-names an load the data from the flatfile in a temp-table.

    In the next step the job looks in the secound row with the data types an build the final table (named like the flatfile).

    Then we copy and convert the data from the temp-table into the final table. After this we erase the temp table.

    So we can easy add columns in the flatfile oder put a new flatfile in the folder. Errors like truncation we catch in the script.

    With this script we don't need to make a flatfile connector.

    Alter load we make some checks for consistecy an content. Errors where reported by mail.

    Greeting from germany

    Klaus

  • I've used BIML for a large scale project before

    ~200+ ETL interfaces which included CSV, XML, Oracle etc...

    My concerns are:

    1. BIML might be free but to use it in anger you will need to install Mist which is not cheap at $4000 per licence plus annual support costs.

    https://varigence.com/Store

    2. There is a very steep learning curve. Ideally you want to be expert in SSIS and SSIS design patterns before using BIML

    3. Debugging is a nightmare. First you have to write a valid BIMLScript which then generates valid BIML which is then transformed into a valid SSIS Package/Project. You can then run the SSIS to check it does what it's supposed to do.

    4. Annoyingly there is the feeling that BIML was invented to fix a problem which shouldn't exist in the first place. SSIS Should be dynamic out of the box.

    I'd say that if you do need a dynamic metadata driven solution then it will be less painful to forget BIML and SSIS and write the solution in pure C#.

  • Samuel Vella (7/14/2015)


    I've used BIML for a large scale project before

    ~200+ ETL interfaces which included CSV, XML, Oracle etc...

    My concerns are:

    1. BIML might be free but to use it in anger you will need to install Mist which is not cheap at $4000 per licence plus annual support costs.

    https://varigence.com/Store

    not sure what "use it in anger" means but I agree about the cost of Mist. I work for a big, very profitable financial institution and it took me months to get a $200 package approved and purchased that is mission-critical to our work. A $4000 request would get me laughed out of the office.

    2. There is a very steep learning curve. Ideally you want to be expert in SSIS and SSIS design patterns before using BIML

    No kidding! Not to mention being moderately proficient in C#.

    3. Debugging is a nightmare. First you have to write a valid BIMLScript which then generates valid BIML which is then transformed into a valid SSIS Package/Project. You can then run the SSIS to check it does what it's supposed to do.

    quite true. Even worse if you have any dynamism in the package (e.g. conditional execution, variables for table names, expressions in tasks, etc.)

    4. Annoyingly there is the feeling that BIML was invented to fix a problem which shouldn't exist in the first place. SSIS Should be dynamic out of the box.

    Well, it is dynamic to a point. Expressions, conditional execution and constraints, variables for table names or sql commands in the OLEDB components, etc.

    I'd say that if you do need a dynamic metadata driven solution then it will be less painful to forget BIML and SSIS and write the solution in pure C#.

    though you'd lose the out-of-the-box multithreading, buffer management, etc.

    Gerald Britton, Pluralsight courses

  • Hi,

    I am curious.

    Are you able to automate the creation of an Excel file that preserves data types?

  • scorchpc 96170 (7/14/2015)


    Hi,

    I am curious.

    Are you able to automate the creation of an Excel file that preserves data types?

    depends what you mean by preserves data types. Can you post some examples of what that would look like? Note that Biml can drive everything that is possible with SSIS but cannot create new transforms or enhance transforms added to the resulting packages.

    Gerald Britton, Pluralsight courses

  • g.britton (7/14/2015)


    scorchpc 96170 (7/14/2015)


    Hi,

    I am curious.

    Are you able to automate the creation of an Excel file that preserves data types?

    depends what you mean by preserves data types. Can you post some examples of what that would look like? Note that Biml can drive everything that is possible with SSIS but cannot create new transforms or enhance transforms added to the resulting packages.

    Verbal example:

    Extract data from SQL to Excel

    Preserve data types such as Number and Dates. Make it so Numbers are treated like Numbers (for example when sorting or doing calculations), rather than Text.

    I have some jobs that do this via the ACE engine, but it does NOT preserve the data types, so a Number is treated like Text.

    Here is a code snippet for the current method:

    insert into OPENROWSET(

    'Microsoft.ACE.OLEDB.12.0',

    'Excel 12.0;Database=s:\temp\tempEngineer.xlsx',

    'SELECT [VarianceID],[Data Source],[Report Month],[Region],[Sub Region],[Sub Sub Region],[Country],[Part Number],[Part Description],[Part Serial Number],[HP Product Number],[Unit],[Product Serial Number],[CT Label Number],[Security Label],[StdCost],[Commodity],[Product Line],[Tower],[Subtower],[Business Unit],[Engineer Name],[DEMAND QTY],[DOA Qty],[DOA Rate],[Case ID],[Subcase ID],[PRU Parts],[DOA Code],[DOA Code Description],[Return Reason Codes],[Return Reason Type Cd from GCSN],[Failure Code],[Failure Code Description],[Quality Code],[Quality Code Description],[Compaq Error Code],[Compaq Error Code Description],[Sales Order],[Case Del Alt],[Order Reason],[Case Title],[Open Date],[Close Date],[Customer Name],[Customer Comment],[Technician Name],[DOA Return Tracking No(RMA)],[Flow Tag Number],[Actual Disposition],[Special Procurement Type Code],[Site Code of the Part],[DOA Record Originator],[Ship To Address],[Ship To City],[Ship To State],[Ship To Country],[SAP Supplying Plant],[Ship Date],[Repair Supplier Name],[Prime Vendor Name],[X Line Number],[Source Location],[X Part Status],[MC_FLAG],[Service_Flag],[Miscoded DOA Qty],[Ext Mat Group]

    FROM [Raw$]')

    select v.VarianceID, v.[Data Source], v.[Report Month], v.[Region], v.[Sub Region], v.[Sub Sub Region], v.[Country],

    v.[Part Number], v.[Part Description], v.[Part Serial Number], v.[HP Product Number], v.[Unit], v.[Product Serial Number],

    v.[CT Label Number], v.[Security Label], v.[StdCost], v.[Commodity], v.[Product Line], v.[Tower], v.[Subtower], v.[Business Unit],

    v.[Engineer Name], v.[DEMAND QTY], v.[DOA Qty], v.[DOA Rate], v.[Case ID], v.[Subcase ID], v.[PRU Parts], v.[DOA Code],

    v.[DOA Code Description], v.[Return Reason Codes], v.[Return Reason Type Cd from GCSN], v.[Failure Code],

    v.[Failure Code Description], v.[Quality Code], v.[Quality Code Description], v.[Compaq Error Code],

    v.[Compaq Error Code Description], v.[Sales Order], v.[Case Del Alt], v.[Order Reason],

    REPLACE(REPLACE(REPLACE(v.[Case Title], CHAR(13), ''), CHAR(10), ''), '|', '') as [Case Title], v.[Open Date], v.[Close Date],

    REPLACE(REPLACE(REPLACE(v.[Customer Name], CHAR(13), ''), CHAR(10), ''), '|', '') as [Customer Name],

    REPLACE(REPLACE(REPLACE(v.[Customer Comment], CHAR(13), ''), CHAR(10), ''), '|', '') as [Customer Comment], v.[Technician Name],

    v.[DOA Return Tracking No(RMA)], v.[Flow Tag Number], v.[Actual Disposition], v.[Special Procurement Type Code],

    v.[Site Code of the Part], v.[DOA Record Originator], v.[Ship To Address], v.[Ship To City], v.[Ship To State], v.[Ship To Country],

    v.[SAP Supplying Plant], v.[Ship Date], v.[Repair Supplier Name], v.[Prime Vendor Name], v.[X Line Number], v.[Source Location],

    v.[X Part Status], v.[MC_FLAG], v.[Service_Flag], v.[Miscoded DOA Qty], v.[Ext Mat Group]

    from datamart.datamart.dbo.variance v

    inner join ##HP_DOA_Top5 h on v.[Part Number] = h.[Part Number]

    where v.[Close Date] >= @StartDate and v.[Close Date] < @EndDate

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply