SSIS and BIML

  • I'm looking at a way to import many tables (about 350), from each of many sources (about 300), so 105,000 individual loads nightly into a staging database. 

    I currently use a home-grown framework, but am looking to use SSIS. BIML may be the solution to address this, but I have some concerns about scalability and volume.

    I'm just getting started looking at this have some questions. Apologies if they are very basic. I'm looking at the Stairway to BIML, but I'm still on the basic stuff. 

    1) I see where I can extract all tables from a database, but I want only a selection of tables (the original database has 7000+ tables). The table list would be stored in a database, so BIML would either need to read from a SQL database or I'd have to figure out a hack to restrict the tables. 

    2) I would need 2 types of extraction patterns: The easy one is the initial load, where all data is extracts. the harder one is based on changes logged in an audit table. Rows would be extracted based on an input parameter of a CRUD date (which is stored in the audit tables). Is that kind of pattern possible within BIML. Templating could be done if the table names can be fed into the process. 

    3) Loading the data into staging database is most convenient if the source database can be written as a "source int" in the staging table. That way I only have 350 tables to deal with instead of 105,000. But I would be able to still determine the source based on the "source int" that is obtained from a cross-reference table. I don't know yet how much flexibility have with BIML Express and passing parameters or looping through multiple connection strings for pulling the same table data from different sources. 

    Also, any suggestions before I embark on this would be helpful.
    Thanks

  • BIML can definitely help with everything you've listed here. To give you an idea, I currently use an Excel document (source-to-target mapping) as source for the tables/attributes I want to extract and import into my staging database. I have also extended it to contain filters and flags that control whether a filter is implemented in the source query and whether the package generation is included when the BIML script is executed. 

    My BIML script reads from this Excel document and performs the necessary logic to do what it needs to do. There's no reason why you wouldn't be able to read from a table to do the same. I can see why you would want to include a "source" attribute to reduce the number of packages. I typically like to keep the packages as granular as possible for control, and then have a script that aggregates the data (and then including a "source" attribute) into a consolidated table. I can see how you wouldn't want to necessarily do that with so many sources, but it's worth considering before embarking on the exercise.

    Hope this helps.

  • i do the same as what Martin does, but I am using extended properties for my source of tables instead of something external to the database, 
    the advantage is, when i need to add a table to the process, I can simply add an extended property, and regenerate  with BIML.
    i just like it that way, as it is self containe din the database.
    Un fortuneately, it is also not obvious, as extended properties  are fairly well hidden from every day reviews.

    so my loading query is something like this:
    SELECT
    SCHEMA_NAME(objz.schema_id) AS [Object_Schema],
    objz.name       AS [Object_Name],
    objz.type_desc     AS [ObjectType],
    ''         AS [ColumnName],
    propz.[class],
    propz.[class_desc],
    propz.[major_id],
    propz.[minor_id],
    propz.[name],
    propz.[value]
    FROM sys.objects objz
    LEFT OUTER JOIN sys.extended_properties propz
    on objz.object_id = propz.major_id
    WHERE propz.[class] = 1
    AND propz.[minor_id] = 0
    AND propz.[name] = 'BIMLProcess2016'

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Do either of you know any consultants in the Seattle, WA area that could help us get started so that we can incorporate best practices from the start?

  • screenshot - Tuesday, March 12, 2019 3:04 PM

    Do either of you know any consultants in the Seattle, WA area that could help us get started so that we can incorporate best practices from the start?

    I am not in the Seattle area, but a consultant and happy to chat. I think Mark Ginnebaugh's company DesignMind is in Seattle if you need somebody local.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply