Data Profiling Tool Suggestion

  • I am at the design phase of a new data warehouse and am trying to weight the cost benefit of the profiling tools out there.  I know the actual cost will depend on our need and intended use, but I want to find out info without starting up a conversation with a product salesman.

    If anyone has experience with a data profiling tool, will you please share with me the following:

    1. What tool?
    2. What was the price (range)?
    3. Would you recommend it?

    Thanks in advance for the info!

    Dan

  • Choice of tool depends highly on the specifics of your environment. I've used both ERwin (from CA) and ERStudio (from Embarcadero) to great success. However, for light usage, the tools built into SQL Server 2005 will allow you to do a decent job. Basic questions:

    - How much documentation do you plan on doing. The tools really shine in that area, allowing you to communicate your designs in a consistent, effective manner.

    - How much "standardization" are you attempting to implement. Again, the tools allow you to define certain standards, and reuse them over and over.

    - What about trigger and/or store procedure generation? Do you want to implement standard templates?

    For data warehousing, the basic structures as generally simpler and easir to understand / review than normal OLTP databases.

    In general, you can't go wrong recommending a modelling tool that'll increase / improve documentation and standards. But a lot depends on your budget.

     

  • Thanks for the feedback.  I've actually started using Datiris Profiler.  It's fairly light on the features, but it does what I need and it's not as expensive as others I've looked at.

    Again, thanks for the info!

  • Try http://www.predictivedatamanagement.com - you will find free data flows and data quality data flows. For full features you do need to by a low cost engine. Basic data profiling data flow is free and self contained.

    Good Luck

  • Check out my downloadable freeware data profiling toolkit w/source at: http://www.ipcdesigns.com/data_profiling/

    I have also published a free date dimension toolkit as well at:

    http://www.ipcdesigns.com/dim_date/

    I am in progress on developing an ETL metadata management freeware toolkit. My goal is to have it available for download by 2/1/2008. Check one of the links above for a link.

    Good luck in '08 and best regards,

    Don McMunn, IPC Designs

  • Heh... I use the Mil-spec Mark I Mod I Grey Matter Data Integrator... works everytime and any costs incurred can be simply resolved by a good night's sleep and maybe a Scotch or two. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply