Bash for ETL pre-processing

  • Ed Wagner (8/15/2016)


    Nice article, David. Like Jeff pointed out, if people have an irrational fear of xp_cmdshell, what are they going to think about Bash? I don't have this fear and I find your article very intriguing. It also brought back memories (some good, some not) of Unix days.

    David.Poole (8/15/2016)


    I'm beginning to think that common sense as a resource is becoming rarer.

    Agreed. I'm reminded of Lowell's signature about common sense actually being considered a superpower. 😉

    I was going to post a response this morning about how it's rational to have some measure of concern over xp_cmdshell, but the IPS at my workplace completely shut down my internet connection, because my comment contained "xp_cmdshell".

    :blink:

  • I personally use Python over bash in Linux environments. I still find it simple yet more powerful because there are modules available that enhance the ETL pipeline in many ways, but most of all, rapid development.

    The real power comes from distributing ETL workloads across multiple machines that are completely open source and tied together with queuing modules. Your ETL becomes that massively parallel processing (MPP) and data starts to streamline greater than what you can achieve with SSIS at the cost of more complexity and difficulty in management.

Viewing 2 posts - 16 through 16 (of 16 total)

You must be logged in to reply to this topic. Login to reply