April 3, 2025 at 7:01 pm
Hi as we replace on prem machines , sometimes not with vm's, what should we be thinking about for etl?
from my little corner of the world where we use both talend and ssis, and where talend is getting costly, and where ssis is slowly being squeezed into a call from data factory to an on prem machine, and where data factory is expensive and overkill for non big data platforms, what should we be thinking about?
one of my peers wants to move toward c# which blows the whole no code paradigm. I told him about pentaho but i dont know if its going to blow his socks off. and im sure there are plenty of pentaho competitors.
i suppose if we paid more (we're limited to pro licenses right now across the company) we could use power bi datasets etc. maybe in a limited way.
April 3, 2025 at 9:32 pm
I've now tried migrating from SSIS to Data Factory twice. The first time (three years ago) failed because DF could not do the things we needed. The second time was in the past six months and was quite successful, except for the cost, which was rocketing upwards as we tried to ramp up performance. We headed back to SSIS in Azure VMs, which has a much better price/performance ratio
I know plenty about dataflows and semantic models, but not datasets, so I'm interested to hear about how you (or others) have been using those for ETL.
April 4, 2025 at 12:56 pm
As SSIS get a litte love lately, thinking of moving to c#, powershell. We don't have very complicated workloads and excel is being read by a c# script inside SSIS
April 4, 2025 at 1:04 pm
Powershell & Python.
April 4, 2025 at 3:28 pm
thx phil we didnt use datasets in etl. its all talk/pontificating. and i'd be more prone to trying pentaho etc than doing that or reinventing some pretty well established wheels by trying to home grow etl using c#. can you imagine coding in c# the normal capability of all etl tools in iterating over whatever. the tco seems prohibitive.
If memory serves, i sat with MS 2-3 years ago griping about how difficult it was to understand data factory. My take away was that off the shelf one can plumb popular big data products to DF once you kind of understand their gui. And i might be mixing this up, but if you are small data and dont want to call ssis , you can plumb a dataset created in a pbi data flow to DF. Dont quote me on that, its been quite while. The problem there (which i learned a couple of years later) is the cost of being able to use DF and the cost of creating datasets. we are limited to pro licenses across the company right now. and i suspect once we saturate the company with access to our sales warehouse (26 erps i think when we get them all in our rollups) , we will start seeing big data in other areas like mfg and maybe procurement.
April 8, 2025 at 6:55 pm
I see some SSIS people moving to ADF, but mostly just running SSIS from the connector.
Some non MSSQL people seem to like dbt, but that looks like a wild, crazy way of working to me.
April 9, 2025 at 3:49 pm
Is the PowerQuery source ever going to work in SSIS? If not, how do I call the PowerQuery refresh in Excel from inside SSIS?
April 9, 2025 at 4:28 pm
PowerQuery source works in SSIS - just needs to be done correctly (which you on your thread about it seem to have done all required EXCEPT 1 very important step - not that reading PDF files with it should be done at all - source should be changed to something else)
April 9, 2025 at 5:27 pm
What "one very important step" am I missing? Please elaborate!
Seems to me, I should use a For Each File loop in SSIS, and then inside that, pass the file name to the PowerQuery, then use the output of the PowerQuery to append that result to a table in my database.
April 16, 2025 at 6:54 am
We're using SSIS with CozyRoc components on the old warehouse and AWS tools - Step Functions and Glue with Python and Athena SQL on the new Data Lake.
The rest of the business is now to move to MuleSoft for integrations between systems. Whether we also do has not been mentioned.
We're also moving the one or two people who do heavy Power Query in PowerBI to mangle spreadsheets together to use warehouse prepared data because of the cost of the Fabric Capacity it needs and the aim for consistent sources of data.
May 7, 2025 at 12:50 pm
thx steve, it appears this post went off track with power query discussions.
anyway something occurred to me the other day. i forget what its called but there is /was a free version of sql server years ago. i cannot recall if it included ssis. but im thinking if keeping one vm or on prem machine (dont know what those cost when equipped with just the minimum horsepower) around for just that purpose, maybe that is yet another alternative to what i think is a pending etl crisis for some of us over small data. the only thing saving us right now is that we use ssrs but i think its days are numbered one way or another. with it being an anchor preventing us from migrating our engine to the cloud (not vm), its kind of like a last line of defense.
May 7, 2025 at 1:20 pm
Migrating your SSRS reports to Power BI Paginated reports is straightforward & sounds like a better solution to me. Or are you staying away from Power BI?
May 7, 2025 at 2:31 pm
Express is free, but no SSIS/SSRS with this. I don't think there are free replacements for them
I might aim for a VM in the cloud I can shut down. Storage is rarely expensive.
May 7, 2025 at 7:03 pm
good question phil. i think my company is averse to putting them up there right now because of 1) the variety of erp's sourcing those reports , 2) the number of reports/datasets , 3) the many hooks between .net and ssrs where .net "generates" many reports for us thru RM/ssrs, 4) our users otherwise being used to going to RM directories (not workspaces) to run all the other reports, 5) at some point my company decided to store word docs , spreadsheets, pdfs etc up on ssrs's directories also. I'm kind of skeptical too. But have to believe ssrs's days are numbered anyway.
May 7, 2025 at 7:04 pm
thx steve.
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply