Azure Automation

  • Hello everyone,

    I am looking for best possible way to accomplish below request.
    > Resume Azure Data warehouse, run batch process in ADF, Stop Azure Data warehouse

    For this development, I have created
    Runbook to resume ADW
    ADF pipelines using trigger
    Runbook to Stop ADW

    Now I am looking for suggestion on running them sequentially using one schedule?

    Thanks in advance!

  • Pretty sure every one of this can be a command from PowerShell. That's pretty much all you need.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks Grant.

    That's what I figured and I created Resume/Pause ADW Runbooks using Powershell. But calling ADF pipelines into Runbook is giving me hard time, and I am very new to Powershell.

    And Get-AzureRmDataFactoryV2Pipeline -ResourceGroupName $rgn -DataFactoryName $dfn  , is failing with "Get-AzureRmDataFactoryV2Pipeline : Method not found: 'Newtonsoft.Json.Serialization.IAttributeProvider" error..
    And have to figure out invoking one after other pipeline...looking for a powershell template or script to accomplish this task..

  • kishoremania - Tuesday, April 17, 2018 5:09 PM

    Thanks Grant.

    That's what I figured and I created Resume/Pause ADW Runbooks using Powershell. But calling ADF pipelines into Runbook is giving me hard time, and I am very new to Powershell.

    And Get-AzureRmDataFactoryV2Pipeline -ResourceGroupName $rgn -DataFactoryName $dfn  , is failing with "Get-AzureRmDataFactoryV2Pipeline : Method not found: 'Newtonsoft.Json.Serialization.IAttributeProvider" error..
    And have to figure out invoking one after other pipeline...looking for a powershell template or script to accomplish this task..

    I have not programmed against data factor yet, so I'm unsure. However, if you have the RM download of Powershell, you should be able to use the ISE to browse the commands. If that method is not found, you should be able to locate the correct one easily enough, or figure out if there's a typo, etc.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • In your Automation Account, have you imported the Data Factory modules?

  • I'm pretty sure you can start, scale, and pause now with TSQL in ADW. I will check though later today. I say this because I'm pretty sure I redesigned one of our python scripts to do so. But I can't think right now as I just woke up and completely tired from a conference I was at all day yesterday.

    For example, I am pretty positive the below query will actually resume your data warehouse if you execute the query in a paused state I do believe. The master is always online regardless of the ADW state.


    ALTER DATABASE mySampleDataWarehouse
    MODIFY (SERVICE_OBJECTIVE = 'DW300');

    Outside of that, in case I'm totally wrong, we used the API to fully control the data warehouse with Python. We start and pause the data warehouse on a schedule each day from a simple python script on a Windows scheduler. Easy peasy.

    To execute TSQL once resumed, we use pyodbc (python module we use for all our SQL Servers) to execute stored procedures as well load in data from Polybase via External Tables. The issue here is that Polybase does return rejected rows as the output to console, which causes pyodbc to rollback the transaction. Microsoft is working on ways to nullify this output or redirect it to a file or something to help people collect these failed records for logging/investigation purposes.

  • higgim - Thursday, April 19, 2018 4:15 AM

    In your Automation Account, have you imported the Data Factory modules?

    Yes, I imported both of them.

  • xsevensinzx - Thursday, April 19, 2018 5:55 AM

    I'm pretty sure you can start, scale, and pause now with TSQL in ADW. I will check though later today. I say this because I'm pretty sure I redesigned one of our python scripts to do so. But I can't think right now as I just woke up and completely tired from a conference I was at all day yesterday.

    For example, I am pretty positive the below query will actually resume your data warehouse if you execute the query in a paused state I do believe. The master is always online regardless of the ADW state.


    ALTER DATABASE mySampleDataWarehouse
    MODIFY (SERVICE_OBJECTIVE = 'DW300');

    Outside of that, in case I'm totally wrong, we used the API to fully control the data warehouse with Python. We start and pause the data warehouse on a schedule each day from a simple python script on a Windows scheduler. Easy peasy.

    To execute TSQL once resumed, we use pyodbc (python module we use for all our SQL Servers) to execute stored procedures as well load in data from Polybase via External Tables. The issue here is that Polybase does return rejected rows as the output to console, which causes pyodbc to rollback the transaction. Microsoft is working on ways to nullify this output or redirect it to a file or something to help people collect these failed records for logging/investigation purposes.

    Thank you!

    Pausing and resuming Azure datawarehouse is not my concern and easily doing it through Powershell runbooks under Azure automation account. And I know these can be done some other ways like T-sql and REST API's.
    My actual task is to run all below 3 tasks run together at once.
    1)Resume Azure Datawarehouse
    2)Run batch process (ADF pipelines)
    3)Pause Azure Datawarehouse

    For 1 & 3, I created powershell runbooks and they are running fine. For 2, I am trying to create other powershell Runbook calling ADF pipelines into it, so I can finally create one Main runbook calling all three of them.
    Here when I am working in 2, seeing the error above mentioned in post!

  • Ahh, gotcha. I can't help you there. I put task 1 and 3 with my ETL, which is all in Python. As the Python scripts are easily automatable, there is no need for ADF in my cases, so I don't use it to do anything at the moment. That same would be true for PowerShell if I used it there too.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply