Troubleshooting a pipeline error

Question

Troubleshooting a pipeline error

SSCertifiable

Points: 5600

May 10, 2022 at 3:13 pm

I am creating a pipeline in a data factory per this page's instructions, https://docs.microsoft.com/en-us/azure/data-factory/tutorial-incremental-copy-change-data-capture-feature-portal. I am a beginner with this and am trying to follow the instruction as closely as possible. The change, in my case, is that I am using a VM SQL installation, also in Azure, but not a managed instance. I have successfully "previewed" the data from the Lookup object, and it returns a value of 3, for the number of rows that have changed.

I am at step 8 of the "Create a pipeline to copy the changed data" section, run in debug mode, and when I run the pipeline I get a message that the pipeline failed. No useful troubleshooting information at all. As I was gathering info for this post I viewed the code of the pipeline, cancelled out of that, then attempted to run the pipeline again in debug mode. Now I get "HasChangedRows

The output of activity 'GetChangeCount' can't be referenced since it is either not an ancestor to the current activity or does not exist"

Where do I go from here?

My code:

{
    "name": "IncrementalCopyPipeline",
    "properties": {
        "activities": [
            {
                "name": "GetChangeCount",
                "type": "Lookup",
                "dependsOn": [],
                "policy": {
                    "timeout": "7.00:00:00",
                    "retry": 0,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false,
                    "secureInput": false
                },
                "userProperties": [],
                "typeProperties": {
                    "source": {
                        "type": "SqlServerSource",
                        "sqlReaderQuery": "DECLARE  @from_lsn binary(10), @to_lsn binary(10);  \nSET @from_lsn =sys.fn_cdc_get_min_lsn('dbo_Acknowledgement_Timeliness');  \nSET @to_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal',  GETDATE());\nSELECT count(1) changecount FROM cdc.fn_cdc_get_net_changes_dbo_Acknowledgement_Timeliness(@from_lsn, @to_lsn, 'all')",
                        "queryTimeout": "02:00:00",
                        "partitionOption": "None"
                    },
                    "dataset": {
                        "referenceName": "ZTWSHRDSQL08C_TestReplicationToDataLake",
                        "type": "DatasetReference"
                    }
                }
            },
            {
                "name": "HasChangedRows",
                "type": "IfCondition",
                "dependsOn": [],
                "userProperties": [],
                "typeProperties": {
                    "expression": {
                        "value": "@greater(int(activity('GetChangeCount').output.firstRow.changecount),0)",
                        "type": "Expression"
                    },
                    "ifTrueActivities": [
                        {
                            "name": "Wait1",
                            "type": "Wait",
                            "dependsOn": [],
                            "userProperties": [],
                            "typeProperties": {
                                "waitTimeInSeconds": 1
                            }
                        }
                    ]
                }
            }
        ],
        "annotations": []
    }
}

Thanks,

Chris

Learning something new on every visit to SSC. Hoping to pass it on to someone else.

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Stamey SSCertifiable Points: 5600 More actions · Answer 1

Finally stumbled across the solution. You have to connect the objects for with a workflow, something that the documentation on the MS page did not mention. Now to work through the rest of it.

Microsoft: Halfway on every solution.

Chris

Learning something new on every visit to SSC. Hoping to pass it on to someone else.

Jo Pattyn SSC-Dedicated Points: 32449 More actions · Answer 2

Jo Pattyn

SSC-Dedicated

Points: 32449

May 23, 2022 at 3:18 pm

#4037689

Thanks, how did you couple it with a workflow?

Jeff Moden SSC Guru Points: 1004619 More actions · Answer 3

Amazing... you had to write all of that for a pipeline? Guess I'm glad I don't have to work with such a thing.

Since I don't know about pipelines, would it have been possible to have your code call a stored procedure instead of inlining the code like you did?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Stamey SSCertifiable Points: 5600 More actions · Answer 4

No, not much of it was written. Just parameter values had to be entered. This is the code behind the GUI. Kind of like an SSIS GUI that generates JSON code for you. I am a beginner at this and do not know how it will act returning record sets from procedures, so I'm just going basic at this time. A better design may be thought of as I get more experience. How this acts, nothing like SSIS, is difficult enough to deal with right now. It's taking a lot of googling to find out how to make the pieces work.

I connected the objects by clicking on the little green tab of a given object and dragging it onto the next object.

Thanks,

Chris

This reply was modified 3 years, 2 months ago by Stamey.

Learning something new on every visit to SSC. Hoping to pass it on to someone else.

Scott Mildenberger Ten Centuries Points: 1397 More actions · Answer 5

Jeff Moden wrote:

Amazing... you had to write all of that for a pipeline? Guess I'm glad I don't have to work with such a thing.
Since I don't know about pipelines, would it have been possible to have your code call a stored procedure instead of inlining the code like you did?

Yes you can call stored procedures to get or save data. Data Factory pipelines are typically built in the web gui so a lot is drag/drop and then specify the parameters. You can get into the json behind it if you want but need to be careful editing. There are cases I have found it quicker to edit the json. Fortunately we have all of ours connect to a git repo so if you err badly you can get back to where you were.

Troubleshooting a pipeline error

Cookies on SQLServerCentral