ETL/SSIS/Azure Data Factory

Technical Article

Real-Time SQL Server to BigQuery Streaming ETL using CDC

  • Script

CDC Changes: The script queries the CDC tables in SQL Server to retrieve the changes (inserts, updates, deletes) since the last sync. Each change is processed with a mapped operation type (INSERT, UPDATE, DELETE).
Real-Time Streaming to BigQuery: The captured changes are streamed directly to BigQuery using its real-time insert_rows_json method, avoiding the need for batch uploads via Google Cloud Storage.
Tracking Last Sync Time: The script tracks the last synchronization time and updates it after every successful sync, ensuring no data is missed.
Low Latency: By continuously querying the CDC tables and streaming the changes, the script achieves near real-time data synchronization.

(1)

You rated this post out of 5. Change rating

2024-11-13 (first published: )

605 reads

Blogs

The Book of Redgate: Spread across the world

By

This was Redgate in 2010, spread across the globe. First the EU/US Here’s Asia...

Merry Christmas

By

Today is Christmas and while I do not expect anybody to actual be reading...

Self-Hosting a Photo Server the Whole Family Can Use

By

Until recently, my family's 90,000+ photos have been hidden away in the depths of...

Read the latest Blogs

Forums

Happy Holidays, Let's Do Nerdy Stuff

By Grant Fritchey

Comments posted to this topic are about the item Happy Holidays, Let's Do Nerdy...

UNISTR Escape

By Steve Jones - SSC Editor

Comments posted to this topic are about the item UNISTR Escape

Celebrating Tomorrow

By Steve Jones - SSC Editor

Comments posted to this topic are about the item Celebrating Tomorrow

Visit the forum

Question of the Day

UNISTR Escape

In SQL Server 2025, I run this command:

SELECT UNISTR('*3041*308A*304C\3068 and good night', '*') as "A Classic";
What is returned? (assume the database has an appropriate collation) A: B: C:

See possible answers