Technical Article

Real-Time SQL Server to BigQuery Streaming ETL using CDC

CDC Changes: The script queries the CDC tables in SQL Server to retrieve the changes (inserts, updates, deletes) since the last sync. Each change is processed with a mapped operation type (INSERT, UPDATE, DELETE).
Real-Time Streaming to BigQuery: The captured changes are streamed directly to BigQuery using its real-time insert_rows_json method, avoiding the need for batch uploads via Google Cloud Storage.
Tracking Last Sync Time: The script tracks the last synchronization time and updates it after every successful sync, ensuring no data is missed.
Low Latency: By continuously querying the CDC tables and streaming the changes, the script achieves near real-time data synchronization.

(1)

You rated this post out of 5. Change rating

2024-11-13 (first published: )

635 reads

Blogs

Presenting with Visual Studio Code

By

A while back I wrote a quick post on setting up key mappings in...

Advice I Like: In 100 Years

By

In 100 years a lot of what we take to be true now will...

Read the latest Blogs

Forums

connections vs apis

By stan

hi , i hear more and more that we have too many connections to...

is it true we cant debug c# scripts in ssis anymore under vs

By stan

Hi, i'm running vs2022.   I'm trying out a c# script that i'd like to...

Missing the Jaro Winkler Distance

By Steve Jones - SSC Editor

Comments posted to this topic are about the item Missing the Jaro Winkler Distance

Visit the forum

Question of the Day

Missing the Jaro Winkler Distance

I upgraded a SQL Server 2019 instance to SQL Server 2025. I wanted to test the fuzzy string search functions. I run this code:

SELECT JARO_WINKLER_DISTANCE('tim', 'tom')
I get this error message:
Msg 195, Level 15, State 10, Line 1 'JARO_WINKLER_DISTANCE' is not a recognized built-in function name.
What is wrong?

See possible answers