SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Brush Up on Your ETL Skills


Brush Up on Your ETL Skills

Author
Message
Steve Jones
Steve Jones
SSC Guru
SSC Guru (539K reputation)SSC Guru (539K reputation)SSC Guru (539K reputation)SSC Guru (539K reputation)SSC Guru (539K reputation)SSC Guru (539K reputation)SSC Guru (539K reputation)SSC Guru (539K reputation)

Group: Administrators
Points: 539030 Visits: 20703
Comments posted to this topic are about the item Brush Up on Your ETL Skills

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
xsevensinzx
xsevensinzx
SSCoach
SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)

Group: General Forum Members
Points: 18074 Visits: 5127
IP address, cookies, gender, zip codes, even religious views to name a few.

87% of American adults could be accurately and uniquely identified using just three data points — date of birth, gender, and a five-digit zip code — using publicly available census data, a sobering statistic that highlights why such robust pseudonymization measures are needed, particularly in light of large-scale data breaches such as the Equifax security incident.


In my world, it's a lot of ETL. Luckily, as I'm working mostly with data scientist, both the machine learning and ETL can co-exist in Python. I think even before then, I was still eager to use Python over T-SQL and SSIS just for the mere fact that you can setup distributed processing in Python very easy with ETL pipelines where each data stream can be processed in a share nothing environment while also working holistically together. The only issue is that some of these scripting languages are not the fastest tool in the box when compared to other options, but generally work out in the end because they can scale horizontally where others can only scale up.

Dave Poole
Dave Poole
SSC Guru
SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)

Group: General Forum Members
Points: 54303 Visits: 3914
For GDPR Article 17 "Right of Erasure" may require you to have a mechanism to delete all forum posts and private messages for a particular user.
I don't think "Right of Erasure" would cover articles written by someone in the unlikely event of an author requesting erasure.
"Right of Erasure" does not trump the legal requirement to keep financial records for the legally mandated time period
If Redgate haven't done so already it is wise to get advice from legal.

A subject access request for subscribers would cover anything in their profile but as the site provides the mechanism to see this it is effectively self service. If there is nothing beyond what people can self-serve then it may be as simple as having an explicit GDPR page that states how a requester can retrieve their own data.

Article 20 "Right to data portability" is an interesting one. It doesn't limit its scope but I think historically it re-enforces consumers rights to swap energy suppliers, broadband/mobile providers and now banking providers. In the context of SQLServerCentral it could be a mechanism to download a subscriber's profile by that subscriber.

Another interesting wrinkle is what do you do when not all your data is in SQL Server? Does something like Apache Presto (implemented in AWS as Athena) provide an answer to this and serendipitously to a general business problem?

As general advice to people facing GDPR I would say take a good hard look at any company file shares, email in-boxes, drop-box/One Drive type accounts, work-stations, Sharepoint etc. In the SQL Server world we have a structured data store with a defined retention strategy, purge, archive and backup. On company file shares and mail-boxes there is God knows what, God knows where and in God knows what format.
If your HR department takes a scan of your passport when you first join the company then they need to have defined processes in place to purge those images when they are no-longer in use. Unless you have some form of auditing software such as www.groundlabs.com which has the capability to perform OCR on images it is going to be very hard to identify what your exposure and risk is.

LinkedIn Profile
www.simple-talk.com
chrisn-585491
chrisn-585491
SSChampion
SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)

Group: General Forum Members
Points: 12130 Visits: 2896
You say "ETL specialist", I say "Data Janitor". BigGrin
below86
below86
SSCertifiable
SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)

Group: General Forum Members
Points: 7068 Visits: 4699
We use SSIS for all of our ETL. I would prefer that SSIS be used more for EL and not (T)ransform. Setting up the SSIS to Extract and Load the data to a 'work' table, then using SQL to transform the data. In everything I've done so far in my career I haven't found any 'Transform' that I couldn't do in SQL.

-------------------------------------------------------------
we travel not to escape life but for life not to escape us
Steve Jones
Steve Jones
SSC Guru
SSC Guru (539K reputation)SSC Guru (539K reputation)SSC Guru (539K reputation)SSC Guru (539K reputation)SSC Guru (539K reputation)SSC Guru (539K reputation)SSC Guru (539K reputation)SSC Guru (539K reputation)

Group: Administrators
Points: 539030 Visits: 20703
David.Poole - Wednesday, February 21, 2018 1:48 AM
For GDPR Article 17 "Right of Erasure" may require you to have a mechanism to delete all forum posts and private messages for a particular user.
I don't think "Right of Erasure" would cover articles written by someone in the unlikely event of an author requesting erasure.
"Right of Erasure" does not trump the legal requirement to keep financial records for the legally mandated time period
If Redgate haven't done so already it is wise to get advice from legal.

Article 20 "Right to data portability" is an interesting one. It doesn't limit its scope but I think historically it re-enforces consumers rights to swap energy suppliers, broadband/mobile providers and now banking providers. In the context of SQLServerCentral it could be a mechanism to download a subscriber's profile by that subscriber.

Maybe. Our business is providing answers to people. It's possible an entity could get a right of erasure, but I doubt it for the things we share. We'd want a legal decision to the contrary.

For Article 20, the profile is a good example where we might need to provide that for someone, though I doubt we'd get a request. We keep fairly little information here that isn't public.


Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
ManicStar
ManicStar
SSChampion
SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)

Group: General Forum Members
Points: 13371 Visits: 5590
chrisn-585491 - Wednesday, February 21, 2018 6:08 AM
You say "ETL specialist", I say "Data Janitor". BigGrin

Yep...

mjh 45389
mjh 45389
SSCarpal Tunnel
SSCarpal Tunnel (4.4K reputation)SSCarpal Tunnel (4.4K reputation)SSCarpal Tunnel (4.4K reputation)SSCarpal Tunnel (4.4K reputation)SSCarpal Tunnel (4.4K reputation)SSCarpal Tunnel (4.4K reputation)SSCarpal Tunnel (4.4K reputation)SSCarpal Tunnel (4.4K reputation)

Group: General Forum Members
Points: 4449 Visits: 3899
"Right of Erasure" is interesting (maybe not)! One of my best friends died suddenly and very, very unexpectedly a few years ago. However, he lived on on the Internet for a long time as his family struggled to get him removed from social media like FriendsReunited and LinkedIn. They were less bothered about professional forums similar to this one...
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum








































































































































































SQLServerCentral


Search