SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


The Data Scientist


The Data Scientist

Author
Message
Phil Factor
Phil Factor
SSCertifiable
SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)

Group: General Forum Members
Points: 6833 Visits: 3050
Comments posted to this topic are about the item The Data Scientist


Best wishes,

Phil Factor
Simple Talk
rp-sqlsc
rp-sqlsc
SSC Rookie
SSC Rookie (49 reputation)SSC Rookie (49 reputation)SSC Rookie (49 reputation)SSC Rookie (49 reputation)SSC Rookie (49 reputation)SSC Rookie (49 reputation)SSC Rookie (49 reputation)SSC Rookie (49 reputation)

Group: General Forum Members
Points: 49 Visits: 545
It would be nice to see how others have integrated R with SQL Server. Using the RODBC package is great for initiating DB interactions from R but there didn't seem to be any way to call R from SQL Server (2008).

We had a need to call R scripts from within SQL but didn't want to go to the effort and overhead of using an xp_cmdshell approach - this approach has security and performance considerations of needed to start a new process for each request.

What we came up with as a prototype was a subscribe/publish approach using request/response tables in SQL Server and one or more continuously running R processes to service the requests and return results in other tables. This would also allow us to have distributed farms of R processes as needed. Not real-time but responsive enough for our needs.
Carla Wilson-484785
Carla Wilson-484785
SSCrazy
SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)

Group: General Forum Members
Points: 2544 Visits: 1951
Thank you for an interesting topic. I have only recently heard people use the term "data scientist", and yet I think it applies well to what I do. I have always called it "knowing the data". When I "pull" data from the database, I always run histogram frequencies or other stats to see if the data make sense and to look for outliers. Are the outliers real, or caused by typos? (very frequently they are!)
Thanks for pointing out that this is a real skill and not just an obsession on my part!
davoscollective
davoscollective
SSCrazy
SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)

Group: General Forum Members
Points: 2521 Visits: 1008
"Stairway to R" interactive lesson, great place to start & totally free.

http://tryr.codeschool.com/
Nadrek
Nadrek
SSCertifiable
SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)

Group: General Forum Members
Points: 7078 Visits: 2741
One major role of an aggregate data reports writer (i.e. "Data Scientist") was Business Analyst. You can do all the tests you want, but when the actual users are deliberately using piece of data A, which means B, and if you have documentation at all is listed in the documentation as meaning B, for Q and T (depending) instead, it's entirely possible that your tests will show data A as being about as valid as your other data... and without knowing it's actually sometimes Q and sometimes T, using A to show B generates incorrect results.
NULLgarity
NULLgarity
SSChasing Mays
SSChasing Mays (642 reputation)SSChasing Mays (642 reputation)SSChasing Mays (642 reputation)SSChasing Mays (642 reputation)SSChasing Mays (642 reputation)SSChasing Mays (642 reputation)SSChasing Mays (642 reputation)SSChasing Mays (642 reputation)

Group: General Forum Members
Points: 642 Visits: 534
How dare you besmirch the good name of "actionable insights" sir! Take it back! Take back what you said!

Sincerely,
Dog & Pony Shows
;-)

Blog |
Twitter | LinkedIn


Sineetha Parveen
Sineetha Parveen
SSC-Addicted
SSC-Addicted (432 reputation)SSC-Addicted (432 reputation)SSC-Addicted (432 reputation)SSC-Addicted (432 reputation)SSC-Addicted (432 reputation)SSC-Addicted (432 reputation)SSC-Addicted (432 reputation)SSC-Addicted (432 reputation)

Group: General Forum Members
Points: 432 Visits: 119
Thanks davoscollective for the link
mschreud 78386
mschreud 78386
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 68
Is this not just a fancy new name for what was already happening?

Business Intelligence teams should have Statisticians who will analyse and provide insight into the data. If you don't, how do you know the data you are pulling and the conclusions reached are relevant and meaningful?
Phil Factor
Phil Factor
SSCertifiable
SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)

Group: General Forum Members
Points: 6833 Visits: 3050
You'd expect BI people to have a proper grounding in statistics but they haven't. It is very rare for someone who gives themselves a fancy title like this to be able to explain what a normal distribution or whether a finding is statistically significant.
You'll get a BI person to come up with an 'actionable insight' (whatever the hell they think that is).
' Oh? Is that statistically significant?' I ask.
'Yeah' they say.
'Interesting, can I see the calculations, please?'
'Calculations?'
'Yes, what is the probability level' (blank look from BI person) 'So what do you mean by 'significant'"
'Well, management will really go for it. It will get them really interested'
(Phil Factor storms off bad-temperedly)


Best wishes,

Phil Factor
Simple Talk
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search