Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

The Data Scientist Expand / Collapse
Author
Message
Posted Saturday, January 19, 2013 3:06 AM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Sunday, April 20, 2014 11:44 AM
Points: 561, Visits: 2,417
Comments posted to this topic are about the item The Data Scientist


Best wishes,

Phil Factor
Simple Talk
Post #1409169
Posted Saturday, January 19, 2013 9:24 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, April 07, 2014 12:27 PM
Points: 8, Visits: 477
It would be nice to see how others have integrated R with SQL Server. Using the RODBC package is great for initiating DB interactions from R but there didn't seem to be any way to call R from SQL Server (2008).

We had a need to call R scripts from within SQL but didn't want to go to the effort and overhead of using an xp_cmdshell approach - this approach has security and performance considerations of needed to start a new process for each request.

What we came up with as a prototype was a subscribe/publish approach using request/response tables in SQL Server and one or more continuously running R processes to service the requests and return results in other tables. This would also allow us to have distributed farms of R processes as needed. Not real-time but responsive enough for our needs.
Post #1409204
Posted Sunday, January 20, 2013 1:57 PM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 9:00 AM
Points: 1,526, Visits: 1,834
Thank you for an interesting topic. I have only recently heard people use the term "data scientist", and yet I think it applies well to what I do. I have always called it "knowing the data". When I "pull" data from the database, I always run histogram frequencies or other stats to see if the data make sense and to look for outliers. Are the outliers real, or caused by typos? (very frequently they are!)
Thanks for pointing out that this is a real skill and not just an obsession on my part!
Post #1409303
Posted Sunday, January 20, 2013 8:27 PM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Monday, April 21, 2014 8:42 PM
Points: 444, Visits: 825
"Stairway to R" interactive lesson, great place to start & totally free.

http://tryr.codeschool.com/
Post #1409340
Posted Monday, January 21, 2013 8:14 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Wednesday, April 16, 2014 8:46 AM
Points: 845, Visits: 2,331
One major role of an aggregate data reports writer (i.e. "Data Scientist") was Business Analyst. You can do all the tests you want, but when the actual users are deliberately using piece of data A, which means B, and if you have documentation at all is listed in the documentation as meaning B, for Q and T (depending) instead, it's entirely possible that your tests will show data A as being about as valid as your other data... and without knowing it's actually sometimes Q and sometimes T, using A to show B generates incorrect results.
Post #1409576
Posted Tuesday, January 22, 2013 6:47 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, February 11, 2014 9:41 AM
Points: 368, Visits: 525
How dare you besmirch the good name of "actionable insights" sir! Take it back! Take back what you said!

Sincerely,
Dog & Pony Shows


Blog |
Twitter | LinkedIn

Post #1410005
Posted Wednesday, January 23, 2013 2:43 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Friday, August 16, 2013 2:26 AM
Points: 424, Visits: 119
Thanks davoscollective for the link
Post #1410425
Posted Wednesday, January 23, 2013 9:37 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, March 19, 2014 11:08 AM
Points: 1, Visits: 27
Is this not just a fancy new name for what was already happening?

Business Intelligence teams should have Statisticians who will analyse and provide insight into the data. If you don't, how do you know the data you are pulling and the conclusions reached are relevant and meaningful?
Post #1410682
Posted Saturday, January 26, 2013 7:48 AM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Sunday, April 20, 2014 11:44 AM
Points: 561, Visits: 2,417
You'd expect BI people to have a proper grounding in statistics but they haven't. It is very rare for someone who gives themselves a fancy title like this to be able to explain what a normal distribution or whether a finding is statistically significant.
You'll get a BI person to come up with an 'actionable insight' (whatever the hell they think that is).
' Oh? Is that statistically significant?' I ask.
'Yeah' they say.
'Interesting, can I see the calculations, please?'
'Calculations?'
'Yes, what is the probability level' (blank look from BI person) 'So what do you mean by 'significant'"
'Well, management will really go for it. It will get them really interested'
(Phil Factor storms off bad-temperedly)



Best wishes,

Phil Factor
Simple Talk
Post #1412031
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse