Citizen Scientists

  • Comments posted to this topic are about the item Citizen Scientists

  • It sounds like someone is trying to make an argument to have 2 people do a job for one person. Better yet, 3 people if you include a tech resource, because believe it or not, most data scientist have huge tech gaps and or are not the right people for doing what a DBA, BI Dev, ETL Dev, or even data engineer should be doing with the data scientist or data analyst.

    I do have pretty strong opinions on the data scientist role though. I don't think it's right to refine these guys down to the languages and tools of the trade. It's much more than just learning how to speak English and knowing all the fancy algorithms. The math portion, being able to prove what you have done with math, on top of explain it in a way that makes sense, and actually knowing what method to go with based on that science is really what defines someone from a data scientist from a data analyst or even a visualization expert.

    However, learning the language and the tools is a good start. It's just not everything nor will it likely ever be for the time being no matter what new tool allows you to push a button and output a predictive model. When it comes to making critical business decisions, people still want to know how and why you came up with whatever. A data analyst or a citizen data scientist is not going to be able to do that. Likely, only the data scientist can.

  • First off "cobbling together" a nice looking report means nothing. Artists can do that or even amateur photographers. Any business managers making conclusions on such reports (of which there are plenty) will last maybe a few months or if disastrous just days.
    The ability to draw nice reports to show data is not what makes a data scientist.
    Besides the inherent disability of business managers to articulate their requirements other than "I need a golden goose!" makes the life of any data professional interesting.

    The ability to understand data and turn it into information requires in my opinion two main skills:
    * Understand data lineage to find out where problems lurk underneath
    * Understand the business the information is for
    Anything else like gathering business requirements, create reports, analyse data and present results builds on top of these two skills.

    Not sure how automation of data capture and transformation will play out. Some much more clever people than I probably come up with some really brilliant implementation. Until then people like I stumble around companies which try to make sense of the data they have to gather some information.

  • Surely in most domains there should be professionals for whom this should be a large part of their remit?

    What exactly are we hiring accountants, quantity surveyors, actuaries, mathematicians, geographers, physicists, chemists, engineers and statisticians for?

    I would probably go for a domain specialists who had a phd in his particular area. Maybe I'm out of touch but I steer clear of jobs with titles Data Scientist.

  • I am not convinced there is any such thing as a "Data Scientist" but I'm a firm believer in a "Data Science Team".

    The role of the Data Science Team is to provide actionable insight.  For insight to be actionable the decision maker must first believe in it.  It can be as demonstrably correct as you like but if no-one believes it then you are on a fools errand.  That is why a Data Science Team needs a mix of people, skills and personalities.

    • Someone who has the gift of the gab.  Can talk to and is welcomed by senior decision makers
    • Someone who is happy to be the equivalent of a data sewage works

    • Someone with a gift for statistics, logic and reasoning
    • Someone with imagination who will ask the questions that no-one else thought of
    • Someone with common sense (the worlds greatest oxymoron)
    • Someone whose ambition is to be the worlds best librarian
    • ...etc

    There is no way you are going to find all that in a single person.

  • Dave is right about data sewage. I often tell folks that ETL is the equivalent of a sewer/water treatment plant.  😀

    There's a lot of folks that like to fling the Data Scientist and other titles around. But the problem with many of the "citizens" and "amateurs" is they don't know much, they don't know what they don't know and they get offended when a professional tries to offer guidance. Even though the tools are better/easier and information is much more available we work in a field that requires time and hard work to master.  It gets to the point I don't volunteer advice much, I just point them to sites and make appropriate noises.

    And the other issue is the promotion of  "Big Data" and "Data Science" by those selling books and conferences. And some of the lower-end tools that aren't suitable for handling anything other than small data sets and "toy" examples. Lots of snake-oil out there...

  • Tend very much to agree with the previous comments. Terms like "Big data", "Data mining" and "Data scientist" are used far to randomly these days often with no real understanding. One project I am involved with involves three of us - I understand the data structure and write the queries, a medically qualified person raises the queries and a statistician checks that they are statistically relevant. As we have all seen with medical data the media can easily distort statistics even on a weekly basis!.

  • I've often thought I was a data janitor, cleaning up messes, keeping the basic bits of humanity moving.

  • David.Poole - Tuesday, October 10, 2017 4:38 AM

    I am not convinced there is any such thing as a "Data Scientist" but I'm a firm believer in a "Data Science Team".

    The role of the Data Science Team is to provide actionable insight.  For insight to be actionable the decision maker must first believe in it.  It can be as demonstrably correct as you like but if no-one believes it then you are on a fools errand.  That is why a Data Science Team needs a mix of people, skills and personalities.

    • Someone who has the gift of the gab.  Can talk to and is welcomed by senior decision makers
    • Someone who is happy to be the equivalent of a data sewage works

    • Someone with a gift for statistics, logic and reasoning
    • Someone with imagination who will ask the questions that no-one else thought of
    • Someone with common sense (the worlds greatest oxymoron)
    • Someone whose ambition is to be the worlds best librarian
    • ...etc

    There is no way you are going to find all that in a single person.

    Sure, teams are essentially what drives data science. We have them here at the company I work for. We always pair data engineers (i.e.: DBA-like, ETL devs, etc) with data scientist to help ensure success. But make no mistake, just because it's a team effort does not mean the data scientist is voided. You still need that statistician on the team that is fluent in statistics, probability, and complex math who also has the domain experience. This is the separator from someone who is a statistician and a data scientist -- the domain experience. Likewise, that's also the difference between me (the data engineer) and the scientist, because like most tech resources that wrangle data, we do not specialize in statistics and probability in our day-to-day.

    While I know it's easy to shove this off like it's just another trendy title that is doing the same things other have already been doing, there is a pretty clear difference between machine learning and canned reporting just like there is a pretty clear difference between using SSAS Data Mining versus R for predictive analytics. You can't just easily lob these guys into the same bucket like their roles do not exist. If you're going to employ machine learning techniques then you will likely need a data scientist to exist on your team regardless of your opinion. That are entrust your programmer or DBA to be good enough in this field even though it's not their specialty... :crazy:

  • I wouldn't describe the role of a data scientist as voided and neither would I say that the underlying role is vacuous.
    I think we are violently agreeing over the differences between Canned Reporting and Machine Learning.

    I don't agree that the discriminator between a Data Scientist and a Statistician is that the former has domain experience.  I would put the discriminator as being imagination and curiosity leading to proactive application of skills.  It's the difference between being able to answer a question someone asks you and being able to answer a question that you are one of the few who would think to ask.

    I think learning about data science won't make you a data scientist but it will give you insight on how you can help turbo-charge your imaginative statistician.

  • David.Poole - Thursday, October 12, 2017 5:29 AM

    I wouldn't describe the role of a data scientist as voided and neither would I say that the underlying role is vacuous.
    I think we are violently agreeing over the differences between Canned Reporting and Machine Learning.

    I don't agree that the discriminator between a Data Scientist and a Statistician is that the former has domain experience.  I would put the discriminator as being imagination and curiosity leading to proactive application of skills.  It's the difference between being able to answer a question someone asks you and being able to answer a question that you are one of the few who would think to ask.

    I think learning about data science won't make you a data scientist but it will give you insight on how you can help turbo-charge your imaginative statistician.

    But, when you say things like "the underlying role is vacuous", you're basically saying the role can't really be defined and that it's something that has no real planning or thought behind the role. It's just something someone fills and they themselves don't even know what it means nor the business. It's like you're saying, "you're not real." While I agree there is not a standard across all organizations, there is enough to define a role that specifically fits a set of skills that most of our common roles are not focused on day-to-day (i.e.: machine learning). To say the role is just fluff is something I would strongly have to disagree with.

  • David.Poole - Thursday, October 12, 2017 5:29 AM

    I don't agree that the discriminator between a Data Scientist and a Statistician is that the former has domain experience.  I would put the discriminator as being imagination and curiosity leading to proactive application of skills.  It's the difference between being able to answer a question someone asks you and being able to answer a question that you are one of the few who would think to ask.

    I think learning about data science won't make you a data scientist but it will give you insight on how you can help turbo-charge your imaginative statistician.

    Well said

  • xsevensinzx - Thursday, October 12, 2017 7:25 AM

    But, when you say things like "the underlying role is vacuous", you're basically saying the role can't really be defined and that it's something that has no real planning or thought behind the role. It's just something someone fills and they themselves don't even know what it means nor the business. It's like you're saying, "you're not real." While I agree there is not a standard across all organizations, there is enough to define a role that specifically fits a set of skills that most of our common roles are not focused on day-to-day (i.e.: machine learning). To say the role is just fluff is something I would strongly have to disagree with.

    I'm saying the role is NOT vacuous, and NOT fluff.
    The title for the role is the bit that makes me nervous.  If you were asked to recruit a statistician you'd have expectations of high mathematical competence.  Degree level and probably higher.  A few of my colleagues are PhD level maths and stats bods.
    I don't know how to be clearer of my personal expectations of what a data scientist should be.  My issue is that the title seems to be used as a broad catch all for a spectrum of abilities and skills and when that happens the title becomes meaningless even though the role REMAINS MEANINGFUL

  • David.Poole - Thursday, October 12, 2017 11:27 AM

    xsevensinzx - Thursday, October 12, 2017 7:25 AM

    But, when you say things like "the underlying role is vacuous", you're basically saying the role can't really be defined and that it's something that has no real planning or thought behind the role. It's just something someone fills and they themselves don't even know what it means nor the business. It's like you're saying, "you're not real." While I agree there is not a standard across all organizations, there is enough to define a role that specifically fits a set of skills that most of our common roles are not focused on day-to-day (i.e.: machine learning). To say the role is just fluff is something I would strongly have to disagree with.

    I'm saying the role is NOT vacuous, and NOT fluff.
    The title for the role is the bit that makes me nervous.  If you were asked to recruit a statistician you'd have expectations of high mathematical competence.  Degree level and probably higher.  A few of my colleagues are PhD level maths and stats bods.
    I don't know how to be clearer of my personal expectations of what a data scientist should be.  My issue is that the title seems to be used as a broad catch all for a spectrum of abilities and skills and when that happens the title becomes meaningless even though the role REMAINS MEANINGFUL

    Sorry, totally misread your sentence there because your prior point was there is no data scientist per say, it's as team effort. Yet there are areas of course on that team effort that fits the role of a data science, which I believe is absolutely true in most cases and therefore is not a catch all. The data scientist is not doing all of those roles, only one specific role within a team collaboration. So, we may have to agree to disagree here that it's a catch all. 

    And forgive me here. I try not to put too much emphasis on what OTHERS are saying and try to put more emphasis on what makes sense to the business and myself. That's why if I asked myself regardless of what's trendy or not, "Does a data science role make sense? Does it have a role on my team and with our organization regardless of the buzz?" the answer is almost always yes regardless if Joe Blow thinks its trendy or not and regardless if other organizations think it's a catchall. All that matters if what has been successful and a failure in my experiences. And I got to tell you, it's been widely successful from what I've seen as a role ON a team, but not a catch all.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply