• To be honest, I'm rather suspicious of the whole "data science" aspect.  In my humble opinion, much of what I've done over my career could be considered "data science" with advanced data analysis, result aggregations and projections, comparative analysis and even including "what-if" scenarios.  All of that was done with familiar tools such as T-SQL, SSRS and occasionally Excel.

    Lately, I've seen multiple examples of queries from "data scientists" that were simply abysmal.  Either the queries were written by someone who clearly does not have the most basic concept of writing moderately efficient queries with a minimal understanding of relational concepts and a basic knowledge of the data organization or the queries were written by some sort of tool that produced queries based upon obscure criteria given by someone who still has a lack of understanding how the data is organized.  Not only were the queries horribly inefficient (for some queries in particular I could write a whole paper on everything that was done wrong and why), but they were executed on a production server while the primary daily job was running.  This impacted not only the system in a general sense but specifically impacted the very tables the primary job required for updates.  As expected, this had a severe negative impact on the Service Level Agreement (SLA) the customer expected of the primary job.  The fact that the SLA is without regard for whatever other customer activity that may occur on their production server is a topic for another discussion.

    I'm sure the "data scientist" had no inkling of the impact that he/she had on the system and merely wanted the data sought so it could be placed into a location for further analysis.  Yet, it seems prudent to hold a "data scientist" to a higher standard than a junior level database developer.  While statistics are often employed for data science, it is much more than merely applying statistical models against data to see what the outcome may be.  Much of it is truly understanding the data, reviewing it to look for patterns (or the lack thereof) to identify trends and to determine how to leverage that knowledge into useful activity to support the business endeavor.  In that very sense, many of us are already data scientists.

    Data scientist or not, we all have a responsibility to be prudent in how we perform our duties such as performing data analysis on a copy of production, not production itself, to avoid an adverse impact on production operations.  We need to be mindful of others and not operate in isolation.  In truth, we need to always consider the impact that we have on each other and the business.  After all, if we're not making a positive impact, why are we there?