Why no posts?

  • heb1014

    Hall of Fame

    Points: 3772

    I've heard Microsoft promote their concept of the "modern data warehouse" and their tools to support machine learning.  In my mind, there are at least 2 big components embedded in what they say: marketing to promote adoption of their cloud services and genuine opportunities to add value to the community.

    When I hear the word "modern" in "modern data warehouse", I also think about the words "traditional" and "old-school" when considering the on-prem warehouse workloads I currently support.  My question is why is there significantly less activity in the Cloud Computing forums on this site compared to the SQL Server 20xx forums?  I think Microsoft would like us to feel pressure to quickly move forward with the cool kids and adopt all of their services, but I wonder what everyone else is doing?  Is your workload mostly on-prem or cloud?  Are you using Azure Machine Learning?

  • Jeff Moden

    SSC Guru

    Points: 994858

    To answer both of your questions for what I'm currently doing... 100% on-prem for SQL Server.

    As for the MS versions of machine learning in general, the only thing I've done with it is watch a couple of really lame demos by MS people and others alike.  When I went back through the code of one individual that claimed strong ties with MS, I found that it wasn't actually a reasonable example because that person made no real attempt at using the "machine learning" part of the demo and it came up with wrong answers both for history and for the predictions.  A simple GROUP BY did a better job in the area of history and a simple use of a Tally Table and a single formula did a much more accurate job of predicting.

    I'm not saying that "Machine Learning" isn't a good thing.  In fact, I embrace it.  I've just not seen anyone do it right yet and the lame demos are leaving a really bad taste in my mouth.  Full disclosure:  I've been too busy to give it a fair chance by teaching myself from the documentation and I've not looked much online for good examples either because I also don't need it right now.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • heb1014

    Hall of Fame

    Points: 3772

    Thanks Jeff!

  • Chris Harshman

    SSC-Forever

    Points: 41813

    Where I'm at we do about 80% of our data warehouse work on premises and 20% in the cloud.  I've worked in IT for 25 years, and it's important to be able to separate the true trends from the hype.  Yes there are cases where cloud makes sense, depending on what the original source of the data is and who needs to access it, but honestly to me it seems the more things change the more they stay the same.  What's important is to measure the pros and cons of each method and find what works best for the situation you are in and problem you're trying to solve and separate facts from the hype.

  • DinoRS

    SSCrazy

    Points: 2533

    I actually would love to play around with Machine Learning but not on Azure - unless it's about training very big models - as this would rather make me poor rather quickly.

    The issue is for me: I haven't figured out yet how to use it in a sensible way because if I had I would skip all this Node Red / Apache Spark / Flink / Kafka stuff and point some sensors directly to MSSQL Server. I do have a potential use case but at the current stage of the project and SQL Server Version I wouldn't want to invest the time into this, SQL 2019 Big Data Cluster might change that a bit but right now I'm not exactly holding my breath about eventually running a big fat SQL Server with GPGPUs in it.

    Things look more like using Single Board Computers (like Raspberry Pi, Nvidia AGX Xavier) and TPUs (like Intel Neural Compute Stick, Google Coral USB Accelerator) is the way to go which leaves us with not many things you might want to process somewhere else - at least in case of sensor data I believe, a GIS where your trucks' route to the next manufacturing plant could be changed in real-time to avoid traffic jams is something I think Machine Learning on R & MSSQL Services is suitable for.

    My workload is 100% on-prem, too well mostly. There is some PowerBI Project coming up which brings in the possibility of Azure even tho we're definitely going to deploy local Reporting Services.

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 716216

    Our warehouse is in the cloud, but really in an IaaS VM that runs a warehouse database.

    ML? It's hard, and as Jeff mentioned, often T-SQL finds similar results that are easier. I think ML works well in some domains, like imaging and speech, but in data warehouse analysis, not sold yet this is better, mostly because I think we, or data scientist (ish) people, struggle to know how to frame a question that isn't easily answered with traditional analysis.

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply