Why no posts?

Question

Why no posts?

heb1014

Hall of Fame

Points: 3790
More actions
October 1, 2019 at 8:25 pm

#3685127

I've heard Microsoft promote their concept of the "modern data warehouse" and their tools to support machine learning. In my mind, there are at least 2 big components embedded in what they say: marketing to promote adoption of their cloud services and genuine opportunities to add value to the community.
When I hear the word "modern" in "modern data warehouse", I also think about the words "traditional" and "old-school" when considering the on-prem warehouse workloads I currently support. My question is why is there significantly less activity in the Cloud Computing forums on this site compared to the SQL Server 20xx forums? I think Microsoft would like us to feel pressure to quickly move forward with the cool kids and adopt all of their services, but I wonder what everyone else is doing? Is your workload mostly on-prem or cloud? Are you using Azure Machine Learning?

Viewing 12 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004685 More actions · Answer 1

To answer both of your questions for what I'm currently doing... 100% on-prem for SQL Server.

As for the MS versions of machine learning in general, the only thing I've done with it is watch a couple of really lame demos by MS people and others alike. When I went back through the code of one individual that claimed strong ties with MS, I found that it wasn't actually a reasonable example because that person made no real attempt at using the "machine learning" part of the demo and it came up with wrong answers both for history and for the predictions. A simple GROUP BY did a better job in the area of history and a simple use of a Tally Table and a single formula did a much more accurate job of predicting.

I'm not saying that "Machine Learning" isn't a good thing. In fact, I embrace it. I've just not seen anyone do it right yet and the lame demos are leaving a really bad taste in my mouth. Full disclosure: I've been too busy to give it a fair chance by teaching myself from the documentation and I've not looked much online for good examples either because I also don't need it right now.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

heb1014 Hall of Fame Points: 3790 More actions · Answer 2

heb1014

Hall of Fame

Points: 3790

October 2, 2019 at 12:06 pm

#3685244

Thanks Jeff!

Chris Harshman SSC-Forever Points: 42192 More actions · Answer 3

Where I'm at we do about 80% of our data warehouse work on premises and 20% in the cloud. I've worked in IT for 25 years, and it's important to be able to separate the true trends from the hype. Yes there are cases where cloud makes sense, depending on what the original source of the data is and who needs to access it, but honestly to me it seems the more things change the more they stay the same. What's important is to measure the pros and cons of each method and find what works best for the situation you are in and problem you're trying to solve and separate facts from the hype.

DinoRS SSCrazy Points: 2683 More actions · Answer 4

I actually would love to play around with Machine Learning but not on Azure - unless it's about training very big models - as this would rather make me poor rather quickly.

The issue is for me: I haven't figured out yet how to use it in a sensible way because if I had I would skip all this Node Red / Apache Spark / Flink / Kafka stuff and point some sensors directly to MSSQL Server. I do have a potential use case but at the current stage of the project and SQL Server Version I wouldn't want to invest the time into this, SQL 2019 Big Data Cluster might change that a bit but right now I'm not exactly holding my breath about eventually running a big fat SQL Server with GPGPUs in it.

Things look more like using Single Board Computers (like Raspberry Pi, Nvidia AGX Xavier) and TPUs (like Intel Neural Compute Stick, Google Coral USB Accelerator) is the way to go which leaves us with not many things you might want to process somewhere else - at least in case of sensor data I believe, a GIS where your trucks' route to the next manufacturing plant could be changed in real-time to avoid traffic jams is something I think Machine Learning on R & MSSQL Services is suitable for.

My workload is 100% on-prem, too well mostly. There is some PowerBI Project coming up which brings in the possibility of Azure even tho we're definitely going to deploy local Reporting Services.

Steve Jones - SSC Editor SSC Guru Points: 740271 More actions · Answer 5

Our warehouse is in the cloud, but really in an IaaS VM that runs a warehouse database.

ML? It's hard, and as Jeff mentioned, often T-SQL finds similar results that are easier. I think ML works well in some domains, like imaging and speech, but in data warehouse analysis, not sold yet this is better, mostly because I think we, or data scientist (ish) people, struggle to know how to frame a question that isn't easily answered with traditional analysis.

xsevensinzx One Orange Chip Points: 25560 More actions · Answer 6

I've been highly inactive for like a year it seems, but here is my input.

I am still 100% invested in Azure Data Warehouse and Azure Data Lake Store with Azure Data Lake Analytics.

Data -> Store -> Analytic Engine -> Warehouse -> SQL DB -> Power BI

When it comes to activity, got to remember, a lot that applies to Azure Data Warehouse also applies to SQL Server. There are a few differences in what is available between cloud and on-prem. But, there is many common things too. Thus, you see a lot of related questions that may be tied to on-prem, but actually equate to both.

For Azure Machine Learning, I have used it a lot. The main benefit of Azure Machine Learning is taking the ML out of your application (e.g.: hard coded) and putting it somewhere else where your app can interact with it via API's (or embedding it). When I went down the path of exploring Azure Machine Learning, I quickly realized I am not developing applications for my data. It's mostly for analytical and operational reporting use cases.

Many of the tools we use have ML features now. Power BI for example has plenty of ML features that do not require Azure ML to thrive on. Azure Data Lake Analytics (the analytics engine in my flow) also has ML features such as a couple of options to wrap or upload full ML modules/code as part of the U-SQL jobs. Again, not needing Azure ML to function.

Outside of that, I do love Azure ML. It allows for similar approaches that you may take with utilizing stored procs versus other approaches with your apps. Having the ML completely separate, outside of the raw code, allows for the data scientist to update and maintain that ML package easier. It's a really nice feature if you really want to enable your applications to have supervised or unsupervised learning on top of just providing ML to Excel and other apps your team may be using.

The only downside is extracting the coefficients seem non-existent with Azure ML, which can be a pain for the DS teams.

Jeff Moden SSC Guru Points: 1004685 More actions · Answer 7

xsevensinzx wrote:

I've been highly inactive for like a year it seems, but here is my input.
I am still 100% invested in Azure Data Warehouse and Azure Data Lake Store with Azure Data Lake Analytics.
Data -> Store -> Analytic Engine -> Warehouse -> SQL DB -> Power BI
When it comes to activity, got to remember, a lot that applies to Azure Data Warehouse also applies to SQL Server. There are a few differences in what is available between cloud and on-prem. But, there is many common things too. Thus, you see a lot of related questions that may be tied to on-prem, but actually equate to both.
For Azure Machine Learning, I have used it a lot. The main benefit of Azure Machine Learning is taking the ML out of your application (e.g.: hard coded) and putting it somewhere else where your app can interact with it via API's (or embedding it). When I went down the path of exploring Azure Machine Learning, I quickly realized I am not developing applications for my data. It's mostly for analytical and operational reporting use cases.
Many of the tools we use have ML features now. Power BI for example has plenty of ML features that do not require Azure ML to thrive on. Azure Data Lake Analytics (the analytics engine in my flow) also has ML features such as a couple of options to wrap or upload full ML modules/code as part of the U-SQL jobs. Again, not needing Azure ML to function.
Outside of that, I do love Azure ML. It allows for similar approaches that you may take with utilizing stored procs versus other approaches with your apps. Having the ML completely separate, outside of the raw code, allows for the data scientist to update and maintain that ML package easier. It's a really nice feature if you really want to enable your applications to have supervised or unsupervised learning on top of just providing ML to Excel and other apps your team may be using.
The only downside is extracting the coefficients seem non-existent with Azure ML, which can be a pain for the DS teams.

It's nice to see someone that's eyeballs deep in it in a good way. With that in mind, I'm incredibly curious as to what questions you're using ML to answer.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

xsevensinzx One Orange Chip Points: 25560 More actions · Answer 8

It's nice to see someone that's eyeballs deep in it in a good way. With that in mind, I'm incredibly curious as to what questions you're using ML to answer.

Well, I work in advertising. There are plenty of use cases for ML in that industry. The first being forecasting how ads will perform. Others may be forecasting spend-to-conversions. Trying to find different data points that drive sales up or down etc.

Other use cases are using ML to analyze creatives. Looking at RGB, brand detection, landmarks, if a photo is racy or not, etc to understand what is driving interaction or sales.

Another more common one is just to help classify new datasets. For example, you can use K-Means Clustering to identify a combination of data points of having some type of relationship that may allow you as the database developer to create a new classification based on that output. Then that classification can be then used to make decisions.

Let's not forget the targeting, audience segmentation, bidding, etc that has to happen in real time based on all this data. ML is used to computate, build, and automate these datasets where AI is used to make the decision with the result set.

This reply was modified 6 years, 1 months ago by xsevensinzx.
This reply was modified 6 years, 1 months ago by xsevensinzx.
This reply was modified 6 years, 1 months ago by xsevensinzx.

Jeff Moden SSC Guru Points: 1004685 More actions · Answer 9

Awesome! I think you're the first person I've talked to that's actually using it for something that I think it's useful for. As soon as you said that you work in advertising, I knew what was coming.

You should pick one of the many things your using it for and write an article about how you did it. I know that I'd be a reader of it.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

xsevensinzx One Orange Chip Points: 25560 More actions · Answer 10

I mean, there are many other uses in other industries for sure. I had an interview with a wood pellet company for alternative fuel. They wanted to explore using ML on the data gathered from their many plants and equipment. The data, if used correctly, could optimize how they are making their products more efficiently.

I mean, in theory, that's all we are doing in advertising. The machine is Google and the product is the ad. We use things like attribution to see how that ad is working across all channels, not just Google. Then optimize according to what ML is telling us.

Steve Jones - SSC Editor SSC Guru Points: 740271 More actions · Answer 11

We used to use ML type systems a decade ago to read marks on wood from a scanner. These were used to grade the wood and help workers reduce the manual work they needed to do. This was rudimentary by today's standards, but I could see this working much better now with better equipment.

The best use I've seen is the jet engine twins from GE. The twins run models based on telemetry from the engines, including inspection videos, to help determine which ones need service now. It's both improved efficiency and also better targeted maintenance by catching items that inspectors sometimes miss.