SQL Server Integrates Hadoop and Spark out-of-the box: The Why?

Question

SQL Server Integrates Hadoop and Spark out-of-the box: The Why?

Frank A. Banin

SSCommitted

Points: 1747
More actions
September 9, 2019 at 12:00 am

#3672902

Comments posted to this topic are about the item SQL Server Integrates Hadoop and Spark out-of-the box: The Why?
Frank Banin
BI and Advanced Analytics Professional.

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Chris Clements Grasshopper Points: 13 More actions · Answer 1

Fantastic information here, but there were so many grammar errors that they distracted me during the reading. Sorry...not trying to be a jerk here, but thought it needed to be pointed out for the future.

Have you ever imagined a world without hypothetical situations?

Jeff Moden SSC Guru Points: 1004683 More actions · Answer 2

Absolutely fascinating article... especially the "in a nutshell" history and "The WHY" as to what's happening with 2019. Looks like I have a lot of reading to do thanks to all of the links you provided.

It also explains a recent surge in questions on Spark SQL. A lot of people are trying to apply what they know about T-SQL (and other flavors of SQL) to Spark and failing because they don't realize that SQL <> SQL. For example, the DATEDIFF function in Spark SQL is relatively crippled in comparison to the T-SQL version and so cannot be used in the same manner for much. However, if people take the time to lookup the relatively good documentation on the various functions in Spark SQL, they'd find a wealth of computational power in different functions that can (for example) greatly simplify such things as the computations that we use DATEDIFF in T-SQL for but in Spark SQL. It's a powerful "SQL" (in my "first blush" examination of the documentation) but it's a different "SQL". Knowing that up front will greatly reduce the anxieties of learning a different flavor of SQL.

Thanks again for the article and "well done"! I'm looking forward to reading the articles at the other end of all the links.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

tablesneer Newbie Points: 6 More actions · Answer 3

I have found the article to be really helpful. All of the information that is included in the post about Spark SQL is really detailed, and I find it highly impressive.