Absolutely fascinating article... especially the "in a nutshell" history and "The WHY" as to what's happening with 2019. Looks like I have a lot of reading to do thanks to all of the links you provided.
It also explains a recent surge in questions on Spark SQL. A lot of people are trying to apply what they know about T-SQL (and other flavors of SQL) to Spark and failing because they don't realize that SQL <> SQL. For example, the DATEDIFF function in Spark SQL is relatively crippled in comparison to the T-SQL version and so cannot be used in the same manner for much. However, if people take the time to lookup the relatively good documentation on the various functions in Spark SQL, they'd find a wealth of computational power in different functions that can (for example) greatly simplify such things as the computations that we use DATEDIFF in T-SQL for but in Spark SQL. It's a powerful "SQL" (in my "first blush" examination of the documentation) but it's a different "SQL". Knowing that up front will greatly reduce the anxieties of learning a different flavor of SQL.
Thanks again for the article and "well done"! I'm looking forward to reading the articles at the other end of all the links.
is pronounced "ree-bar
" and is a "Modenism
" for R
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".
"If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)