Why PolyBase matters

Question

Post reply

Why PolyBase matters

APSolutely

Old Hand

Points: 390
More actions
August 9, 2015 at 3:53 pm

#319573

Comments posted to this topic are about the item Why PolyBase matters

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

eagleua SSC Rookie Points: 49 More actions · Answer 1

Hi Alain,

Thanks for thoughtful article.

Have one question - does PolyBase in your case generate Map/Reduce jobs to process data, or its still 'move data into SQL' and then process it?

Basically - do you need to transfer again all your dataset when you use PolyBase or it transformed into Map/Reduce and returning to you only result set?

Thanks and regards!

Andrii

APSolutely Old Hand Points: 390 More actions · Answer 2

Hi Andrii,

Glad you enjoyed the article, part 2 will be available tomorrow which shows more technical information.

However to answer your question, the inclusion of an HDI Region (whether on-premise in the appliance or azure) should be seen as the addition of hardware and software and is a separate engine from the PDW. The data movement between HDI and the PDW regions is a fully parallelized process controlled by Data Movement Services (DMS) but PolyBase is still responsible for the predicate push down capabilities which uses MapReduce to reduce the amount of data moved between the two regions based on the query predicates.The final result set produced by the MapReduce job is then moved using DMS to temporary tables in the PDW.

Important to note though, this only happens when the amount of data that would be landed in the PDW region exceeds 1GB per distribution (and you have enabled the functionality by providing the resource manager location in the external data source configuration)

I just want to credit a friend of mine and an APS guru James Rowland-Jones for provide the specific details and exact context to this answer.

I'll also refer you to a very good article by another APS Guru James Serra : PolyBase Explained[/url] where he explains exactly how the current implementation of PolyBase works and its limitations.

h.tobisch SSCommitted Points: 1671 More actions · Answer 3

h.tobisch

SSCommitted

Points: 1671

August 25, 2017 at 2:39 am

#1956760

After all, what IS Polybase?

chudman SSCrazy Points: 2453 More actions · Answer 4

Very interested in follow-up articles, Alain. Keep them coming!

One question: With the creation of the Azure data warehouse, don't you make the DR set-up you had before obsolete?

Thanks

Jeff Bennett
Missouri, US