On August 4, 2020, I presented this on the weekly Pragmatic Works webinar series. You can view that presentation here.
As part of that presentation, I committed to give you access to the databricks notebooks so you can run through this as well. You can find the notebook on my Github. It is stored as a dbc (Databricks notebook) file. You will need Databricks to open the file.
Two questions were asked during the session that I wanted to handle here. The first was related to connecting to relational databases. The short answer is yes. You can use the JDBC driver to work with SQL Server or Snowflake for instance. Details can be found in the data sources article on the Databricks site.
The next question was related to Databricks ability to work with SFTP. While I cannot speak to the specifics, I was able to find the following Spark library that may be able to provide the support you need. To be clear, I have not implemented this myself but wanted to provide a potential resource to help with this implementation. I found this in a Databricks forum and it may work for you: https://github.com/springml/spark-sftp. If one of you finds this useful, feel free to post a comment here for others to refer to.
Thanks again for everyone who was able to attend. We look forward to continuing to work with Databricks more.