There are an increasing number of articles demonstrating that obfuscated data isn't as obfuscated as it might first appear. The Netflix example being just one.
Tools like Redgate SqlGenerator can be used to generate sufficient quantities of realistic, but not real, data. I had a lot of fun with the tool and wrote it up in https://www.red-gate.com/hub/product-learning/sql-data-generator/how-to-generate-various-forms-of-realistic-data-for-testing-development-and-prototypes
I do agree that generating realistic data isn't easy once you step beyond "Here is a table. Fill it". Getting the proportions and relationships right becomes ever important as the sophistication of testing and demonstrations increases. As the article points out, it isn't the job of an afternoon! Setting it up correctly can be a project in its own right. The setup needs to be treated as a product and nutured as such.
Putting large test datasets into some form of shared binary repository is a good strategy. The need to refresh such data is likely to be infrequent in most cases