As I have mentioned in prior blog posts, I have been writing a data architecture book, which I started last November. The title of the book is “Deciphering Data Architectures: Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh” and it is being published by O’Reilly. They have made the first two chapters and the preface available in their Early Release program. It’s 32 printed pages. Check it out here! You can expect to see 1-2 additional chapters appear each month. This is a great way to start reading the book without having to wait until the entire book is done. Note you have to have an O’Reilly subscription to access it, or start a free 10-day trial. The site has the release date for the full book as September 2024, but I’m expecting it to be available by the end of this year. Please send me any feedback on the book to firstname.lastname@example.org. Would love to hear what you think!
Here is the abstract of the book:
Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern data warehouse. These new architectures have solid benefits, but they’re also surrounded by a lot of hyperbole and confusion. This practical book provides a guided tour of each architecture to help data professionals understand its pros and cons.
In the process, James Serra, big data and data warehousing solution architect at Microsoft, examines common data architecture concepts, including how data warehouses have had to evolve to work with data lake features. You’ll learn what data lakehouses can help you achieve, and how to distinguish data mesh hype from reality. Best of all, you’ll be able to determine the most appropriate data architecture for your needs. By reading this book, you’ll:
- Gain a working understanding of several data architectures
- Know the pros and cons of each approach
- Distinguish data architecture theory from the reality
- Learn to pick the best architecture for your use case
- Understand the differences between data warehouses and data lakes
- Learn common data architecture concepts to help you build better solutions
- Alleviate confusion by clearly defining each data architecture
- Know what architectures to use for each cloud provider
And here is the table of contents (subject to change):
A brief description of how the book publishing process works, for those interested: You submit a book proposal (for O’Reilly go here). You discuss your abstract with an acquisitions editor, and if the proposal is accepted, you sign a contract that contains a timeline for when the chapters are due. Then start writing! It could take a year or more to finish a book, hence the benefit of the Early Release program. How this works is you write a chapter and submit it to a development editor, who will make edits and suggested changes and sends it back to you (the edits are more along the lines of structure and content, not so much on grammar). You make some or all of the suggested changes and submit it back to the development editor. You may repeat this cycle a couple of times for each chapter. Once you do this for two chapters (they don’t have to be the first two chapters in the book), you are ready for the early-release program. Those two chapters are sent to a production editor who publishes them to the O’Reilly site. Then approximately every month you try and write 1-2 chapters that can be edited and posted to the site. The content is considered “unedited” but as I explained earlier there is editing being done for structure and content and the grammar editing will be done by a copy editor after all the chapters are posted in the Early Release program. So the chapters you read in the Early Release program will have some changes, but usually not much. Most of what I’m describing happens before your book gets to production (if you’re curious about that, I’d recommend checking out O’Reilly’s guide to Production.) The Early Release process is pretty O’Reilly-specific, and different authors and development editors will manage the revisions and number of chapter expectations differently. The level of edit for copyedits and proofreads in production will depend on a number of factors as well.