View profile weekly #16: A framework for understanding the data journey, The evolution of Data Engineering, Maths for ML and DS foundations...


nibble dispatch

May 22 · Issue #16 · View online

Curated essays about the future of Data Science. Production Data Science and learning resources for continuous learning. Covers Data Science, Data Engineering, MLOps & DataOps. Curated by people at

Our top pick this week is a16z’s practical framework for understanding the data journey where they question the value of “data network effects” as a defensive strategy for companies.
Data + network effects ≠ data network effects

From Jupyter to Prod (see below for an overview of the different approaches to putting ML models in production)
From Jupyter to Prod (see below for an overview of the different approaches to putting ML models in production)
While so many people are trying to enter the field of data science, it’s easy to get misguided and focus on the wrong stuff. Peter Scobas shares his experience on how he made the jump from economics to data science:
[…] what I’ve learned is that it is a bad idea for aspiring data science and analytics applicants to immediately jump into trying to learn fancy machine learning or deep learning models.
The evolution of data engineering
I stumbled upon this great post on the evolution of data engineering, which led me to the Medium of Maxime Beauchemin, the creator of Apache Airflow: all very good posts, like this one on functional data engineering.
The role of the data engineer is no longer to just provide support for analytics purpose but to be the owner of data-flows and to be able to serve data both to production and for analytics purpose.
Human-Centered Artificial Intelligence
Google just released their People + AI Guidebook about the multidisciplinary and human-centered approach to designing with machine learning and AI
Learning resources
Essential Maths for Machine Learning
It’s not entirely clear what level of mathematics is necessary to get started in machine learning. If you need to brush up your math skills, this course by Microsoft on EdX provides a hands-on approach on topics like equations, functions, vectors, matrices, statistics and probability.
Foundations of Data Science
Microsoft Research just release a free book that covers the theory for data science they expect to be useful in the next 40 years. The book is available for free as a pdf here, you can also watch the presentation video.
Data Engineering Cookbook
Andreas Ketz, the host of Plumbers of Data Science just released a first draft of his Data Engineering Cookbook.
It’s not only useful for beginners, professionals will definitely like the case study section.
You can download the PDF for free here.
🍪 Bonus
I haven’t watched it yet but that sounds super interesting, this talk about strategies for testing Async code from the last PyCon.
See you next week! 🙂
Did you enjoy this issue?
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue