Building Better ML Systems — Chapter 4. Model Deployment and Beyond
Deploying models and supporting them in production is more about engineering and less about machine learning.
When an ML project approaches production, more and more people get involved: Backend Engineers, Frontend Engineers, Data Engineers, DevOps, Infrastructure Engineers...
They choose data storages, introduce workflows and pipelines, integrate service into the backend and UI codebase, automate releases, make backups and rollbacks, decide on compute instances, set up monitoring and alerts… Today, literally no one expects a Data Scientist / ML Engineer to do it all. Even in a tiny startup, people are specialized to some extent.
“Why should a Data Scientist / ML Engineer know about production?” — you may ask.
Having the model in production does not mean we are done with all ML-related tasks. Ha! Not even close. Now it’s time to tackle a whole new set of challenges: how to evaluate your model in production and monitor whether its accuracy is still satisfactory, how to detect data distribution shifts and deal with them, how often to retrain the model, and how to make sure that a newly trained model is better. There are ways, and we are going to extensively discuss them.
In this post, I intentionally focus on ML topics only and omit many engineering concepts or cover them at a high level — to keep it simple and understandable for people with varying levels of experience.
0 Comments