9/17/2020

Data Science in Production: Building Scalable Model Pipelines with Python

Ben Weber, Data Science in Production: Building Scalable Model Pipelines with Python, Independently published, 2020.

Putting predictive models into production is one of the most direct ways that data scientists can add value to an organization. By learning how to build and deploy scalable model pipelines, data scientists can own more of the model production process and more rapidly deliver data products.

This book provides a hands-on approach to scaling up Python code to work in distributed environments in order to build robust pipelines. Readers will learn how to set up machine learning models as web endpoints, serverless functions, and streaming pipelines using multiple cloud environments. It is intended for analytics practitioners with hands-on experience with Python libraries such as Pandas and scikit-learn, and will focus on scaling up prototype models to production.

From startups to trillion dollar companies, data science is playing an important role in helping organizations maximize the value of their data. This book helps data scientists to level up their careers by taking ownership of data products with applied examples that demonstrate how to:

  • Translate models developed on a laptop to scalable deployments in the cloud
  • Develop end-to-end systems that automate data science workflows
  • Own a data product from conception to production

The accompanying Jupyter notebooks provide examples of scalable pipelines across multiple cloud environments, tools, and libraries (github.com/bgweber/DS_Production).

沒有留言:

張貼留言