Machine Learning in Production — Book Overview
After teaching our Machine Learning in Production class (formerly “Software Engineering for AI-Enabled Systems”) four times, we stupidly made a decision we are going to soo regret when there are still so many chapters left: We are going to write a book with our collected material.
We will release the book publicly under creative commons license eventually. While we work on it, we are releasing individual chapters here, one at a time. We hope that more chapters below will be filled with links soon.
Table of Contents
Part 1: ML in Production: Going beyond the model
- Introduction
- From model to production system
- Challenges of production machine learning systems
Part 2: Engineering Production AI Systems
- Requirements engineering
- 1. System and model goals
- 2. Excursion: Measurement
- 3. Model qualities
- 4. The world and the machine (short version)
- 5. Risk analysis
- 6. Planning for mistakes
- Architecture and design
- 1. Inference service
- 2. ML tradeoffs
- 3. Model deployment
- 4. Model composition
- 5. Telemetry
- 6. Big data
- 7. Monitoring
- 8. Pipelines and automation
- 9. Evolution
- 10. Human-AI interaction
- 11. Other architectural drivers: privacy, data volume, operating cost
- Quality assurance
- 1. Model Quality: Defining Correctness and Fit
- 2. Model Quality: Measuring Prediction Accuracy
- 3. Model Quality: Slicing, Capabilities, Invariants, and other Testing Strategies
- 4. Data quality
- 5. QA automation
- 6. Quality assurance in production
- 7. Infrastructure quality
- 8. Debugging
- Operations
- Process and teams
- 1. Data science and software engineering process models (short version)
- 2. Interdisciplinary teams
- 3. Technical debt
- 4. DevOps to MLOps
Part 3: Responsible AI Engineering
- Responsible Engineering
- Versioning, provenance, and reproducibility
- Safety
- Security and privacy
- Fairness
- Interpretability and explainability
- Transparency and trust
- Regulation is coming