Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow

Abstract: 

ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models.

To solve these challenges, MLflow, an open-source project, simplifies the entire ML lifecycle. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size.

What You Will Learn
* Understand the four main components of open source MLflow (MLflow Tracking, MLflow Projects, MLflow Models, and MLfow Registry) and how each help address challenges of the ML lifecycle
* How to use MLflow Tracking to record and query experiments: code, data, config, and results.
* How to use MLflow Projects packaging format to reproduce runs
* How to use MLflow Models general format to send models to diverse deployment tools.
* How to use MLflow Registry for collaborative model lifecycle management
* How to use MLflow UI to visually compare and contrast experimental runs with different tuning parameters and evaluate metrics
Prerequisite
– A fully-charged laptop (8-16GB memory) with Chrome or Firefox
– Pre-Register for Databricks Community Edition
– Basic knowledge of Python programming language
– Basic understanding of machine learning concepts and supervised learning algorithms
– Some with basic familiry scikit-learnand Keras/TensorFlow

Bio: 

Jules S. Damji is an Apache Spark community and developer advocate at Databricks. He’s a hands-on developer with over 20 years of experience. Previously, he worked at leading companies such as Sun Microsystems, Netscape, @Home, LoudCloud/Opsware, Verisign, ProQuest, and Hortonworks, building large-scale distributed systems. He holds a BSc and MSc in computer science and MA in political advocacy and communication from Oregon State University, California State University, and Johns Hopkins University, respectively.