Advanced Machine Learning with scikit-learn: Imbalanced Classification and Text Data
Advanced Machine Learning with scikit-learn: Imbalanced Classification and Text Data

Abstract: 

scikit-learn is a machine learning library in Python, that has become a valuable tool for many data science practitioners. This training will cover some advanced topics in using scikit-learn and how to build your own models or feature extraction methods that are compatible with scikit-learn. We will also discuss different approaches to feature selection and resampling methods for imbalanced data. Finally, we'll discuss how to do the classification of text data using the bag-of-words model and its variants.

Bio: 

Andreas Mueller is an Associate Research Scientist at the Data Science Institute at Columbia University and author of the O'Reilly book """"Introduction to machine learning with Python"""". He is one of the core developers of the scikit-learn machine learning library and has co-maintained it for several years.

His mission is to create open tools to lower the barrier of entry for machine learning applications, promote reproducible science and democratize access to high-quality machine learning algorithms.