Chaos and Pain in Machine Learning, and the ‘DevOps for ML Manifesto’
Chaos and Pain in Machine Learning, and the ‘DevOps for ML Manifesto’

Abstract: 

Most AI/ML projects start shipping models into production, where they can deliver business value, using the no-process process. That is, people just do their best by creating an ad-hoc process with familiar tools. This works for tiny teams at first, but as the team grows you'll discover significant chaos and pain trying to operationalize AI.

We’ve been here before. Software development in the 90s was a lot like the “no-process process” for ML today. And just as the paradigm shift known as DevOps brought reproducibility, collaboration and continuous delivery to software, applying the same principles to ML can bring the same benefits to AI projects. Without it, AI projects fail and create financial and reputational risk.

Nick will dive deep into the important differences between software development and ML which mean you can't just reuse DevOps tools. He will share his team’s research comparing the evolution of Software Development & DevOps with that of Machine Learning. Nick will then present a manifesto, and propose an architecture and a set of open-source tools to make ML reproducible, accountable, collaborative and continuous.

Bio: 

Nick has been a data scientist since the early 2000s. After obtaining an undergraduate degree in geology at Cambridge University in England (2000), he completed Masters (2001) and PhD (2004) degrees in Astronomy at the University of Sussex, then moved to North America, completing postdoctoral positions in Astronomy at the University of Illinois at Urbana-Champaign (2004-9, joint with the National Center for Supercomputing Applications), and the Herzberg Institute of Astrophysics in Victoria, BC, Canada (2009-2013). He joined Skytree, a startup company specializing in machine learning, in 2012, and in 2017 the Skytree technology and team was acquired by Infosys. Machine learning has been part of his work since 2000, first applying it to large astronomical datasets, followed by wide ranges of application as a generalist data scientist at Skytree, Infosys, Oracle, and now Dotscience.