Margaret Mitchell, PhD AI Researcher, Founding Member | Google Research and Machine Intelligence, Microsoft Research’s “Cognition” group
Margaret Mitchell, PhD
Margaret is a Senior Research Scientist in Google’s Research & Machine Intelligence group, working on artificial intelligence. Her research generally involves vision-language and grounded language generation, focusing on how to evolve artificial intelligence towards positive goals. This includes research on helping computers to communicate based on what they can process, as well as projects to create assistive and clinical technology from the state of the art in AI. Her work combines computer vision, natural language processing, social media, many statistical methods, and insights from cognitive science.
Michael Kearns, PhD Professor, Author of The Ethical Algorithm, National Center Chair, Founding Director | Warren Center for Network and Data Sciences, University of Pennsylvania
Michael Kearns, PhD
Michael Kearns is a professor in the Computer and Information Science department at the University of Pennsylvania, where he holds the National Center Chair and has joint appointments in the Wharton School.He is founder of Penn’s Networked and Social Systems Engineering (NETS) program, and director of Penn’s Warren Center for Network and Data Sciences. Michael is also the co-author of the book The Ethical Algorithm that talks about the science of designing algorithms that embed social values like privacy and fairness. His research interests include topics in machine learning, algorithmic game theory, social networks, and computational finance. He has worked and consulted extensively in the technology and finance industries. He is a fellow of the American Academy of Arts and Sciences, the Association for Computing Machinery, and the Association for the Advancement of Artificial Intelligence.
John Montgomery Corporate Vice President, Program Management, AI Platform | Microsoft
John leads Program Management for Microsoft Azure AI and is responsible for designing products and services that data scientists and ML experts around the world love and use. He leads a team of program managers, researchers, and designers responsible for products and services including Azure Machine Learning, Azure Cognitive Services, ML.NET, and ONNX Runtime. Prior to this role, John has led the Program Management team for Microsoft’s Developer Division, including Visual Studio, Visual Studio Code, and Azure Notebooks.
As stream processing frameworks like Apache Flink become the norm for real-time analytics and event-driven applications, SQL is making a comeback as a way to lower the entry barrier to streaming. But how do you take the leap from analytics-based historical data to real-time insights on streams?
This talk will walk you through the full development pipeline, from initial interactive analysis to continuous production deployment using Flink SQL’s unique approach to unified batch and stream processing. You will learn how to use some of Flink’s most powerful features such as temporal table joins for working with historical data and complex pattern detection using MATCH_RECOGNIZE.
Seth Wiesman is a Solutions Architect at Ververica, where he works with engineering teams inside of various organizations to build the best possible stream processing architecture for their use cases.
People Analytics: How to train, explain and operationalize your HR analytics solution with Azure Machine Learning
Francesca Lazzeri, PhD Senior Lead Machine Learning Scientist, Cloud Advocate at Microsoft
People Analytics: How to train, explain and operationalize your HR analytics solution with Azure Machine Learning
In this session we will show how to train and explain an employee attrition model, and how to deploy the trained model and its corresponding explainer to Azure Container Instances (ACI). We will demonstrate the API calls that you need to make to submit a run for training and explaining a model to AMLCompute, download the compute explanations remotely, and visualizing the global and local explanations via a visualization dashboard that provides an interactive way of discovering patterns in model predictions and downloaded explanations. Finally, we will show how to use Azure Machine Learning MLOps capabilities to deploy your model and its corresponding explainer.
Francesca Lazzeri, PhD
Francesca Lazzeri, PhD is an experienced scientist and machine learning practitioner with over 12 years of both academic and industry experience. She is author of a number of publications, including technology journals, conferences, and books. She currently leads an international team of cloud advocates, developers and data scientists at Microsoft. Before joining Microsoft, she was a research fellow at Harvard University in the Technology and Operations Management Unit. Find her on Twitter: @frlazzeri and Medium: @francescalazzeri
This webinar is an introduction to the hands-on workshop that will take place at ODSC Virtual Conference. The workshop will demonstrate how to gain observability (monitoring & alerting) for production machine learning pipelines. We will provide background on why observability is important to run successful MLOps, then walk through in detail how to set up a robust observability system.
Without a proper observability system, it is impossible to scale a successful machine learning effort. During the webinar, we will discuss the agenda of the workshop, its learning outcomes, and some more details. The workshop will provide ML engineering teams with the tools they need (all available in the open-source ecosystem) to solve major visibility gaps in the machine learning lifecycle, including monitoring data quality, job statuses, ML model performance, and retraining.
The content covered will be of interest to data engineers and data scientists, including anyone who is working on machine learning projects.
Josh is Cofounder of Databand, an APM and observability solution for data engineering teams. Prior to founding Databand, Josh was a Product Manager at Sisense, a business analytics software startup. Josh led product on Sisense’s ETL and database integration technologies as the startup scaled to over 700 team members and over 1,000 clients. Before Sisense, Josh worked in venture capital at Bessemer Venture Partners, where he focused on cloud infrastructure and machine learning investments.
Evgeny is Cofounder of Databand, an APM and observability solution for data engineering teams. Evgeny is a data architect and engineer by background. Prior to Databand, Evgeny was first employee, data architect, and team lead at Crosswise, a big data startup acquired by Oracle Data Cloud. Before Crosswise and ODC, Evgeny was a senior developer, software engineering team lead, and researcher at various startups.
Data Science Learnathon: From Raw Data to Deployment
Kathrin Melcher Data Scientist at KNIME
Data Science Learnathon: From Raw Data to Deployment
In this Learnathon you will learn more about the data science cycle – data access, data blending, data preparation, model training, optimization, testing, and deployment. We will work in groups to create a workflow-based solution to guided exercises.
The tool of choice for this Learnathon is the open-source, GUI-driven KNIME Analytics Platform. Because KNIME is open, it offers great integrations with an IDE environment for R, Python; SQL, and Spark.
In this webinar, we will start with an introduction to KNIME Analytics Platform followed by a short presentation about the data science cycle…more details and requirements
Kathrin Melcher is a data scientist at KNIME. She holds a master degree in Mathematics. She has a strong interest in data science, machine learning and algorithms, and enjoys teaching and sharing her knowledge about it.
Paige Roberts Open Source Relations Manager at Vertica
Python + MPP Database = Large Scale AI/ML Projects in Production Faster
In this talk, you will learn about combination architectures that can get your work into production, shorten development time, and provide the performance and scale advantages of an MPP database with the convenience and power of Python. Use case examples use the open source Vertica-Python project created by Uber with contributions from Twitter, Palantir, Etsy, Vertica, Kayak and Gooddata.
In two decades in the data management industry, Paige Roberts has worked as an engineer, a trainer, a support technician, a technical writer, a marketer, a product manager, and a consultant. She has built data engineering pipelines and architectures, documented and tested large scale open source analytics implementations, spun up Hadoop clusters from bare metal, picked the brains of some of the stars in the data analytics and engineering industry, championed data quality when that was supposedly passé, worked with a lot of companies in a lot of different industries, and questioned a lot of people’s assumptions. Now, she promotes understanding of Vertica, MPP data processing, open source, high scale data engineering, and how the analytics revolution is changing the world.
Reinhold Beckmann Attorney & Lecturer at RA Reinhold Beckmann
GDPR in Action: Does it Work?
In 2018, companies faced massive pressure to handle private data in the European Union carefully. This was triggered by the new E.U. data privacy regulations, called GDPR. We will show that implementing these requirements to do business in Europe turned out to be much simpler than expected. For this, we will give real life insight accompanied by legal expertise on how we implemented GDPR requirements concretely at Digital Farming. We are part of the BASF group which is the largest chemical company in the world.
Reinhold is a lawyer from Germany specialized in Internet-law and international IT-law. His main subject is to consult companies, e.g. BASF Digital Farming in Germany to ensuring their GDPR compliance under the new European Data Privacy regulations. This includes international aspects of Personal Data security. Reinhold is also teaching and a speaker on these topics on international conferences. After finishing his law-studies in Muenster, Germany Reinhold worked for more than 20 years in the Enterprise Software industry, mainly for North American Software vendors leading their European Organizations.
Marinela Profi Global Product Marketing Manager at SAS
Managing Open Source Models Just Got a Lot Easier: SAS Open Model Manager®
Without a structured process to coordinate all the different pieces, model management can feel more like the wild, wild west – no governance, no oversight, and no strategy for putting analytic insights to work. This is where SAS® Open Model Manager can help. This new offering from SAS can wrangle the model management process by enabling IT and analytics professionals to register, deploy, and monitor open-source models in a consistent, repeatable manner. SAS Open Model Manager integrates with Python and R and is intended for organizations that use open-source tools to build models and need model management capabilities.
After a job experience as Customer Advisor for Analytics in SAS Europe, Marinela is Global Product Marketing Manager in SAS. She is responsible for Model Management and Open Source Integration Solutions. Her background is a mix between Business Administration, Statistics and Marketing. After a Bachelor’s degree in Economics and Management, Marinela completed her first Master in Business Administration and her second Master in Statistics at the University of Rome Tor Vergata. In her free time, she enjoys riding her motorbike and, as a good Italian, cooking and eating.
Joy Payton Supervisor, Data Education, Children’s Hospital of Philadelphia
Data Science and Machine Learning in the Cloud for Cloud Novices
In this hands-on training at the conference, we will use free-tier resources in the Google Cloud Platform (GCP) to introduce learners to the practical use of cloud computing resources in data science and machine learning. This training will be useful for those considering cloud adoption, interested in data engineering, or interested in working with public data as citizen scientists. Topics covered will include: Cloud computing concepts and vocabulary; Cloud providers; Free tier and cost considerations; Public datasets and citizen science; Redundancy, security, and privacy; Continuum of management levels; Cloud data storage and analytics; Machine learning in the cloud.
Joy Payton is a cloud engineer, data scientist, and adjunct professor who specializes in helping biomedical professionals conduct reproducible computational research. In addition to moving medicine forward through principles of open science and reproducibility, Joy also enjoys teaching citizen scientists how to use public data repositories to understand their own communities better and advocate for change from a data-centric perspective. Her various roles allow Joy to lead efforts to teach people how to write their first line of code and help anyone who’s interested climb the data science learning curve. Currently employed by the Children’s Hospital of Philadelphia and Yeshiva University, Joy is always open to hearing about open-source, data-centric volunteer opportunities for herself and her students.
Patrick Buehler, PhD Principal Data Scientist, Microsoft
How to Solve Real-World Computer Vision Problems Using Open-source
At the conference, the workshop will begin with an overview of common real-world tasks in the CV domain, including examples of problems our customers have faced in recent years. We will then give a brief introduction to deep learning models for CV. The main part of this session will demonstrate how to train and evaluate CV models by executing notebooks based on PyTorch’s Fast.ai and Torchvision libraries. We will start with image classification, how to fine-tune a pre-trained ImageNet model on a custom dataset, and show how to deploy the model to the cloud. Next, we will train an object detection model and extend the model to segmentation masks and keypoints. Finally, we will build an image similarity system and demo a fast image retrieval solution that can handle large amounts of images.
Patrick Buehler, PhD
Patrick Buehler is a principal data scientist at Microsoft’s Cloud AI Group. He obtained his PhD from the Oxford VGG group in Computer Vision with Prof. Andrew Zisserman. He has over fifteen years of working experience in academic settings and with various external customers spanning a wide range of Computer Vision problems.
Tom Goldenberg Junior Principal Data Engineer, QuantumBlack
Kedro + MLflow – Reproducible and Versioned Data Pipelines at Scale
The aim of this tutorial, that we will host at ODSC Boston, is to demonstrate how Kedro (development workflow tool open sourced by QuantumBlack, a McKinsey company) and MLflow fit together in a scalable AI architecture. To start, we will give an overview of Kedro and an overview of MLflow: what they are used for, what functionality they provide, how they compare as tools. Next, we will walk through a demo of a Kedro project that has MLflow integrated into it. Finally, we will go over deployment options. At the webinar you will learn more about this tutorial. There will be time allocated at the end for Q&A.
Tom is a Junior Principal Data Engineer at QuantumBlack, a McKinsey Company. Prior to consulting, Tom was CTO and co-founder of Commandiv, a wealth management startup.
ODSC Webinar: Rapid AI-powered Apps Prototyping, and Deep Learning Model to Detect Fraudulent Attacks
Zero to Production: Rapid AI-powered apps prototyping and development
Building AI-powered applications is a complex and challenging process wrought with unforeseen costs and setbacks. But what if any developer could build and prototype a sophisticated and functional AI-powered application as quickly and efficiently as they could launch a new webpage on Wix or Squarespace? This would be a game-changer in the world of AI, and it’s exactly what Baha Abu Nojaim, Cofounder @ Baseet.ai, is going to show.
Baha Abu Nojaim
A serial entrepreneur, technologist, MIT Innovator Under 35, and co-founder at Baseet.ai solving complex problems at the intersection between the digital world and the physical world with a passion to tackle world-class challenges in transformative tech.
Nicola Corradi Deep Learning Research Engineer at DataVisor
Deep Learning Model to Detect Fraudulent Attacks
Fraudulent attacks such as application fraud, fake reviews, and promotion abuse have to automate the generation of user content to scale; this creates latent patterns shared among the coordinated malicious accounts. Nicola Corradi digs into a deep learning model to detect such patterns for the identification of coordinated content abuse attacks on social, ecommerce, financial platforms, and more.
Nicola Corradi, PhD
Nicola Corradi is a Research Scientist at DataVisor, where he uses his vast experience with neural networks to design and train deep learning models to recognize malicious patterns in user behaviour. He earned a PhD in cognitive science (University of Padua) and did a post-doc at Cornell in computational neuroscience and computer vision, focusing on the integration of computational model of the neurons with neural networks.
Dr. Jon Krohn Chief Data Scientist at untapt and author of Deep Learning Illustrated
Deep Learning (with TensorFlow 2)
Relatively obscure a few short years ago, Deep Learning is ubiquitous today across data-driven applications as diverse as machine vision, natural language processing, and super-human game-playing. This Deep Learning primer brings the revolutionary machine-learning approach behind contemporary artificial intelligence to life with interactive demos featuring TensorFlow 2.0, the major, cutting-edge revision of the world’s most popular Deep Learning library.
Jon Krohn is Chief Data Scientist at the machine learning company untapt. He presents an acclaimed series of tutorials published by Addison-Wesley, including Deep Learning with TensorFlow and Deep Learning for Natural Language Processing. Jon teaches his deep learning curriculum in-classroom at the New York City Data Science Academy and guest lectures at Columbia University. He holds a doctorate in neuroscience from the University of Oxford and, since 2010, has been publishing on machine learning in leading peer-reviewed journals. His book, Deep Learning Illustrated, was published by Pearson in 2019.
Ali Vanderveld, PhD Director of Data Science at ShopRunner
Using Deep Learning to Build a Unified E-commerce Marketplace
ShopRunner is an e-commerce company that receives feeds of product data from many different retailer partners, including large department stores and retailers that specialize in electronics, appliances, nutritional products, and more. In order to provide a great user experience on our website and in our mobile app, we need to have one easy-to-navigate product taxonomy. We also would like to have sets of attribute tags that make it easy to filter down to exactly what any shopper is looking for. In this talk I will describe how we are using computer vision and natural language processing to place all of the products from our retailer partners into one easy-to-navigate shopping experience.
Ali Vanderveld is Head of Data Science at ShopRunner, where her team leverages data from a network of over 140 retailers to build products for their 6 million members. Prior to ShopRunner, she was a staff data scientist at Civis Analytics, a consulting and software startup that helps companies, nonprofits, and political organizations better utilize their data. She has also worked at Groupon and as a technical mentor for the Data Science for Social Good Fellowship. Ali has a PhD in theoretical astrophysics from Cornell University and got her start working as an academic researcher at Caltech, the NASA Jet Propulsion Laboratory, and the University of Chicago, working on the development teams for several space telescope missions, including ESA’s Euclid.
Veysel Kocaman, PhD Senior Data Scientist at John Snow Labs
Spark NLP for Healthcare: Lessons Learned Building Real-World Healthcare AI Systems
At the conference, the speaker will review case studies from real-world projects that built AI systems using Natural Language Processing (NLP) in healthcare. These case studies cover projects that deployed automated patient risk prediction, automated diagnosis, clinical guidelines, and revenue cycle optimization. He will also cover why and how NLP was used, what deep learning models and libraries were used, and what was achieved. Key takeaways for the conference attendees will include important considerations for NLP projects including how to build domain-specific healthcare models and using NLP as part of larger and scalable machine learning and deep learning pipelines in distributed environment.
Veysel Kocaman is a Senior Data Scientist and ML Engineer at John Snow Labs and have a decade long industry experience. He is also pursuing his PhD in CS as well as giving lectures at Leiden University (NL) and holds an MS degree in Operations Research from Penn State University. He is affiliated with Google as a Developer Expert in Machine Learning.