Abstract: AI is introducing new frontiers in engineering and science. A fleet of cars can be monitored in real time and problems can be diagnosed before they happen. Smart medical devices and diagnostics can help save lives. However, there are many complexities associated with acquiring and managing engineering data, training predictive models, and deploying these models to be used in near real-time. This talk will discuss these complexities in the context of building a system to perform streaming predictions on sensor data.
First, you must have failure data to train a good model, but don’t want to break equipment for the sake of building a data set! Instead, physical simulations can be used to create large, synthetic data sets with various failure conditions to train a machine learning model. These systems also involve high-frequency data from many sensors and components, reporting at different times. The data must be time-aligned to apply calculations, which makes it difficult to design a streaming architecture. These challenges can be addressed through a stream processing framework that incorporates time-windowing and manages out-of-order data with Apache Kafka. The sensor data must then be synchronized for further signal processing before being passed to a machine learning model.
This session will focus on building a system to address these challenges using MATLAB, Simulink, Python, Apache Kafka, and Microsoft Azure. We will start with a physical model and walk through the process of generating sensor data, performing signal processing and managing time, and developing and deploying a machine learning model for streaming data.
Bio: Heather Gorr holds a Ph.D. in Materials Science Engineering from the University of Pittsburgh and a Masters and Bachelors of Science in Physics from Penn State University. Since 2013, she has supported MATLAB users in the areas of mathematics, data science, deep learning, and application deployment. Prior to joining MathWorks, she was a Research Fellow, focused on machine learning for prediction of fluid concentrations.