Beyond Word Embedding: BERT, ElMo and ULMFit NLP models- New Era in Neural Natural Language Processing

Abstract: Big changes are underway in the world of Natural Language Processing (NLP). The long reign of word vectors as NLP’s core representation technique has seen an exciting new line of challengers emerge: ELMo, OpenAI transformer, ULMFiT, Facebook’s PyText, Google’s BERT
In this Talk, the audience will get a detailed understanding of the past, present, and future of deep learning in NLP. In addition, readers will also learn some of the current best practices for applying deep learning in NLP. Some topics include: The rise of distributed representations (e.g., word2vec), Convolutional, recurrent, and recursive neural networks, Recent development in unsupervised sentence representation learning, Combining deep learning models with memory-augmenting strategies. Our conceptual understanding of how best to represent words and sentences in a way that best captures underlying meanings and relationships is rapidly evolving. Moreover, the NLP community has been putting forward incredibly powerful components that you can freely download and use in your own models and pipelines. This talk will introduce them to the audience.
These works made headlines by demonstrating that pretrained language models can be used to achieve state-of-the-art results on a wide range of NLP tasks. Such methods herald a watershed moment: they may have the same wide-ranging impact on NLP as pretrained ImageNet models had on computer vision.
Language understanding is a challenge for computers. Subtle nuances of communication that human toddlers can understand still confuse the most powerful machines. Even though advanced techniques like deep learning can detect and replicate complex language patterns, machine learning models still lack fundamental conceptual understanding of what our words really mean.
Understanding context has broken down barriers that had prevented NLP techniques making headway before.

Tools: NLTK, Spacy, Google Colab, Pandas, Gensim, PolyGlot, Sci-KitLearn, Glove, Word2Vec, Word Embedding, WEVI, Google Tensorflow Projector, Tensorflow Keras

Languages: Python, R, Jupyter Notebook

Learning Outcomes:
Text mining and the ways of extracting and reading data from some common file types including NLTK corpora
Understand some ways of text extraction and cleaning using NLTK.
Analyse a sentence structure using a group of words to create phrases and sentences using NLP and the rules of English grammar
Explore text classification, vectorization techniques and processing using scikit-learn
Build a Machine Learning classifier for text classification
Word Embedding
Deep Learning Concepts
Language Modeling
New Era in Pretrained Natural Language Processing language models like Google BERT, Facebook PyText, ELMo etc.

Bio: Bhairav Mehta is Senior Data Scientist with extensive professional experience and academic background. Bhairav works for Apple Inc. as Sr. Data Scientist.

Bhairav Mehta is experienced engineer, business professional and seasoned Statistician / programmer with 19 years of combined progressive experience working on data science in electronics consumer products industry (7 years at Apple Inc.), yield engineering in semiconductor manufacturing (6 years at Qualcomm and MIT Startup) and quality engineering in automotive industry (OEM, Tier2 Suppliers, Ford Motor Company) (3 years). Bhairav founded a start up DataInquest Inc. in 2014 that is specialized in training/consulting in Artificial Intelligence, Machine Learning, Blockchain and Data Science.

Bhairav Mehta has MBA from Johnson School of Management at Cornell University, Masters in Computer science from Georgia Tech (Expected 2018), Masters in Statistics from Cornell University, Masters in Industrial Systems Engineering from Rochester Institute of Technology and BS Production Engineering from Mumbai University.