Courses Detail Information

MSE6501J – Machine Learning in Molecular and Materials Sciences


Instructors:

Wendong Wang

Credits: 3 credits

Pre-requisites: Graduate standing. Upper year undergraduate upon permission

Description:

Machine learning has an increasing impact in molecular and materials sciences. On the one hand,
machine learning provides new perspectives in how we record and analyze the structures and
properties of molecules and materials. On the other hand, molecular and materials sciences are fertile
grounds for the application of various machine learning techniques. This course intends to be an
interdisciplinary bridge between data sciences and molecular/materials sciences. You will not only
learn/review neural networks, convolutional neural networks, graphs, featurization, regression and
classification, and other main machine learning techniques and concepts, but also learn/review
molecular conformation, chirality, protein folding, symmetry groups, electron diffraction, and other
techniques and concepts in molecular and materials sciences. Some familiarity with Python is
recommended but not required. We will use DeepChem as the main package for the course.

Course Topics:

Introduction to Python & your first deepchem model and
solubility
Introduction to neural network & understanding datasets
More on neural network & introduction to MoleculeNet
Basics of NumPy and molecular fingerprints
Introduction to convolutional neural networks & creating
models with TensorFlow
More on convolutional neural networks & molecules as
graphs
Introduction to matplotlib & more on molecular featurization
Building a neural net from scratch
Back propagation, splitter and hyperparameter tuning
introduction to panadas and pubchempy, working with
experimental data, introduction to model interpretability
Introduction to scikit-learn, unsupervised learning, putting
multitask learning to work, paper presentation from literature
Unsupervised embeddings for molecules, model protein
ligand interactions, generative adversarial networks
Large-scale chemical screens, introduction to bioinformatics,
synthetic feasibility scoring
Uncertainty in deep learning, transfer learning with
ChemBerta, reinforcement learning, paper presentation from
literature