1. Introduction to Deep Learning

Deep learning is a category of machine learning. Machine learning is a category of artificial intelligence. These notes are mostly about deep learning, thus the name of the book. Deep learning is the use of neural networks to classify and regress data (this is too narrow, but a good starting place). I am a chemical engineering professor though; writing an introduction to deep learning is a hopeless task for me. I found the introduction the from Ian Goodfellow’s book to be a good intro. If you’re more visually oriented, Grant Sanderson has made a short video series specifically about neural networks that give an applied introduction to the topic. DeepMind has a high-level video showing what can be accomplished with deep learning & AI. When people write “deep learning is really cool” in their research papers, they typically cite this Nature paper by Yann LeCun, Yoshua Bengio, and Geoffery Hinton. Zhang, Lipton, Li, and Smola have written a practical and example-driven online book that gives each example in Tensorflow, PyTorch, and MXNet. You can find many chemistry specific examples and information about deep learning in chemistry via the excellent DeepChem project.

The main advice I would give to beginners in deep learning are to focus less on the neurological inspired language (i.e., connections between neurons), and instead view deep learning as a series of linear algebra operations where many of the matrices are filled with adjustable parameters. There are of course a few non-linear functions (activations) here and there, but deep learning is essentially linear algebra operations specified via a “computation graph” (network kind) that vaguely looks like neurons connected in a brain.

1.1. Neural Networks

The deep in deep learning means we have many layers in our neural networks. What is a neural network? Without loss of generality, we can view neural networks as 2 components: (1) a non-linear function \(g(\cdot)\) which operates on our input features \(\mathbf{X}\) and outputs a new set of features \(\mathbf{H} = g(\mathbf{X})\) and (2) a linear model like we saw in our Machine Learning chapter. Our model equation for deep learning regression is:

(1.20)\[\begin{equation} y = \vec{w}g(\vec{x}) + b \end{equation}\]

One of the main discussion points in our ML chapters was how arcane and difficult it is to choose features. Here, we have replaced our features with a set of trainable features \(g(\vec{x})\) and then use the same linear model as before. So how do we design \(g(\vec{x})\)? That is the deep learning part. \(g(\vec{x})\) is a differentiable function we design composed of layers, which are themselves differentiable functions each with trainable weights (free variables). Deep learning is a mature field and there is a set of standard layers, each with a different purpose. For example, convolution layers look at a fixed neighborhood around each element of an input tensor. Dropout layers randomly inactivate inputs as a form of regularization. The most commonly used and basic layer is the dense or fully-connected layer.

A dense layer is defined by two things: the desired output feature shape and the activation. The equation is:

(1.21)\[\begin{equation} \vec{h} = \sigma(\mathbf{W}\vec{x} + \vec{b}) \end{equation}\]

where \(\mathbf{W}\) is a trainable \(D \times F\) matrix, where \(D\) is the input vector (\(\vec{x}\)) dimension and \(F\) is the output vector (\(\vec{h}\)) dimension, \(\vec{b}\) is a trainable \(F\) dimensional vector, and \(\sigma(\cdot)\) is the activation function. \(F\) is an example of a hyperparameter, it is not trainable but is a problem dependent choice. \(\sigma(\cdot)\) is another hyperparameter. In principle, any differentiable function that has a range of \((-\infty, \infty)\) can be used for activation. However, just a few activations have been empirically designed that balance computational cost and effectiveness. One example we’ve seen before is the sigmoid. Another is a hyperbolic tangent, which behaves similar (domain/range) to the sigmoid. The most commonly used activation is the rectified linear unit (ReLU), which is

(1.22)\[\begin{equation} \sigma(x) = \left\{\begin{array}{lr} x & x > 0\\ 0 & \textrm{otherwise}\\ \end{array}\right. \end{equation}\]

1.1.1. Universal Approximation Theorem

One of the reasons that neural networks are a good choice at approximating unknown functions (\(f(\vec{x})\)) is that a neural network can approximate any function with a large enough network depth (number of layers) or width (size of hidden layers). To be more specific, any 1 dimensional function can be approximated by a depth 5 neural network with ReLU activation functions. The universal approximation theorem shows that neural networks are, in the limit of large depth or width, expressive enough to fit any function.

1.2. Frameworks

Deep learning has lots of “gotchas” – easy to make mistakes that make it difficult to implement things yourself. This is especially true with numerical stability, which only reveals itself when your model fails to learn. We will move to a bit of a more abstract software framework than JAX for some examples. We’ll use Keras, which is one of many possible choices for deep learning frameworks.

1.3. Discussion

When it comes to introducing deep learning, I will be as terse as possible. There are good learning resources out there. You should use some of the reading above and tutorials put out by Keras (or PyTorch) to get familiar with the concepts of neural networks and learning.

1.4. Revisiting Solubity Model

We’ll see our first example of deep learning by revisiting the solubility dataset with a two layer dense neural network.

1.5. Running This Notebook

Click the    above to launch this page as an interactive Google Colab. See details below on installing packages, either on your own environment or on Google Colab

The hidden cells below sets-up our imports and/or install necessary packages.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib as mpl
import tensorflow as tf
import numpy as np
np.random.seed(0)
import warnings
warnings.filterwarnings('ignore')
sns.set_context('notebook')
sns.set_style('dark',  {'xtick.bottom':True, 'ytick.left':True, 'xtick.color': '#666666', 'ytick.color': '#666666',
                        'axes.edgecolor': '#666666', 'axes.linewidth':     0.8 })
color_cycle = ['#1BBC9B', '#F06060', '#5C4B51', '#F3B562', '#6e5687']
mpl.rcParams['axes.prop_cycle'] = mpl.cycler(color=color_cycle) 

1.5.1. Load Data

We download the data and load it into a Pandas data frame and then standardize our features as before.

soldata = pd.read_csv('https://dataverse.harvard.edu/api/access/datafile/3407241?format=original&gbrecs=true')
features_start_at = list(soldata.columns).index('MolWt')
feature_names = soldata.columns[features_start_at:]
# standardize the features
soldata[feature_names] -= soldata[feature_names].mean()
soldata[feature_names] /= soldata[feature_names].std()

1.6. Prepare Data for Keras

The deep learning libraries simplify many common tasks, like splitting data and building layers. This code below builds our dataset from numpy arrays.

full_data = tf.data.Dataset.from_tensor_slices((soldata[feature_names].values, soldata['Solubility'].values))
N = len(soldata)
test_N = int(0.1 * N)
test_data = full_data.take(test_N).batch(16)
train_data = full_data.skip(test_N).batch(16)

Notice that we used skip and take to split our dataset into two pieces and then created batches of data.

1.7. Neural Network

Now we build our neural network model. In this case, our \(g(\vec{x}) = \sigma\left(\mathbf{W^0}\vec{x} + \vec{b}\right)\). We will call the fucntion \(g(\vec{x})\) a hidden layer. This is because we do not observe its output. Remember, the solubility will be \(y = \vec{w}g(\vec{x}) + b\). We’ll choose our activation, \(\sigma(\cdot)\), to be tanh and the output dimension of the hidden-layer to be 32. You can read more about this API here, however you should be able to understand the process from the function names and comments.

# our hidden layer
# We only need to define the output dimension - 32.
hidden_layer =  tf.keras.layers.Dense(32, activation='tanh')
# Last layer - which we want to output one number
# the predicted solubility. 
output_layer = tf.keras.layers.Dense(1)

# Now we put the layers into a sequential model
model = tf.keras.Sequential()
model.add(hidden_layer)
model.add(output_layer)

# our model is complete

# Try out our model on first few datapoints
model(soldata[feature_names].values[:3])
WARNING:tensorflow:Layer dense is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because its dtype defaults to floatx.

If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.

To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.
<tf.Tensor: shape=(3, 1), dtype=float32, numpy=
array([[-1.0884409 ],
       [ 0.42499703],
       [-0.07747264]], dtype=float32)>

We can see our model predicting the solubility for 3 molecules above. There is a warning about how our Pandas data is using float64 (double precision floating point numbers) but our model is using float32 (single precision), which doesn’t matter that much. It warns us because we are technically throwing out a little bit of precision, but our solubility has much more variance than the difference between 32 and 64 bit precision floating point numbers.

At this point, we’ve defined how our model structure should work and it can be called on data. Now we need to train it! We prepare the model for training by calling compile, which is where we define our optimization (flavor of stochastic gradient descent) and loss

model.compile(optimizer='SGD', loss='mean_squared_error')

Look back to the amount of work it took to previously set-up loss and optimization process! Now we can train our model

model.fit(train_data, epochs=50)
Epoch 1/50
  1/562 [..............................] - ETA: 0s - loss: 9.2264

 75/562 [===>..........................] - ETA: 0s - loss: 4.4881

156/562 [=======>......................] - ETA: 0s - loss: 3.4974

244/562 [============>.................] - ETA: 0s - loss: 3.0880

330/562 [================>.............] - ETA: 0s - loss: 2.6212

417/562 [=====================>........] - ETA: 0s - loss: 2.2756

500/562 [=========================>....] - ETA: 0s - loss: 2.0800

562/562 [==============================] - 0s 601us/step - loss: 2.0186
Epoch 2/50

  1/562 [..............................] - ETA: 0s - loss: 1.0405

 86/562 [===>..........................] - ETA: 0s - loss: 2.1690

170/562 [========>.....................] - ETA: 0s - loss: 2.4957

254/562 [============>.................] - ETA: 0s - loss: 2.2466

336/562 [================>.............] - ETA: 0s - loss: 1.9695

421/562 [=====================>........] - ETA: 0s - loss: 1.7528

505/562 [=========================>....] - ETA: 0s - loss: 1.6370

562/562 [==============================] - 0s 610us/step - loss: 1.6199
Epoch 3/50

  1/562 [..............................] - ETA: 0s - loss: 0.9749

 88/562 [===>..........................] - ETA: 0s - loss: 2.0524

175/562 [========>.....................] - ETA: 0s - loss: 2.4329

258/562 [============>.................] - ETA: 0s - loss: 2.1434

344/562 [=================>............] - ETA: 0s - loss: 1.8670

431/562 [======================>.......] - ETA: 0s - loss: 1.6670

515/562 [==========================>...] - ETA: 0s - loss: 1.5637

562/562 [==============================] - 0s 587us/step - loss: 1.5534
Epoch 4/50

  1/562 [..............................] - ETA: 0s - loss: 0.9187

 18/562 [..............................] - ETA: 3s - loss: 1.9217

 80/562 [===>..........................] - ETA: 0s - loss: 2.1206

165/562 [=======>......................] - ETA: 0s - loss: 2.2416

249/562 [============>.................] - ETA: 0s - loss: 2.1072

327/562 [================>.............] - ETA: 0s - loss: 1.8589

412/562 [====================>.........] - ETA: 0s - loss: 1.6493

499/562 [=========================>....] - ETA: 0s - loss: 1.5240

562/562 [==============================] - 0s 786us/step - loss: 1.5022
Epoch 5/50

  1/562 [..............................] - ETA: 0s - loss: 0.8762

 87/562 [===>..........................] - ETA: 0s - loss: 1.9562

172/562 [========>.....................] - ETA: 0s - loss: 2.3318

257/562 [============>.................] - ETA: 0s - loss: 2.0338

336/562 [================>.............] - ETA: 0s - loss: 1.7814

400/562 [====================>.........] - ETA: 0s - loss: 1.6261

481/562 [========================>.....] - ETA: 0s - loss: 1.5131

561/562 [============================>.] - ETA: 0s - loss: 1.4612

562/562 [==============================] - 0s 631us/step - loss: 1.4607
Epoch 6/50

  1/562 [..............................] - ETA: 0s - loss: 0.8442

 83/562 [===>..........................] - ETA: 0s - loss: 1.9883

166/562 [=======>......................] - ETA: 0s - loss: 2.1722

249/562 [============>.................] - ETA: 0s - loss: 2.0067

333/562 [================>.............] - ETA: 0s - loss: 1.7499

416/562 [=====================>........] - ETA: 0s - loss: 1.5566

499/562 [=========================>....] - ETA: 0s - loss: 1.4452

562/562 [==============================] - 0s 603us/step - loss: 1.4252
Epoch 7/50

  1/562 [..............................] - ETA: 0s - loss: 0.8186

 86/562 [===>..........................] - ETA: 0s - loss: 1.9093

171/562 [========>.....................] - ETA: 0s - loss: 2.2328

258/562 [============>.................] - ETA: 0s - loss: 1.9414

343/562 [=================>............] - ETA: 0s - loss: 1.6855

424/562 [=====================>........] - ETA: 0s - loss: 1.5066

506/562 [==========================>...] - ETA: 0s - loss: 1.4090

562/562 [==============================] - 0s 597us/step - loss: 1.3962
Epoch 8/50

  1/562 [..............................] - ETA: 0s - loss: 0.8224

 86/562 [===>..........................] - ETA: 0s - loss: 1.8911

170/562 [========>.....................] - ETA: 0s - loss: 2.1825

252/562 [============>.................] - ETA: 0s - loss: 1.9314

338/562 [=================>............] - ETA: 0s - loss: 1.6732

421/562 [=====================>........] - ETA: 0s - loss: 1.4903

495/562 [=========================>....] - ETA: 0s - loss: 1.3989

562/562 [==============================] - 0s 611us/step - loss: 1.3767
Epoch 9/50

  1/562 [..............................] - ETA: 0s - loss: 0.8379

 84/562 [===>..........................] - ETA: 0s - loss: 1.9122

165/562 [=======>......................] - ETA: 0s - loss: 2.0655

246/562 [============>.................] - ETA: 0s - loss: 1.9325

332/562 [================>.............] - ETA: 0s - loss: 1.6750

416/562 [=====================>........] - ETA: 0s - loss: 1.4865

498/562 [=========================>....] - ETA: 0s - loss: 1.3809

562/562 [==============================] - 0s 608us/step - loss: 1.3634
Epoch 10/50

  1/562 [..............................] - ETA: 0s - loss: 0.8509

 87/562 [===>..........................] - ETA: 0s - loss: 1.8600

171/562 [========>.....................] - ETA: 0s - loss: 2.1743

253/562 [============>.................] - ETA: 0s - loss: 1.8945

325/562 [================>.............] - ETA: 0s - loss: 1.6785

408/562 [====================>.........] - ETA: 0s - loss: 1.4831

487/562 [========================>.....] - ETA: 0s - loss: 1.3888

562/562 [==============================] - 0s 620us/step - loss: 1.3532
Epoch 11/50

  1/562 [..............................] - ETA: 0s - loss: 0.8572

 87/562 [===>..........................] - ETA: 0s - loss: 1.8525

169/562 [========>.....................] - ETA: 0s - loss: 2.1273

255/562 [============>.................] - ETA: 0s - loss: 1.8783

329/562 [================>.............] - ETA: 0s - loss: 1.6601

408/562 [====================>.........] - ETA: 0s - loss: 1.4736

447/562 [======================>.......] - ETA: 0s - loss: 1.4217

528/562 [===========================>..] - ETA: 0s - loss: 1.3501

562/562 [==============================] - 0s 665us/step - loss: 1.3449
Epoch 12/50

  1/562 [..............................] - ETA: 0s - loss: 0.8576

 82/562 [===>..........................] - ETA: 0s - loss: 1.9189

159/562 [=======>......................] - ETA: 0s - loss: 1.9709

244/562 [============>.................] - ETA: 0s - loss: 1.9041

327/562 [================>.............] - ETA: 0s - loss: 1.6547

405/562 [====================>.........] - ETA: 0s - loss: 1.4716

481/562 [========================>.....] - ETA: 0s - loss: 1.3832

562/562 [==============================] - ETA: 0s - loss: 1.3373

562/562 [==============================] - 0s 628us/step - loss: 1.3373
Epoch 13/50

  1/562 [..............................] - ETA: 0s - loss: 0.8569

 82/562 [===>..........................] - ETA: 0s - loss: 1.9101

168/562 [=======>......................] - ETA: 0s - loss: 2.0951

249/562 [============>.................] - ETA: 0s - loss: 1.8698

334/562 [================>.............] - ETA: 0s - loss: 1.6268

418/562 [=====================>........] - ETA: 0s - loss: 1.4456

505/562 [=========================>....] - ETA: 0s - loss: 1.3422

562/562 [==============================] - 0s 596us/step - loss: 1.3302
Epoch 14/50

  1/562 [..............................] - ETA: 0s - loss: 0.8534

 84/562 [===>..........................] - ETA: 0s - loss: 1.8715

170/562 [========>.....................] - ETA: 0s - loss: 2.1047

256/562 [============>.................] - ETA: 0s - loss: 1.8423

341/562 [=================>............] - ETA: 0s - loss: 1.6022

428/562 [=====================>........] - ETA: 0s - loss: 1.4228

508/562 [==========================>...] - ETA: 0s - loss: 1.3340

562/562 [==============================] - 0s 601us/step - loss: 1.3238
Epoch 15/50

  1/562 [..............................] - ETA: 0s - loss: 0.8477

 84/562 [===>..........................] - ETA: 0s - loss: 1.8645

165/562 [=======>......................] - ETA: 0s - loss: 2.0037

248/562 [============>.................] - ETA: 0s - loss: 1.8549

333/562 [================>.............] - ETA: 0s - loss: 1.6144

412/562 [====================>.........] - ETA: 0s - loss: 1.4426

498/562 [=========================>....] - ETA: 0s - loss: 1.3348

562/562 [==============================] - 0s 601us/step - loss: 1.3179
Epoch 16/50

  1/562 [..............................] - ETA: 0s - loss: 0.8406

 91/562 [===>..........................] - ETA: 0s - loss: 1.7638

172/562 [========>.....................] - ETA: 0s - loss: 2.1174

256/562 [============>.................] - ETA: 0s - loss: 1.8253

342/562 [=================>............] - ETA: 0s - loss: 1.5853

425/562 [=====================>........] - ETA: 0s - loss: 1.4138

508/562 [==========================>...] - ETA: 0s - loss: 1.3221

562/562 [==============================] - 0s 597us/step - loss: 1.3123
Epoch 17/50

  1/562 [..............................] - ETA: 0s - loss: 0.8325

 85/562 [===>..........................] - ETA: 0s - loss: 1.8356

168/562 [=======>......................] - ETA: 0s - loss: 2.0591

251/562 [============>.................] - ETA: 0s - loss: 1.8312

341/562 [=================>............] - ETA: 0s - loss: 1.5813

423/562 [=====================>........] - ETA: 0s - loss: 1.4113

508/562 [==========================>...] - ETA: 0s - loss: 1.3166

562/562 [==============================] - 0s 594us/step - loss: 1.3069
Epoch 18/50

  1/562 [..............................] - ETA: 0s - loss: 0.8249

 87/562 [===>..........................] - ETA: 0s - loss: 1.8013

171/562 [========>.....................] - ETA: 0s - loss: 2.0911

247/562 [============>.................] - ETA: 0s - loss: 1.8357

326/562 [================>.............] - ETA: 0s - loss: 1.6104

412/562 [====================>.........] - ETA: 0s - loss: 1.4244

495/562 [=========================>....] - ETA: 0s - loss: 1.3216

562/562 [==============================] - 0s 608us/step - loss: 1.3018
Epoch 19/50

  1/562 [..............................] - ETA: 0s - loss: 0.8190

 85/562 [===>..........................] - ETA: 0s - loss: 1.8250

168/562 [=======>......................] - ETA: 0s - loss: 2.0450

257/562 [============>.................] - ETA: 0s - loss: 1.8033

342/562 [=================>............] - ETA: 0s - loss: 1.5672

426/562 [=====================>........] - ETA: 0s - loss: 1.3968

509/562 [==========================>...] - ETA: 0s - loss: 1.3071

562/562 [==============================] - 0s 595us/step - loss: 1.2975
Epoch 20/50

  1/562 [..............................] - ETA: 0s - loss: 0.8150

 87/562 [===>..........................] - ETA: 0s - loss: 1.7923

170/562 [========>.....................] - ETA: 0s - loss: 2.0549

237/562 [===========>..................] - ETA: 0s - loss: 1.8701

322/562 [================>.............] - ETA: 0s - loss: 1.6089

408/562 [====================>.........] - ETA: 0s - loss: 1.4152

495/562 [=========================>....] - ETA: 0s - loss: 1.3123

562/562 [==============================] - 0s 609us/step - loss: 1.2929
Epoch 21/50

  1/562 [..............................] - ETA: 0s - loss: 0.8154

 84/562 [===>..........................] - ETA: 0s - loss: 1.8271

168/562 [=======>......................] - ETA: 0s - loss: 2.0291

236/562 [===========>..................] - ETA: 0s - loss: 1.8661

318/562 [===============>..............] - ETA: 0s - loss: 1.6132

396/562 [====================>.........] - ETA: 0s - loss: 1.4315

476/562 [========================>.....] - ETA: 0s - loss: 1.3406

560/562 [============================>.] - ETA: 0s - loss: 1.2891

562/562 [==============================] - 0s 633us/step - loss: 1.2881
Epoch 22/50

  1/562 [..............................] - ETA: 0s - loss: 0.8196

 83/562 [===>..........................] - ETA: 0s - loss: 1.8354

157/562 [=======>......................] - ETA: 0s - loss: 1.9113

240/562 [===========>..................] - ETA: 0s - loss: 1.8380

324/562 [================>.............] - ETA: 0s - loss: 1.5917

406/562 [====================>.........] - ETA: 0s - loss: 1.4070

483/562 [========================>.....] - ETA: 0s - loss: 1.3216

562/562 [==============================] - 0s 641us/step - loss: 1.2830
Epoch 23/50

  1/562 [..............................] - ETA: 0s - loss: 0.8228

 84/562 [===>..........................] - ETA: 0s - loss: 1.8102

163/562 [=======>......................] - ETA: 0s - loss: 1.9261

239/562 [===========>..................] - ETA: 0s - loss: 1.8343

314/562 [===============>..............] - ETA: 0s - loss: 1.6083

395/562 [====================>.........] - ETA: 0s - loss: 1.4217

476/562 [========================>.....] - ETA: 0s - loss: 1.3295

556/562 [============================>.] - ETA: 0s - loss: 1.2820

562/562 [==============================] - 0s 641us/step - loss: 1.2780
Epoch 24/50

  1/562 [..............................] - ETA: 0s - loss: 0.8245

 84/562 [===>..........................] - ETA: 0s - loss: 1.8009

166/562 [=======>......................] - ETA: 0s - loss: 1.9546

250/562 [============>.................] - ETA: 0s - loss: 1.7807

336/562 [================>.............] - ETA: 0s - loss: 1.5466

421/562 [=====================>........] - ETA: 0s - loss: 1.3769

504/562 [=========================>....] - ETA: 0s - loss: 1.2840

562/562 [==============================] - 0s 615us/step - loss: 1.2734
Epoch 25/50

  1/562 [..............................] - ETA: 0s - loss: 0.8245

 76/562 [===>..........................] - ETA: 0s - loss: 1.9045

115/562 [=====>........................] - ETA: 0s - loss: 1.5367

167/562 [=======>......................] - ETA: 0s - loss: 1.9792

221/562 [==========>...................] - ETA: 0s - loss: 1.8864

303/562 [===============>..............] - ETA: 0s - loss: 1.6242

376/562 [===================>..........] - ETA: 0s - loss: 1.4439

462/562 [=======================>......] - ETA: 0s - loss: 1.3212

546/562 [============================>.] - ETA: 0s - loss: 1.2808

562/562 [==============================] - 0s 741us/step - loss: 1.2694
Epoch 26/50

  1/562 [..............................] - ETA: 0s - loss: 0.8218

 87/562 [===>..........................] - ETA: 0s - loss: 1.7481

173/562 [========>.....................] - ETA: 0s - loss: 2.0294

261/562 [============>.................] - ETA: 0s - loss: 1.7362

351/562 [=================>............] - ETA: 0s - loss: 1.5045

441/562 [======================>.......] - ETA: 0s - loss: 1.3402

531/562 [===========================>..] - ETA: 0s - loss: 1.2681

562/562 [==============================] - 0s 572us/step - loss: 1.2658
Epoch 27/50

  1/562 [..............................] - ETA: 0s - loss: 0.8179

 89/562 [===>..........................] - ETA: 0s - loss: 1.7211

177/562 [========>.....................] - ETA: 0s - loss: 2.0196

266/562 [=============>................] - ETA: 0s - loss: 1.7157

353/562 [=================>............] - ETA: 0s - loss: 1.4934

436/562 [======================>.......] - ETA: 0s - loss: 1.3444

525/562 [===========================>..] - ETA: 0s - loss: 1.2649

562/562 [==============================] - 0s 576us/step - loss: 1.2626
Epoch 28/50

  1/562 [..............................] - ETA: 0s - loss: 0.8147

 86/562 [===>..........................] - ETA: 0s - loss: 1.7515

172/562 [========>.....................] - ETA: 0s - loss: 2.0215

259/562 [============>.................] - ETA: 0s - loss: 1.7344

345/562 [=================>............] - ETA: 0s - loss: 1.5131

433/562 [======================>.......] - ETA: 0s - loss: 1.3449

520/562 [==========================>...] - ETA: 0s - loss: 1.2612

562/562 [==============================] - 0s 584us/step - loss: 1.2595
Epoch 29/50

  1/562 [..............................] - ETA: 0s - loss: 0.8103

 86/562 [===>..........................] - ETA: 0s - loss: 1.7484

173/562 [========>.....................] - ETA: 0s - loss: 2.0130

260/562 [============>.................] - ETA: 0s - loss: 1.7249

347/562 [=================>............] - ETA: 0s - loss: 1.5042

433/562 [======================>.......] - ETA: 0s - loss: 1.3416

519/562 [==========================>...] - ETA: 0s - loss: 1.2604

562/562 [==============================] - 0s 583us/step - loss: 1.2567
Epoch 30/50

  1/562 [..............................] - ETA: 0s - loss: 0.8029

 87/562 [===>..........................] - ETA: 0s - loss: 1.7328

174/562 [========>.....................] - ETA: 0s - loss: 1.9999

258/562 [============>.................] - ETA: 0s - loss: 1.7283

345/562 [=================>............] - ETA: 0s - loss: 1.5055

434/562 [======================>.......] - ETA: 0s - loss: 1.3364

520/562 [==========================>...] - ETA: 0s - loss: 1.2553

562/562 [==============================] - 0s 581us/step - loss: 1.2538
Epoch 31/50

  1/562 [..............................] - ETA: 0s - loss: 0.7976

 88/562 [===>..........................] - ETA: 0s - loss: 1.7152

176/562 [========>.....................] - ETA: 0s - loss: 1.9888

263/562 [=============>................] - ETA: 0s - loss: 1.7097

350/562 [=================>............] - ETA: 0s - loss: 1.4886

436/562 [======================>.......] - ETA: 0s - loss: 1.3309

523/562 [==========================>...] - ETA: 0s - loss: 1.2500

562/562 [==============================] - 0s 585us/step - loss: 1.2510
Epoch 32/50

  1/562 [..............................] - ETA: 0s - loss: 0.7926

 86/562 [===>..........................] - ETA: 0s - loss: 1.7363

171/562 [========>.....................] - ETA: 0s - loss: 1.9939

259/562 [============>.................] - ETA: 0s - loss: 1.7158

350/562 [=================>............] - ETA: 0s - loss: 1.4849

440/562 [======================>.......] - ETA: 0s - loss: 1.3212

529/562 [===========================>..] - ETA: 0s - loss: 1.2522

562/562 [==============================] - 0s 587us/step - loss: 1.2483
Epoch 33/50

  1/562 [..............................] - ETA: 0s - loss: 0.7882

 89/562 [===>..........................] - ETA: 0s - loss: 1.6997

175/562 [========>.....................] - ETA: 0s - loss: 1.9846

262/562 [============>.................] - ETA: 0s - loss: 1.7027

350/562 [=================>............] - ETA: 0s - loss: 1.4814

437/562 [======================>.......] - ETA: 0s - loss: 1.3232

526/562 [===========================>..] - ETA: 0s - loss: 1.2470

562/562 [==============================] - 0s 575us/step - loss: 1.2457
Epoch 34/50

  1/562 [..............................] - ETA: 0s - loss: 0.7843

 86/562 [===>..........................] - ETA: 0s - loss: 1.7283

171/562 [========>.....................] - ETA: 0s - loss: 1.9839

259/562 [============>.................] - ETA: 0s - loss: 1.7071

345/562 [=================>............] - ETA: 0s - loss: 1.4909

432/562 [======================>.......] - ETA: 0s - loss: 1.3269

516/562 [==========================>...] - ETA: 0s - loss: 1.2490

562/562 [==============================] - 0s 584us/step - loss: 1.2431
Epoch 35/50

  1/562 [..............................] - ETA: 0s - loss: 0.7808

 88/562 [===>..........................] - ETA: 0s - loss: 1.7000

175/562 [========>.....................] - ETA: 0s - loss: 1.9748

264/562 [=============>................] - ETA: 0s - loss: 1.6882

351/562 [=================>............] - ETA: 0s - loss: 1.4709

440/562 [======================>.......] - ETA: 0s - loss: 1.3121

529/562 [===========================>..] - ETA: 0s - loss: 1.2443

562/562 [==============================] - 0s 572us/step - loss: 1.2405
Epoch 36/50

  1/562 [..............................] - ETA: 0s - loss: 0.7776

 87/562 [===>..........................] - ETA: 0s - loss: 1.7096

176/562 [========>.....................] - ETA: 0s - loss: 1.9639

260/562 [============>.................] - ETA: 0s - loss: 1.6939

348/562 [=================>............] - ETA: 0s - loss: 1.4779

437/562 [======================>.......] - ETA: 0s - loss: 1.3143

525/562 [===========================>..] - ETA: 0s - loss: 1.2398

562/562 [==============================] - 0s 575us/step - loss: 1.2380
Epoch 37/50

  1/562 [..............................] - ETA: 0s - loss: 0.7747

 88/562 [===>..........................] - ETA: 0s - loss: 1.6925

175/562 [========>.....................] - ETA: 0s - loss: 1.9651

262/562 [============>.................] - ETA: 0s - loss: 1.6861

345/562 [=================>............] - ETA: 0s - loss: 1.4808

433/562 [======================>.......] - ETA: 0s - loss: 1.3167

520/562 [==========================>...] - ETA: 0s - loss: 1.2365

562/562 [==============================] - 0s 581us/step - loss: 1.2355
Epoch 38/50

  1/562 [..............................] - ETA: 0s - loss: 0.7721

 90/562 [===>..........................] - ETA: 0s - loss: 1.6733

179/562 [========>.....................] - ETA: 0s - loss: 1.9555

268/562 [=============>................] - ETA: 0s - loss: 1.6632

352/562 [=================>............] - ETA: 0s - loss: 1.4573

440/562 [======================>.......] - ETA: 0s - loss: 1.3036

526/562 [===========================>..] - ETA: 0s - loss: 1.2341

562/562 [==============================] - 0s 576us/step - loss: 1.2331
Epoch 39/50

  1/562 [..............................] - ETA: 0s - loss: 0.7698

 87/562 [===>..........................] - ETA: 0s - loss: 1.6985

172/562 [========>.....................] - ETA: 0s - loss: 1.9660

256/562 [============>.................] - ETA: 0s - loss: 1.6912

340/562 [=================>............] - ETA: 0s - loss: 1.4795

424/562 [=====================>........] - ETA: 0s - loss: 1.3214

508/562 [==========================>...] - ETA: 0s - loss: 1.2366

562/562 [==============================] - 0s 595us/step - loss: 1.2306
Epoch 40/50

  1/562 [..............................] - ETA: 0s - loss: 0.7678

 86/562 [===>..........................] - ETA: 0s - loss: 1.7051

173/562 [========>.....................] - ETA: 0s - loss: 1.9580

261/562 [============>.................] - ETA: 0s - loss: 1.6743

351/562 [=================>............] - ETA: 0s - loss: 1.4545

437/562 [======================>.......] - ETA: 0s - loss: 1.3030

519/562 [==========================>...] - ETA: 0s - loss: 1.2310

562/562 [==============================] - 0s 584us/step - loss: 1.2282
Epoch 41/50

  1/562 [..............................] - ETA: 0s - loss: 0.7662

 89/562 [===>..........................] - ETA: 0s - loss: 1.6709

176/562 [========>.....................] - ETA: 0s - loss: 1.9400

263/562 [=============>................] - ETA: 0s - loss: 1.6684

350/562 [=================>............] - ETA: 0s - loss: 1.4548

435/562 [======================>.......] - ETA: 0s - loss: 1.3032

519/562 [==========================>...] - ETA: 0s - loss: 1.2286

562/562 [==============================] - 0s 585us/step - loss: 1.2258
Epoch 42/50

  1/562 [..............................] - ETA: 0s - loss: 0.7648

 90/562 [===>..........................] - ETA: 0s - loss: 1.6598

176/562 [========>.....................] - ETA: 0s - loss: 1.9354

262/562 [============>.................] - ETA: 0s - loss: 1.6662

348/562 [=================>............] - ETA: 0s - loss: 1.4582

433/562 [======================>.......] - ETA: 0s - loss: 1.3027

520/562 [==========================>...] - ETA: 0s - loss: 1.2242

562/562 [==============================] - 0s 581us/step - loss: 1.2235
Epoch 43/50

  1/562 [..............................] - ETA: 0s - loss: 0.7637

 88/562 [===>..........................] - ETA: 0s - loss: 1.6715

174/562 [========>.....................] - ETA: 0s - loss: 1.9366

260/562 [============>.................] - ETA: 0s - loss: 1.6659

348/562 [=================>............] - ETA: 0s - loss: 1.4549

435/562 [======================>.......] - ETA: 0s - loss: 1.2977

523/562 [==========================>...] - ETA: 0s - loss: 1.2194

562/562 [==============================] - 0s 580us/step - loss: 1.2211
Epoch 44/50

  1/562 [..............................] - ETA: 0s - loss: 0.7629

 89/562 [===>..........................] - ETA: 0s - loss: 1.6605

174/562 [========>.....................] - ETA: 0s - loss: 1.9316

258/562 [============>.................] - ETA: 0s - loss: 1.6697

328/562 [================>.............] - ETA: 0s - loss: 1.4891

403/562 [====================>.........] - ETA: 0s - loss: 1.3327

489/562 [=========================>....] - ETA: 0s - loss: 1.2411

562/562 [==============================] - 0s 615us/step - loss: 1.2188
Epoch 45/50

  1/562 [..............................] - ETA: 0s - loss: 0.7623

 88/562 [===>..........................] - ETA: 0s - loss: 1.6640

172/562 [========>.....................] - ETA: 0s - loss: 1.9367

249/562 [============>.................] - ETA: 0s - loss: 1.6818

328/562 [================>.............] - ETA: 0s - loss: 1.4853

409/562 [====================>.........] - ETA: 0s - loss: 1.3211

494/562 [=========================>....] - ETA: 0s - loss: 1.2315

562/562 [==============================] - 0s 612us/step - loss: 1.2164
Epoch 46/50

  1/562 [..............................] - ETA: 0s - loss: 0.7617

 79/562 [===>..........................] - ETA: 0s - loss: 1.7771

162/562 [=======>......................] - ETA: 0s - loss: 1.8124

245/562 [============>.................] - ETA: 0s - loss: 1.6946

332/562 [================>.............] - ETA: 0s - loss: 1.4742

414/562 [=====================>........] - ETA: 0s - loss: 1.3155

494/562 [=========================>....] - ETA: 0s - loss: 1.2289

562/562 [==============================] - 0s 607us/step - loss: 1.2140
Epoch 47/50

  1/562 [..............................] - ETA: 0s - loss: 0.7608

 84/562 [===>..........................] - ETA: 0s - loss: 1.7061

166/562 [=======>......................] - ETA: 0s - loss: 1.8376

248/562 [============>.................] - ETA: 0s - loss: 1.6760

332/562 [================>.............] - ETA: 0s - loss: 1.4706

416/562 [=====================>........] - ETA: 0s - loss: 1.3105

500/562 [=========================>....] - ETA: 0s - loss: 1.2212

562/562 [==============================] - 0s 608us/step - loss: 1.2116
Epoch 48/50

  1/562 [..............................] - ETA: 0s - loss: 0.7598

 68/562 [==>...........................] - ETA: 0s - loss: 1.9144

152/562 [=======>......................] - ETA: 0s - loss: 1.7163

235/562 [===========>..................] - ETA: 0s - loss: 1.7281

310/562 [===============>..............] - ETA: 0s - loss: 1.5178

391/562 [===================>..........] - ETA: 0s - loss: 1.3393

465/562 [=======================>......] - ETA: 0s - loss: 1.2552

543/562 [===========================>..] - ETA: 0s - loss: 1.2207

562/562 [==============================] - 0s 652us/step - loss: 1.2093
Epoch 49/50

  1/562 [..............................] - ETA: 0s - loss: 0.7586

 80/562 [===>..........................] - ETA: 0s - loss: 1.7536

164/562 [=======>......................] - ETA: 0s - loss: 1.7965

243/562 [===========>..................] - ETA: 0s - loss: 1.6893

324/562 [================>.............] - ETA: 0s - loss: 1.4813

411/562 [====================>.........] - ETA: 0s - loss: 1.3099

497/562 [=========================>....] - ETA: 0s - loss: 1.2180

562/562 [==============================] - 0s 613us/step - loss: 1.2070
Epoch 50/50

  1/562 [..............................] - ETA: 0s - loss: 0.7570

 81/562 [===>..........................] - ETA: 0s - loss: 1.7368

168/562 [=======>......................] - ETA: 0s - loss: 1.8617

256/562 [============>.................] - ETA: 0s - loss: 1.6456

325/562 [================>.............] - ETA: 0s - loss: 1.4745

411/562 [====================>.........] - ETA: 0s - loss: 1.3071

497/562 [=========================>....] - ETA: 0s - loss: 1.2156

562/562 [==============================] - 0s 604us/step - loss: 1.2048
<tensorflow.python.keras.callbacks.History at 0x7f53d80530d0>

That was quite simple!

For reference, we got a loss about as low as 3 in our previous work. It was also much faster, thanks to the optimizations. Now let’s see how our model did on the test data

# get model predictions on test data and get labels
# squeeze to remove extra dimensions
yhat = np.squeeze(model.predict(test_data))
test_y = soldata['Solubility'].values[:test_N]
plt.plot(test_y, yhat, '.')
plt.plot(test_y, test_y, '-')
plt.xlabel('Measured Solubility $y$')
plt.ylabel('Predicted Solubility $\hat{y}$')
plt.text(min(test_y) + 1, max(test_y) - 2, f'correlation = {np.corrcoef(test_y, yhat)[0,1]:.3f}')
plt.text(min(test_y) + 1, max(test_y) - 3, f'loss = {np.sqrt(np.mean((test_y - yhat)**2)):.3f}')
plt.show()
../_images/introduction_16_0.png

This performance is better than our simple linear model.

1.8. Chapter Summary

  • Deep learning is a category of machine learning that utilizes neural networks for classification and regression of data.

  • Neural networks are a series of operations with matrices of adjustable parameters.

  • A neural network transforms input features into a new set of features that can be subsequently used for regression or classification.

  • The most common layer is the dense layer. Each input element affects each output element. It is defined by the desired output feature shape and the activation function.

  • With enough layers or wide enough hidden layers, neural networks can approximate unknown functions.

  • Hidden layers are called such because we do not observe the output from one.

  • Using libraries such as TensorFlow, it becomes easy to split data into training and testing, but also to build layers in the neural network.

  • Building a neural network allows us to predict various properties of molecules, such as solubility.