# Printing the loss during TensorFlow training

8.4k Views

I am looking at the TensorFlow "MNIST For ML Beginners" tutorial, and I want to print out the training loss after every training step.

My training loop currently looks like this:

``````for i in range(100):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
``````

Now, `train_step` is defined as:

``````train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
``````

Where `cross_entropy` is the loss which I want to print out:

``````cross_entropy = -tf.reduce_sum(y_ * tf.log(y))
``````

One way to print this would be to explicitly compute `cross_entropy` in the training loop:

``````for i in range(100):
batch_xs, batch_ys = mnist.train.next_batch(100)
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))
print 'loss = ' + str(cross_entropy)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
``````

I now have two questions regarding this:

1. Given that `cross_entropy` is already computed during `sess.run(train_step, ...)`, it seems inefficient to compute it twice, requiring twice the number of forward passes of all the training data. Is there a way to access the value of `cross_entropy` when it was computed during `sess.run(train_step, ...)`?

2. How do I even print a `tf.Variable`? Using `str(cross_entropy)` gives me an error...

Thank you! You can fetch the value of `cross_entropy` by adding it to the list of arguments to `sess.run(...)`. For example, your `for`-loop could be rewritten as follows:

``````for i in range(100):
batch_xs, batch_ys = mnist.train.next_batch(100)
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))
_, loss_val = sess.run([train_step, cross_entropy],
feed_dict={x: batch_xs, y_: batch_ys})
print 'loss = ' + loss_val
``````

The same approach can be used to print the current value of a variable. Let's say, in addition to the value of `cross_entropy`, you wanted to print the value of a `tf.Variable` called `W`, you could do the following:

``````for i in range(100):
batch_xs, batch_ys = mnist.train.next_batch(100)
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))
_, loss_val, W_val = sess.run([train_step, cross_entropy, W],
feed_dict={x: batch_xs, y_: batch_ys})
print 'loss = %s' % loss_val
print 'W = %s' % W_val
``````
• 1
• Thanks. So everytime I call `sess.run([train_step, cross_entropy])`, it still only computes `cross_entropy` once, right? It doesn't do an additional forward pass for each of the variables I pass?
• 1
• That's right - it executes the exact same subgraph (because `cross_entropy` is already calculated as part of the training step), and just adds an extra node to fetch the value of `cross_entropy` back to your Python program.
• Thanks. As a side point, after updating my code as you suggested, the value of `cross_entropy` does, on average, decrease over the loop. However, sometimes it actually increases from one training iteration to the next. This happens for a range of step sizes in the gradient descent. Is this expected? Wouldn't the loss always decrease after each iteration, because you are moving the weights in a direction which should reduce this loss? The graph of loss vs iteration is here: i.stack.imgur.com/f8B80.png
• That's to be expected - the loss will fluctuate as you pass in different training examples, but it should have an overall downward trend. If it starts to increase again, then you may be overfitting so you should investigate early stopping: en.wikipedia.org/wiki/Early_stopping
• 2
• Is this loss for a batch of 100 records ? Should we divide by 100 ?

Instead of just running the training_step, run also the cross_entropy node so that its value is returned to you. Remember that:

``````var_as_a_python_value = sess.run(tensorflow_variable)
``````

will give you what you want, so you can do this:

``````[_, cross_entropy_py] = sess.run([train_step, cross_entropy],
feed_dict={x: batch_xs, y_: batch_ys})
``````

to both run the training and pull out the value of the cross entropy as it was computed during the iteration. Note that I turned both the arguments to sess.run and the return values into a list so that both happen.