TensorFlow: Difference between revisions

(21 intermediate revisions by the same user not shown)

Line 2:

==Install==

* Install CUDA and CuDNN

* Create a conda environment with python 3.5+

** <code>conda create -n my_env python=3.8</code>

* Install with pip

===Install TF2===

See https://www.tensorflow.org/install/pip

Install tensorflow and [https://www.tensorflow.org/addons/overview tensorflow-addons]

<pre>

~~# Install cuda and cudnn if necessary~~

pip install tensorflow-addons

~~conda install cudatoolkit=11.0.221~~

pip install ~~tensorflow~~ tensorflow-addons

</pre>

* ~~Run~~ <code>conda search cudatoolkit</code> to ~~see other~~ versions of cuda available

;Notes

* ~~On Windows, there is no cudnn available for cuda 11 in conda's repos. You will need to install this manually by downloading~~ [https://developer.nvidia.com/cuDNN cudnn] and ~~copying~~ the binaries to the environment's <code>Library/bin/</code> directory.

* Note that [https://anaconda.org/anaconda/tensorflow anaconda/tensorflow] does not always have the latest version.

* If you prefer, you can install only cuda and cudnn from conda:

** See [https://www.tensorflow.org/install/source#linux https://www.tensorflow.org/install/source#linux] for a list of compatible Cuda and Cudnn versions.

** <code>conda search cudatoolkit</code> to which versions of cuda available

** Download [https://developer.nvidia.com/cuDNN cudnn] and copy the binaries to the environment's <code>Library/bin/</code> directory.

===Install TF1===

~~Note: You~~ will ~~only need TF1 if working with a TF1 repo~~.

The last official version of TensorFlow v1 is 1.15. This version does not work on RTX 3000+ (Ampere) GPUs. Your code will run but output bad results.<br>

If ~~migrating your old code,~~ you ~~can install TF2 and use:~~

If you need TensorFlow v1, see [https://github.com/NVIDIA/tensorflow nvidia-tensorflow].

* <code>import tensorflow.compat.v1 ~~as tf</code>~~

* See [https://~~www.tensorflow~~.~~org~~/~~guide~~/~~migrate TF Guide Migrate]~~

~~See [https://www.~~tensorflow~~.org/install/source#linux https://www.~~tensorflow~~.org/install/source#linux~~] ~~for a list of compatible Cuda and Cudnn versions~~.

<pre>

~~# Install compatible cuda and cudnn versions.~~

pip install nvidia-pyindex

~~conda~~ install ~~cudatoolkit=10.0.130 cudnn=7.6.5~~

pip install nvidia-tensorflow

~~# Install tensorflow~~

pip install ~~tensorflow-gpu==1.15~~

~~# Test GPU support~~

~~python~~ -~~c "import~~ tensorflow ~~as tf;print(tf.test.is_gpu_available())"~~

</pre>

~~;Notes~~

* Sometimes, I get <code>CUDNN_STATUS_INTERNAL_ERROR</code>. This is fixed by setting the environment variable <code>TF_FORCE_GPU_ALLOW_GROWTH=true</code> in my conda env. See [https://stackoverflow.com/questions/46826497/conda-set-ld-library-path-for-env-only Add env variables to conda env]

==Usage (TF2)==

Here we'll cover usage using TensorFlow 2 which has eager execution.<br>

This is using the Keras API in tensorflow.keras.

===~~Basics~~===

===Keras Pipeline===

[https://www.tensorflow.org/api_docs/python/tf/keras/Model tf.keras.Model]

The general pipeline using Keras is:

* Define a model, typically using [https://www.tensorflow.org/api_docs/python/tf/keras/Sequential tf.keras.Sequential]

* Call [https://www.tensorflow.org/api_docs/python/tf/keras/Model#compile <code>model.compile</code>]

** Here you pass in your optimizer, loss function, and metrics.

* Train your model by calling <code>model.fit</code>

* Train your model by calling [https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit <code>model.fit</code>]

** Here you pass in your training data, batch size, number of epochs, and training callbacks

** For more information about callbacks, see [https://www.tensorflow.org/guide/keras/custom_callback Keras custom callbacks].

After training, you can use your model by calling <code>model.evaluate</code>

After training, you can use your model by calling [https://www.tensorflow.org/api_docs/python/tf/keras/Model#evaluate <code>model.evaluate</code>]

===Custom Models===

Line 68:

Line 54:

You can write your own training loop by doing the following:

import tensorflow as tf

from tensorflow import keras

my_model= keras.Sequential([

my_model = keras.Sequential([

keras.~~layers.Dense~~(~~400, input_shape~~=~~400, activation='relu'),~~

keras.Input(shape=(400,)),

~~keras.layers.Dense~~(400, ~~activation='relu'~~),

~~keras.layers.Dense(400, activation='relu'~~),

keras.layers.Dense(400, activation='relu'),

Line 159:

Line 145:

==Usage (TF1)==

In TF1, you first build a computational graph by chaining commands with ~~placeholder~~.

In TF1, you first build a computational graph by chaining commands with placeholders and constant variables.

Then, you execute the graph in a tf ~~session~~.

Then, you execute the graph in a <code>tf.Session()</code>.

{{hidden | TF1 MNIST Example |

import tensorflow as tf

from tensorflow import keras

import numpy as np

NUM_EPOCHS = 10

BATCH_SIZE = 64

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

rng = np.random.default_rng()

classification_model = keras.Sequential([

keras.Input(shape=(28, 28, 1)),

keras.layers.Conv2D(16, 3, padding="SAME"),

keras.layers.ReLU(),

keras.layers.Conv2D(16, 3, padding="SAME"),

keras.layers.ReLU(),

keras.layers.Flatten(),

keras.layers.Dense(10, activation='relu'),

])

x_in = tf.compat.v1.placeholder(dtype=tf.float32, shape=(None, 28, 28, 1))

logits = classification_model(x_in)

gt_classes = tf.compat.v1.placeholder(dtype=tf.int32, shape=(None,))

loss = tf.losses.softmax_cross_entropy(tf.one_hot(gt_classes, 10), logits)

optimizer = tf.train.AdamOptimizer(learning_rate=0.0001).minimize(loss)

with tf.compat.v1.Session() as sess:

sess.run(tf.compat.v1.global_variables_initializer())

global_step = 0

for epoch in range(NUM_EPOCHS):

x_count = x_train.shape[0]

image_ordering = rng.choice(range(x_count), x_count, replace=False)

current_idx = 0

while current_idx < x_count:

my_indices = image_ordering[current_idx:min(current_idx + BATCH_SIZE, x_count)]

x = x_train[my_indices]

x = x[:, :, :, None] / 255

logits_val, loss_val, _ = sess.run((logits, loss, optimizer), {

x_in: x,

gt_classes: y_train[my_indices]

})

if global_step % 100 == 0:

print("Loss", loss_val)

current_idx += BATCH_SIZE

global_step += 1

</syntaxhighlight>

}}

===Batch Normalization===

See [https://www.tensorflow.org/api_docs/python/tf/compat/v1/layers/batch_normalization <code>tf.compat.v1.layers.batch_normalization</code>]

When training with batchnorm, you need to run <code>tf.GraphKeys.UPDATE_OPS</code> in your session to update the batchnorm variables or they will not be updated.

These variables do not contribute to the loss when training is true so they will not by updated by the optimizer.

update_ops = tf.compat.v1.get_collection(tf.GraphKeys.UPDATE_OPS)

train_op = optimizer.minimize(loss)

train_op = tf.group([train_op, update_ops])

</syntaxhighlight>

@@ Line 2: / Line 2: @@
 ==Install==
-* Install CUDA and CuDNN
-* Create a conda environment with python 3.5+
-** <code>conda create -n my_env python=3.8</code>
-* Install with pip
 ===Install TF2===
+See https://www.tensorflow.org/install/pip
 Install tensorflow and [https://www.tensorflow.org/addons/overview tensorflow-addons]
 <pre>
-# Install cuda and cudnn if necessary
+pip install tensorflow-addons
-conda install cudatoolkit=11.0.221
-pip install tensorflow tensorflow-addons
 </pre>
-* Run <code>conda search cudatoolkit</code> to see other versions of cuda available
+;Notes
-* On Windows, there is no cudnn available for cuda 11 in conda's repos. You will need to install this manually by downloading [https://developer.nvidia.com/cuDNN cudnn] and copying the binaries to the environment's <code>Library/bin/</code> directory.
+* Note that [https://anaconda.org/anaconda/tensorflow anaconda/tensorflow] does not always have the latest version.
+* If you prefer, you can install only cuda and cudnn from conda:
+** See [https://www.tensorflow.org/install/source#linux https://www.tensorflow.org/install/source#linux] for a list of compatible Cuda and Cudnn versions.
+** <code>conda search cudatoolkit</code> to which versions of cuda available
+** Download [https://developer.nvidia.com/cuDNN cudnn] and copy the binaries to the environment's <code>Library/bin/</code> directory.
 ===Install TF1===
-Note: You will only need TF1 if working with a TF1 repo.
+The last official version of TensorFlow v1 is 1.15. This version does not work on RTX 3000+ (Ampere) GPUs. Your code will run but output bad results.<br>
-If migrating your old code, you can install TF2 and use:
+If you need TensorFlow v1, see [https://github.com/NVIDIA/tensorflow nvidia-tensorflow].
-* <code>import tensorflow.compat.v1 as tf</code>
-* See [https://www.tensorflow.org/guide/migrate TF Guide Migrate]
-See [https://www.tensorflow.org/install/source#linux https://www.tensorflow.org/install/source#linux] for a list of compatible Cuda and Cudnn versions.
 <pre>
-# Install compatible cuda and cudnn versions.
+pip install nvidia-pyindex
-conda install cudatoolkit=10.0.130 cudnn=7.6.5
+pip install nvidia-tensorflow
-# Install tensorflow
-pip install tensorflow-gpu==1.15
-# Test GPU support
-python -c "import tensorflow as tf;print(tf.test.is_gpu_available())"
 </pre>
-;Notes
-* Sometimes, I get <code>CUDNN_STATUS_INTERNAL_ERROR</code>. This is fixed by setting the environment variable <code>TF_FORCE_GPU_ALLOW_GROWTH=true</code> in my conda env. See [https://stackoverflow.com/questions/46826497/conda-set-ld-library-path-for-env-only Add env variables to conda env]
 ==Usage (TF2)==
 Here we'll cover usage using TensorFlow 2 which has eager execution.<br>
 This is using the Keras API in tensorflow.keras.
-===Basics===
+===Keras Pipeline===
+[https://www.tensorflow.org/api_docs/python/tf/keras/Model tf.keras.Model]
 The general pipeline using Keras is:
 * Define a model, typically using [https://www.tensorflow.org/api_docs/python/tf/keras/Sequential tf.keras.Sequential]
 * Call [https://www.tensorflow.org/api_docs/python/tf/keras/Model#compile <code>model.compile</code>]
 ** Here you pass in your optimizer, loss function, and metrics.
-* Train your model by calling <code>model.fit</code>
+* Train your model by calling [https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit <code>model.fit</code>]
 ** Here you pass in your training data, batch size, number of epochs, and training callbacks
 ** For more information about callbacks, see [https://www.tensorflow.org/guide/keras/custom_callback Keras custom callbacks].
-After training, you can use your model by calling <code>model.evaluate</code>
+After training, you can use your model by calling [https://www.tensorflow.org/api_docs/python/tf/keras/Model#evaluate <code>model.evaluate</code>]
 ===Custom Models===
@@ Line 68: / Line 54: @@
 You can write your own training loop by doing the following:
 <syntaxhighlight lang="python">
+import tensorflow as tf
+from tensorflow import keras
-my_model= keras.Sequential([
+my_model = keras.Sequential([
-     keras.layers.Dense(400, input_shape=400, activation='relu'),
+     keras.Input(shape=(400,)),
-    keras.layers.Dense(400, activation='relu'),
-    keras.layers.Dense(400, activation='relu'),
      keras.layers.Dense(400, activation='relu'),
      keras.layers.Dense(400, activation='relu'),
@@ Line 159: / Line 145: @@
 ==Usage (TF1)==
-In TF1, you first build a computational graph by chaining commands with placeholder.
+In TF1, you first build a computational graph by chaining commands with placeholders and constant variables.
-Then, you execute the graph in a tf session.
+Then, you execute the graph in a <code>tf.Session()</code>.
+{{hidden | TF1 MNIST Example |
 <syntaxhighlight lang="python">
 import tensorflow as tf
+from tensorflow import keras
+import numpy as np
+NUM_EPOCHS = 10
+BATCH_SIZE = 64
+(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
+rng = np.random.default_rng()
+classification_model = keras.Sequential([
+    keras.Input(shape=(28, 28, 1)),
+    keras.layers.Conv2D(16, 3, padding="SAME"),
+    keras.layers.ReLU(),
+    keras.layers.Conv2D(16, 3, padding="SAME"),
+    keras.layers.ReLU(),
+    keras.layers.Flatten(),
+    keras.layers.Dense(10, activation='relu'),
+])
+x_in = tf.compat.v1.placeholder(dtype=tf.float32, shape=(None, 28, 28, 1))
+logits = classification_model(x_in)
+gt_classes = tf.compat.v1.placeholder(dtype=tf.int32, shape=(None,))
+loss = tf.losses.softmax_cross_entropy(tf.one_hot(gt_classes, 10), logits)
+optimizer = tf.train.AdamOptimizer(learning_rate=0.0001).minimize(loss)
+with tf.compat.v1.Session() as sess:
+    sess.run(tf.compat.v1.global_variables_initializer())
+    global_step = 0
+    for epoch in range(NUM_EPOCHS):
+        x_count = x_train.shape[0]
+        image_ordering = rng.choice(range(x_count), x_count, replace=False)
+        current_idx = 0
+        while current_idx < x_count:
+            my_indices = image_ordering[current_idx:min(current_idx + BATCH_SIZE, x_count)]
+            x = x_train[my_indices]
+            x = x[:, :, :, None] / 255
+            logits_val, loss_val, _ = sess.run((logits, loss, optimizer), {
+                x_in: x,
+                gt_classes: y_train[my_indices]
+            })
+            if global_step % 100 == 0:
+                print("Loss", loss_val)
+            current_idx += BATCH_SIZE
+            global_step += 1
+</syntaxhighlight>
+}}
+===Batch Normalization===
+See [https://www.tensorflow.org/api_docs/python/tf/compat/v1/layers/batch_normalization <code>tf.compat.v1.layers.batch_normalization</code>]
+When training with batchnorm, you need to run <code>tf.GraphKeys.UPDATE_OPS</code> in your session to update the batchnorm variables or they will not be updated.
+These variables do not contribute to the loss when training is true so they will not by updated by the optimizer.
+<syntaxhighlight lang="python">
+update_ops = tf.compat.v1.get_collection(tf.GraphKeys.UPDATE_OPS)
+train_op = optimizer.minimize(loss)
+train_op = tf.group([train_op, update_ops])
 </syntaxhighlight>