TensorFlow: Difference between revisions

 
(13 intermediate revisions by the same user not shown)
Line 2: Line 2:


==Install==
==Install==
* Install CUDA and CuDNN
* Create a conda environment with python 3.5+
** <code>conda create -n my_env python=3.8</code>
* Install with pip


===Install TF2===
===Install TF2===
See https://www.tensorflow.org/install/pip
Install tensorflow and [https://www.tensorflow.org/addons/overview tensorflow-addons]
Install tensorflow and [https://www.tensorflow.org/addons/overview tensorflow-addons]
<pre>
<pre>
conda install tensorflow-gpu
pip install tensorflow-addons
pip install tensorflow-addons
</pre>
</pre>


;Notes
* Note that [https://anaconda.org/anaconda/tensorflow anaconda/tensorflow] does not always have the latest version.
* If you prefer, you can install only cuda and cudnn from conda:
* If you prefer, you can install only cuda and cudnn from conda:
** See [https://www.tensorflow.org/install/source#linux https://www.tensorflow.org/install/source#linux] for a list of compatible Cuda and Cudnn versions.
** See [https://www.tensorflow.org/install/source#linux https://www.tensorflow.org/install/source#linux] for a list of compatible Cuda and Cudnn versions.
Line 20: Line 18:


===Install TF1===
===Install TF1===
The last official version of TensorFlow v1 is 1.15. This version does not work on RTX 3000+ (Ampere) GPUs. Your code will run but output bad results.<br>
If you need TensorFlow v1, see [https://github.com/NVIDIA/tensorflow nvidia-tensorflow].
<pre>
<pre>
conda install tensorflow-gpu=1.15
pip install nvidia-pyindex
pip install nvidia-tensorflow
</pre>
</pre>
;Notes
* Conda will automatically install a compatible cuda and cudnn into the cuda environment. Your host OS only needs to have a sufficiently new version of nvidia drivers installed.
* Sometimes, I get <code>CUDNN_STATUS_INTERNAL_ERROR</code>. This is fixed by setting the environment variable <code>TF_FORCE_GPU_ALLOW_GROWTH=true</code> in my conda env. See [https://stackoverflow.com/questions/46826497/conda-set-ld-library-path-for-env-only Add env variables to conda env]


==Usage (TF2)==
==Usage (TF2)==
Here we'll cover usage using TensorFlow 2 which has eager execution.<br>
Here we'll cover usage using TensorFlow 2 which has eager execution.<br>
This is using the Keras API in tensorflow.keras.
This is using the Keras API in tensorflow.keras.
===Basics===
===Keras Pipeline===
[https://www.tensorflow.org/api_docs/python/tf/keras/Model tf.keras.Model]
 
The general pipeline using Keras is:
The general pipeline using Keras is:
* Define a model, typically using [https://www.tensorflow.org/api_docs/python/tf/keras/Sequential tf.keras.Sequential]
* Define a model, typically using [https://www.tensorflow.org/api_docs/python/tf/keras/Sequential tf.keras.Sequential]
* Call [https://www.tensorflow.org/api_docs/python/tf/keras/Model#compile <code>model.compile</code>]
* Call [https://www.tensorflow.org/api_docs/python/tf/keras/Model#compile <code>model.compile</code>]
** Here you pass in your optimizer, loss function, and metrics.
** Here you pass in your optimizer, loss function, and metrics.
* Train your model by calling <code>model.fit</code>
* Train your model by calling [https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit <code>model.fit</code>]
** Here you pass in your training data, batch size, number of epochs, and training callbacks
** Here you pass in your training data, batch size, number of epochs, and training callbacks
** For more information about callbacks, see [https://www.tensorflow.org/guide/keras/custom_callback Keras custom callbacks].
** For more information about callbacks, see [https://www.tensorflow.org/guide/keras/custom_callback Keras custom callbacks].


After training, you can use your model by calling <code>model.evaluate</code>
After training, you can use your model by calling [https://www.tensorflow.org/api_docs/python/tf/keras/Model#evaluate <code>model.evaluate</code>]


===Custom Models===
===Custom Models===
Line 146: Line 145:


==Usage (TF1)==
==Usage (TF1)==
In TF1, you first build a computational graph by chaining commands with placeholder.   
In TF1, you first build a computational graph by chaining commands with placeholders and constant variables.   
Then, you execute the graph in a tf session.
Then, you execute the graph in a <code>tf.Session()</code>.
{{hidden | TF1 MNIST Example |
{{hidden | TF1 MNIST Example |
<syntaxhighlight lang="python">
<syntaxhighlight lang="python">
Line 197: Line 196:
</syntaxhighlight>
</syntaxhighlight>
}}
}}
===Batch Normalization===
See [https://www.tensorflow.org/api_docs/python/tf/compat/v1/layers/batch_normalization <code>tf.compat.v1.layers.batch_normalization</code>]
When training with batchnorm, you need to run <code>tf.GraphKeys.UPDATE_OPS</code> in your session to update the batchnorm variables or they will not be updated.
These variables do not contribute to the loss when training is true so they will not by updated by the optimizer.
<syntaxhighlight lang="python">
update_ops = tf.compat.v1.get_collection(tf.GraphKeys.UPDATE_OPS)
train_op = optimizer.minimize(loss)
train_op = tf.group([train_op, update_ops])
</syntaxhighlight>


==Estimators==
==Estimators==