TensorBoard

Revision as of 14:22, 11 November 2020 by David (talk | contribs) (→‎CLI Usage)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
\( \newcommand{\P}[]{\unicode{xB6}} \newcommand{\AA}[]{\unicode{x212B}} \newcommand{\empty}[]{\emptyset} \newcommand{\O}[]{\emptyset} \newcommand{\Alpha}[]{Α} \newcommand{\Beta}[]{Β} \newcommand{\Epsilon}[]{Ε} \newcommand{\Iota}[]{Ι} \newcommand{\Kappa}[]{Κ} \newcommand{\Rho}[]{Ρ} \newcommand{\Tau}[]{Τ} \newcommand{\Zeta}[]{Ζ} \newcommand{\Mu}[]{\unicode{x039C}} \newcommand{\Chi}[]{Χ} \newcommand{\Eta}[]{\unicode{x0397}} \newcommand{\Nu}[]{\unicode{x039D}} \newcommand{\Omicron}[]{\unicode{x039F}} \DeclareMathOperator{\sgn}{sgn} \def\oiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x222F}\,}{\unicode{x222F}}{\unicode{x222F}}{\unicode{x222F}}}\,}\nolimits} \def\oiiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x2230}\,}{\unicode{x2230}}{\unicode{x2230}}{\unicode{x2230}}}\,}\nolimits} \)

TensorBoard is a way to visualize your model and various statistics during or after training.

CLI Usage

CLI

tensorboard --logdir [logs]
Flags
  • --samples_per_plugin indices the number of samples to show for each tab. Non-scalar objects are sampled using reservoir sampling.
    • --samples_per_plugin images=10000 samples approximately 10000 images.

Training Usage

If you're using a custom training loop (i.e. gradient tape), then you'll need to set everything up manually.

First create a SummaryWriter

train_log_dir = os.path.join(args.checkpoint_dir, "logs", "train")
train_summary_writer = tf.summary.create_file_writer(train_log_dir)

Scalars

Add scalars using tf.summary.scalar:

with train_summary_writer.as_default():
  tf.summary.scalar("training_loss", m_loss.numpy(), step=int(ckpt.step))

PyTorch

PyTorch also supports output tensorboard logs.
See https://pytorch.org/docs/stable/tensorboard.html.
There is also lanpa/tensorboardX but I haven't tried it.

from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter(log_dir="./runs")

writer.add_scalar("train_loss", loss_np, step)

# Optionally flush e.g. at checkpoints
writer.flush()

# Close the writer (will flush)
writer.close()

Resources