PyTorch: Difference between revisions

PyTorch (view source)

1,016 bytes added , 13 August 2020

5,337

edits

@@ Line 51: / Line 51: @@
 ** If you have a list of modules, make sure to wrap them in <code>nn.ModuleList</code> or <code>nn.Sequential</code> so they are properly recognized.
 * Write a forward pass for your model.
+==Multi-GPU Training==
+See [https://pytorch.org/tutorials/beginner/former_torchies/parallelism_tutorial.html Multi-GPU Examples].
+The basic idea is to wrap blocks in [https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html#torch.nn.DataParallel <code>nn.DataParallel</code>].
+This will automatically duplicate the module across multiple GPUs and split the batch across GPUs during training.
+However, doing so causes you to lose access to custom methods and attributes.
+===nn.parallel.DistributedDataParallel===
+[https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html#torch.nn.parallel.DistributedDataParallel nn.parallel.DistributedDataParallel]
+[https://pytorch.org/docs/stable/notes/cuda.html#cuda-nn-ddp-instead DistributedDataParallel vs DataParallel]
+The PyTorch documentation suggests using this instead of <code>nn.DataParallel</code>.
+The main difference is this uses multiple processes instead of multithreading to work around the Python Interpreter.
 ==Memory Usage==