5,323
edits
Line 57: | Line 57: | ||
==Multi-GPU Training== | ==Multi-GPU Training== | ||
See [https://pytorch.org/tutorials/beginner/former_torchies/parallelism_tutorial.html Multi-GPU Examples]. | See [https://pytorch.org/tutorials/beginner/former_torchies/parallelism_tutorial.html Multi-GPU Examples]. | ||
==nn.DataParallel== | |||
The basic idea is to wrap blocks in [https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html#torch.nn.DataParallel <code>nn.DataParallel</code>]. | The basic idea is to wrap blocks in [https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html#torch.nn.DataParallel <code>nn.DataParallel</code>]. | ||
Line 62: | Line 64: | ||
However, doing so causes you to lose access to custom methods and attributes. | However, doing so causes you to lose access to custom methods and attributes. | ||
To save and load the model, just use <code>model.module.save_state_dict()</code> and <code>model.module.load_state_dict()</code>. | |||
===nn.parallel.DistributedDataParallel=== | ===nn.parallel.DistributedDataParallel=== | ||
[https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html#torch.nn.parallel.DistributedDataParallel nn.parallel.DistributedDataParallel] | [https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html#torch.nn.parallel.DistributedDataParallel nn.parallel.DistributedDataParallel] | ||
[https://pytorch.org/docs/stable/notes/cuda.html#cuda-nn-ddp-instead DistributedDataParallel vs DataParallel] | [https://pytorch.org/docs/stable/notes/cuda.html#cuda-nn-ddp-instead DistributedDataParallel vs DataParallel] | ||
[https://pytorch.org/tutorials/intermediate/ddp_tutorial.html ddp tutorial] | |||
The PyTorch documentation suggests using this instead of <code>nn.DataParallel</code>. | The PyTorch documentation suggests using this instead of <code>nn.DataParallel</code>. |