Jump to content

PyTorch: Difference between revisions

228 bytes added ,  22 February 2021
Line 57: Line 57:
==Multi-GPU Training==
==Multi-GPU Training==
See [https://pytorch.org/tutorials/beginner/former_torchies/parallelism_tutorial.html Multi-GPU Examples].
See [https://pytorch.org/tutorials/beginner/former_torchies/parallelism_tutorial.html Multi-GPU Examples].
==nn.DataParallel==


The basic idea is to wrap blocks in [https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html#torch.nn.DataParallel <code>nn.DataParallel</code>].   
The basic idea is to wrap blocks in [https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html#torch.nn.DataParallel <code>nn.DataParallel</code>].   
Line 62: Line 64:


However, doing so causes you to lose access to custom methods and attributes.
However, doing so causes you to lose access to custom methods and attributes.
To save and load the model, just use <code>model.module.save_state_dict()</code> and <code>model.module.load_state_dict()</code>.


===nn.parallel.DistributedDataParallel===
===nn.parallel.DistributedDataParallel===
[https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html#torch.nn.parallel.DistributedDataParallel nn.parallel.DistributedDataParallel]   
[https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html#torch.nn.parallel.DistributedDataParallel nn.parallel.DistributedDataParallel]   
[https://pytorch.org/docs/stable/notes/cuda.html#cuda-nn-ddp-instead DistributedDataParallel vs DataParallel]
[https://pytorch.org/docs/stable/notes/cuda.html#cuda-nn-ddp-instead DistributedDataParallel vs DataParallel]
[https://pytorch.org/tutorials/intermediate/ddp_tutorial.html ddp tutorial]


The PyTorch documentation suggests using this instead of <code>nn.DataParallel</code>.
The PyTorch documentation suggests using this instead of <code>nn.DataParallel</code>.