5,323
edits
(→Memory) |
|||
Line 68: | Line 68: | ||
The PyTorch documentation suggests using this instead of <code>nn.DataParallel</code>. | The PyTorch documentation suggests using this instead of <code>nn.DataParallel</code>. | ||
The main difference is this uses multiple processes instead of multithreading to work around the Python Interpreter. | The main difference is this uses multiple processes instead of multithreading to work around the Python Interpreter. | ||
It also supports training on GPUs across multiple ''nodes'', or computers. | |||
==Optimizations== | ==Optimizations== |