Jump to content

PyTorch: Difference between revisions

No edit summary
Line 116: Line 116:
===Float16===
===Float16===
Float16 uses half the memory of float32.   
Float16 uses half the memory of float32.   
New Nvidia GPUs also have dedicated hardware called tensor cores to speed up float16 matrix multiplication.   
New Nvidia GPUs also have dedicated hardware instructions called tensor cores to speed up float16 matrix multiplication.   
Typically it's best to train using float32 though for stability purposes.   
Typically it's best to train using float32 though for stability purposes.   
You can do truncate trained models and inference using float16.
You can do truncate trained models and inference using float16.