5,323
edits
No edit summary |
|||
Line 116: | Line 116: | ||
===Float16=== | ===Float16=== | ||
Float16 uses half the memory of float32. | Float16 uses half the memory of float32. | ||
New Nvidia GPUs also have dedicated hardware called tensor cores to speed up float16 matrix multiplication. | New Nvidia GPUs also have dedicated hardware instructions called tensor cores to speed up float16 matrix multiplication. | ||
Typically it's best to train using float32 though for stability purposes. | Typically it's best to train using float32 though for stability purposes. | ||
You can do truncate trained models and inference using float16. | You can do truncate trained models and inference using float16. |