Google Colab

From David's Wiki
\( \newcommand{\P}[]{\unicode{xB6}} \newcommand{\AA}[]{\unicode{x212B}} \newcommand{\empty}[]{\emptyset} \newcommand{\O}[]{\emptyset} \newcommand{\Alpha}[]{Α} \newcommand{\Beta}[]{Β} \newcommand{\Epsilon}[]{Ε} \newcommand{\Iota}[]{Ι} \newcommand{\Kappa}[]{Κ} \newcommand{\Rho}[]{Ρ} \newcommand{\Tau}[]{Τ} \newcommand{\Zeta}[]{Ζ} \newcommand{\Mu}[]{\unicode{x039C}} \newcommand{\Chi}[]{Χ} \newcommand{\Eta}[]{\unicode{x0397}} \newcommand{\Nu}[]{\unicode{x039D}} \newcommand{\Omicron}[]{\unicode{x039F}} \DeclareMathOperator{\sgn}{sgn} \def\oiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x222F}\,}{\unicode{x222F}}{\unicode{x222F}}{\unicode{x222F}}}\,}\nolimits} \def\oiiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x2230}\,}{\unicode{x2230}}{\unicode{x2230}}{\unicode{x2230}}}\,}\nolimits} \)

Google Colab is a free online Jupyter notebook you can access for small ML training exercises.
It supports Python and R.


Google Drive

Colab supports mounting Google Drive.
However, files are cached locally on the VM which only has ~100GB.
This means you cannot directly write (e.g. via curl) files over 100GB to your Google Drive folder.

There are some tricks you can try though:

curl
  • Run curl -r 0-4999999999 [url] -o part1 to download ~50GB your Google Drive
  • Wait ~10 minutes for the cache to clear and files to copy over
  • Do curl -r 5000000000-9999999999 [url] -o part2
  • Repeat until you have the whole >50GB file
  • .

Limitations

  • You can't create non-empty files in a folder that is shared with you.
  • Colab struggles copying large files and handing folders with lots of file in Google Drive.

Workarounds

R Runtime

how to use r with google colab
There are two ways to do this:

  • Use %load_ext rpy2.ipython to enable R and add %%R to each R cell.
  • Use colab.to/r

Colab Pro

Colab Pro

Claims faster GPUs, longer runtimes, and more memory.
However nothing is guaranteed and VMs are still preemptible.
Runtimes are limited to 24 hours as opposed to 12 hours for free users.