UMIACS Servers: Difference between revisions

 
(4 intermediate revisions by the same user not shown)
Line 29: Line 29:


==Python==
==Python==
Do not install anaconda. You will run out of space.<br>
Do not install anaconda in home. You will run out of space.<br>
Load the Python 3 module adding the following to your .bashrc file
Load the Python 3 module adding the following to your .bashrc file
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
Line 48: Line 48:
** <code>export PYTHONPATH="${PYTHONPATH}:/nfshomes/$(whoami)/.local/lib/python3.7/site-packages/"</code>
** <code>export PYTHONPATH="${PYTHONPATH}:/nfshomes/$(whoami)/.local/lib/python3.7/site-packages/"</code>
* You can also install using <code>pip install --target=/my-libs-folder/</code>
* You can also install using <code>pip install --target=/my-libs-folder/</code>
===Conda===
If you must install conda, install it somewhere with a lot of space like scratch.


===Install PyTorch===
===Install PyTorch===
Line 127: Line 130:
REMOTE_PORT=22350
REMOTE_PORT=22350
REMOTE_SSH_PORT=22450
REMOTE_SSH_PORT=22450
REMOTE_ADDR=192.168.78.34
REMOTE_ADDR=$(echo "$SSH_CONNECTION" | awk '{print $1}')


/usr/sbin/sshd -D -f sshd_config & \
/usr/sbin/sshd -D -f sshd_config & \
Line 136: Line 139:
Proxy the sshd from the local docker to your localhost.   
Proxy the sshd from the local docker to your localhost.   
Connect to the the sshd on the cluster
Connect to the the sshd on the cluster
==Class Accounts==
See [https://wiki.umiacs.umd.edu/umiacs/index.php/ClassAccounts UMIACS Wiki: ClassAccounts] 
Class accounts have the least priority. If GPUs are available, you can access 1 GPU up to 48 hours. 
However, your home disk only has 18GB and installing PyTorch takes up ~3GB. 
You cannot fit a conda environment in here so just use the python module.
The ssh endpoint is
<pre>
class.umiacs.umd.edu
</pre>
Start a job with:
<pre>
srun --pty --account=class --partition=class --gres=gpu:1 --mem=16G --qos=default --time=47:59:00 -c4 bash
</pre>
{{hidden | My .bashrc |
<pre>
#PS1='\w$ '
PS1='\[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$'
# Modules
module load tmux
module load cuda/10.0.130
module load cudnn/v7.5.0
module load Python3/3.7.6
alias python=python3
export PATH="${PATH}:${HOME}/bin/"
export PATH="${PATH}:${HOME}/.local/bin/"
</pre>
}}


==<code>.bashrc</code>==
==<code>.bashrc</code>==
Line 159: Line 197:
if command_exists module ; then
if command_exists module ; then
   module load tmux
   module load tmux
   module load cuda/10.1.243
   module load cuda/10.2.89
   module load cudnn/v7.6.5
   module load cudnn/v8.0.4
   module load Python3/3.7.6
   module load Python3/3.7.6
   module load git/2.25.1
   module load git/2.25.1
   module load gitlfs
   module load gitlfs
   module load gcc/8.1.0
   module load gcc/8.1.0
   #module load gcc/6.3.0
   module load openmpi/4.0.1
   module load ffmpeg
   module load ffmpeg
 
  module load rclone
fi
fi
if command_exists python3 ; then
if command_exists python3 ; then
Line 193: Line 231:


==Copying Files==
==Copying Files==
There are 3 ways that I copy files to the scratch drives:
There are 3 ways that I use to copy files:
* For small files, you can copy to your home directory under <code>/nfshomes/</code> via SFTP to mbrcsub00. I rarely do this because the home directory is only a few gigs.
* For small files, you can copy to your home directory under <code>/nfshomes/</code> via SFTP to the submission node. I rarely do this because the home directory is only a few gigs.
* For large files, I typically use [[rclone]] to copy to my terpmail Google Drive and then copy back to the scratch drives with a cpu-only job. Do not do this with thousands of small files; it will take forever since Google Drive has a limit on files per second. Also note that Google Drive has a daily limit of 750GB in transfers.
* For large files and folder, I typically use [[rclone]] to copy to the cloud and then copy back to the scratch drives with a cpu-only job.
* For mounting, I have a convoluted system where I start SSHD in a job and port forward the SSH port to my local PC. See above for more details.
** You can store project files on Google Drive or the UMIACS object storage.
** Note that Google Drive has a limit on files per second and a daily limit of 750GB in transfers.