UMIACS Servers: Difference between revisions

From David's Wiki
Line 29: Line 29:


==Python==
==Python==
Do not install anaconda. You will run out of space.<br>
Do not install anaconda in home. You will run out of space.<br>
Load the Python 3 module adding the following to your .bashrc file
Load the Python 3 module adding the following to your .bashrc file
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
Line 48: Line 48:
** <code>export PYTHONPATH="${PYTHONPATH}:/nfshomes/$(whoami)/.local/lib/python3.7/site-packages/"</code>
** <code>export PYTHONPATH="${PYTHONPATH}:/nfshomes/$(whoami)/.local/lib/python3.7/site-packages/"</code>
* You can also install using <code>pip install --target=/my-libs-folder/</code>
* You can also install using <code>pip install --target=/my-libs-folder/</code>
===Conda===
If you must install conda, install it somewhere with a lot of space like scratch.


===Install PyTorch===
===Install PyTorch===

Revision as of 18:07, 17 October 2020

Notes on using UMIACS servers


Modules

Use modules to load programs you need to run.

Notes
  • You can load modules in your .bashrc file
# List loaded modules
module list

# Load a module
module load [my_module]

# List all available modules
module avail

Some useful modules in my .bashrc file

module load tmux
module load cuda/10.0.130
module load cudnn/v7.5.0
module load Python3/3.7.6
module load git

Python

Do not install anaconda in home. You will run out of space.
Load the Python 3 module adding the following to your .bashrc file

module load Python3/3.7.6
export PATH="${PATH}:$(python3 -c 'import site; print(site.USER_BASE)')/bin"

Then run the following to get pip installed

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python get-pip.py --user
Notes
  • You will need to install things with pip --user
  • You may need to add your local site-packages to your PYTHONPATH environment variable
    • Add this to .bashrc:
    • export PYTHONPATH="${PYTHONPATH}:/nfshomes/$(whoami)/.local/lib/python3.7/site-packages/"
  • You can also install using pip install --target=/my-libs-folder/

Conda

If you must install conda, install it somewhere with a lot of space like scratch.

Install PyTorch

pip install --user torch===1.3.1 torchvision===0.4.2 -f https://download.pytorch.org/whl/torch_stable.html

Installing Packages to a Directory

pip install geographiclib -t /scratch1/davidli/python/

MBRC Cluster

See UMIACS MBRC

SLURM Job Management

See https://docs.rc.fas.harvard.edu/kb/convenient-slurm-commands/

1 GPU
srun --pty --gres=gpu:1 --mem=16G --qos=high --time=47:59:00 -w mbrc00 bash
2 GPUS mbrc00
srun --pty --gres=gpu:2 --mem=16G --qos=default --time=23:59:00 -w mbrc00 bash
CPU-only on scavenger QOS
srun --pty --account=scavenger --partition=scavenger \
     --time=3:59:00 \
     --mem=1G -c1 -w mbrc00 bash
Notes
  • You can add -w mbrc01 to pick mbrc01
  • -c 4 for 4 cores

See Jobs

See my own jobs
squeue -u <user> -o "%8i %10P %8j %10u %10L %5b"
Formatting
  • %L is remaining time
  • %b is the number of GPUs
See all jobs
squeue

SFTP

Note: If you know of an easier way, please tell me.

On your PC
Start an sshd for forwarding. You can do this in a docker container for privacy purposes.

On the cluster:
Generate an sshd host key:

ssh-keygen -t ed25519 -a 100 -f /nfshomes/dli7319/ssh/ssh_host_ed25519_key

Create the following sshd_config file

#	$OpenBSD: sshd_config,v 1.103 2018/04/09 20:41:22 tj Exp $
Port 5981
HostKey /nfshomes/dli7319/ssh/ssh_host_ed25519_key
AuthorizedKeysFile	.ssh/authorized_keys
Subsystem	sftp	/usr/libexec/openssh/sftp-server

Start the sshd daemon and proxy the port to your local sshd. You can make a script like this:

#!/bin/bash

LOCAL_PORT=5981
REMOTE_PORT=22350
REMOTE_SSH_PORT=22450
REMOTE_ADDR=$(echo "$SSH_CONNECTION" | awk '{print $1}')

/usr/sbin/sshd -D -f sshd_config & \
ssh -R $REMOTE_PORT:localhost:$LOCAL_PORT root@$REMOTE_ADDR -p $REMOTE_SSH_PORT 

On your PC:
Proxy the sshd from the local docker to your localhost.
Connect to the the sshd on the cluster

.bashrc

My .bashrc

Software

git

The MBRC cluster has an git available in the modules.
Then you can download git-lfs compiled and drop it in ~/bin/.
Make sure ${HOME}/bin is in your path and run git lfs install

Notes
  • Make sure you have a recent version of git
    • E.g. module load git/2.25.1

Copying Files

There are 3 ways that I copy files to the scratch drives:

  • For small files, you can copy to your home directory under /nfshomes/ via SFTP to mbrcsub00. I rarely do this because the home directory is only a few gigs.
  • For large files, I typically use rclone to copy to my terpmail Google Drive and then copy back to the scratch drives with a cpu-only job. Do not do this with thousands of small files; it will take forever since Google Drive has a limit on files per second. Also note that Google Drive has a daily limit of 750GB in transfers.
  • For mounting, I have a convoluted system where I start SSHD in a job and port forward the SSH port to my local PC. See above for more details.