Introduction to High-Performance Computing: Glossary

Key Points

Why Use a Cluster?
  • High Performance Computing (HPC) typically involves connecting to very large computing systems elsewhere in the world.

  • These other systems can be used to do work that would either be impossible or much slower or smaller systems.

  • The standard method of interacting with such systems is via a command line interface called Bash.

Connecting to the cluster
  • To connect to a cluster using SSH: ssh yourUsername@remote.computer.address

Scripts, variables, and loops
  • A shell script is just a list of bash commands in a text file.

  • chmod +x script.sh will give it permission to execute.

Working on a cluster
  • A cluster is a set of networked machines.

  • Clusters typically provide a login node and a set of worker nodes.

  • Files saved on one node are available everywhere.

Accessing software
  • Load software with module load softwareName

  • Unload software with module purge

  • The module system handles software versioning and package conflicts for you automatically.

  • You can edit your .bashrc file to automatically load a software package.

Transferring files
  • wget downloads a file from the internet.

  • sftp/scp transfer files to and from your computer.

  • You can use an SFTP client like FileZilla to transfer files through a GUI.

Using resources effectively
  • The smaller your job, the faster it will schedule.

Be a responsible cluster user
  • Don’t run stuff on the login node.

  • Again, don’t run stuff on the login node.

  • Don’t be a bad person and run stuff on the login node.

Glossary

FIXME