HPC Architecture
High-Performance Computing cluster (HPC) is a collection of computers/servers and storage that are connected via network (Ethernet/Infiniband). That provide amount of compute resource (CPUs, GPUs, Memory) for researchers to run thier experiments.
Typically, our cluster is comprised of Frontend, Compute Nodes, and I/O nodes + Storage.
- Frontned : Login in node for users and allow users to manipulate their working space, sumbit their jobs and install software.
- Compute nodes : Nodes which execute code in submitted jobs.
- I/O nodes + Storage : Store and serve file in cluster.
Workflow
The basic workflow for run program in cluster.
- Login to frontend
- Organize your workspace
- Install/Manage software
- Test job
- Submit job
- Monitor job