High Performance Computing (HPC)
High Throughput Computing (HTC)
AKA "Supercomputing"
Reduce time to result
Increase accuracy
Increase throughput
Pressure from computing requirements
Pressure from physics
Physics limits GHz speed
Heat and power-draw increase super-linearly with speed
Bottlenecks everywhere
Limits of RAM
Serial Process:
A process in which its sub-processes happen sequentially in time.
Only one sub-process is active at any given time.
Parallel Process:
Serial Process:
Parallel Process:
Example: Searching database of web-sites for some text (i.e. Google)
Searching sequentially through large volumes of text too time-consuming.
Multiple servers hold different pages.
Each server can report the result of each individual search.
The more servers you add the quicker the search is.
Example: Large scale weather simulation
Detailed description for atmosphere too large to run on today's desktop or server PCs.
Multiple servers are needed to hold all grid data in memory.
Servers need to quickly communicate to synchronise work over entire grid
Communication between servers can become a bottleneck.
But...
1 Processor
2 Processors
10 Processors
\(\frac{1}{\left(1-P\right)+\frac{P}{N}}\)
Processor farms, pipelining, divide/conquer, geometric decomposition, cellular automata, algorithmic parallelism
Interconnect | Typical MPI latency (microseconds) | Typical bandwidth (MB/s) |
---|---|---|
1Gbs Ethernet | 60-90 | 90 |
10Gbs Ethernet | 12-20 | 800 |
Infiniband | 2-4 | 250-1200 |
NUMALink 4 | ~1 | 3000 |
QPI | ~0.5 | 20000 |
Not all problems can be parallelised
Not all parallel problems can be ported to accelerators
Developments in CPU architecture still continue: Moore’s law still valid
Improvements to CPU architecture consist of increasing internal parallelism: hyper-threading, wider SIMD units, more cores per chip
Accelerators are bringing more computing cycles to the table (albeit of a special type).
Data storage and Network infrastructure need to keep up with computation and other data producing technologies...
upgrading network infrastructure
re-thinking data storage
SSD’s
parallel file systems
Multi-tiered storage
Map-Reduce data platforms
Used efficiently, Supercomputers let you get more done faster.
They can be useful for a large number of types of work.
Compute Node
Just computes - little else
Private IP address - no user access
Login Node
User login
Interaction with job scheduler
Public IP address - connects to external network
The simulation of hard physical materials, e.g. metal, plastic
Crash test, product design, suitability for purpose
Examples: MSC Nastran, Ansys, LS-Dyna, Abaqus, ESI PAMCrash, Radioss
CFD - Computational Fluid Dynamics
The simulation of soft physical materials, gases and fluids
Engine design, airflow, oil reservoir modelling
Examples: Fluent, Star-CD, CFX
Geophysical Sciences
Seismic Imaging - taking echo traces and building a picture of the sub-earth geology
Reservoir Simulation - CFD specific to oil asset management
Examples: Omega, Landmark VIP and Pro/Max, Geoquest Eclipse
Life Sciences
Understanding the living world - genome matching, protein folding, drug design, bioinformatics, organic chemistry
Examples: BLAST, Gaussian, LAMMPS, Trinity, Amber, NAMD
High Energy Physics
Understanding the atomic and sub-atomic world
Software from Fermi-Lab or CERN, or home-grown
Financial Modelling
The vast majority of Clusters in the world use some flavour of Unix or Linux for their OS.
The most common form of interaction with these systems is a "shell" or "command line".
(Which we are going to learn about using Legion’s Login nodes.)
Clusters are very frequently used as a shared facility.
As such, work needs to be scheduled via a batch system.
Jobs are queued and prioritised based on requested resources.
(These are the focus of tomorrow’s session.)