High Performance Computing

I am interested in setting up some High Performance Computing clusters and would like to get people's views and experiences on this.

I have 2 requirements:

  1. Compute clusters to do fast cpu intensive computations
  2. Storage clusters of parallel and extendable filesystems spread across many nodes

Both of these should run across multiple commodity hardware nodes and ideally be Linux/Unix based and open source.

Any feedback welcome.

see
Beowulf.org: The Beowulf Cluster Site
and

Thanks for the response. I did start reading those 2 sites.

I was also interested in people's opinions and experiences of any of the technologies surrounding Linux High Performance Clusters and parallel filesystems.

Is anybody out there using these technologies in production and what kinds of things are they doing and how?

Think carefully about what sort of problems you want to solve, e.g.
parallel computation or task farming?
If the former, then are the communications latency-bound or bandwidth-bound? Are collective communications important? Will you need full switching for remote comms, or just nearest-neighbour?
CPU-bound or memory-bound or IO-bound?

These factors are not necessarily mutually exclusive and Inevitably there are trade-offs, but one size does not fit all.

  1. CPU intensive computation of a single task
  2. Parallel computation of a task broken down into pieces
  3. Storage across many commodity nodes with scalability and i/o performance
  4. The solutions do not need to be geographically dispersed, same server room is fine.

So this problem used to be rather "simple". Just use some CPU metric (MPIS, FLOPS, SPECint, SPECfloat, whatever) and divide it by the cost of a computer. Then we had a clear choice: 2 CPUs per "1U" system. Now the choice has expanded to cores per chip and we have 2, 4, and even 8-way systems (AMD). You could build a rack of Sunx6400, each containing 64 cores. But we also shouldn't forget Sun's T1 processor line, with 128 "virtual" cores.

Further complicating the issue: cost is no longer just for the compute node. Now you have to consider the networking costs between them. 100 Mbit Ethernet switches are cheap, but may not be suitable for a cluster of very fast machines. Infiniband gives you great performance, but scaling is very expensive -- just the cabling alone can cost as much as your CPUs!

Further complications: the operating costs of cooling and electricity are not insignificant. For every watt used by the CPU, you can count on needing 2 watts to cool it (depends on the climate you're in). Thus, if every computer node requires 1.5 A, and you have 256 compute nodes, you will need 1.5 * 256 * 3 = 1152 Amps of power and maybe 2 30-ton chillers.

Q1: What percentage of the operations are floating point? Do you need double-precision? (Usually the answer is yes).

What's the expected ratio between computation time and communication time between the pieces. Medium ratio: do some computation, then send intermediate results to all neighbors, then do some more computation. Low ratio: compute, send a result, wait for a message, compute, send a result, and so on. High ratio: the CPUs crunch, crunch, crunch, then finally send results to a central task which does a final computation.

This is important in deciding what kind of network capacity you will need.
[/quote]
?

How about reliability? Commodity nodes means high rate of disk failures and/or node failures. Can you bear with frequent filesystem downtime? Or will you need high availability on this filesystem?

Does your budget include life operating costs? Does your server room have specifications for lb/ft^2 ? One institution I worked at discovered that the building was designed for a certain amount of weight density -- even in the server room. It turns out that putting more than about 8 computer racks in the room exceeded this density! So we had the room, but adding more racks might make the floor unstable, especially given that this building was in a geographically active area (about 1 4+ quake every 2 to 3 years).