Research Group of Prof. Dr. M. Griebel
Institute for Numerical Simulation
maximize

History

Cluster computing has been established at the research group since the mid 90s. The first cluster, Parnass, was created by connecting the department's SGI workstations. This cluster comprised a total of 49 MPIS R10K processors and 4.2 GByte of main memory distributed among 33 SGI O2 workstations and 8 SGI Origin 200 servers, yielding a peak performance of 22.4 GFlop/s. In addition to the usual Fast Ethernet network used for standard network services, two additional interconnect techniques were installed: First, two O200 nodes each were connected using a CrayLink cable, which turned the O200s into four 4-processor shared memory machines. Second, a Myrinet Gigabit-network (nominal bandwith 1.28 GBit/s, two-stage fattree topology) allowed fast communication for the message passing libraries.  
In 1998, Parnass2 was installed at the department. In its initial configuration, it contained 128 PentiumII 400Mhz CPUs distributed among 64 SMP nodes with 512 MByte of main meory each. Again, the nodes were connected using both fast ethernet (for standard network services) and Myrinet (for message-passing) hardware. A three-level fattree network topology (full bisection bandwith 82 GBit/s) allowed communication between two nodes each at full bandwith of 1.28GBit/s and collisionless all-to-all routing. Parnass2 reached a Linpack performance of 29.6 GFlop/s, which means an efficiency of 58% with respect to the theoretical peak performance of 51.2 GFlop/s. Hence, the system was listed rank 362 on the June 1999 Top 500 list. Later that year, the cluster was extended by another 8 dual-processor 400Mhz nodes. In this configuration, the cluster achieved a Linpack performance of 34.23 GFlop/s (59% of 57.6 GFlop/s peak performance) and was ranked 454 on the November 1999 list. Parnass2 remained in service until November 2004.
During 2003 and 2004, preparations were started to install a successor for Parnass2. First, some of the workstations at the institute were equipped with Myrinet cards and Gigabit ethernet devices. Using this small workstation cluster, different networking hardware and several cluster software packages were investigated. In its final configuration this cluster, which was called Eifel, consisted of 32 Dell Precision 620 workstations, each of them equipped with two Xeon 933 Mhz processors and 2 Gbytes of main memory. All nodes were directly connected to a single Myrinet M3-32E switch. After the installation of Himalaya, the Eifel cluster was mainly used for educational purposes. As of August 2006, this cluster is out of service.
In the spring of 2005, Himalaya was installed at the Institute for Numerical Simulation. Each cluster node (Dell PowerEdge 1850) contains two Xeon EM64T 3.2 GHz processors and 2 GBytes of main memory in its initial configuration (later upgraded to 4-6 GBytes). The nodes are equipped with Myrinet/XP network interfaces, and connected to a Clos256 switch. Himalaya reached a Linpack performance of 1269 GFlop/s. With respect to the theoretical peak performace of 1638.4 GFlop/s, this means an efficiency of 77%. Himalaya is listed rank 428 on the June 2005 TOP 500 list (press release, certificate).
In the fall of 2007, the grid was extended by 19 nodes which form the Eifel II cluster. Each node contains two quad-core Intel Xeon 2.66 GHz processors. Three of them (Dell PowerEdge 2950) are equipped with 16GBytes of memory, the others (Dell PowerEdge 1950) first contained 4GBytes RAM, later upgraded to 12GBytes. Again, the nodes are connected using Myrinet. While the whole Eifel II cluster reaches a Linpack performance of 519.1GFlop/s (32% of peak), a single PowerEdge 2950 already achieves 58.44 GFlop/s, i.e. is faster than the entire Parnass2!
In 2010, a third cluster called Siebengebirge was installed. Although it only consists of five Dell PowerEdge R910 nodes, 4 eight-core Intel Xeon 2.226 Ghz processors and 512 GB RAM per node allow a theoretical peak performance of 1453 GFlop/s. In terms of the Linpack performance, we have measured 1349 GFlop/s which means that this cluster is faster than Himalaya while the latter take eight times the rack space of Siebengebirge. In this cluster, the high-speed interconnect is provided by an Infiniband network.
A Dell PowerEdge 2900 III serves as frontend node for Eifel II and Himalaya. Next to the NFS diskspace (around 27 TByte), it also offers standard network services to the cluster. Users may log on this machine to compile their code and submit it to the batch system, which in turn distributes the work among the clusters. Himalaya runs the Ubuntu 10.04 operating system, which makes it easy for the users to port their code from the department's workstations to the clusters. As message passing-library, OpenMPI is used.

 

Acknowledgements

The support of the DFG through the HBFG and SFB611 and SFB1060 programs is gratefully acknowledged.

Disclaimer

All trademarks used are properties of their respective owners.