Parallelization of MPI application with Star Cluster

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

From: Torstein Fjermestad <no email>
Date: Thu, 8 May 2014 19:30:15 +0200

Dear all,

I am planning to use Star Cluster to run Quantum Espresso (
http://www.quantum-espresso.org/) calculations. For those who are not
familiar with Quantum Espresso; it is a code to run quantum mechanical
calculations on materials. In order for these types of calculations to
achieve good scaling with respect to the number of CPU, fast communication
hardware is necessary.

For this reason, I configured a cluster based on the HVM-EBS image:

[1] ami-ca4abfbd eu-west-1 starcluster-base-ubuntu-13.04-x86_64-hvm
(HVM-EBS)

Then I followed the instructions on this site

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html#test-enhanced-networking

to check that "enhanced networking" was indeed enabled. Running the
suggested commands gave me the same output as in the examples. This
certainly indicated that "enhanced networking" is enabled in the image.

On this image I installed Quantum Espresso (by use of apt-get install) and
I generated a new modified image from which I generated the final cluster.

On this cluster, I carried out some parallelization tests by running the
same Quantum Espresso calculation on different number of CPUs. I present
the results below:

# proc CPU time wall time 4
4m23.98s 5m 0.10s 8 2m46.25s 2m49.30s 16 1m40.98s 4m 2.82s 32 0m57.70s
3m36.15s
Except from the test ran with 8 CPUs, the wall time is significantly longer
than the CPU time. This is usually an indication of a slow communication
between the CPUs/nodes.

My question is therefore whether there is a way to check the communication
speed between the nodes / CPUs.

The large difference between the CPU time and wall time may also be caused
by an incorrect configuration of the cluster. Is there something I have
done wrong / forgotten?

Does anyone have suggestions on how I can fix this parallelization issue?

Thanks in advance for your help.

Regards,
Torstein Fjermestad
Received on Thu May 08 2014 - 13:30:17 EDT

This message: [ Message body ]
Next message: Cedar McKay: "Fast shared or local storage?"
Previous message: Torstein Fjermestad: "Re: Error when I try to resize a volume"
Next in thread: Rayson Ho: "Re: Parallelization of MPI application with Star Cluster"
Reply: Rayson Ho: "Re: Parallelization of MPI application with Star Cluster"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

This archive was generated by hypermail 2.3.0.

Parallelization of MPI application with Star Cluster

Search:

Sort all by:

Navigation

Parallelization of MPI application with Star Cluster

Search:

Sort all by:

Navigation