Hi Rayson
I have written a plugin to run the intel installer from a tgz file on s3. It takes a while but seems to work.
I am using one of the stock public images:
[0] ami-3393a45a us-east-1 starcluster-base-ubuntu-13.04-x86_64 (EBS)
Should I be using the HVM image? I understood they are only needed for GPU computing?
How can I tell if I have the enhanced networking driver setup correctly? In the paravirtual machine lspci etc show nothing?
Thanks for the suggestions!
David Stuebe
Scientist & Software Engineer – RPS ASA
55 Village Square Drive
South Kingstown, RI 02879-8248
Tel: +1 (401) 789-6224
Email: David.Stuebe_at_rpsgroup.com<mailto:David.Stuebe_at_rpsgroup.com>
www: asascience.com<
http://www.asascience.com/> | rpsgroup.com<
http://www.rpsgroup.com/>
A member of the RPS Group plc
From: Rayson Ho <raysonlogin_at_gmail.com<mailto:raysonlogin_at_gmail.com>>
Date: Fri, 16 May 2014 09:13:03 -0400
To: David Stuebe <dstuebe_at_asascience.com<mailto:dstuebe_at_asascience.com>>
Cc: "starcluster_at_mit.edu<mailto:starcluster_at_mit.edu>" <starcluster_at_mit.edu<mailto:starcluster_at_mit.edu>>
Subject: Re: [StarCluster] MPICH Fabric
How are you deploying the Intel Cluster Compiler Suite? If you are using a custom AMI, then make sure that you have the AWS enhanced networking NIC driver setup correctly, and also make sure that your instances are all in a placement group, and in a VPC (those should be set by StarCluster if you are using the latest stable version).
We benchmarked AWS enhanced networking on the C3 family a few months ago, and the latency is around 20% better on a pair of C3.8xlarge instances in a placement group with AWS enhanced networking enabled:
http://blogs.scalablelogic.com/2014/01/enhanced-networking-in-aws-cloud-part-2.html
Rayson
==================================================
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/
http://gridscheduler.sourceforge.net/GridEngine/GridEngineCloud.html
On Thu, May 15, 2014 at 3:18 PM, David Stuebe <DStuebe_at_asascience.com<mailto:DStuebe_at_asascience.com>> wrote:
Hi Starcluster
Do anyone have advice on what fabric to use when running on AWS?
I know the interconnect is supposed to be 10-gigE but my model is more dependent on latency than throughput.
I have had to use the Intel Cluster Compiler Suite rather than the built in OpenMPI. Hoping to resolve those issues and compare the two – I am interested to see the performance differences…
Currently the model actually has a negative performance curve as I add processors past a single node.
Model performance on running on Amazon…
C3.8Xlarge - 1 instance, 32 cores
! IINT SIMTIME(UTC) FINISH IN SECS/IT PERCENT COMPLETE
!8396282 2014-03-18T00:18:02.000000 0000:07:54:19 0.1103 | |
C3.8Xlarge - 2 instance, 64 cores
!8395221 2014-03-18T00:00:21.000000 0000:20:27:58 0.2843 | |
C3.8Xlarge - 3 instance, 96 cores
!8395273 2014-03-18T00:01:13.000000 0007:18:33:19 2.5918 | |
David Stuebe
Scientist & Software Engineer – RPS ASA
55 Village Square Drive
South Kingstown, RI 02879-8248
Tel: +1 (401) 789-6224<tel:%2B1%20%28401%29%20789-6224>
Email: David.Stuebe_at_rpsgroup.com<mailto:David.Stuebe_at_rpsgroup.com>
www: asascience.com<
http://www.asascience.com/> | rpsgroup.com<
http://www.rpsgroup.com/>
A member of the RPS Group plc
_______________________________________________
StarCluster mailing list
StarCluster_at_mit.edu<mailto:StarCluster_at_mit.edu>
http://mailman.mit.edu/mailman/listinfo/starcluster
Received on Fri May 16 2014 - 12:45:48 EDT