-----BEGIN PGP SIGNED MESSAGE-----
First off, I'm not familiar with WRF, smpar,or dmpar but I'm guessing
they come from:
Also I would recommend OpenMPI over MPICH2 if possible given that it
integrates nicely with SGE.
With that said can you please provide me with:
1. The commands you used to submit the job (ie qsub ...)
2. The error output from the job. The error output for the job is
written either in the cluster user's $HOME folder or wherever you
submitted the job. It's usually some file like myjob.e5 where the
number is the job number.
3. Are you sure you installed the software on all nodes? In other
words, are you able to login to each individual node and see/run WRF?
On 09/13/2012 10:04 PM, Itesh Dash wrote:
> Dear All,
> I tried to install for the first time the WRF in the Amazon EC2
> cluster and use it with star cluster nodes to run parallel.
> I have installed it will both smpar, OpenMP (previously there in
> the cluster) and also with dmpar, mpich2 installed, separately,
> But it seems that the model is not running in parallel mode. Most
> of the time the process gets forced killed. And even it runs, job
> submitted to only single node, even i have specified the nodes/host
> in the hostfile.
> For information, I have total 8 nodes, including the master with
> Ubuntu 64 Bit. And model installation is with gcc/gfortran/mpif90.
> I have submitted the jobs following the StarCluster user guide.
> Is there something missing? Please let me know if more information
> is needed.
> Will appreciate any help in this regard.
> Regards, Itesh
> _______________________________________________ StarCluster mailing
> list StarCluster_at_mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/
-----END PGP SIGNATURE-----
Received on Wed Oct 17 2012 - 13:43:23 EDT