Specifying the number of engines per node

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

From: Cory Dolphin <no email>
Date: Fri, 7 Mar 2014 00:54:44 -0500

Hello,

I am new to Starcluster, but have found the package extremely useful in
running a parameter sweeping grid search training sklearn models. With my
particular problem, each job requires a large amount of memory relative to
the number of CPUs (compensating with memory optimized instances is not
sufficient, each job takes ~40GB of memory when training a model). Thus, I
needed to limit the number of ipengines on each node in the cluster. I
edited the ipcluster plugin such that it supports this optional parameter,
with the default behavior matching that of the original implementation.

I believe that others may find this modification useful, and I would love
feedback on whether or not such a change is interesting to the team.

There are two oddities with the implementation that I wish to discuss:

It requires the IPClusterRestartEngines plugin to also specify the number
of engines
It likely requires changes depending on the instance type. Alternatively,
it would be trivial to specify an amount of memory per engine, i.e. start
an engine for each 40GB of memory; this however may be difficult to explain.

I have created a pull request for this change,
#379<https://github.com/jtriley/StarCluster/pull/379>,
but I wanted to reach out to the mailing list for discussion and feedback.

Thanks for sharing this wonderful project, and I hope others find the
limitation on number of engines useful.
Cory
Received on Fri Mar 07 2014 - 00:54:45 EST

This message: [ Message body ]
Next message: Lilley, John F.: "Re: DRMAA jobs failing when load balancer enabled and jobs longer than 60 mins (Lilley, John F.)"
Previous message: Rayson Ho: "Re: DRMAA jobs failing when load balancer enabled and jobs longer than 60 mins (Lilley, John F.)"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

This archive was generated by hypermail 2.3.0.

Specifying the number of engines per node

Search:

Sort all by:

Navigation

Specifying the number of engines per node

Search:

Sort all by:

Navigation