Re: Starcluster - Taking advantage of multiple cores on EC2
This archive was generated by
Grid Engine just executes jobs and manages resources.
It's up to your code to use more than one core.
Maybe there is a config difference between your local scipy/numpy etc.
install and how StarCluster deploys it's version?
Grid Engine assumes by default a 1:1 ratio between job and CPU core
unless you are explicitly submitting to a parallel environment.
If you are the only user on a small cluster you probably don't have to
do much, the worst that could happen would be that SGE queues up and
runs more than one of your threaded app job on the same host and they
end up competing for CPU/memory resources to the detriment of all.
One way around that would be to configure exclusive job access and
submit your job with the "exclusive" request. That will ensure that your
job when it runs will get an entire execution host.
Another way is to fake up a parallel environment. For your situation it
is very common for people to build a parallel environment called
"Threaded" or "SMP" so that they can run threaded apps without
oversubscribing an execution host.
With a threaded PE set up you'd submit your job:
$ qsub -pe threaded=<# CPU> my-job-script.sh
... and SGE would account for your single job using more than one CPU on
a single host.
FYI Grid Engine has recently picked up some Linux core binding
enhancements that make it easier to pin jobs and tasks to specific
cores. I'm not sure if the version of GE that is built into StarCluster
today has those features yet but it should gain them eventually.
Bill Lennon wrote:
> Dear Starcluster Gurus,
> Iíve successfully loaded the Starcluster AMI onto a single high-memory
> quadruple extra large instance and am performing an SVD on a large
> sparse matrix and then performing k-means on the result. However, Iím
> only taking advantage of one core when I do this? On my laptop (using
> scipy numpy, intel MKL), on a small version of this, all cores are taken
> advantage of automagically. Is there an easy way to do this with a
> single starcluster instance with Atlas? Or do I need to explicitly write
> my code to multithread?
> My thanks,
Received on Wed Aug 31 2011 - 13:29:49 EDT