I wanted second the interest in UI for starcluster and for ability to run in similar scenarios as described by Michael Shamberger and others.
I am new to StarCluster and to python, but a many-year java veteran (maybe survivor? graduate? :) )
A little more context for my usecases:
We have recently started using StarCluster at Adchemy to manage machine learning model training workloads. Our main datastores are in on-premises Hadoop HDFS and in various databases that are either on premisses or in AWS VPC. Just the other day I switched to the develop branch to pick up the VPC capability and so far it's working pretty well.
My goal with StarCluster is to build a more-or-less turnkey platform for our researchers to build and test new algos with ready access to our data resources, and also to serve as the production platform for the same algos without much change.
I would be interested in helping develop and/or test the following:
- Local Master seems like an interesting approach, which looks like a special case of the "cloud burst" capability Michael describes.
- UI is definitely a missing feature. I am also thinking of Django as the platform. Hopefully RESTful API and the UI both would come from this.
Another potential direction is to base on / copy / get inspired by Netflix Genie (https://github.com/Netflix/genie
- GCE is interesting, but probably longer range for us (we haven't outgrown AWS yet, but if the pricing changes it might become important)
- Software Plugins - if you mean plugins that are separately distributable, then yes, that would be very interesting. I suppose this is already possible, no?
Some other things on my wish / todo list:
- integration with Route53 (or other DNS), especially for VPC scenarios
Assigning sub-domains to clusters and automatically registering master with DNS (less so with for the nodes)
- integrating loadbalancer into the cluster master itself, so that it's not necessary to have another machine / process for the cluster to be elastic
Finally, I want say a big Thank You to all the folks that have contributed to StarCluster already! It's a great piece of software and already solves a great need for us very nicely!
Thanks very much! And I'm excited to help more!
Date: Wed, 11 Dec 2013 13:05:00 +0200
From: Michael Shamberger <shambergerm_at_gmail.com>
Subject: [StarCluster] (no subject)
Content-Type: text/plain; charset="iso-8859-1"
To introduce myself, I am a University of Cincinnati engineering graduate
and will start my masters at Georgia Tech in HPC in the Spring after 15
years of IT work.
I am trying to see if there is interest in the following capabilities for
- User interface - I am planning to open source a django user interface for
StarCluster in early January. It will have a python based API and also
provide the capability of dynamically launching web facing applications
with starcluster providing the compute behind them.
- Google Compute Engine - If you are not aware, GCE has launched a
competitor service to Amazon (
The offering is interesting because they charge by the minute after ten
minutes intead of 1 hour and reviews say instances come up faster. This
could potentially make your workload cheaper then even Amazon spot pricing.
I would like to hear opinions/ideas on the best way of enabling GCE to be
a backend cloud option to StarCluster. A fork of the boto package could do
the job. There was already a rejected pull request last year to add GCE
capability to boto.
- private cloud - Run StarCluster bare metal on own hardware with
- Software plugins - I am developing docker based MKL compiled software
plugins for various HPC software stacks (SciPy, R, Bioperl, OpenFoam, etc)
to run on top of grid workers and also plan to release those as open
source. These packages can be notoriusly hard to compile and install
Please reply to this email or send mail to me directly.
Received on Wed Dec 11 2013 - 15:06:53 EST