Re: How does StarCluster track the clusters it's managing?
I actually joined the mailing list today because of this issue with spot
instances dying. I stumbled across this fork that promises to handle it.
https://github.com/datacratic/StarCluster
Anyone have experience using it?
On Mon, Mar 16, 2015 at 4:39 PM, Steve Darnell <darnells_at_dnastar.com> wrote:
> Hi Raj,
>
>
>
> Thanks for the reply. Manual clean-up is indeed required to deal with
> these rouge instances. It would be really convenient if loadbalancer
> resolved this scenario automatically once an hour. One can dream (or
> implement)…
>
>
>
> Best regards,
>
> Steve
>
>
>
> *From:* rqbanerjee_at_gmail.com [mailto:rqbanerjee_at_gmail.com] *On Behalf Of *Rajat
> Banerjee
> *Sent:* Monday, March 16, 2015 2:04 PM
> *To:* Steve Darnell
> *Cc:* Eduardo Gurgel Valente; Nicholas Chammas; starcluster_at_mit.edu
>
> *Subject:* Re: [StarCluster] How does StarCluster track the clusters it's
> managing?
>
>
>
> Sorry for the super-slow response.
>
> The elastic load balancer parses the output of 'qhost' on the cluster:
>
>
> https://github.com/jtriley/StarCluster/blob/develop/starcluster/balancers/sge/__init__.py#L59
>
> I don't remember the exact reason for using that instead of the same logic
> as 'listclusters' above, but here's my guess a few years after the fact:
>
> - Avoids another remote API call to AWS' tagging service to retrieve the
> tags for all instances within an account. This needs to be called every
> minute, so a speedy call to your cluster instead of to a remote API is
> beneficial
>
> - qhost outputs the number of machines correctly configured and able to
> process work. If a machine shows up in 'listcluster' but not in 'qhost'
> it's likely not usable to process jobs, and would probably need manual
> cleanup.
>
> HTH
>
> Raj
>
>
>
> On Tue, Mar 10, 2015 at 4:04 PM, Steve Darnell <darnells_at_dnastar.com>
> wrote:
>
> On a related topic, does anyone know how the load balancing feature tracks
> the cluster and its compute nodes? I have gotten into situations where
> listclusters correctly reports that a cluster and its nodes are running (I
> can ssh into master and the nodes, etc.); however, loadbalance reports that
> the cluster is not running and refuses to balance the cluster.
>
>
>
> Best regards,
>
> Steve
>
>
>
> *From:* starcluster-bounces_at_mit.edu [mailto:starcluster-bounces_at_mit.edu] *On
> Behalf Of *Eduardo Gurgel Valente
> *Sent:* Tuesday, March 10, 2015 2:08 PM
> *To:* Nicholas Chammas
> *Cc:* starcluster_at_mit.edu
> *Subject:* Re: [StarCluster] How does StarCluster track the clusters it's
> managing?
>
>
>
> Hi Nick,
>
> Look at the security group it creates. It follows a naming
> convention. In addition there are tags with encrypted information at play.
>
> Eduardo
>
>
>
> On Mon, Mar 9, 2015 at 11:16 PM, Nicholas Chammas <
> nicholas.chammas_at_gmail.com> wrote:
>
> Howdy!
>
> At this point in the StarCluster demo video
> <http://youtu.be/vC3lJcPq1FY?t=7m20s>, the presenter runs the following
> command to list all the clusters being managed by StarCluster:
>
> starcluster listclusters
>
> How does StarCluster track all the clusters it’s managing? Is it through
> the use of EC2 instance tags? A pointer to the relevant code would also be
> helpful.
>
> I’m looking to implement a feature similar to listclusters but for
> spark-ec2 <http://spark.apache.org/docs/1.2.1/ec2-scripts.html>. Tagging
> seems like the way to go to do that, but we had some issues with it
> <https://issues.apache.org/jira/browse/SPARK-3332> when we used it with
> spark-ec2.
>
> So I’m curious to know how StarCluster did things.
>
> Nick
>
>
>
>
> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
>
>
>
> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
>
>
> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
>
Received on Mon Mar 16 2015 - 17:34:31 EDT
This archive was generated by
hypermail 2.3.0.