Re:  processes still running after qdel
 
I've seen this before myself.  Make sure the 'ENABLE_ADDGRP_KILL=TRUE'
option is set on your SGE configuration, 'qconf -sconf' | grep execd_params
On Thu, Jun 18, 2015 at 5:37 PM, David Koppstein <david.koppstein_at_gmail.com>
wrote:
> Thanks. I've used qdel -f in the past and it's very useful. Again, sorry
> for not being more specific -- the issue I'm having is that after I use
> qdel and can verify the job is no longer in the queue using qstat, the
> process is still running when I login to the node and do ps -ef.
>
> It could be that there is some problem with the process that causes it to
> hang, but I would still expect that doing qdel would kill the process -- I
> have to login to the nodes and kill them myself.
>
> On Thu, Jun 18, 2015 at 5:33 PM Jacob Barhak <jacob.barhak_at_gmail.com>
> wrote:
>
>> Hi David,
>>
>> There is a qdel force delete option you can run as an administrator to
>> help the situation. Check out the -f option in the qdel man page for
>> details.
>>
>> However, in some cases there may be issues with qdel. Some reported in
>> this list. If you have dependencies,  it may complicate deletion, and I
>> found myself restarting machines to clear out deleted jobs, and deleting
>> specific jobs by number one at a time, especially jobs that depend on many
>> others - those should be deleted first.
>>
>> Hopefully, this will give you some ideas to try.
>>
>>              Jacob
>> On Jun 18, 2015 9:16 AM, "David Koppstein" <david.koppstein_at_gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> I recently noticed that a lot of processes are still running on the
>>> compute nodes, even after I delete them using qdel. Is there a way to
>>> prevent this from happening?
>>>
>>> Thanks,
>>> David
>>>
>>> _______________________________________________
>>> StarCluster mailing list
>>> StarCluster_at_mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>>
>>>
> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
>
Received on Fri Jun 19 2015 - 10:08:38 EDT
 
This archive was generated by
hypermail 2.3.0.