Back to the main page.

Bug 1331 - more informative job id

Status CLOSED FIXED
Reported 2012-02-15 08:45:00 +0100
Modified 2012-04-11 16:48:28 +0200
Product: FieldTrip
Component: qsub
Version: unspecified
Hardware: All
Operating System: All
Importance: P3 enhancement
Assigned to: Robert Oostenveld
URL:
Tags:
Depends on:
Blocks:
See also:

Gio Piantoni - 2012-02-15 08:45:43 +0100

The job name in SGE now is the jobid generated automatically (qsubfeval.m @ 123): jobid = generatejobid(batch); In my case, it consists of username_servername_pXXX_bX_JXXX. However, when checking qstat with many jobs, it's not clear which process is running. We could add the name of the function to the jobid: jobid = [varargin{1} '_' generatejobid(batch)] I cannot think of any problem with it (maybe function names with spaces?) and it'd make qstat more informative.


Robert Oostenveld - 2012-02-15 09:20:47 +0100

good idea. For torque it does not help that much, because in the qstat display the job name is truncated after 12 characters or so. So you don't see the latter part. The only thing that I am concerned about is that we plan to move to function handles instead of strings. That will allow execution of stuff like this (think of this being a single m-file) function [y] = mainfunction(x) y = @subfunction; function z = subfunction(x) z = x^2; end % subfunction end % mainfunction If I save this to test_bug1331.m, then >> test_bug1331 ans = @test_bug1331/subfunction >> func2str(test_bug1331) ans = test_bug1331/subfunction Similar issues might arise for functions inside objects. So it is doable, but would require something like private/warning_once.m:function out = fixname(toolbox) to replace the '/' by a valid character for the filenames.


Gio Piantoni - 2012-02-15 09:46:43 +0100

Hi Robert, Thanks for the feedback! Indeed the truncation is the problem, because qstat only show: username_server... username_server... username_server... username_server... That's whay the varargin{1} would go first, so that you see: ft_freqanalysis... ft_dipolefittin... ft_sourceanalys... power_username_... But you're right about subfunctions etc, I did not think about that. That would make things more complicated. Best, G


Marcel Zwiers - 2012-02-15 10:36:32 +0100

Robert, with torque, you see the end of the jobname with qstat (or qstat -l), and the start when you use qstat -a, right?


Gio Piantoni - 2012-02-15 10:44:04 +0100

You're right, truncation is not a big problem. In SGE, you can see the full job name with: qstat -r But the main question is whether to include the Matlab function name into the jobname at all.


Robert Oostenveld - 2012-02-15 11:37:38 +0100

(In reply to comment #3) That is correct, see below. 1004 # qstat -a dccn-l014.dccn.nl: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - ----- 373530.dccn-l014 eminij interact STDIN -- 1 0 30gb 48:00 Q -- 373540.dccn-l014 eminij interact STDIN -- 1 0 16gb 48:00 Q -- 373544.dccn-l014 eminij interact STDIN -- 1 0 2gb 48:00 Q -- 373586.dccn-l014 eminij long eminij_mentat318 11486 -- -- 161061 14:03 R 10:33 373587.dccn-l014 eminij long eminij_mentat318 11552 -- -- 161061 14:03 R 10:33 373589.dccn-l014 eminij long eminij_mentat318 11914 -- -- 161061 14:03 R 10:33 373592.dccn-l014 eminij long eminij_mentat318 12459 -- -- 161061 14:03 R 10:33 373627.dccn-l014 stewhi interact STDIN 14828 -- -- 15gb 48:00 R 00:38 root@dccn-l014:~ 1005 # qstat -l Job id Name User Time Use S Queue ------------------------- ---------------- --------------- -------- - ----- 373530.dccn-l014.dccn.nl STDIN eminij 0 Q interactive 373540.dccn-l014.dccn.nl STDIN eminij 0 Q interactive 373544.dccn-l014.dccn.nl STDIN eminij 0 Q interactive 373586.dccn-l014.dccn.nl ...25397_b2_j017 eminij 10:33:07 R long 373587.dccn-l014.dccn.nl ...25397_b2_j018 eminij 10:33:09 R long 373589.dccn-l014.dccn.nl ...25397_b2_j020 eminij 10:32:28 R long 373592.dccn-l014.dccn.nl ...25397_b2_j023 eminij 10:32:29 R long 373627.dccn-l014.dccn.nl STDIN stewhi 00:00:20 R interactive root@dccn-l014:~


Robert Oostenveld - 2012-02-15 11:44:07 +0100

(In reply to comment #2) As long as the job id can serve as unique filename on a shared NFS file system, then it can be something else. That is why we have user, hostname and proces ID (in case the user has two matlabs on one linux computer). Should we make it an option, e.g. either username_servername_pXXX_bX_jXXX or funname_username_servername_pXXX_bX_jXXX Now that I think of it: the username is not required. That is also visible elsewhere in qstat. So funname_servername_pXXX_bX_jXXX would be sufficient as unique filename compatible job id. What do you think about it being an option? Or should we just always do funname_servername_pXXX_bX_jXXX? In case funname="subfunction/function" or "function/object", it can be subfunname_funname_servername_pXXX_bX_jXXX


Gio Piantoni - 2012-02-15 12:02:17 +0100

(In reply to comment #6) No strong preference. SGE qstat always shows the username as well, so it's not necessary to include it in the jobname. My preference goes to: funname_servername_pXXX_bX_jXXX? subfunname_funname_servername_pXXX_bX_jXXX Thanks a lot!


Robert Oostenveld - 2012-03-11 11:11:55 +0100

Instead of changing the default, I leave it to the user to think of a better name. I added the following option to qsubcellfun % batchid = string, to identify the jobs in the queue (default is user_host_pid_batch) which allows you to override the default user_host_pid_batch. This is alo supported in qsubfeval and qsubcompile. For qsubcompile it is interesting because it allows you to name the executable (and possibly make reuse easier).


Robert Oostenveld - 2012-03-11 11:33:24 +0100

committed three improvements to qsub in one go: enhancement - this addresses bug #1254, bug #1331 and bug #1361. Allow the user to specify toolbox names when compiling. Allow the user to specify a custom batchid instead of user_host_pid_bM. Implemented backend=local using a call to qqubexec inside qsubfeval (i.e. really local execution), renamed the existing implementation to backend-system. mbp> svn commit Sending qsub/private/generatejobid.m Sending qsub/qsubcellfun.m Sending qsub/qsubcompile.m Sending qsub/qsubfeval.m Transmitting file data .... Committed revision 5435.


Robert Oostenveld - 2012-04-11 16:48:28 +0200

I cleaned up my bugzilla list by changing the status from resolved (either fixed or wontfix) into closed. If you don't agree, please reopen the bug. Robert