Long running job.


Message boards : Number crunching : Long running job.

Message board moderation

To post messages, you must log in.
AuthorMessage
alexander

Send message
Joined: 28 Apr 13
Posts: 87
Credit: 26,716,176
RAC: 19
Message 6115 - Posted: 29 Dec 2018, 12:33:10 UTC
Looks like they are running now. Re-enabled my computers for Asteroid.
ID: 6115 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mike Thomson

Send message
Joined: 3 Apr 13
Posts: 3
Credit: 45,909,867
RAC: 588
Message 6152 - Posted: 12 Jan 2019, 23:40:28 UTC
New work units available but still the same problem of units taking 20hrs plus and if Boinc is stopped and restarted the offending work unit just starts back at zero. Offending units are ps_181122. ps_190110 seem to reach 100% without a problem. Anyone know whats going on??
ID: 6152 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Steve Dodd

Send message
Joined: 23 Aug 12
Posts: 8
Credit: 10,001,880
RAC: 0
Message 6161 - Posted: 30 Jan 2019, 0:31:06 UTC - in response to Message 6152.  
Don't know what's going on, but I'm experiencing same symptoms on certain WUs. I've had to abort 4 so far. Several others have run w/o issue.
ID: 6161 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile JStateson
Avatar

Send message
Joined: 16 Jan 14
Posts: 17
Credit: 27,435,207
RAC: 18,366
Message 6162 - Posted: 31 Jan 2019, 3:29:18 UTC
Same problem here. Aborted a 21 hour task that should have taken 21 minutes. I looked at my wingman here and observed that the same task was automatically pre-empted "EXIT_TIME_LIMIT_EXCEEDED" after 10 days. Truly a waste of computer power. I respectfully request that the "EXIT_TIME_LIMIT_EXCEEDED" parameter be set to some reasonable value. I could not find that option under preferences, I assume it is coded into the program. Suggest an hour or 2.
ID: 6162 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile JStateson
Avatar

Send message
Joined: 16 Jan 14
Posts: 17
Credit: 27,435,207
RAC: 18,366
Message 6163 - Posted: 31 Jan 2019, 5:19:28 UTC - in response to Message 6162.  
Found a way to, at least, stop long tasks from executing. I set a rule in BoincTasks to suspend any Asteroids GPU task if it takes over 1 hour and 30 minutes, the "0d,01:30:00" (I show 08:00 in below pic as I did not want to wait that long). The rule, as shown below, would be "After 1:30 hours, wait 10 seconds, then suspend the task for 99 days"

This allows other tasks to use the GPU. Note that the task status is "suspendedby user" and the debug log shows the rules was executed. This rule was applied to a remote computer, ms-7593-1060, which is nice.

HTH

ID: 6163 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
adrianxw

Send message
Joined: 5 Dec 12
Posts: 46
Credit: 9,905,488
RAC: 838
Message 6457 - Posted: 3 Mar 2020, 14:37:22 UTC
Almost a year now, did this problem ever get fixed? I've not run it since November 2018.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 6457 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Long running job.