Posts by rhb

1) (Message 2183)
Posted 13 Dec 2013 by rhb
I would recommend you check the manufacturer's specs for your specific cpu model. I can assure you that 35 is generally quite low for almost all devices. Common running temperatures are in the range 45 - 65 deg C, and many (not all) can run reliably up to 90 C.

Generally a component will start to malfunction long before it will be permanently damaged by heat. On the other hand, all the chips in your computer may wear out somewhat sooner if they run very hot -- but in today's world it is hard to wear them out before they become obsolete.
2) (Message 335)
Posted 26 Oct 2012 by rhb
Other tasks are running fine. I think the errant task was likely not complete, but failed during processing, because the time is significantly shorter than any of the other tasks. I have no idea why no files or memory map showed up, but that must have been a false appearance anyway because it printed to stderr when aborted.
3) (Message 334)
Posted 25 Oct 2012 by rhb
I'm quite certain the task was in the process of finishing up, as you suggest. I would have thought the client would send an abort request to the task, but all we see is the no heartbeat. The no heartbeat message occured less than a minute before the task was reported (4pm edt, 8pm utc). I suggest either the client terminates tasks by failing to send a heartbeat, or (more likely?) the task failed to get the request to abort, but was aware immediately that no heartbeat was present. It is also possible that the client sent a signal, which the task caught but reported the no heartbeat instead. If so, the no heartbeat might have persisted for a long time as you suggest.

I don't know the IPC design of boinc, but it probably doesn't matter. The task appears to have got stuck exiting for unknown reasons, possibly a race condition. I did stop and continue the task before aborting it, hoping that might shake something up. In any case, I suspect the error is random and not likely to happen again. I will release the others one-by-one just in case.
4) (Message 331)
Posted 24 Oct 2012 by rhb
Task 88762 was sleeping and not using any cpu time after using about 3 hours cpu. I looked at it with system monitor, and it had no open files and no memory map. Perhaps it got stuck as it was about ready to exit.

I will suspend the project for now, in case the others I have queued are similar. Let me know if some tasks are known to be bad, or if there is anything I can do to help in solving the problem. If it appears to be just my system, I will try the others and see what happens.

Name	ps_121018_9005_232_0
Workunit	36604
Created	18 Oct 2012 | 5:16:13 UTC
Sent	21 Oct 2012 | 6:01:13 UTC
Received	24 Oct 2012 | 19:50:31 UTC
Server state	Over
Outcome	Computation error
Client state	Aborted by user
Exit status	203 (0xcb) Unknown error number
Computer ID	362
Report deadline	31 Oct 2012 | 18:01:13 UTC
Run time	63,090.56
CPU time	10,696.53
Validate state	Invalid
Credit	0.00
Application version	Period Search Application v101.00 

Stderr output
aborted by user
15:49:49 (17224): No heartbeat from core client for 30 sec - exiting