Posts by Jeff Buck
1)
(Message 3860)
Posted 5 Dec 2014 by Jeff Buck Post: I've been getting a few 1-3 a day (that I notice). This is the event log entry of one of the recent ones: Seems to me it's more than just a pain in the neck. True, for each individual user/host it may just be a minor nuisance, but from what I've seen in the last several days, once one task for a WU gets a download error, all the resends get essentially the same error, resulting in the entire WU finally ending up in a "Too many errors (may have bug)" status. That often leaves one host, which actually successfully processed the unit, with zero credit for the work done, such as on this WU, and the project with no usable results for that WU. Last weekend I only got a few of the D/L errors, but the last few days it seems to be running around 40% of the tasks sent. It's not just the "permanent HTTP error" either. Many are checksum errors, like this recent one; 12/4/2014 8:10:26 PM | Asteroids@home | Started download of input_43848_5 12/4/2014 8:10:27 PM | Asteroids@home | Finished download of input_43848_5 12/4/2014 8:10:27 PM | Asteroids@home | [error] MD5 check failed for input_43848_5 12/4/2014 8:10:27 PM | Asteroids@home | [error] expected c64f0e4a6d04f542258293e8e22d10bb, got 764d56d0874e7d21f8e56fea9faa142b 12/4/2014 8:10:27 PM | Asteroids@home | [error] Checksum or signature error for input_43848_5 or this one: 12/4/2014 8:31:11 PM | Asteroids@home | Started download of input_43869_10 12/4/2014 8:31:16 PM | Asteroids@home | Finished download of input_43869_10 12/4/2014 8:31:16 PM | Asteroids@home | [error] File input_43869_10 has wrong size: expected 20889, got 20890 12/4/2014 8:31:16 PM | Asteroids@home | [error] Checksum or signature error for input_43869_10 I have no idea what ultimately happens to these WUs that max out on errors, but it doesn't seem like it would be particularly good for the project, and they're really starting to pile up. |
2)
(Message 3841)
Posted 2 Dec 2014 by Jeff Buck Post: Just noticed that my box tried to download a bunch of WUs. I've seen 3 of these myself in the last couple days. All of them seem to be WUs where one host has abandoned or aborted its task. Then all the resends get download errors until the whole WU goes into a "Too many errors (may have bug)" status. Doesn't really hurt the hosts that get the download errors, but it sure hurts the one original host who actually did expend the time to successfully complete its processing for the WU. Looks to me like something gets screwed up when creating task files for resends. |