Posts by Jeff Buck

1) (Message 3860)
Posted 5 Dec 2014 by Jeff Buck
Post:
I've been getting a few 1-3 a day (that I notice). This is the event log entry of one of the recent ones:

12/3/2014 11:53:40 PM | Asteroids@home | Giving up on download of input_44151_7: permanent HTTP error


It's not really a problem, more of a pain in the neck, as the Project just has to send another unit as that one didn't get thru to you. I'm thinking, since this is a long standing problem here, that the Project has a standard home type internet connection and it is not metered like a business one. A business one can stay with you even thru bad moments, while a home one can drop the whole thing if the connection goes bad. A business type one is more expensive.

Seems to me it's more than just a pain in the neck. True, for each individual user/host it may just be a minor nuisance, but from what I've seen in the last several days, once one task for a WU gets a download error, all the resends get essentially the same error, resulting in the entire WU finally ending up in a "Too many errors (may have bug)" status. That often leaves one host, which actually successfully processed the unit, with zero credit for the work done, such as on this WU, and the project with no usable results for that WU.

Last weekend I only got a few of the D/L errors, but the last few days it seems to be running around 40% of the tasks sent. It's not just the "permanent HTTP error" either. Many are checksum errors, like this recent one;

12/4/2014 8:10:26 PM | Asteroids@home | Started download of input_43848_5
12/4/2014 8:10:27 PM | Asteroids@home | Finished download of input_43848_5
12/4/2014 8:10:27 PM | Asteroids@home | [error] MD5 check failed for input_43848_5
12/4/2014 8:10:27 PM | Asteroids@home | [error] expected c64f0e4a6d04f542258293e8e22d10bb, got 764d56d0874e7d21f8e56fea9faa142b
12/4/2014 8:10:27 PM | Asteroids@home | [error] Checksum or signature error for input_43848_5

or this one:

12/4/2014 8:31:11 PM | Asteroids@home | Started download of input_43869_10
12/4/2014 8:31:16 PM | Asteroids@home | Finished download of input_43869_10
12/4/2014 8:31:16 PM | Asteroids@home | [error] File input_43869_10 has wrong size: expected 20889, got 20890
12/4/2014 8:31:16 PM | Asteroids@home | [error] Checksum or signature error for input_43869_10

I have no idea what ultimately happens to these WUs that max out on errors, but it doesn't seem like it would be particularly good for the project, and they're really starting to pile up.
2) (Message 3841)
Posted 2 Dec 2014 by Jeff Buck
Post:
Just noticed that my box tried to download a bunch of WUs.

Every single one had a checksum error

Was this jsut part of a bad batch??


TIA
ken

I've seen 3 of these myself in the last couple days. All of them seem to be WUs where one host has abandoned or aborted its task. Then all the resends get download errors until the whole WU goes into a "Too many errors (may have bug)" status. Doesn't really hurt the hosts that get the download errors, but it sure hurts the one original host who actually did expend the time to successfully complete its processing for the WU.

Looks to me like something gets screwed up when creating task files for resends.