Why so many "Error while downloading..."
Message boards :
Number crunching :
Why so many "Error while downloading..."
Message board moderation
Author | Message |
---|---|
Send message Joined: 13 Mar 13 Posts: 8 Credit: 5,995,680 RAC: 0 |
|
Send message Joined: 16 Nov 12 Posts: 24 Credit: 23,025,000 RAC: 0 |
Last modified: 17 Oct 2014, 16:16:09 UTC Hi Yank, I've invested in several "larger" systems and I've just looked at the results for this one system. I've got 1200 "error while downloading" too!!! Is this only during download maybe? Or is it jobs which has been done? I know downloads is new jobs, but my jobs done has declined during the last month. This even though I added another new system/cruncher with additional GPU's ;) !?!? And it's not just one box, it's all of them. Have you heard anything??? Error (1200) errors Too many errors (may have bug) 50619812 21755352 82538 12 Oct 2014, 23:04:54 UTC 12 Oct 2014, 23:06:36 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50614690 21756353 80263 12 Oct 2014, 14:17:48 UTC 12 Oct 2014, 15:21:50 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50614686 21756022 80263 12 Oct 2014, 14:17:47 UTC 12 Oct 2014, 15:21:50 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50614687 21756352 80263 12 Oct 2014, 14:17:47 UTC 12 Oct 2014, 15:21:50 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50614691 21710021 80263 12 Oct 2014, 14:17:47 UTC 12 Oct 2014, 15:21:50 UTC Error while downloading 0.00 0.00 --- Period Search Application v102.10 (sse2) 50614692 21754593 80263 12 Oct 2014, 14:17:47 UTC 12 Oct 2014, 15:21:50 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50614693 21756343 80263 12 Oct 2014, 14:17:47 UTC 12 Oct 2014, 15:21:50 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50614062 21757770 115283 12 Oct 2014, 13:57:42 UTC 12 Oct 2014, 14:31:33 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50614063 21760582 115283 12 Oct 2014, 13:57:42 UTC 12 Oct 2014, 14:31:33 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50614066 21755357 115283 12 Oct 2014, 13:57:42 UTC 12 Oct 2014, 14:31:33 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50611960 21762768 115283 12 Oct 2014, 12:57:05 UTC 12 Oct 2014, 13:57:41 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50611204 21762682 82538 12 Oct 2014, 12:32:07 UTC 12 Oct 2014, 18:34:26 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50611221 21704050 82538 12 Oct 2014, 12:32:07 UTC 12 Oct 2014, 18:34:26 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50609210 21762345 115283 12 Oct 2014, 11:21:02 UTC 12 Oct 2014, 12:57:05 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50609214 21760854 115283 12 Oct 2014, 11:21:02 UTC 12 Oct 2014, 12:57:05 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50609215 21760856 115283 12 Oct 2014, 11:21:02 UTC 12 Oct 2014, 12:57:05 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50609221 21762284 115283 12 Oct 2014, 11:21:02 UTC 12 Oct 2014, 12:57:05 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50609244 21728928 115283 12 Oct 2014, 11:21:02 UTC 12 Oct 2014, 12:57:05 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50609257 21759957 115283 12 Oct 2014, 11:21:02 UTC 12 Oct 2014, 12:57:05 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50609260 21759964 115283 12 Oct 2014, 11:21:02 UTC 12 Oct 2014, 12:57:05 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55)
Navn ps_140922_323932_2 Applikationer Period Search Application Opret 27 Sep 2014, 6:55:28 UTC minimum quorum 2 initial replication 2 max # of error/total/success tasks 7, 20, 20 errors Too many errors (may have bug) 50246333 106458 12 Oct 2014, 5:20:12 UTC 12 Oct 2014, 12:47:33 UTC Error 0.00 0.00 --- Period Search Application v102.10 (avx) 50246334 8801 12 Oct 2014, 5:19:57 UTC 12 Oct 2014, 23:02:41 UTC Error 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50611663 57748 12 Oct 2014, 12:48:37 UTC 13 Oct 2014, 10:54:10 UTC Error 0.00 0.00 --- Period Search Application v102.10 (sse3) 50619807 124147 12 Oct 2014, 23:02:50 UTC 12 Oct 2014, 23:03:17 UTC Error while downloading 0.00 0.00 --- Period Search Application v102.10 50619809 52719 12 Oct 2014, 23:03:22 UTC 12 Oct 2014, 23:04:44 UTC Error while downloading 0.00 0.00 --- Period Search Application v102.10 (sse3) 50619812 82538 12 Oct 2014, 23:04:54 UTC 12 Oct 2014, 23:06:36 UTC Error while downloading 0.00 0.00 --- Period Search Application v101.12 (cuda55) 50619818 95802 12 Oct 2014, 23:06:42 UTC 12 Oct 2014, 23:07:09 UTC Error while downloading 0.00 0.00 --- Period Search Application v102.10 (sse2) 50619820 51607 12 Oct 2014, 23:07:18 UTC 13 Oct 2014, 0:07:50 UTC Error while downloading 0.00 0.00 --- Period Search Application v102.10 (sse3) 50620138 102510 13 Oct 2014, 0:07:58 UTC 13 Oct 2014, 0:43:38 UTC Error while downloading 0.00 0.00 --- Period Search Application v102.10 (sse2)
Project Headless CLI Linux Multiple GPU Boinc Servers Ubuntu Server 14.04.1 64bit Kernel 3.13.0-32-generic CPU's i5-4690K GPU's GT640/GTX750TI Nvidia v.340.29 BOINC v.7.2.42 |
Send message Joined: 19 Jun 12 Posts: 221 Credit: 623,640 RAC: 0 |
Last modified: 18 Oct 2014, 7:29:15 UTC Is this only during download maybe? Or is it jobs which has been done? I know downloads is new jobs, but my jobs done has declined during the last month. Yes, if you see 'Error while downloading' the task never reached your computer so it can't be "jobs which has been done" About "declined during the last month" it have to be probably in "the last week" See the list of your tasks and look in the 'Sent' column. You will see a 'gap' of no (or very few) tasks in the period ~10-16 Oct 2014 (server had no tasks to send for everyone) and also many of "Sent" files give 'Error while downloading' which may be because: - file do not exist on the server (or is in wrong directory, or have wrong filename?) - server was overloaded (network/database/disk overload): http://asteroidsathome.net/boinc/forum_thread.php?id=373&postid=3672#3672 If you (everyone) have no tasks to work on - no wonder your (everyone) RAC go down. You may also read: http://asteroidsathome.net/boinc/forum_thread.php?id=377 http://asteroidsathome.net/boinc/forum_thread.php?id=263 - ALF - "Find out what you don't do well ..... then don't do it!" :) |
Send message Joined: 16 Nov 12 Posts: 24 Credit: 23,025,000 RAC: 0 |
Hi Bilbg, Thanks for your reply. If you (everyone) have no tasks to work on - no wonder your (everyone) RAC go down. I think that when the systems are going to run more than 30 days without interruption, that the numbers will increase ;) . Project Headless CLI Linux Multiple GPU Boinc Servers Ubuntu Server 14.04.1 64bit Kernel 3.13.0-32-generic CPU's i5-4690K GPU's GT640/GTX750TI Nvidia v.340.29 BOINC v.7.2.42 |
Send message Joined: 1 Nov 13 Posts: 1 Credit: 7,680 RAC: 0 |
Last modified: 4 Nov 2014, 12:11:08 UTC I've got some WUs with that problem and decided to take a look at forums to see if there's any help, and I see that i'm not the only with this download error, maybe is a server bad configuration. Any suggestions about how to solve it? Edit: About 2 atempted WUs both failed |
Send message Joined: 19 Nov 14 Posts: 93 Credit: 30,066,240 RAC: 0 |
Last modified: 5 Dec 2014, 9:17:47 UTC Hi, Well download errors abound:- 05/12/2014 08:53:38 | Asteroids@home | Requesting new tasks for CPU and NVIDIA GPU 05/12/2014 08:53:40 | Asteroids@home | Scheduler request completed: got 17 new tasks 05/12/2014 08:53:42 | Asteroids@home | Started download of input_47693_10 05/12/2014 08:53:42 | Asteroids@home | Started download of input_47691_19 05/12/2014 08:53:43 | Asteroids@home | Finished download of input_47693_10 05/12/2014 08:53:43 | Asteroids@home | Finished download of input_47691_19 05/12/2014 08:53:43 | Asteroids@home | Started download of input_47691_18 05/12/2014 08:53:43 | Asteroids@home | Started download of input_47691_17 05/12/2014 08:53:44 | Asteroids@home | Finished download of input_47691_18 05/12/2014 08:53:44 | Asteroids@home | Finished download of input_47691_17 05/12/2014 08:53:44 | Asteroids@home | Started download of input_47691_16 05/12/2014 08:53:44 | Asteroids@home | Started download of input_47693_11 05/12/2014 08:53:45 | Asteroids@home | Finished download of input_47691_16 05/12/2014 08:53:45 | Asteroids@home | Finished download of input_47693_11 05/12/2014 08:53:45 | Asteroids@home | Started download of input_47691_15 05/12/2014 08:53:45 | Asteroids@home | Started download of input_43934_5 05/12/2014 08:53:46 | Asteroids@home | Finished download of input_47691_15 05/12/2014 08:53:46 | Asteroids@home | Finished download of input_43934_5 05/12/2014 08:53:46 | Asteroids@home | Started download of input_47510_1 05/12/2014 08:53:46 | Asteroids@home | Started download of input_47691_14 05/12/2014 08:53:46 | Asteroids@home | [error] MD5 check failed for input_43934_5 05/12/2014 08:53:46 | Asteroids@home | [error] expected 712f7783d8ff43aafeb5bad488dc7233, got 0a474a13da63e5d5d92f6ab7597b5fb1 05/12/2014 08:53:46 | Asteroids@home | [error] Checksum or signature error for input_43934_5 05/12/2014 08:53:47 | Asteroids@home | Giving up on download of input_47510_1: permanent HTTP error 05/12/2014 08:53:47 | Asteroids@home | Finished download of input_47691_14 05/12/2014 08:53:47 | Asteroids@home | Started download of input_47693_6 05/12/2014 08:53:47 | Asteroids@home | Started download of input_43934_4 05/12/2014 08:53:48 | Asteroids@home | Finished download of input_43934_4 05/12/2014 08:53:48 | Asteroids@home | Started download of input_47430_24 05/12/2014 08:53:48 | Asteroids@home | [error] MD5 check failed for input_43934_4 05/12/2014 08:53:48 | Asteroids@home | [error] expected 6ae345124a1674833e0a52f8ed30036b, got c2fd07b896d0b45a7694124e87b6c3e0 05/12/2014 08:53:48 | Asteroids@home | [error] Checksum or signature error for input_43934_4 05/12/2014 08:53:49 | Asteroids@home | Finished download of input_47693_6 05/12/2014 08:53:49 | Asteroids@home | Giving up on download of input_47430_24: permanent HTTP error 05/12/2014 08:53:49 | Asteroids@home | Started download of input_47692_6 05/12/2014 08:53:49 | Asteroids@home | Started download of input_47141_11 05/12/2014 08:53:50 | Asteroids@home | Finished download of input_47692_6 05/12/2014 08:53:50 | Asteroids@home | Giving up on download of input_47141_11: permanent HTTP error 05/12/2014 08:53:50 | Asteroids@home | Started download of input_47691_12 End result I a lot less than 17 tasks ready to crunch, and a fair few messed up downloads. I've set NNT for the time being I'll resume and try again later. Regards, |
Send message Joined: 1 Jan 14 Posts: 302 Credit: 32,671,868 RAC: 0 |
I've got some WUs with that problem and decided to take a look at forums to see if there's any help, and I see that i'm not the only with this download error, maybe is a server bad configuration. Any suggestions about how to solve it? The problem is NOT on your end, so is not fixable by you. |
Send message Joined: 17 Aug 14 Posts: 49 Credit: 5,225,280 RAC: 0 |
Looks like you need a server application to inspect every workunit, and if any of the input files are not in the right places on the server, mark that workunit as Download failed without actually sending it out to any clients. I might volunteer to write it, except that I suspect that your server uses Linux, and I've never used Linux at all. |
Send message Joined: 1 Jan 14 Posts: 302 Credit: 32,671,868 RAC: 0 |
Looks like you need a server application to inspect every workunit, and if any of the input files are not in the right places on the server, mark that workunit as Download failed without actually sending it out to any clients. I am NOT an admin, I am just reporting what I see like everyone else, the problem seems to be in the transmission, whether that is a bad file being sent as you say or a bad connection along the way or even because of too many connections at once is beyond my expertise to figure out. I do know that Kyong is aware of the problem and seems to have settled into the fact that it is going to happen and has learned to live with a certain number of failures. As I said before the Server will automatically put the workunits back into the queue to get resent to someone else, so no work needed on his part. |
Send message Joined: 9 Jun 12 Posts: 584 Credit: 52,667,664 RAC: 0 |
Last modified: 9 Dec 2014, 10:31:16 UTC Hi, I am aware of this unfortunately I am now solving another problem. I am changing my job and I have now temporarily 2 two jobs at once, so it is very complicated for me now. Since January, I will have just only one job, so I will be able to take care about the server much more again. I am sorry for it. I will try to solve the problems with download errors as soon as possible.
|
Message boards :
Number crunching :
Why so many "Error while downloading..."