Download failed

Message boards : Problems and bug reports : Download failed

Author	Message
Kyong Project administrator Project developer Project tester Project scientist Send message Joined: 9 Jun 12 Posts: 585 Credit: 52,667,664 RAC: 0	Message 1675 - Posted: 5 Sep 2013, 17:48:20 UTC Ok, 7 may be better. But server uses special subfolders which aren't still same so if I add it on another server and copy it back, it won't work and there will be mess with files. ID: 1675 · Rating: 0 · rate: / Reply Quote

Highlander Send message Joined: 16 Aug 13 Posts: 4 Credit: 3,234,196 RAC: 0	Message 1676 - Posted: 5 Sep 2013, 20:00:19 UTC that is a pitty, have thought also something about that, k, then i hope, the network connection on your side and the servers stays stable . ID: 1676 · Rating: 0 · rate: / Reply Quote

HA-SOFT, s.r.o. Project developer Project tester Send message Joined: 21 Dec 12 Posts: 176 Credit: 136,462,135 RAC: 0	Message 1678 - Posted: 5 Sep 2013, 20:35:38 UTC - in response to Message 1676. Last modified: 5 Sep 2013, 22:23:21 UTC For every one who wants to help us flush invalid tasks or do not want to click update every second: create cmd file like this one: :1 boinccmd --project http://asteroidsathome.net/boinc update timeout 60 goto 1 and run it. It updates project every 1 minute. Add full path before boinccmd if required. Thanks a lot for your patience. ID: 1678 · Rating: 0 · rate: / Reply Quote

Dagorath Send message Joined: 16 Aug 12 Posts: 293 Credit: 1,116,280 RAC: 0	Message 1681 - Posted: 6 Sep 2013, 1:16:21 UTC - in response to Message 1678. For every one who wants to help us flush invalid tasks or do not want to click update every second: create cmd file like this one: :1 boinccmd --project http://asteroidsathome.net/boinc update timeout 60 goto 1 Good idea but this strategy has been attempted at other projects and it's not as effective as one might think. The problem is it flushes invalid tasks only until the host's task cache fills. At that point the host won't request more tasks until it crunches enough tasks to drain the cache to the host's minimum cache setting. Then the host will request more tasks but soon it's cache is full again. In the end the flushing is severely constrained by how fast the host crunches tasks. It's better than nothing, but it may take a long time to flush the tasks from the server unless many hosts run the script. What the script needs to do before every update is abort any task the host has not started crunching. Yes, that will abort tasks that downloaded properly but those will be sent to another host and you can be fairly sure it will not be running the flush script and that it will crunch the task. Remember the bad tasks still have a max errors setting of 20. Only by aborting all tasks that have not started will the host be forced to request more tasks on each and every update. If there only 1,000 more bad tasks then it's probably not worth putting any effort into improving the flush script. If there are 10,000 more bad tasks then maybe it's worth improving the script. How many more bad tasks remain and how long will it take at the current flush rate? ID: 1681 · Rating: 0 · rate: / Reply Quote

Sonoraguy Send message Joined: 11 Jun 13 Posts: 8 Credit: 15,481,080 RAC: 0	Message 1684 - Posted: 6 Sep 2013, 4:38:04 UTC I hope you guys are on the right track. I've recorded about 8,000 failed downloads myself in the last couple days. This is a little crazy. ID: 1684 · Rating: 0 · rate: / Reply Quote

Dagorath Send message Joined: 16 Aug 12 Posts: 293 Credit: 1,116,280 RAC: 0	Message 1686 - Posted: 6 Sep 2013, 11:50:02 UTC - in response to Message 1684. Last modified: 6 Sep 2013, 11:53:25 UTC Doan be worry, Sonoraguy (Sonora, California?). It's only computers and we are still their masters for at least another year. No guarantees after that. I dug up a script I used a few months ago to flush bad tasks at another project. I'm pretty sure I can adapt it to work here too. If so it should flush a few thousand bad tasks per day but hmmmm... with 20 errors required for every task it might take a while. Wanna help? ID: 1686 · Rating: 0 · rate: / Reply Quote

Kyong Project administrator Project developer Project tester Project scientist Send message Joined: 9 Jun 12 Posts: 585 Credit: 52,667,664 RAC: 0	Message 1688 - Posted: 6 Sep 2013, 12:43:35 UTC I have decreased max_errors to 3 in database so cleaning the bad results should be fast now. ID: 1688 · Rating: 0 · rate: / Reply Quote

Sonoraguy Send message Joined: 11 Jun 13 Posts: 8 Credit: 15,481,080 RAC: 0	Message 1689 - Posted: 6 Sep 2013, 14:55:54 UTC :-) well.... the attack of the killer Downloads Failed seems to be over and humanity is once again safe to continue on with its own nefarious deeds! Came in at over 9,600 errors on this! ID: 1689 · Rating: 0 · rate: / Reply Quote

frankhagen Send message Joined: 18 Jun 12 Posts: 15 Credit: 5,027,400 RAC: 0	Message 1690 - Posted: 6 Sep 2013, 17:03:47 UTC - in response to Message 1688. I have decreased max_errors to 3 in database so cleaning the bad results should be fast now. well done! ID: 1690 · Rating: 0 · rate: / Reply Quote

Dagorath Send message Joined: 16 Aug 12 Posts: 293 Credit: 1,116,280 RAC: 0	Message 1692 - Posted: 6 Sep 2013, 18:56:22 UTC - in response to Message 1690. I think my flusher script aborted about 1,000 tasks. That might be just a drop in the bucket. I shut the script down because I reached my daily task limit and can't get anymore tasks today. Of the last bunch of tasks I received, about 50% were failed downloads so I'm not sure this is over yet. Before I hit the daily task limit I received batches that had no failed downloads only to be followed by batches that had ~75% download fails so I'm not so sure this is over yet. That was my Linux host. Think I'll put the script on my Windows host later and see how many failed downloads I get on that one before I hit the daily limit. Also, I've been wondering... what harm do these failed downloads really do? You get some failed downloads, but when your host needs more tasks it requests more and continues requesting until it receives some good tasks and then it goes back to work. Is anybody experiencing any real problems because of that? Any major dead time or hosts languishing about with nothing to do for more than 10 minutes? Sorry, I'm tied up with other stuff and am unable to keep close watch on my machines so I could be missing the obvious here. ID: 1692 · Rating: 0 · rate: / Reply Quote

Tomaseq Send message Joined: 23 Feb 13 Posts: 3 Credit: 5,057,040 RAC: 0	Message 1705 - Posted: 7 Sep 2013, 21:40:31 UTC There are any problems with download... ID: 1705 · Rating: 0 · rate: / Reply Quote

Sonoraguy Send message Joined: 11 Jun 13 Posts: 8 Credit: 15,481,080 RAC: 0	Message 1708 - Posted: 8 Sep 2013, 0:05:37 UTC Question: Has anyone taken notice of the fact that the issue of the Download Failed issue seems to have returned? After a day of relatively clean downloads, we're back to seeing about 60% - 80% of them fail again. I've had about 1,000 downloads fail in the last day or so. I don't expect this happens often but I've also had Work Units declared invalid because I processed it but there were 16 or 18 "Error while Downloading" so the WU was dropped due to 20 failures. Examples: http://asteroidsathome.net/boinc/workunit.php?wuid=4842346 http://asteroidsathome.net/boinc/workunit.php?wuid=4842338 Just a thought, but it might be an idea to clear the database of all those download failures. ID: 1708 · Rating: 0 · rate: / Reply Quote

nanoprobe Send message Joined: 15 Jan 13 Posts: 12 Credit: 904,320 RAC: 0	Message 1710 - Posted: 8 Sep 2013, 1:08:34 UTC - in response to Message 1708. Question: Has anyone taken notice of the fact that the issue of the Download Failed issue seems to have returned? After a day of relatively clean downloads, we're back to seeing about 60% - 80% of them fail again. I've had about 1,000 downloads fail in the last day or so. I don't expect this happens often but I've also had Work Units declared invalid because I processed it but there were 16 or 18 "Error while Downloading" so the WU was dropped due to 20 failures. Examples: http://asteroidsathome.net/boinc/workunit.php?wuid=4842346 http://asteroidsathome.net/boinc/workunit.php?wuid=4842338 Just a thought, but it might be an idea to clear the database of all those download failures. I can't get any tasks. They all fail to download. ID: 1710 · Rating: 0 · rate: / Reply Quote

MarkJ Send message Joined: 27 Jun 12 Posts: 129 Credit: 62,725,780 RAC: 0	Message 1713 - Posted: 8 Sep 2013, 3:30:15 UTC - in response to Message 1708. Just a thought, but it might be an idea to clear the database of all those download failures. They should be able identify the download failed by their error count being > 5 and then mark them all as cancelled to save download failures. Just a thought Kyong might want to consider. I am still getting attempted downloads of ones up to _19 so I don't think the setting of them to a max of 3 errors worked. BOINC blog ID: 1713 · Rating: 0 · rate: / Reply Quote

Dagorath Send message Joined: 16 Aug 12 Posts: 293 Credit: 1,116,280 RAC: 0	Message 1714 - Posted: 8 Sep 2013, 3:50:20 UTC - in response to Message 1708. Last modified: 8 Sep 2013, 3:57:29 UTC Question: Has anyone taken notice of the fact that the issue of the Download Failed issue seems to have returned? After a day of relatively clean downloads, we're back to seeing about 60% - 80% of them fail again. I've had about 1,000 downloads fail in the last day or so. Kyong knows the reason for this recurring problem best but my hunch is the bad downloaders are not sprinkled evenly throughout the database. Instead they were injected in relatively large blocks. We worked through a bad block 2 days ago then hit a good block but now we're into another bad block. Or something like that. I've seen similar problems at a few other projects too. It'll work out fairly soon. I don't expect this happens often but I've also had Work Units declared invalid because I processed it but there were 16 or 18 "Error while Downloading" so the WU was dropped due to 20 failures. It's possible that as we continue to purge these bad downloadersd we're going to see that happen more often. It almost happened to me too except I was running a script that auto requests an Asteroids project update every 2 minutes so the server spotted the task and asked my host to abort it. My host complied with the abort request and I lost only 2 minutes crunch time on that task. If you want to have those kinds of tasks aborted on your host too then check this post from HA-Soft in which he provides a small and easily installed Windows batch file (script) that will cause your host to auto-update Asteroids project too. Batch files are not something the average computer user needs to learn so it's possible you don't know how to implement it. The thing is BOINCing isn't exactly "average sort of computer activity" as you well know, right? If you don't know but would like to learn then ask and someone will guide you through it. It's a very handy thing to know how to do because frequently we can use batch files and scripts like HA-Soft's to correct various problems BOINCers run into. If you don't mind installing Python on your host you can implement a scipt I'm using that flushes the bad downloaders from the server even faster than HA-Soft's script and causes tasks to auto-abort just as his script does. Python is a very powerful scripting language. It's safe and secure because if you or anybody else is interested in using it I will publish the script here in a post where others will vet it. If it poses any security risk to anyone they'll say so and the post will be deleted very quickly. Just a thought, but it might be an idea to clear the database of all those download failures. I'm not so sure the bad downloaders are causing us volunteers any real grief that we cannot avoid with minimal effort. Weigh that against the fact that messing around with the database has, in the past, caused some projects major grief. The likelihood of major grief happening depends on a number of factors I won't bother going into but let me assure you that I know Kyong and HA-Soft are not a couple of inexperienced amateurs bumbling their way through this. I am quite sure they've weighed their options and think the present course of action is the best one for the project. I trust them and I hope we will all trust them. Remember this... nobody is going to criticize you or anybody else if you just suspend Asteroids for however long it takes to work this out. Probably nobody will even know if you decide to do that. ID: 1714 · Rating: 0 · rate: / Reply Quote

Kyong Project administrator Project developer Project tester Project scientist Send message Joined: 9 Jun 12 Posts: 585 Credit: 52,667,664 RAC: 0	Message 1715 - Posted: 8 Sep 2013, 7:53:00 UTC I have again decreased max_error to 3, I thought that the bad WUs was 17001 - 18000, not even to 19000. ID: 1715 · Rating: 0 · rate: / Reply Quote

MarkJ Send message Joined: 27 Jun 12 Posts: 129 Credit: 62,725,780 RAC: 0	Message 1719 - Posted: 8 Sep 2013, 10:53:51 UTC - in response to Message 1715. Last modified: 8 Sep 2013, 10:59:48 UTC I have again decreased max_error to 3, I thought that the bad WUs was 17001 - 18000, not even to 19000. I have computed some that have max errors set to 3, but the wingman got a computing error. Hopefully we won't waste these work units. Example: ps_130831_18208_165 Compute error As Dagorath says probably best not to fiddle with the database and just let them fail naturally now. BOINC blog ID: 1719 · Rating: 0 · rate: / Reply Quote

Wettermann Send message Joined: 18 Jun 12 Posts: 8 Credit: 5,284,068 RAC: 5	Message 1722 - Posted: 8 Sep 2013, 14:58:04 UTC From 59 WU´s there were 39 with permanent http-error. 08.09.2013 16:50:15 \| Asteroids@home \| Scheduler request completed: got 59 new tasks 08.09.2013 16:50:16 \| Asteroids@home \| work fetch suspended by user 08.09.2013 16:50:17 \| Asteroids@home \| Started download of period_search_10100_windows_intelx86__sse2.exe 08.09.2013 16:50:17 \| Asteroids@home \| Started download of input_18357_95 08.09.2013 16:50:19 \| Asteroids@home \| Finished download of input_18357_95 08.09.2013 16:50:19 \| Asteroids@home \| Started download of input_18357_65 08.09.2013 16:50:20 \| Asteroids@home \| Finished download of period_search_10100_windows_intelx86__sse2.exe 08.09.2013 16:50:20 \| Asteroids@home \| Finished download of input_18357_65 08.09.2013 16:50:20 \| Asteroids@home \| Started download of input_18356_195 08.09.2013 16:50:20 \| Asteroids@home \| Started download of input_18355_149 08.09.2013 16:50:21 \| Asteroids@home \| Giving up on download of input_18356_195: permanent HTTP error 08.09.2013 16:50:21 \| Asteroids@home \| Giving up on download of input_18355_149: permanent HTTP error 08.09.2013 16:50:21 \| Asteroids@home \| Started download of input_18355_156 08.09.2013 16:50:21 \| Asteroids@home \| Started download of input_18355_172 08.09.2013 16:50:22 \| Asteroids@home \| Giving up on download of input_18355_156: permanent HTTP error 08.09.2013 16:50:22 \| Asteroids@home \| Giving up on download of input_18355_172: permanent HTTP error 08.09.2013 16:50:22 \| Asteroids@home \| Started download of input_18356_17 08.09.2013 16:50:22 \| Asteroids@home \| Started download of input_18357_101 08.09.2013 16:50:24 \| Asteroids@home \| Giving up on download of input_18356_17: permanent HTTP error 08.09.2013 16:50:24 \| Asteroids@home \| Started download of input_18356_123 08.09.2013 16:50:25 \| Asteroids@home \| Finished download of input_18357_101 08.09.2013 16:50:25 \| Asteroids@home \| Giving up on download of input_18356_123: permanent HTTP error says Tommy the Wettermann ID: 1722 · Rating: 0 · rate: / Reply Quote

Sonoraguy Send message Joined: 11 Jun 13 Posts: 8 Credit: 15,481,080 RAC: 0	Message 1723 - Posted: 8 Sep 2013, 17:41:18 UTC Thanks Dagorath. I implemented the cmd file you suggested on two systems and we are, once again, fully loaded with WUs to run. ID: 1723 · Rating: 0 · rate: / Reply Quote

Dagorath Send message Joined: 16 Aug 12 Posts: 293 Credit: 1,116,280 RAC: 0	Message 1724 - Posted: 9 Sep 2013, 0:18:36 UTC - in response to Message 1723. Thanks Dagorath. I implemented the cmd file you suggested on two systems and we are, once again, fully loaded with WUs to run. Wonderful! ID: 1724 · Rating: 0 · rate: / Reply Quote

Previous · 1 · 2 · 3 · Next

Message boards : Problems and bug reports : Download failed