all downloaded tasks starting at the same time - and crash
Message boards :
Number crunching :
all downloaded tasks starting at the same time - and crash
Message board moderation
Author | Message |
---|---|
Send message Joined: 12 Jan 24 Posts: 17 Credit: 1,592,428 RAC: 0 |
Yesterday I attached the Asteroid@home project to 3 of my computers. In the settings I determined that only the GPU is crunching tasks. Which works perfectly on all three machines. Today, I tried to attach a fourth PC, but I ran against the following problem: All downloaded tasks seem to start at the same time - only the first one proceeds and finishes normally after a few minutes, all others produce a "computation error" right at the moment they want to start. Which is clear, because for sure some 20 or 30 tasks cannot be crunched at the same time. What's going wrong? FYI: the CPU is a Intel Xeon E5 2667v4, the GPU is a Nvidia Quadro P 5000. The stderr of these failed tasks looks like: https://asteroidsathome.net/boinc/result.php?resultid=434995002 |
Send message Joined: 1 Jan 13 Posts: 90 Credit: 10,397,766 RAC: 8,451 |
are you running 20/30 gpu tasks in parallel? that's not a good idea and it won't be any better. perhaps try with recent drivers? 516.94 is a bit older https://www.nvidia.com/Download/driverResults.aspx/216860/en-us/ |
Send message Joined: 23 Apr 21 Posts: 85 Credit: 115,470,505 RAC: 203,869 |
|
Send message Joined: 12 Jan 24 Posts: 17 Credit: 1,592,428 RAC: 0 |
are you running 20/30 gpu tasks in parallel? that's not a good ideano, it was NOT my intention to run more than 1 task at a time. I just meant to say that when I pushed the UPDATE button in the BOINC manager after attaching the to project, some 20/30 tasks were downloaded - with the expectation that one after the other will be processed. |
Send message Joined: 12 Jan 24 Posts: 17 Credit: 1,592,428 RAC: 0 |
sounds like something wrong with your boinc configuration.no, on this PC I did not even put in a app_config.xml. I guess you refer to my other posting from yesterday where I asked for the correct wording in an app_config.xml in order to run 2 tasks in parallel. I finally found out, but it turned out that the runtime for each task had doubled, so it did not make any sense to run 2 such tasks in parallel, and I finally removed the app_config.xml. And for exactly this reason I did not even install one in the other PC from today. BTW, any other GPU tasks which I crunch on this PC, e.g. WCG, GPUGRID, Primegrid, Einstein ... work well. So there seems everything alright with the BOINC configuration. No idea why all of a sudden Asteroid wants to crunch all downloaded tasks together |
Send message Joined: 23 Apr 21 Posts: 85 Credit: 115,470,505 RAC: 203,869 |
still sounds like something wrong with the BOINC configuration. since it's BOINC that controls how many tasks run at a time, not the application itself. you might want to actually inspect the project folder to make sure there isnt some app_config file that you're not aware of. if there is indeed no app_config, then I would uninstall BOINC and reinstall the latest version you can. |
Send message Joined: 12 Jan 24 Posts: 17 Credit: 1,592,428 RAC: 0 |
Last modified: 13 Jan 2024, 19:08:21 UTC still sounds like something wrong with the BOINC configuration. since it's BOINC that controls how many tasks run at a time, not the application itself.I double-checked: no app_config_xml in the project folder. So it might not be a bad idea to install the latest BOINC version. Still strange though: the problem just occurs with Asteroid but with no other project. Also, I will update the driver. 516.94 is pretty old. |
Send message Joined: 12 Jan 24 Posts: 17 Credit: 1,592,428 RAC: 0 |
...So it might not be a bad idea to install the latest BOINC version. I now updated both BOINC and the NVIDIA driver. Unfortunately, I cannot make a testrun for Asteroid now, since there are no new tasks available :-( P.S. As mentioned before, I am new to this project - so my question: does it happen frequently that no tasks are available? I have been partcipating in project where this happens quite often, and others where this happens almost never. |
Send message Joined: 23 Apr 21 Posts: 85 Credit: 115,470,505 RAC: 203,869 |
|
Send message Joined: 12 Jan 24 Posts: 17 Credit: 1,592,428 RAC: 0 |
|
Send message Joined: 12 Jan 24 Posts: 17 Credit: 1,592,428 RAC: 0 |
Now new tasks are available, so I downloaded a few. And again, all downloaded tasks startet at the same time, and except for one, all failed immediately, of course....So it might not be a bad idea to install the latest BOINC version. So neither the latest version of BOINC nor the latest version of the GPU driver could solve the problem. As said before, this happens with no other GPU project. So it's bound to have something to do with Asteroid specifically. |
Send message Joined: 23 Apr 21 Posts: 85 Credit: 115,470,505 RAC: 203,869 |
it's some problem with your boinc configuration. the app itself is not possible to behave the way you are describing. it only runs once. if you have multiple copies running it's because BOINC told it to. remove the asteroids project completely from BOINC, and re-add it. |
Send message Joined: 12 Jan 24 Posts: 17 Credit: 1,592,428 RAC: 0 |
|
Send message Joined: 12 Jan 24 Posts: 17 Credit: 1,592,428 RAC: 0 |
good news: a friend gave me the advice to try the app_config.xml <app_config> <project_max_concurrent>1</project_max_concurrent> </app_config> which I have known for long time and have used with some other projects now and then. I simply did not think about it. And it works - now only 1 task is being processed at a time :-) Sometimes solutions to problems are simpler than one might think. |
Send message Joined: 23 Apr 21 Posts: 85 Credit: 115,470,505 RAC: 203,869 |
|
Send message Joined: 12 Jan 24 Posts: 17 Credit: 1,592,428 RAC: 0 |
|
Message boards :
Number crunching :
all downloaded tasks starting at the same time - and crash