WUs stuck at the end, keeps running until manual abort


Message boards : Problems and bug reports : WUs stuck at the end, keeps running until manual abort

Message board moderation

To post messages, you must log in.
AuthorMessage
KLiK

Send message
Joined: 3 Apr 14
Posts: 16
Credit: 19,567,302
RAC: 1,913
Message 7042 - Posted: 6 Sep 2020, 21:42:08 UTC

Last modified: 6 Sep 2020, 21:42:27 UTC
Hi,
recently I've came across WU that has been stuck at the end of calculation. Usually WUs are run in about 40min on my 1050Ti card, so that means 2400sec.

This WU ran for about 32.000+sec, before I've aborted it. Check here:
- http://asteroidsathome.net/boinc/result.php?resultid=315123559
- http://asteroidsathome.net/boinc/workunit.php?wuid=132451878

Can we get some timer, when those WUs are over 2x more then projected time - to have them aborted in app? That would speed up the process of the calculations, as sometimes our calculations might get stuck like this for days.

Thanks,




non-profit org. Play4Life in Zagreb, Croatia, EU
ID: 7042 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,466
RAC: 47
Message 7058 - Posted: 13 Sep 2020, 10:27:33 UTC - in response to Message 7042.  
Hi KLiK,

By design all the data that is being processed by this project is subject to a lot of preliminary checks before it has been send to the contributors. However there is still possibility some faulty Work Units (WU) to sneak out. We are aware of the behavior of those old application when they receive such faulty task.

From what I can see your computer is running Windows 10 x64 with NVIDIA GeForce GTX 1050 Ti (4095MB) driver: 432.00 GPU. I'm strongly suggesting if you can upgrade at least your NVIDIA driver to version 441.22 as minimum in order your BOINC client to be able to receive and run the newest "cuda102_win10" application. That new app handles better such situations when faulty WU is processed.

The "cuda55" application is deprecated, and soon will be removed for good from our project.
The reason why we still keep it is for backward compatibility with older devices and 32bit OSes. However CUDA v5.x library is not supported anymore by NVIDIA, so our team does not have the resources to support that application anymore too.
The new application is build using CUDA v10.2 library which supports every device (running under x64 OS) with Compute Capability (CC) greater or equal to 3.0 and less than 8.0.

More information you can find HERE.
About the latest drivers for you NVIDIA card you can check THIS link.

I hope that will help.

Regards,
Georgi
ID: 7058 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Problems and bug reports : WUs stuck at the end, keeps running until manual abort