Posts by Jesse Viviano

1) (Message 6339)
Posted 29 Aug 2019 by Jesse Viviano
Post:
The long term solution would be to port the application to OpenCL. When Pascal was released, it only supported OpenCL initially until Nvidia released a new version of CUDA. What I have read about CUDA is that it encourages developers to target a specific hardware. A developer could target one type of Nvidia GPU hardware at the expense of others, which will either force the CUDA driver to emulate the code to fit the mismatched hardware if a mismatched GPU is found, or refuse to run it if there is no way to allow the hardware mismatch to be solved. OpenCL can allow the developer to write generic code that runs mostly equally well on many different hardware architectures. OpenCL also allows AMD GPUs to participate. Nvidia has been sluggish to adopt new versions of OpenCL, but it does not skimp on supporting OpenCL 1.2 or older. Nvidia's drivers also deprecate the running of and sometimes remove the support for running old CUDA versions on newer GPUs.
2) (Message 6054)
Posted 20 Nov 2018 by Jesse Viviano
Post:
This might be the same issue: Work unit 96813621 put no load on my GPU at all. I aborted it after a reboot failed to solve the issue. Also, someone else crunched this work unit on a CPU, and the work unit timed out as seen in https://asteroidsathome.net/boinc/result.php?resultid=225982292. My result was at https://asteroidsathome.net/boinc/result.php?resultid=226803514.
3) (Message 5985)
Posted 7 Oct 2018 by Jesse Viviano
Post:
The upload server is out of disk space again and therefore unable to accept uploads.
4) (Message 5983)
Posted 6 Oct 2018 by Jesse Viviano
Post:
Why do you think that this is the wrong computing platform? Maybe the administrator needs to do a hardware upgrade, do a major operating system upgrade, or maybe to vacuum the servers out. We have had times where the storage filled up causing users to not be able to upload work, so maybe the administrator will use this window to add more storage.
5) (Message 5975)
Posted 3 Oct 2018 by Jesse Viviano
Post:
I can confirm that I cannot upload again because the server's disk space is out. Also, the db_purge process is not running again.
6) (Message 5965)
Posted 26 Sep 2018 by Jesse Viviano
Post:
The server status page states that the db_purge process is not running as of this writing. Could this be related to this issue?
7) (Message 5893)
Posted 8 Jul 2018 by Jesse Viviano
Post:
I noticed that the db_purge process in the server status page at https://asteroidsathome.net/boinc/server_status.php as of this writing is marked as "Not Running". Could this be the root cause of the problem?
8) (Message 5892)
Posted 8 Jul 2018 by Jesse Viviano
Post:
I noticed that the db_purge process in the server status page at https://asteroidsathome.net/boinc/server_status.php as of this writing is marked as "Not Running". Could this be the root cause of the problem?
9) (Message 5472)
Posted 9 Sep 2017 by Jesse Viviano
Post:
It is likely that the scheduling server is programmed for Bulldozer family CPUs when it sees AMD, because AVX is slower than SSE3 on AMD Bulldozer and Piledriver CPUs.
10) (Message 5323)
Posted 9 Apr 2017 by Jesse Viviano
Post:
AMD GPUs are not supported because the only GPU app for this project uses legacy CUDA code instead of OpenCL which both Nvidia and AMD GPUs can use.
11) (Message 5215)
Posted 10 Mar 2017 by Jesse Viviano
Post:
The servers have some serious issues that are now being repaired. You will have to wait for the repairs to be finished to allow your BOINC client to declare that it has finished the tasks that it has uploaded to the scheduler.
12) (Message 5162)
Posted 5 Mar 2017 by Jesse Viviano
Post:
The server is full because its assimilator and the process to purge old database entries are down. The rest of the server's processes are up. The server status page is at https://asteroidsathome.net/boinc/server_status.php.
13) (Message 5153)
Posted 4 Mar 2017 by Jesse Viviano
Post:
Telling your mouse utility to use a lower report rate solved this issue for me. Switching the report rate from 1 kHz to 125 Hz (the de facto default report rate for non-gaming USB mice) on my Logitech G502 Proteus Core solved the issue for me.
14) (Message 4048)
Posted 20 Feb 2015 by Jesse Viviano
Post:
I just got another permanent HTTP error, so the problem still exists. I am also getting failed MD5 checks (which means that the file or the MD5 hash is corrupt) and wrong-sized file errors.
15) (Message 4040)
Posted 17 Feb 2015 by Jesse Viviano
Post:
I am starting to think that the Asteroids@home server's file system might be corrupted, or its hard drive might be failing. See http://asteroidsathome.net/boinc/workunit.php?wuid=26733314 Results 0 and 1 failed due to being sent to Android Lollipop devices that can't handle this project's pre-Lollipop-only Android application. Result 2 managed to get all of the required files and completed it successfully. All subsequent results are unable to get one of the required files due to permanent HTTP errors. This makes me suspect either a corrupted file system or a failing hard drive because one of the required files was successfully downloaded and now cannot be downloaded anymore.
16) (Message 4039)
Posted 17 Feb 2015 by Jesse Viviano
Post:
Is the Asteroids@home server's file system corrupted, or is its hard drive failing? See http://asteroidsathome.net/boinc/workunit.php?wuid=26733314 Results 0 and 1 failed due to being sent to Android Lollipop devices that can't handle this project's pre-Lollipop-only Android application. Result 2 managed to get all of the required files and completed it successfully. All subsequent results are unable to get one of the required files due to permanent HTTP errors. This suggests either a corrupted file system or a failing hard drive.
17) (Message 4034)
Posted 16 Feb 2015 by Jesse Viviano
Post:
Your issue is the same issue I wrote about in http://asteroidsathome.net/boinc/forum_thread.php?id=418 that I opened because I realized my issue was a different issue than the original poster's issue. Our issue is that our BOINC clients were directed to download files that do not exist on the server. The issue the original poster wrote about is that the BOINC client is detecting signs of file corruption because the downloaded files are different sizes than their declared files, and therefore declaring download errors.
18) (Message 4028)
Posted 16 Feb 2015 by Jesse Viviano
Post:
A run of bad work units apparently has been generated. I have just downloaded 5 work units. 4 immediately failed with permanent HTTP download errors, which I immediately reported by pushing the Update button once I spotted the errors. 2 of those failures have had multiple failed results from other users as of this writing.
19) (Message 4006)
Posted 10 Feb 2015 by Jesse Viviano
Post:
I was not trying to disparage the AMD or ATI GPUs, but was providing a heads-up because other projects had developers who complained about buggy AMD OpenCL drivers which might be solved by passing the correct OpenCL compiler flags. Nvidia's GPUs from Fermi onwards default to IEEE 754 when running CUDA applications and degrade to gaming precision only if the programmer requests that. (Pre-Fermi Nvidia GPUs are stuck at gaming precision and cannot become IEEE 754 compliant in single precision.)
Personally, I think that OpenCL should have defaulted to IEEE 754 compliance and should require developers to explicitly request gaming accuracy to get faster but less accurate results instead of requiring programmers to explicitly request IEEE 754 accuracy to get it, and feel that this is a fault of OpenCL standard instead of AMD. If I had written the standard, such flags would be required to run on hardware that is incapable of IEEE 754 compliance so developers have no illusions on what they are programming.
20) (Message 4000)
Posted 9 Feb 2015 by Jesse Viviano
Post:
One thing I found out is that AMD only guarantees its cards' accuracy to be equal to the OpenCL specification. The problem with that definition is that single precision floating point divides and square roots are not guaranteed to conform to IEEE 754 specifications according to the OpenCL specifications listed at https://www.khronos.org/registry/cl/ unless the OpenCL version used is at least 1.2, the GPU being targeted is capable of correctly performing divides and square roots according to the IEEE 754 specification, and the compiler flag "-cl-fp32-correctly-rounded-divide-sqrt" is passed to the compiler. (The flag does not exist in OpenCL 1.0 or 1.1.) This last flag forces the GPU to correctly perform divides and square roots if it is capable of performing those operations correctly, and terminate the program before attempting to run it if the GPU is incapable of doing correct IEEE 754 divides or square roots. Without the flag, a GPU is allowed to be off by up to 2.5 units in the last place for divides and up to 3 units in the last place for square roots in the standard profile. This inaccuracy can be crucial to graphics speed, but is not welcome in many scientific computations where accuracy is more important than speed. This matter is moot if your program does not use single precision divides or square roots. These operations are always correctly performed in double precision if double precision math is supported.