Amd Cards



Message board moderation

To post messages, you must log in.
AuthorMessage
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,466
RAC: 11
Message 7938 - Posted: 14 Sep 2023, 20:54:36 UTC
Brief update:

Windows - AMD OpenCL app v102.17 seams to be the most stable of all the others. Keep sending your feedback. It may help us to polish issues (or at least some of them) within the next build. Yet, the app may not work on some GPUs. Keep on mind that OpenCL apps can be highly affected by the drivers in use and there is no universal cure (version, vendor, etc.).

Linux - AMD OpenCL app v102.17 is ready. Waiting for Kyong to release it. Here are some details for the system on which it was built:
Ubuntu 20.04.1 LTS   Release:  20.04
ldd (Ubuntu GLIBC 2.31-0ubuntu9) 2.31

I hope this will do, as I can't go lower with versions.
And again, keep sending your feedback please.
By the way it will be interesting to know if the app woks on Mesa platforms.

As to the other questions

Have you deliberately set it to use about half the GPU RAM?

No. The amount of allocated memory will be different for every GPU model.

Do you use something like this to run CUDA code on AMD GPUs?

No. The code was written from scratch.

Cheers!
Georgi
“The good thing about science is that it's true whether or not you believe in it.” ― Neil deGrasse Tyson
ID: 7938 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Lamberto Vitali

Send message
Joined: 14 Jun 23
Posts: 76
Credit: 5,914
RAC: 0
Message 7939 - Posted: 14 Sep 2023, 23:30:31 UTC

Last modified: 14 Sep 2023, 23:33:00 UTC
V17 is running perfectly on R9 Nano (OpenCL 2.0). Every task works and takes about 20 minutes.

However it fails to run on R9 280X (OpenCL 1.2).

And on the Nano it's much slower than my CPU, whereas on other projects the GPU is way faster. A lot of speed increase will be required to make it worth using GPUs here. My Ryzen 9 3900X can do 24 of them in 50 minutes, which is almost 10 times more throughput.
ID: 7939 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,466
RAC: 11
Message 7940 - Posted: 15 Sep 2023, 2:58:44 UTC - in response to Message 7939.  
However it fails to run on R9 280X (OpenCL 1.2).


You may try again on this one as the final app v102.17 is a different build that the previous buggy (Version: 102.18.1.0) one.
You should be able to see
Version: 102.18.5.0
in Stderr output of the tasks.

Let me know how was it.
“The good thing about science is that it's true whether or not you believe in it.” ― Neil deGrasse Tyson
ID: 7940 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Lamberto Vitali

Send message
Joined: 14 Jun 23
Posts: 76
Credit: 5,914
RAC: 0
Message 7942 - Posted: 15 Sep 2023, 4:17:12 UTC - in response to Message 7940.  

Last modified: 15 Sep 2023, 4:25:04 UTC
No, I see "Version: 102.18.1.0" (on the bad and good cards), even though Boinc lists it as 102.17, for example:

https://asteroidsathome.net/boinc/result.php?resultid=401167874

It also says:
"The system cannot find the file specified.
(0x2) - exit code 2 (0x2)</message>"
ID: 7942 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,466
RAC: 11
Message 7943 - Posted: 15 Sep 2023, 5:36:30 UTC - in response to Message 7942.  
I see. The executable is definitely the wrong one. I'll talk to Kyong.

Will keep you posted.
“The good thing about science is that it's true whether or not you believe in it.” ― Neil deGrasse Tyson
ID: 7943 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Lamberto Vitali

Send message
Joined: 14 Jun 23
Posts: 76
Credit: 5,914
RAC: 0
Message 7944 - Posted: 15 Sep 2023, 8:03:35 UTC

Last modified: 15 Sep 2023, 8:10:19 UTC
From my stderr output (does that stand for standard error?!):

Multiprocessors: 64
Max Samplers: 16
Max work item dimensions: 3
Resident blocks per multiprocessor: 16
Grid dim: 2048 = 2 * 64 * 16
Block dim: 128
Looking up the specs for my card:

Shading Units 4096
TMUs 256
ROPs 64
Compute Units 64
I have no idea what a ROP or TMU is, but can I assume from "Multiprocessors: 64" in stderr you're using the 64 compute units? Folding@Home uses the shaders apparently. Not sure if that's possible, it may depend on what calculations you're doing, but I'm guessing 4096 shaders is faster than 64 compute units.
ID: 7944 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Krümel

Send message
Joined: 12 May 13
Posts: 5
Credit: 943,335
RAC: 58
Message 7945 - Posted: 15 Sep 2023, 10:08:05 UTC - in response to Message 7944.  
My BOINC-Client ist still using the old 102.15 App Version.
How can I force the client to load the new one?
ID: 7945 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Lamberto Vitali

Send message
Joined: 14 Jun 23
Posts: 76
Credit: 5,914
RAC: 0
Message 7946 - Posted: 15 Sep 2023, 10:32:35 UTC - in response to Message 7945.  
My BOINC-Client ist still using the old 102.15 App Version.
How can I force the client to load the new one?
Abort the tasks you have and it will download the new one. The version is tied to the task.
ID: 7946 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Krümel

Send message
Joined: 12 May 13
Posts: 5
Credit: 943,335
RAC: 58
Message 7947 - Posted: 15 Sep 2023, 10:35:12 UTC - in response to Message 7946.  
Abort the tasks you have and it will download the new one. The version is tied to the task.


I had no old tasks, when I started Asteroids today.
The BOINC manager pult a fresh one but with the old App-Version. :(
ID: 7947 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Lamberto Vitali

Send message
Joined: 14 Jun 23
Posts: 76
Credit: 5,914
RAC: 0
Message 7948 - Posted: 15 Sep 2023, 10:54:39 UTC - in response to Message 7947.  

Last modified: 15 Sep 2023, 10:56:36 UTC
Abort the tasks you have and it will download the new one. The version is tied to the task.


I had no old tasks, when I started Asteroids today.
The BOINC manager pult a fresh one but with the old App-Version. :(
You're using Linux, perhaps it's on a different version to Windows? They're having different problems with each OS!
ID: 7948 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
magic_sam

Send message
Joined: 16 Nov 22
Posts: 18
Credit: 6,962,168
RAC: 534
Message 7949 - Posted: 15 Sep 2023, 12:49:35 UTC
Hi all,

I'm still getting jobs for 102.15, and errors as a result.

I believe the version has not been properly updated for Linux:

https://asteroidsathome.net/boinc/apps.php

sched_reply_asteroidsathome.net_boinc.xml still refers to version 10215. I tried to manually wget version 102.17 from https://asteroidsathome.net/boinc/downloads but the file couldn't be found.

Best regards,

Samuel
ID: 7949 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,466
RAC: 11
Message 7950 - Posted: 15 Sep 2023, 14:24:53 UTC
Be patient guys.
As you can see from here https://asteroidsathome.net/boinc/apps.php those apps are not released yet.
Kyong will not be available till later today.
“The good thing about science is that it's true whether or not you believe in it.” ― Neil deGrasse Tyson
ID: 7950 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,466
RAC: 11
Message 7951 - Posted: 15 Sep 2023, 20:10:32 UTC
Latest versions has been released.
“The good thing about science is that it's true whether or not you believe in it.” ― Neil deGrasse Tyson
ID: 7951 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Krümel

Send message
Joined: 12 May 13
Posts: 5
Credit: 943,335
RAC: 58
Message 7952 - Posted: 15 Sep 2023, 21:45:46 UTC - in response to Message 7951.  
Latest Version craches with integrated Vega11

<core_client_version>7.18.1</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)</message>
<stderr_txt>
BOINC client version 7.18.1
BOINC GPU type 'ATI', deviceId=0, slot=13
Application: ../../projects/asteroidsathome.net_boinc/period_search_10217_x86_64-pc-linux-gnu__opencl_101_amd_linux
Version: 102.17.0.0
Platform name: Clover
Platform vendor: Mesa
OpenCL device C version: OpenCL C 1.1 | OpenCL 1.1 Mesa 23.1.7 - kisak-mesa PPA
OpenCL device Id: 0
OpenCL device name: AMD Radeon Vega 11 Graphics (raven, LLVM 15.0.7, DRM 3.49, 6.2.0-32-generic) 5GB
Device driver version: 23.1.7 - kisak-mesa PPA
Multiprocessors: 11
Max Samplers: 32
Max work item dimensions: 3
Resident blocks per multiprocessor: 32
Grid dim: 704 = 2 * 11 * 32
Block dim: 128
Binary build log for AMD Radeon Vega 11 Graphics (raven, LLVM 15.0.7, DRM 3.49, 6.2.0-32-generic):
OK (0)
Program build log for AMD Radeon Vega 11 Graphics (raven, LLVM 15.0.7, DRM 3.49, 6.2.0-32-generic):
OK (0)
Prefered kernel work group size multiple: 64
Setting Grid Dim to 256
amdgpu: The CS has been rejected (-125), but the context isn't robust.
amdgpu: The process will be terminated.
ID: 7952 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 23 Apr 21
Posts: 70
Credit: 58,228,079
RAC: 495,095
Message 7953 - Posted: 15 Sep 2023, 23:49:46 UTC - in response to Message 7952.  

Last modified: 15 Sep 2023, 23:54:48 UTC
Latest Version craches with integrated Vega11

<core_client_version>7.18.1</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)</message>
<stderr_txt>
BOINC client version 7.18.1
BOINC GPU type 'ATI', deviceId=0, slot=13
Application: ../../projects/asteroidsathome.net_boinc/period_search_10217_x86_64-pc-linux-gnu__opencl_101_amd_linux
Version: 102.17.0.0
Platform name: Clover
Platform vendor: Mesa
OpenCL device C version: OpenCL C 1.1 | OpenCL 1.1 Mesa 23.1.7 - kisak-mesa PPA
OpenCL device Id: 0
OpenCL device name: AMD Radeon Vega 11 Graphics (raven, LLVM 15.0.7, DRM 3.49, 6.2.0-32-generic) 5GB
Device driver version: 23.1.7 - kisak-mesa PPA
Multiprocessors: 11
Max Samplers: 32
Max work item dimensions: 3
Resident blocks per multiprocessor: 32
Grid dim: 704 = 2 * 11 * 32
Block dim: 128
Binary build log for AMD Radeon Vega 11 Graphics (raven, LLVM 15.0.7, DRM 3.49, 6.2.0-32-generic):
OK (0)
Program build log for AMD Radeon Vega 11 Graphics (raven, LLVM 15.0.7, DRM 3.49, 6.2.0-32-generic):
OK (0)
Prefered kernel work group size multiple: 64
Setting Grid Dim to 256
amdgpu: The CS has been rejected (-125), but the context isn't robust.
amdgpu: The process will be terminated.


i think your opencl version is too old, 1.1 is ancient. Mesa is generally not well suited for real compute loads. Many people have issues using Mesa with BOINC

try installing the real AMD drivers. you might have to back up to an older OS/kernel though. or try the ROCm drivers, but i think ROCm 5.6? was the last to support Vega

ID: 7953 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,466
RAC: 11
Message 7955 - Posted: 16 Sep 2023, 4:10:14 UTC

Last modified: 16 Sep 2023, 5:43:12 UTC
It's not exactly that.
The linux app is OpenCL 1.1 compliant (in respect to the Windows app, which is OpenCL 1.2). I've tested it on my Arch Linux system using clover-mesa drivers and it was working as a charm. Well, as you remember I was using latest libraries for everything, especially I had to built the clover-mesa from the latest source in their gitlab branch. The application does not use any Image nor Texture manipulation functions, wich are the Achilles heel in most of the cases with Clover.

But as I mentioned previously, integrated graphics needs special approach and will need additional research:

It still may have issues with integrated AMD Graphics though. Those CPU based Graphics needs different memory alignment. Unfortunately there is now way for us to distinct them at project level from discrete GPUs.
So, those of you who have such systems may want to restrict their use for now using cc_congif.xml and
<exclude_gpu>
tag (Client configuration).
At least until we handle their specs through the code.

Georgi


As your graphics are GCN5 you definitely have to move towards ROCm.
Clover will be retired in the next few months:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19385
“The good thing about science is that it's true whether or not you believe in it.” ― Neil deGrasse Tyson
ID: 7955 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,466
RAC: 11
Message 7956 - Posted: 16 Sep 2023, 4:36:50 UTC
Check this out. This could put some enlightenment to the issue with

amdgpu: The CS has been rejected (-125), but the context isn't robust.
amdgpu: The process will be terminated.

Amdgpu timeout with OpenCl kernel (reproducible)

Random GPU crashes amdgpu across 2 different GPUs

Send your feed back on your progress with that.
“The good thing about science is that it's true whether or not you believe in it.” ― Neil deGrasse Tyson
ID: 7956 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Lamberto Vitali

Send message
Joined: 14 Jun 23
Posts: 76
Credit: 5,914
RAC: 0
Message 7962 - Posted: 16 Sep 2023, 7:20:35 UTC
The new version in Windows is now working on my older OpenCL 1.2 cards (R9 280X).

Just need more speed, nothing is getting to full temperature. How much faster do you expect to make them go eventually?
ID: 7962 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Lamberto Vitali

Send message
Joined: 14 Jun 23
Posts: 76
Credit: 5,914
RAC: 0
Message 7963 - Posted: 16 Sep 2023, 8:11:01 UTC

Last modified: 16 Sep 2023, 8:13:41 UTC
My R9 Nano and R9 280X cards are both taking the same 20 minutes to complete a task. This is odd considering the Nano is exactly twice as fast by specs and by other projects.

Are you perhaps limiting it to so many computing units, and the Nano is only usually faster because it has more of them?
ID: 7963 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,466
RAC: 11
Message 7964 - Posted: 16 Sep 2023, 9:58:24 UTC
Good to hear it's working.

At the moment there is an issue utilizing too many cores and memory allocation, so yes, there is a limitation.
I've ordered one RX 580. When it arrives I'll try to find what is wrong with kernel execution on bigger grid. Can't say much more for now.
“The good thing about science is that it's true whether or not you believe in it.” ― Neil deGrasse Tyson
ID: 7964 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote