AMD OpenCL issues



Message board moderation

To post messages, you must log in.
Previous · 1 · 2 · 3
AuthorMessage
Profile JStateson
Avatar

Send message
Joined: 16 Jan 14
Posts: 17
Credit: 30,520,252
RAC: 4,013
Message 8059 - Posted: 21 Sep 2023, 18:32:40 UTC

Last modified: 21 Sep 2023, 18:33:34 UTC
Seems to be working fine for
s9000 and s9050: 3 validated, 8 pending
Radeon VII: 3 validation pending
ID: 8059 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 23 Apr 21
Posts: 88
Credit: 119,520,561
RAC: 139,251
Message 8060 - Posted: 21 Sep 2023, 19:38:25 UTC - in response to Message 8059.  
Radeon VII: 3 validation pending


i see your RVII system. what's the power draw when running asteroids? and are you running 2x? or just one at a time?

ID: 8060 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Lamberto Vitali

Send message
Joined: 14 Jun 23
Posts: 85
Credit: 5,914
RAC: 0
Message 8069 - Posted: 24 Sep 2023, 16:06:52 UTC
Any news on a new program version Georgi? I'm still having to manually abort a lot of the tasks on the 280X cards (OpenCL 1.2) and a few on the Nano (OpenCL 2.0). Plus faster would be nice if you can fix that memory problem.
ID: 8069 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,518
RAC: 0
Message 8072 - Posted: 25 Sep 2023, 6:56:31 UTC - in response to Message 8069.  
Any news on a new program version Georgi?
Nope.

I'm still having to manually abort a lot of the tasks on the 280X cards (OpenCL 1.2) and a few on the Nano (OpenCL 2.0). Plus faster would be nice if you can fix that memory problem.
Have you tried the AMD Pro drivers as i suggested earlier?
“The good thing about science is that it's true whether or not you believe in it.” ― Neil deGrasse Tyson
ID: 8072 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Lamberto Vitali

Send message
Joined: 14 Jun 23
Posts: 85
Credit: 5,914
RAC: 0
Message 8075 - Posted: 25 Sep 2023, 23:17:10 UTC - in response to Message 8072.  
Any news on a new program version Georgi?
Nope.

I'm still having to manually abort a lot of the tasks on the 280X cards (OpenCL 1.2) and a few on the Nano (OpenCL 2.0). Plus faster would be nice if you can fix that memory problem.
Have you tried the AMD Pro drivers as i suggested earlier?
No, as they're a year older. I'm sure they caused problems with other projects before. Getting those old cards to run stably is difficult enough.

P.S. I don't get this line you draw between "AMD OpenCL Issues" and "New OpenCL application for AMD GPUs". To me they're identical subjects.
ID: 8075 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Lamberto Vitali

Send message
Joined: 14 Jun 23
Posts: 85
Credit: 5,914
RAC: 0
Message 8077 - Posted: 26 Sep 2023, 10:45:59 UTC
Pro driver hasn't helped, I still get them stalling.
ID: 8077 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Lamberto Vitali

Send message
Joined: 14 Jun 23
Posts: 85
Credit: 5,914
RAC: 0
Message 8083 - Posted: 27 Sep 2023, 9:57:18 UTC

Last modified: 27 Sep 2023, 9:57:50 UTC
It seems if the 280X (OpenCL 1.2) cards are failing a lot of tasks, restarting Windows cheers it up a little and reduces the number of failures.

Which suggests something is getting stuck in the card or driver even after a task has aborted.
ID: 8083 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,518
RAC: 0
Message 8084 - Posted: 27 Sep 2023, 18:07:34 UTC - in response to Message 8083.  
That's interesting. Thanks for the feedback.
“The good thing about science is that it's true whether or not you believe in it.” ― Neil deGrasse Tyson
ID: 8084 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ziran

Send message
Joined: 7 Jan 13
Posts: 1
Credit: 1,814,056
RAC: 472
Message 8085 - Posted: 28 Sep 2023, 22:47:43 UTC
Just aborted a task that made 0,1% progress in 13H
https://asteroidsathome.net/boinc/result.php?resultid=402982121
ID: 8085 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Lamberto Vitali

Send message
Joined: 14 Jun 23
Posts: 85
Credit: 5,914
RAC: 0
Message 8086 - Posted: 28 Sep 2023, 23:12:23 UTC - in response to Message 8085.  
Just aborted a task that made 0,1% progress in 13H
https://asteroidsathome.net/boinc/result.php?resultid=402982121
Lucky you, I'm aborting 20 a day. I'm off on holiday for a week, I'll have to switch project for the GPUs, since these stick without user intervention. Hopefully I'll come back to a functional program.
ID: 8086 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,518
RAC: 0
Message 8089 - Posted: 2 Oct 2023, 10:03:34 UTC
Hi guys,
this is about to change with the next release.

Cheers
“The good thing about science is that it's true whether or not you believe in it.” ― Neil deGrasse Tyson
ID: 8089 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile nexiagsi16v

Send message
Joined: 26 Nov 14
Posts: 2
Credit: 1,356,141
RAC: 1,393
Message 8090 - Posted: 2 Oct 2023, 16:50:31 UTC
My system: R9 5900X, 3x RX 5700, Win10 64bit.
Only 3 of 15 valid and one with no progress.

Here a stderr out and i see no problems. Only the 7GB of RAM is not correct. AMD driver is 24.4.3.

Stderr Ausgabe

<core_client_version>7.20.2</core_client_version>
<![CDATA[
<stderr_txt>
BOINC client version 7.20.2
BOINC GPU type 'ATI', deviceId=0, slot=0
Application: period_search_10219_windows_x86_64__opencl_102_amd_win.exe
Version: 102.19.0.0
Platform name: AMD Accelerated Parallel Processing
Platform vendor: Advanced Micro Devices, Inc.
OpenCL device C version: OpenCL C 2.0 | OpenCL 2.0 AMD-APP (3516.0)
OpenCL device Id: 0
OpenCL device name: AMD Radeon RX 5700 7GB
Device driver version: 3516.0 (PAL,LC)
Multiprocessors: 18
Max Samplers: 16
Max work item dimensions: 3
Resident blocks per multiprocessor: 16
Grid dim: 576 = 2 * 18 * 16
Block dim: 128
Binary build log for AMD Radeon RX 5700:
OK (0)
Program build log for AMD Radeon RX 5700:
OK (0)
Prefered kernel work group size multiple: 32
Setting Grid Dim to 256
21:48:00 (12140): called boinc_finish(0)

</stderr_txt>
]]>
ID: 8090 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,518
RAC: 0
Message 8097 - Posted: 5 Oct 2023, 13:36:22 UTC
Hi everyone,

New OpenCL apps with added bug fix for the hanging issue has been released.
I'd like to say thank you to ahorek's team for helping us to solve the problem!

Waiting for some feedback from you all.

Cheers,
Georgi
“The good thing about science is that it's true whether or not you believe in it.” ― Neil deGrasse Tyson
ID: 8097 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile nexiagsi16v

Send message
Joined: 26 Nov 14
Posts: 2
Credit: 1,356,141
RAC: 1,393
Message 8104 - Posted: 9 Oct 2023, 18:08:23 UTC
It looks like you got it. I calculated 20 WU with my RX 5700, none of them got stuck and little by little they are all being validated.
ID: 8104 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bogdan2005

Send message
Joined: 12 Jul 23
Posts: 2
Credit: 1,074,550
RAC: 333
Message 8151 - Posted: 27 Nov 2023, 9:52:49 UTC
Hi Team,

Hope you're all fine.

First-time poster, so apologies if this has been discussed already - tried searching this thread but got no hits regarding Memory Leak.

Running on a Ryzen 7 6800H, Windows 11 fully patched (25997), AMD 23.11.1 - noticed that the OpenCL 102.20 processes eat up all the RAM they can find if they're suspended / resumed or if a Driver Timeout occurs.

The Nvidia CUDA 102.17 processes run without issues.

I can provide a 10 GB Process Dump, if the Admins / Devs can PM me a secure location for upload.


Thank you,

Bogdan
ID: 8151 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,518
RAC: 0
Message 8153 - Posted: 27 Nov 2023, 23:53:05 UTC - in response to Message 8151.  
Hi Bogdan,

Thank you for pointing it out.
We were aware there could be issues with some AMD APUs and we are working on a solution.

We'll keep you posted with our progress on solving that issue.
“The good thing about science is that it's true whether or not you believe in it.” ― Neil deGrasse Tyson
ID: 8153 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bogdan2005

Send message
Joined: 12 Jul 23
Posts: 2
Credit: 1,074,550
RAC: 333
Message 8154 - Posted: 28 Nov 2023, 19:23:26 UTC - in response to Message 8153.  
Hi Georgi,

Thank you for the quick reply and the details you've provided.

Fingers crossed for a quick and efficient solution for the APU issues ;)

Cheers,

Bogdan
ID: 8154 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3