7800xt hangs running OpenCL app


Message boards : Problems and bug reports : 7800xt hangs running OpenCL app

Message board moderation

To post messages, you must log in.
AuthorMessage
Kinjirra

Send message
Joined: 27 Feb 23
Posts: 1
Credit: 1,748,639
RAC: 0
Message 8095 - Posted: 4 Oct 2023, 21:33:24 UTC
Every now and then, im getting a WU that hangs and has an expected run time of around 600days.

Normal WU clear in about 7min.

Any idea's?
ID: 8095 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ahorek's team
Volunteer developer
Volunteer tester

Send message
Joined: 1 Jan 13
Posts: 35
Credit: 5,300,422
RAC: 60,707
Message 8098 - Posted: 5 Oct 2023, 13:49:15 UTC
a new opencl version has been released today that should address the hanging issue. Could you test it and give us feedback? Thanks!
ID: 8098 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
magic_sam

Send message
Joined: 16 Nov 22
Posts: 18
Credit: 6,962,168
RAC: 1,069
Message 8103 - Posted: 8 Oct 2023, 15:37:04 UTC

Last modified: 8 Oct 2023, 15:47:49 UTC
Dear all,

Same issue here: GPU tasks remain at 0,01% with an AMD Radeon RX 7900 XTX on Ubuntu 22.04.3 LTS.

OpenCL drivers are from ROCm 5.7 package.

I'm running the latest 102.18 GPU application.

What am I doing wrong ?

Best regards,

Samuel

EDIT: although I terminated most of the tasks when they were stuck at 0,01%, some of them failed on their own:

https://asteroidsathome.net/boinc/result.php?resultid=406507815

<core_client_version>7.20.5</core_client_version>
<![CDATA[
<message>
process exited with code 2 (0x2, -254)</message>
<stderr_txt>
BOINC client version 7.20.5
BOINC GPU type 'ATI', deviceId=0, slot=28
Application: ../../projects/asteroidsathome.net_boinc/period_search_10218_x86_64-pc-linux-gnu__opencl_101_amd_linux
Version: 102.18.0.0
Platform name: AMD Accelerated Parallel Processing
Platform vendor: Advanced Micro Devices, Inc.
No GPU device found for platform Advanced Micro Devices, Inc.(AMD Accelerated Parallel Processing)

</stderr_txt>
]]>
ID: 8103 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Georgi Vidinski
Volunteer moderator
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 22 Nov 17
Posts: 159
Credit: 13,180,466
RAC: 21
Message 8109 - Posted: 12 Oct 2023, 4:02:40 UTC - in response to Message 8103.  
Dear all,

Same issue here: GPU tasks remain at 0,01% with an AMD Radeon RX 7900 XTX on Ubuntu 22.04.3 LTS.

OpenCL drivers are from ROCm 5.7 package.

I'm running the latest 102.18 GPU application.

What am I doing wrong ?

Best regards,

Samuel

EDIT: although I terminated most of the tasks when they were stuck at 0,01%, some of them failed on their own:

https://asteroidsathome.net/boinc/result.php?resultid=406507815

<core_client_version>7.20.5</core_client_version>
<![CDATA[
<message>
process exited with code 2 (0x2, -254)</message>
<stderr_txt>
BOINC client version 7.20.5
BOINC GPU type 'ATI', deviceId=0, slot=28
Application: ../../projects/asteroidsathome.net_boinc/period_search_10218_x86_64-pc-linux-gnu__opencl_101_amd_linux
Version: 102.18.0.0
Platform name: AMD Accelerated Parallel Processing
Platform vendor: Advanced Micro Devices, Inc.
No GPU device found for platform Advanced Micro Devices, Inc.(AMD Accelerated Parallel Processing)

</stderr_txt>
]]>


Could you please post here the output form the clinfo?
“The good thing about science is that it's true whether or not you believe in it.” ― Neil deGrasse Tyson
ID: 8109 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
magic_sam

Send message
Joined: 16 Nov 22
Posts: 18
Credit: 6,962,168
RAC: 1,069
Message 8110 - Posted: 12 Oct 2023, 11:27:35 UTC - in response to Message 8109.  
Hello,

Here's the output from clinfo:

root@kymera:~# clinfo 
Number of platforms:				 1
  Platform Profile:				 FULL_PROFILE
  Platform Version:				 OpenCL 2.1 AMD-APP (3590.0)
  Platform Name:				 AMD Accelerated Parallel Processing
  Platform Vendor:				 Advanced Micro Devices, Inc.
  Platform Extensions:				 cl_khr_icd cl_amd_event_callback 


  Platform Name:				 AMD Accelerated Parallel Processing
Number of devices:				 1
  Device Type:					 CL_DEVICE_TYPE_GPU
  Vendor ID:					 1002h
  Board name:					 Radeon RX 7900 XTX
  Device Topology:				 PCI[ B#3, D#0, F#0 ]
  Max compute units:				 48
  Max work items dimensions:			 3
    Max work items[0]:				 1024
    Max work items[1]:				 1024
    Max work items[2]:				 1024
  Max work group size:				 256
  Preferred vector width char:			 4
  Preferred vector width short:			 2
  Preferred vector width int:			 1
  Preferred vector width long:			 1
  Preferred vector width float:			 1
  Preferred vector width double:		 1
  Native vector width char:			 4
  Native vector width short:			 2
  Native vector width int:			 1
  Native vector width long:			 1
  Native vector width float:			 1
  Native vector width double:			 1
  Max clock frequency:				 2371Mhz
  Address bits:					 64
  Max memory allocation:			 21890072576
  Image support:				 Yes
  Max number of images read arguments:		 128
  Max number of images write arguments:		 8
  Max image 2D width:				 16384
  Max image 2D height:				 16384
  Max image 3D width:				 16384
  Max image 3D height:				 16384
  Max image 3D depth:				 8192
  Max samplers within kernel:			 16
  Max size of kernel argument:			 1024
  Alignment (bits) of base address:		 1024
  Minimum alignment (bytes) for any datatype:	 128
  Single precision floating point capability
    Denorms:					 Yes
    Quiet NaNs:					 Yes
    Round to nearest even:			 Yes
    Round to zero:				 Yes
    Round to +ve and infinity:			 Yes
    IEEE754-2008 fused multiply-add:		 Yes
  Cache type:					 Read/Write
  Cache line size:				 64
  Cache size:					 32768
  Global memory size:				 25753026560
  Constant buffer size:				 21890072576
  Max number of constant args:			 8
  Local memory type:				 Scratchpad
  Local memory size:				 65536
  Max pipe arguments:				 16
  Max pipe active reservations:			 16
  Max pipe packet size:				 415236096
  Max global variable size:			 21890072576
  Max global variable preferred total size:	 25753026560
  Max read/write image args:			 64
  Max on device events:				 1024
  Queue on device max size:			 8388608
  Max on device queues:				 1
  Queue on device preferred size:		 262144
  SVM capabilities:				 
    Coarse grain buffer:			 Yes
    Fine grain buffer:				 Yes
    Fine grain system:				 No
    Atomics:					 No
  Preferred platform atomic alignment:		 0
  Preferred global atomic alignment:		 0
  Preferred local atomic alignment:		 0
  Kernel Preferred work group size multiple:	 32
  Error correction support:			 0
  Unified memory for Host and Device:		 0
  Profiling timer resolution:			 1
  Device endianess:				 Little
  Available:					 Yes
  Compiler available:				 Yes
  Execution capabilities:				 
    Execute OpenCL kernels:			 Yes
    Execute native function:			 No
  Queue on Host properties:				 
    Out-of-Order:				 No
    Profiling :					 Yes
  Queue on Device properties:				 
    Out-of-Order:				 Yes
    Profiling :					 Yes
  Platform ID:					 0x7fcf391edf90
  Name:						 gfx1100
  Vendor:					 Advanced Micro Devices, Inc.
  Device OpenCL C version:			 OpenCL C 2.0 
  Driver version:				 3590.0 (HSA1.1,LC)
  Profile:					 FULL_PROFILE
  Version:					 OpenCL 2.0 
  Extensions:					 cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program 


What am I doing wrong ?

Best regards, Samuel
ID: 8110 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Problems and bug reports : 7800xt hangs running OpenCL app