Posts by ahorek's team

1) Message boards : Number crunching : What is this again? (Message 9060) Posted 19 hours ago by ahorek's team Post: I don't believe there's a deliberate limit. The work buffer may be just too busy due to the high volume of short tasks. We've told Radim about the issue. Hopefully, he’ll be able to fix it soon...
2) Message boards : Number crunching : giving up on download of ... (Message 9051) Posted 9 days ago by ahorek's team Post: The server had some connectivity issues over the last few days. I had a few failures too, but they didn’t happen often. Anyway, the server currently doesn't have available tasks to send. https://asteroidsathome.net/boinc/result.php?resultid=586941665 I also had random 403 errors on the website, so it could be related (now it's back to normal). Let’s hope the next work batch goes more smoothly...
3) Message boards : Number crunching : HTTP service unavailable (Message 9032) Posted 23 days ago by ahorek's team Post: > You need to use the unpublished application that is available from the project codebase and build it yourself and deploy it via the anonymous platform on your host. I can't recommend that because your wingman also needs the updated app. Otherwise, the task will fail and eventually error out and be marked invalid. Those WUs with LC Points >2000 should be cancelled by the server and reprocessed later, once the new app is officially released.
4) Message boards : News : New AVX512 application released (Message 8997) Posted 9 Jun 2025 by ahorek's team Post: > Has the AVX512 application been definitively discontinued? No, it has not. https://asteroidsathome.net/boinc/forum_thread.php?id=1123#8960
5) Message boards : Number crunching : Times Wu (Message 8980) Posted 16 May 2025 by ahorek's team Post: It's ultimately about the software. While you can compare the RTX 4060 and RX 9700 XT based on FLOPS in areas like FP32 for peak performance, the actual performance can vary depending on the specific workload. you can check some compute benchmarks here https://www.phoronix.com/review/amd-radeon-rx9070-linux-compute/3 Software and driver optimizations can certainly help, and the Asteroids GPU app currently doesn’t scale well on high-end GPUs due to memory bottlenecks rather then compute power, so there’s definitely room for improvement on the software side for both AMD/NVidia cards.
6) Message boards : Number crunching : Times Wu (Message 8975) Posted 10 May 2025 by ahorek's team Post: Zen 4 offers some enhancements over Zen 3, higher clock speeds, a larger L2 cache, and support for AVX-512 (limited to 2×256-bit execution). The power efficiency is also better thanks to 7->5nm process node. You can see up to 30% improvement in some apps, but in most cases, the gains are only a few percent, primarily due to the increased clock speeds. If you expected more, you'll be disappointed... The days when each new CPU architecture brought a 2× performance leap are long gone.
7) Message boards : Unix/Linux : ARM NEON version for Raspberry (Message 8969) Posted 30 Apr 2025 by ahorek's team Post: > Can we get an application that makes use of a Nvidia GPU under ARM64 too? Tegras were popular, but the platform is now outdated. It lacks OpenCL support, and NVIDIA doesn't provide cross-compilation tools for CUDA 10. It's possible to do, but the app would be restricted to just a couple of Tegra device models and would be difficult to maintain, as the framework is already EOL. Keith tested it and it takes about 4 hours per work unit on a GPU: https://asteroidsathome.net/boinc/result.php?resultid=566363744 We could add support for NVIDIA Orin devices, but there likely won't be an official app for Tegras. Sorry
8) Message boards : Number crunching : AVX-512 still supported? (Message 8961) Posted 24 Apr 2025 by ahorek's team Post: No, it hasn’t been dropped. We’ve transitioned to a single universal binary that detects available instructions at runtime and selects the optimal one based on your CPU’s capabilities. You can verify which version was used by checking the task details, for example: https://asteroidsathome.net/boinc/result.php?resultid=572657726 Using AVX512 SIMD optimizations.
9) Message boards : Wish list : Android Adreno GPU support (Message 8957) Posted 18 Apr 2025 by ahorek's team Post: Ardeno lacks FP64 support, so double-precision calculations need to be emulated, which involves modifying the application. This will likely take considerable effort, and the GPU version will almost certainly be MUCH SLOWER than the CPU app.
10) Message boards : Number crunching : Nvidia GPU tasks versus CPU tasks (Message 8954) Posted 16 Apr 2025 by ahorek's team Post: Only one, in other words, GPU and CPU tasks are the same. Asteroids tasks are around 10 times faster on a reasonable GPU than a single CPU core, but since modern CPUs have many cores, using CPUs is currently more efficient overall. Some applications can be 1000+ times faster on a GPU. The raw processing power is significantly higher, but writing software that fully leverages that power is also much more challenging. While you can attempt to run more tasks in parallel on a GPU, it usually won't lead to any performance improvement. 2 tasks running in parallel on the same GPU will both just be 2 times slower.
11) Message boards : Unix/Linux : AMD GPU Computation Error (Message 8952) Posted 16 Apr 2025 by ahorek's team Post: Is there a "rusticl.icd" file in your OpenCL vendor list? ls /etc/OpenCL/vendors ls /usr/share/OpenCL/vendors 1/ If the file is missing, your Mesa version may be compiled without rusticl support (it depends on the distribution, and I'm not familiar with the GNOME linux) 2/ you should at least see rusticl as a platform without available devices in case the GPU isn't supported Unfortunately, AMD has opted not to support OpenCL on Linux for integrated GPUs like yours, and without functional drivers, OpenCL applications just won’t run.
12) Message boards : Unix/Linux : AMD GPU Computation Error (Message 8950) Posted 15 Apr 2025 by ahorek's team Post: Verify if your GPU is supported by running clinfo (you might need to set environment variables to activate it). The output should look like this: Number of platforms 1 Platform Name rusticl Platform Vendor Mesa/X.org Platform Version OpenCL 3.0 Platform Profile FULL_PROFILE Platform Extensions cl_khr_icd... Platform Extensions with Version cl_khr_icd 0x400000 (1.0.0) ... Platform Numeric Version 0xc00000 (3.0.0) Platform Extensions function suffix MESA Platform Host timer resolution 1ns Platform Name rusticl Number of devices 1 Device Name AMD Radeon Graphics (radeonsi, raphael_mendocino, LLVM 20.1.0, DRM 3.61, 6.14.1-1561.native) Device Vendor AMD Device Vendor ID 0x1002 Device Version OpenCL 3.0 Device UUID 00000000-1300-0000-0000-000000000000 Driver UUID 414d442d-4d45-5341-2d44-525600000000 Valid Device LUID No Device LUID 0000-000000000000 Device Node Mask 0 Device Numeric Version 0xc00000 (3.0.0) Driver Version 25.1.0-devel ... Primegrid and Einstein apps should work. Asteroids should be supported in theory, but there are some issues that need to be resolved first. Simply enabling it isn’t sufficient, as the app was either crashing or producing invalid results.
13) Message boards : Unix/Linux : AMD GPU Computation Error (Message 8948) Posted 14 Apr 2025 by ahorek's team Post: btw RustiCL may already be supported on your system, you can check it: export RUSTICL_ENABLE=radeonsi export RUSTICL_FEATURES=fp64 clinfo However, even if you enable it, the current Asteroids app won’t recognize it. The restriction will be removed in the next release, though there's still a low chance it will work correctly.
14) Message boards : Unix/Linux : AMD GPU Computation Error (Message 8947) Posted 14 Apr 2025 by ahorek's team Post: That's because you're using Clover, which has already been deprecated. Those drivers were never usable for more than "detecting the GPU name"... https://www.phoronix.com/news/RadeonSI-Rusticl-Only For OpenCL, you'll need a proper AMD driver (ROCM). Unfortunately, it seems that some iGPUs like AMD Ryzen™ 5 5600G are only supported on Windows https://www.amd.com/en/support/downloads/drivers.html/processors/ryzen/ryzen-5000-series/amd-ryzen-5-5600g.html So, if you want to use the iGPU, you’ve only got two options: switch to Windows or use RustiCL, but RustiCL still has some issues, and even if it works, it could be slower than proprietary drivers. The last time I checked, the Asteroids app didn’t run properly on my iGPU, but you’ll still have a much better shot at getting OpenCL things working on Linux with RustiCL than with Clover.
15) Message boards : Problems and bug reports : Driver Timeout Error - AMD Stoney Ridge APU with R5 GPU (Message 8943) Posted 11 Apr 2025 by ahorek's team Post: Pushing the GPU beyond its limits may lead to GUI unresponsiveness, and a malfunctioning application could cause the system to freeze, and a hard reset will be the only option to recover it. Windows protection monitors your GPU and resets the driver if such issues occur, but resetting the GPU will cause the task to fail. With very slow GPUs, you're more likely to encounter timeout crashes since the default timeout value is the same across all GPUs, regardless of their performance. Ideally, the application should assess your GPU's capabilities and process smaller data chunks. This would improve GUI responsiveness and prevent timeouts, although it will also slow down the application. Each GPU model is unique, making it challenging to find the right balance that consistently works across all systems, and splitting work into smaller pieces isn't always as simple as it sounds. Given that your GPU technically has the capability to run those tasks, it still processes them slower than most 20-year-old CPUs. If you're still set on using it on Windows, despite the insane inefficiency, increasing the timeout could be a reasonable workaround.
16) Message boards : Problems and bug reports : Driver Timeout Error - AMD Stoney Ridge APU with R5 GPU (Message 8938) Posted 9 Apr 2025 by ahorek's team Post: hi, the GPU is extremely slow. Old architecture, 3 cores, and shared memory. The problem is that it triggers Windows protection against GPU freezes. You can probably fix it by increasing the timeout https://manual.notch.one/0.9.23/en/docs/faq/extending-gpu-timeout-detection/ but I would rather recommend not to use the GPU at all. Even the Bulldozer cores are more efficient, and utilizing the iGPU will just slow it down.
17) Message boards : Number crunching : Times Wu (Message 8936) Posted 8 Apr 2025 by ahorek's team Post: > Can you use the IA tensor cores of nvidia to program ASIC of asteroids at home? No, tensor cores do support low-precision data types like FP16, they're specifically built for AI workloads. The Asteroids app depends on FP64, and lower precisions aren't good enough. That’s also why none of the existing BOINC projects make use of tensor cores (or NPUs); they're not suitable for scientific computing. Like ASICs, tensor cores are highly specialized, great for AI, but pretty much useless for most scientific applications.
18) Message boards : Number crunching : Times Wu (Message 8931) Posted 7 Apr 2025 by ahorek's team Post: Note that FP32 represents peak performance under ideal (unrealistic) conditions. Real-world applications are more complex than simply multiplying two numbers. As a result, newer architectures can outperform older GPUs in certain applications, even if their FP32 performance is lower because they can utilize available resources more efficiently. The numbers can give you a rough performance estimate, but it's always best to test with the specific application you plan to run.
19) Message boards : Number crunching : Times Wu (Message 8930) Posted 7 Apr 2025 by ahorek's team Post: PCIe bandwidth is mostly important for SSD performance or when games exceed available GPU memory. In typical scenarios, it's primarily used during data transfers to the GPU, like when loading a game level. Compute tasks, on the other hand, spend minimal time on data loading. Most of their time is spent processing data that's already resident in GPU memory. Faster is always better in theory, but if data transfers only account for 1% of the total workload, speeding them up even 10 times won’t make a noticeable difference. what doesn't matter: * pcie bandwidth * tensor / AI / wmma - no boinc app utilize them * FP16, INT4... - despite great perf numbers, no boinc app utilize them * TMUs (TMU * clock speed = Texture Rate) - only relevant for games * ROPs (ROP * clock speed = Pixel Rate) - only relevant for games * VRAM capacity - A larger size won’t improve performance, and most GPUs have more than enough capacity to handle BOINC projects. what matters: * GPU vendor / architecture in general * cores - but meaningful comparisons can only be made within the same architecture and vendor * FP32 * FP64 * INT32 * cache * memory bandwidth * clock speed Those are attributes you should be looking for, but just like with games, performance can vary. Some projects see greater gains from high clock speeds, while others rely more on memory bandwidth. It depends on the type of computations and how well the application is optimized.
20) Message boards : Number crunching : About code (Message 8929) Posted 7 Apr 2025 by ahorek's team Post: You can find the latest code in the "dev" branch. period_search_optimization_simd (cpu) period_search_opencl_amd (opencl) period_search_cuda (cuda) If you're trying to understand what the code does, it's easier to begin with the simpler version that doesn't include any SIMD or GPU optimizations. There's no guide or comprehensive documentation, but you can find some resources about the math behind here https://dspace.cuni.cz/bitstream/handle/20.500.11956/124674/130299758.pdf?sequence=1&isAllowed=y (czech only) https://www.issfd.org/ISSFD_2014/ISSFD24_Paper_S10-3_Bradley.pdf The app uses well-known algorithms like https://en.wikipedia.org/wiki/Einstein_notation https://en.wikipedia.org/wiki/Levenberg%E2%80%93Marquardt_algorithm https://en.wikipedia.org/wiki/Gaussian_elimination > i think this IA can write the code c... in python? AI can explain what each part does and write a similar code in Python for educational purposes, but you still need some math+dev background to understand it. > Can i see the code run with asembler or xdbg debugger Sure, if you can compile it... However, the assembly output won't provide much insight into what the code does.

Next 20