Posts by (retired account)

1) (Message 3158)
Posted 6 Jun 2014 by Profile (retired account)
Post:
Yep, exactly. You could also connect to Asteroids with one of the old ones, if that was still an option.
2) (Message 3153)
Posted 6 Jun 2014 by Profile (retired account)
Post:
Did you calculate Asteroids on your old rig, too? In that case you should have used the same account again. But I guess you want to consolidate statistics with some other projects instead, right?

In the latter case you should use the same email address on all projects and also calculate some workunits from the other projects (with the old accounts, of course) on your new rig. That should sync the cpid again (could take some days, however).

Usually it's not necessary to add all projects to the new computer. For example if you have two PCs and three projects A, B and C, it's usually sufficient to run A and B on the first and B and C on the second computer to keep the cpid in sync. Even if the first PC (or project A) is no longer existent.
3) (Message 3138)
Posted 2 Jun 2014 by Profile (retired account)
Post:
both jobs were completed and validated by two other users before they were sent to me...


Rather validated after they were sent to you. One of the previous users did not report back before the deadline. This is why you got a third copy. Then the late user reported a valid result and your calculation on this work package was no longer necessary. When your computer contacted the server again, it checked if calculation on the redundant tasks had already begun and if not canceled them.
4) (Message 3130)
Posted 31 May 2014 by Profile (retired account)
Post:
"For GPUs, 64 bit does not mean faster or better in any way, only can access more memory"


Then it makes no sense to provide a true 64bit app, if you don't need the larger memory space, right? The project could still provide a renamed 32bit app for the 64bit Windows flavors.
5) (Message 3129)
Posted 31 May 2014 by Profile (retired account)
Post:
WU : input_22147_73_short.wu
(...)
period_search_10210_windows_x86_64_bd_fma4_gcc.exe -verb -nog :
Elapsed 220.928 secs, speedup: 22.77% ratio: 1.29x
CPU 217.356 secs, speedup: 23.20% ratio: 1.30x
(...)
CPU: AMD-A10-7700K, std clock 3.4GHz, DDR3-1866MHz, hostid=88984


I'm surprised how fast this Steamroller-Kaveri is compared to my 4.0 GHz Piledriver-Vishera. 221 vs. 216 secs elapsed time with FMA4. Are you sure it ran at 3.4 GHz and not with turbo frequency 3.8 GHz for example? Otherwise AMD must have done a hell of a job with Steamroller improvements here. ;)
6) (Message 3128)
Posted 31 May 2014 by Profile (retired account)
Post:
During the last week I made another benchmark, this time under a more realistic setting:

- I used ten WUs from the current batch (well, last weeks batch to be more specific: 150893_1, 150893_12, 150893_2, 150893_28, 150893_29, 150893_3 150893_30, 150893_31, 150894_4 and 150894_5) and calculated the average from the elapsed time speedups again. The minimum and maximum speedups are also included below.
- the FX-8350 was again running at 4.0 GHz (no turbo, no throttling)
- within BOINC another three tasks with Crunch3r's FMA4 app were running concurrently (using an app_config.xml and the 'mode noBS' switch of the benchmark package)
- this time the reference app was period_search_10210_windows_x86_64__sse2.exe, the fastest stock app from the first run, so the baseline was 'higher' than in the first benchmark run

By using ten test WUs and running three BOINC WUs concurrently I guess I got some more realistic figures here. One thing to be noted is that some workunits tend to run a bit faster with the SSE3 app while others are faster with the SSE2 app. I noticed the same during another benchmark with one of my intels (Ivy Bridge i7). However, in both cases the differences are minimal, so it doesn't matter much if you run 64bit SSE2 or SSE3. YMMV.

Results:

32bit plain: -130.33% avg. (max. -124.79%; min. -135.50%)
64bit SSE3: +0.13% avg. (max. +1.82%; min. -1.56%)
64bit AVX: -32.97% avg. (max. -31.17%; min. -34.81%)
64bit FMA4: +10.91% avg. (max. +12.35%; min. +9.68%)

Again a significant speedup of approx. 10% with the FMA4 app and no big difference between 64bit SSE2 and 64bit SSE3. AVX is out of the game again.
7) (Message 3119)
Posted 30 May 2014 by Profile (retired account)
Post:
Hi Rapture,

"permanent HTTP error" is very common to me, since I mostly connect with a slow 2G mobile connection (+ EDGE at least :) ).

If you're also on a slow / mobile connection, try to uncheck "skip image file verification" in the preference of the BOINC manager. Might help, you never know.

But the best solution is to download the file with the browser, as BilBG suggested, and to copy it manually into the \ProgramData\BOINC\projects\asteroidsathome.net_boinc\ subdirectory. Start BOINC again and it will skip the download since the file is already on the local harddrive.

Btw, I'm still using BOINC 7.2.33 and currently have no problem to download new work. YMMV.

Cheers
8) (Message 3099)
Posted 23 May 2014 by Profile (retired account)
Post:
Some more figures from my two intels. Both are notebooks and they will throttle under full load, so below figures will not be achieved then.

i7-3720QM (Ivy Bridge), avg. clock during tests 3.3 GHz

WU : input_22147_73_short.wu 
period_search_10210_windows_intelx86__sse2.exe -verb -nog :
  Elapsed 189.790 secs
      CPU 187.357 secs
period_search_10210_windows_intelx86.exe -verb -nog :
  Elapsed 386.662 secs, speedup: -103.73%  ratio: 0.49x
      CPU 384.433 secs, speedup: -105.19%  ratio: 0.49x
period_search_10210_windows_intelx86__avx.exe -verb -nog :
  Elapsed 200.648 secs, speedup: -5.72%  ratio: 0.95x
      CPU 198.418 secs, speedup: -5.90%  ratio: 0.94x
period_search_10210_windows_intelx86__sse2.exe -verb -nog :
  Elapsed 189.634 secs, speedup: 0.08%  ratio: 1.00x
      CPU 187.373 secs, speedup: -0.01%  ratio: 1.00x
period_search_10210_windows_intelx86__sse3.exe -verb -nog :
  Elapsed 181.210 secs, speedup: 4.52%  ratio: 1.05x
      CPU 179.011 secs, speedup: 4.45%  ratio: 1.05x
period_search_10210_windows_x86_64__avx.exe -verb -nog :
  Elapsed 182.052 secs, speedup: 4.08%  ratio: 1.04x
      CPU 179.932 secs, speedup: 3.96%  ratio: 1.04x
period_search_10210_windows_x86_64__sse2.exe -verb -nog :
  Elapsed 168.028 secs, speedup: 11.47%  ratio: 1.13x
      CPU 165.735 secs, speedup: 11.54%  ratio: 1.13x
period_search_10210_windows_x86_64__sse3.exe -verb -nog :
  Elapsed 172.244 secs, speedup: 9.24%  ratio: 1.10x
      CPU 170.103 secs, speedup: 9.21%  ratio: 1.10x


i7-3632QM (Ivy Bridge), avg. clock during CPU tests 2.9 GHz, nVIDIA GeForce GT 650M

WU : input_22147_73_short.wu 
period_search_10210_windows_intelx86__sse2.exe -verb -nog :
  Elapsed 216.072 secs
      CPU 212.349 secs
period_search_10112_windows_intelx86__cuda55.exe -verb -nog :
  Elapsed 410.356 secs, speedup: -89.92%  ratio: 0.53x
      CPU 2.886 secs, speedup: 98.64%  ratio: 73.58x
period_search_10112_windows_x86_64__cuda55.exe -verb -nog :
  Elapsed 477.314 secs, speedup: -120.91%  ratio: 0.45x
      CPU 2.808 secs, speedup: 98.68%  ratio: 75.62x
period_search_10210_windows_intelx86.exe -verb -nog :
  Elapsed 433.743 secs, speedup: -100.74%  ratio: 0.50x
      CPU 429.892 secs, speedup: -102.45%  ratio: 0.49x
period_search_10210_windows_intelx86__avx.exe -verb -nog :
  Elapsed 227.043 secs, speedup: -5.08%  ratio: 0.95x
      CPU 223.066 secs, speedup: -5.05%  ratio: 0.95x
period_search_10210_windows_intelx86__sse2.exe -verb -nog :
  Elapsed 213.876 secs, speedup: 1.02%  ratio: 1.01x
      CPU 210.102 secs, speedup: 1.06%  ratio: 1.01x
period_search_10210_windows_intelx86__sse3.exe -verb -nog :
  Elapsed 201.677 secs, speedup: 6.66%  ratio: 1.07x
      CPU 198.542 secs, speedup: 6.50%  ratio: 1.07x
period_search_10210_windows_x86_64__avx.exe -verb -nog :
  Elapsed 205.078 secs, speedup: 5.09%  ratio: 1.05x
      CPU 201.475 secs, speedup: 5.12%  ratio: 1.05x
period_search_10210_windows_x86_64__sse2.exe -verb -nog :
  Elapsed 197.574 secs, speedup: 8.56%  ratio: 1.09x
      CPU 190.368 secs, speedup: 10.35%  ratio: 1.12x
period_search_10210_windows_x86_64__sse3.exe -verb -nog :
  Elapsed 195.328 secs, speedup: 9.60%  ratio: 1.11x
      CPU 191.273 secs, speedup: 9.93%  ratio: 1.11x


Surprisingly the 64bit SSE2 app is fastest in the first run while the 64bit SSE3 app comes out on top in the second but only by a small margin. The AVX apps, especially the 32bit flavor, are disappointing (are those maybe a bit overoptimized for Haswell ;) ). And again, btw, is the 64bit CUDA app slower than the 32bit one, strange.
9) (Message 3098)
Posted 23 May 2014 by Profile (retired account)
Post:
Great, works now as intended. Thanks a lot!

If you accidentally delete some file or any other bad change - just uncompress the .7z package again


Yes, sure. But both \Reference subdirectories "are intended to store what you plan to keep within reach" (quote from the readme :) ), so it makes sense to put all apps and test WUs in it IMHO.

This time I ran it concurrently with 3 Asteroids tasks within BOINC by using the 'mode noBS' switch in the provided BenchCfg.txt (and an app_config.xml for BOINC). Works great, too.

MBbench210_Asteroids.cmd 
====================================== 
 1 testWU(s) found  
     (input_22147_73_short.wu) 
 1 reference science app(s) found 
     (period_search_10210_windows_intelx86__sse2.exe -verb -nog) 
 10 science app(s) found 
     (period_search_10112_windows_intelx86__cuda55.exe -verb -nog) 
     (period_search_10112_windows_x86_64__cuda55.exe -verb -nog) 
     (period_search_10210_windows_intelx86.exe -verb -nog) 
     (period_search_10210_windows_intelx86__avx.exe -verb -nog) 
     (period_search_10210_windows_intelx86__sse2.exe -verb -nog) 
     (period_search_10210_windows_intelx86__sse3.exe -verb -nog) 
     (period_search_10210_windows_x86_64__avx.exe -verb -nog) 
     (period_search_10210_windows_x86_64__sse2.exe -verb -nog) 
     (period_search_10210_windows_x86_64__sse3.exe -verb -nog) 
     (period_search_10210_windows_x86_64_bd_fma4_gcc.exe -verb -nog) 
====================================== 
period_search_10210_windows_intelx86__sse2.exe -verb -nog / input_22147_73_short.wu : 
AppName: period_search_10210_windows_intelx86__sse2.exe 
AppArgs: -verb -nog 
TaskName: input_22147_73_short.wu 
Started at  : 08:45:07.825 
Ended at    : 08:50:06.829 
Result      : stored as ref for validations. 
    298.973 secs Elapsed
    296.542 secs CPU time
 
[ stderr ]
08:45:07 (4928): Can't open init data file - running in standalone mode
08:50:04 (4928): called boinc_finish
[ /stderr ]
------------ 
period_search_10112_windows_intelx86__cuda55.exe -verb -nog / input_22147_73_short.wu : 
AppName: period_search_10112_windows_intelx86__cuda55.exe 
AppArgs: -verb -nog 
TaskName: input_22147_73_short.wu 
Started at  : 08:50:09.949 
Ended at    : 08:52:14.313 
    124.317 secs Elapsed
      0.998 secs CPU time
Speedup     : 99.66%
Ratio       : 297.14x
 
[ stderr ] 
08:50:09 (3572): Can't open init data file - running in standalone mode
CUDA RC12!!!!!!!!!!
CUDA Device number: 0
CUDA Device: GeForce GTX TITAN
Compute capability: 3.5
Multiprocessors: 14
Grid dim: 224 = 14*16
Block dim: 128
08:52:12 (3572): called boinc_finish
[ /stderr ] 
 
------------ 


(...)

------------ 
Quick timetable 
 
WU : input_22147_73_short.wu 
period_search_10210_windows_intelx86__sse2.exe -verb -nog :
  Elapsed 298.973 secs
      CPU 296.542 secs
period_search_10112_windows_intelx86__cuda55.exe -verb -nog :
  Elapsed 124.317 secs, speedup: 58.42%  ratio: 2.40x
      CPU 0.998 secs, speedup: 99.66%  ratio: 297.14x
period_search_10112_windows_x86_64__cuda55.exe -verb -nog :
  Elapsed 156.671 secs, speedup: 47.60%  ratio: 1.91x
      CPU 1.435 secs, speedup: 99.52%  ratio: 206.65x
period_search_10210_windows_intelx86.exe -verb -nog :
  Elapsed 610.497 secs, speedup: -104.20%  ratio: 0.49x
      CPU 607.717 secs, speedup: -104.93%  ratio: 0.49x
period_search_10210_windows_intelx86__avx.exe -verb -nog :
  Elapsed 610.913 secs, speedup: -104.34%  ratio: 0.49x
      CPU 608.092 secs, speedup: -105.06%  ratio: 0.49x
period_search_10210_windows_intelx86__sse2.exe -verb -nog :
  Elapsed 308.726 secs, speedup: -3.26%  ratio: 0.97x
      CPU 306.277 secs, speedup: -3.28%  ratio: 0.97x
period_search_10210_windows_intelx86__sse3.exe -verb -nog :
  Elapsed 284.498 secs, speedup: 4.84%  ratio: 1.05x
      CPU 281.987 secs, speedup: 4.91%  ratio: 1.05x
period_search_10210_windows_x86_64__avx.exe -verb -nog :
  Elapsed 356.507 secs, speedup: -19.24%  ratio: 0.84x
      CPU 353.951 secs, speedup: -19.36%  ratio: 0.84x
period_search_10210_windows_x86_64__sse2.exe -verb -nog :
  Elapsed 262.440 secs, speedup: 12.22%  ratio: 1.14x
      CPU 259.960 secs, speedup: 12.34%  ratio: 1.14x
period_search_10210_windows_x86_64__sse3.exe -verb -nog :
  Elapsed 266.152 secs, speedup: 10.98%  ratio: 1.12x
      CPU 263.735 secs, speedup: 11.06%  ratio: 1.12x
period_search_10210_windows_x86_64_bd_fma4_gcc.exe -verb -nog :
  Elapsed 238.649 secs, speedup: 20.18%  ratio: 1.25x
      CPU 236.014 secs, speedup: 20.41%  ratio: 1.26x
 
------------ 

10) (Message 3097)
Posted 22 May 2014 by Profile (retired account)
Post:
A first run of benchmarks is finished.

- I used two of the not shortened WUs from BilBG's bench package (input_22147_73.wu and input_22152_83.wu) and calculated the average from both elapsed time speedups.
- the cpu is an AMD FX-8350 (Piledriver) running at 4.0 GHz (no turbo, no throttling)
- no other cpu-intense tasks were running
- the reference app (baseline) was period_search_10210_windows_intelx86__sse2.exe

Results:

32bit plain: -99.8%
32bit SSE2: +2.8% (same as reference, only for control)
32bit SSE3: +8.4%
32bit AVX: -105,0%
64bit SSE2: +16.6%
64bit SSE3: +16.0%
64bit AVX: -19.3%
64bit FMA4: +22.9%

This confirms again that the AVX app is not suited at all for the AMD FX and that the SSE3 app has no or little advantage over SSE2 for that processor. But it shows that Crunch3r's FMA4 app has a significant speedup over the fastest stock app (64bit SSE2).

Quite surprising to me is the result for the 32bit AVX app. It's as slow as the plain app and much slower than the 64bit variant. Can anybody confirm this?
11) (Message 3091)
Posted 22 May 2014 by Profile (retired account)
Post:
(oops)
12) (Message 3090)
Posted 22 May 2014 by Profile (retired account)
Post:
... btw, same is true for the period_search_10210_windows_intelx86__sse3.exe app, which should also be included in .\Science_apps\Reserve IMHO.
13) (Message 3089)
Posted 22 May 2014 by Profile (retired account)
Post:
BilBg, it might be a good idea to include a copy of the default test wu "input_22147_73_short.wu" also in the subdirectory .\TestWUs\Reserve\Short of your bench package. Otherwise it could get lost accidentally if one changes the WUs to bench. ;)
14) (Message 3088)
Posted 22 May 2014 by Profile (retired account)
Post:
Good mornin'


period_search_10210_windows_x86_64__sse3.exe -verb -nog :
Elapsed 2.018 secs, speedup: 99.25% ratio: 133.56x
CPU 0.016 secs, speedup: 99.99% ratio: 16648.25x


I see, you're having the same problem with the second run (not including the ref).
15) (Message 3087)
Posted 22 May 2014 by Profile (retired account)
Post:
After receiving the link to the FMA4 app from Crunch3r beginning of last week by pm - thank you! - and after the end of the 2014 Pentathlon :) I have run a few dozen workunits on my FX-8350 and they all validated ok. Compared to wingmen there is some indication of speedup, but as BilBg already pointed out it's hard to compare due to the differences between WUs and due to the fact that you don't know the exact settings of the wingmen computer (such as clock speed, number of threads used per cpu, throttling, hyperthreading on/off, other programs running etc.). Hence I'll also try to get some results with BilBg's bench package posted yesterday.

One question concerning AVX and FMA4 on the Bulldozer: Do these instruction sets benefit from using both 128bit FPUs of one module exclusively? In that case there should be a difference between running one thread per core (i.e. 8 threads on a FX-8xxx) and one thread per module (i.e. 4 threads on a FX-8xxx), right?
16) (Message 3085)
Posted 22 May 2014 by Profile (retired account)
Post:
MBbench210_Asteroids.cmd 
====================================== 
 1 testWU(s) found  
     (input_22147_73_short.wu) 
 1 reference science app(s) found 
     (period_search_10210_windows_intelx86__sse2.exe -verb -nog) 
 2 science app(s) found 
     (period_search_10112_windows_x86_64__cuda55.exe -verb -nog) 
     (period_search_10210_windows_x86_64_bd_fma4_gcc.exe -verb -nog) 
====================================== 
period_search_10210_windows_intelx86__sse2.exe -verb -nog / input_22147_73_short.wu : 
Result cached, skipping execution 
    281.144 secs Elapsed
    278.337 secs CPU time
 
Stderr.txt  : not found 
------------ 
period_search_10112_windows_x86_64__cuda55.exe -verb -nog / input_22147_73_short.wu : 
AppName: period_search_10112_windows_x86_64__cuda55.exe 
AppArgs: -verb -nog 
TaskName: input_22147_73_short.wu 
Started at  : 03:16:20.165 
Ended at    : 03:18:57.023 
    156.780 secs Elapsed
      2.777 secs CPU time
Speedup     : 99.00%
Ratio       : 100.23x
 
[ stderr ] 
03:16:20 (5812): Can't open init data file - running in standalone mode
CUDA RC12!!!!!!!!!!
CUDA Device number: 0
CUDA Device: GeForce GTX TITAN
Compute capability: 3.5
Multiprocessors: 14
Grid dim: 224 = 14*16
Block dim: 128
03:18:54 (5812): called boinc_finish
[ /stderr ] 
 
------------ 
period_search_10210_windows_x86_64_bd_fma4_gcc.exe -verb -nog / input_22147_73_short.wu : 
AppName: period_search_10210_windows_x86_64_bd_fma4_gcc.exe 
AppArgs: -verb -nog 
TaskName: input_22147_73_short.wu 
Started at  : 03:19:00.221 
Ended at    : 03:22:36.734 
    216.466 secs Elapsed
    213.721 secs CPU time
Speedup     : 23.22%
Ratio       : 1.30x
 
[ stderr ] 
03:19:00 (6564): Can't open init data file - running in standalone mode

Using: FMA4

03:22:34 (6564): called boinc_finish
[ /stderr ] 
 
------------ 
Quick timetable 
 
WU : input_22147_73_short.wu 
period_search_10210_windows_intelx86__sse2.exe -verb -nog :
  Elapsed 281.144 secs
      CPU 278.337 secs
period_search_10112_windows_x86_64__cuda55.exe -verb -nog :
  Elapsed 156.780 secs, speedup: 44.23%  ratio: 1.79x
      CPU 2.777 secs, speedup: 99.00%  ratio: 100.23x
period_search_10210_windows_x86_64_bd_fma4_gcc.exe -verb -nog :
  Elapsed 216.466 secs, speedup: 23.01%  ratio: 1.30x
      CPU 213.721 secs, speedup: 23.22%  ratio: 1.30x
 
------------ 

17) (Message 3084)
Posted 22 May 2014 by Profile (retired account)
Post:
MBbench210_Asteroids.cmd 
====================================== 
 1 testWU(s) found  
     (input_22147_73_short.wu) 
 1 reference science app(s) found 
     (period_search_10210_windows_intelx86__sse2.exe -verb -nog) 
 9 science app(s) found 
     (period_search_10112_windows_intelx86__cuda55.exe -verb -nog) 
     (period_search_10112_windows_x86_64__cuda55.exe -verb -nog) 
     (period_search_10210_windows_intelx86.exe -verb -nog) 
     (period_search_10210_windows_intelx86__avx.exe -verb -nog) 
     (period_search_10210_windows_intelx86__sse3.exe -verb -nog) 
     (period_search_10210_windows_x86_64__avx.exe -verb -nog) 
     (period_search_10210_windows_x86_64__sse2.exe -verb -nog) 
     (period_search_10210_windows_x86_64__sse3.exe -verb -nog) 
     (period_search_10210_windows_x86_64_bd_fma4_gcc.exe -verb -nog) 
====================================== 
period_search_10210_windows_intelx86__sse2.exe -verb -nog / input_22147_73_short.wu : 
AppName: period_search_10210_windows_intelx86__sse2.exe 
AppArgs: -verb -nog 
TaskName: input_22147_73_short.wu 
Started at  : 02:30:59.385 
Ended at    : 02:35:40.576 
Result      : stored as ref for validations. 
    281.144 secs Elapsed
    278.337 secs CPU time
 
[ stderr ]
02:30:59 (6000): Can't open init data file - running in standalone mode
02:35:38 (6000): called boinc_finish
[ /stderr ]
------------ 
period_search_10112_windows_intelx86__cuda55.exe -verb -nog / input_22147_73_short.wu : 
AppName: period_search_10112_windows_intelx86__cuda55.exe 
AppArgs: -verb -nog 
TaskName: input_22147_73_short.wu 
Started at  : 02:35:43.711 
Ended at    : 02:37:50.050 
    126.292 secs Elapsed
      2.465 secs CPU time
Speedup     : 99.11%
Ratio       : 112.92x
 
R2: .\ref\ref-period_search_10210_windows_intelx86__sse2.exe-input_22147_73_short.wu.res 
Result      : Strongly similar,  Q= 1.010e+004%
 
[ stderr ] 
02:35:43 (4480): Can't open init data file - running in standalone mode
CUDA RC12!!!!!!!!!!
CUDA Device number: 0
CUDA Device: GeForce GTX TITAN
Compute capability: 3.5
Multiprocessors: 14
Grid dim: 224 = 14*16
Block dim: 128
02:37:47 (4480): called boinc_finish
[ /stderr ] 
 
------------ 
period_search_10112_windows_x86_64__cuda55.exe -verb -nog / input_22147_73_short.wu : 
AppName: period_search_10112_windows_x86_64__cuda55.exe 
AppArgs: -verb -nog 
TaskName: input_22147_73_short.wu 
Started at  : 02:37:53.295 
Ended at    : 02:37:55.619 
      2.278 secs Elapsed
      0.234 secs CPU time
Speedup     : 99.92%
Ratio       : 1189.47x
 
R2: .\ref\ref-period_search_10210_windows_intelx86__sse2.exe-input_22147_73_short.wu.res 
Result      : Strongly similar,  Q= 1.010e+004%
 
[ stderr ] 
02:37:53 (6032): Can't open init data file - running in standalone mode
02:37:53 (6032): called boinc_finish
[ /stderr ] 
 
------------ 
period_search_10210_windows_intelx86.exe -verb -nog / input_22147_73_short.wu : 
AppName: period_search_10210_windows_intelx86.exe 
AppArgs: -verb -nog 
TaskName: input_22147_73_short.wu 
Started at  : 02:37:58.848 
Ended at    : 02:47:37.573 
    578.662 secs Elapsed
    575.800 secs CPU time
Speedup     : -106.87%
Ratio       : 0.48x
 
R2: .\ref\ref-period_search_10210_windows_intelx86__sse2.exe-input_22147_73_short.wu.res 
Result      : Strongly similar,  Q= 1.010e+004%
 
[ stderr ] 
02:37:58 (1956): Can't open init data file - running in standalone mode
02:47:35 (1956): called boinc_finish
[ /stderr ] 
 
------------ 
period_search_10210_windows_intelx86__avx.exe -verb -nog / input_22147_73_short.wu : 
AppName: period_search_10210_windows_intelx86__avx.exe 
AppArgs: -verb -nog 
TaskName: input_22147_73_short.wu 
Started at  : 02:47:40.818 
Ended at    : 02:47:42.940 
      2.044 secs Elapsed
      0.031 secs CPU time
Speedup     : 99.99%
Ratio       : 8978.61x
 
R2: .\ref\ref-period_search_10210_windows_intelx86__sse2.exe-input_22147_73_short.wu.res 
Result      : Strongly similar,  Q= 1.010e+004%
 
[ stderr ] 
02:47:40 (4980): Can't open init data file - running in standalone mode
02:47:40 (4980): called boinc_finish
[ /stderr ] 
 
------------ 
period_search_10210_windows_intelx86__sse3.exe -verb -nog / input_22147_73_short.wu : 
AppName: period_search_10210_windows_intelx86__sse3.exe 
AppArgs: -verb -nog 
TaskName: input_22147_73_short.wu 
Started at  : 02:47:46.184 
Ended at    : 02:47:48.259 
      2.028 secs Elapsed
      0.016 secs CPU time
Speedup     : 99.99%
Ratio       : 17396.06x
 
R2: .\ref\ref-period_search_10210_windows_intelx86__sse2.exe-input_22147_73_short.wu.res 
Result      : Strongly similar,  Q= 1.010e+004%
 
[ stderr ] 
02:47:46 (3572): Can't open init data file - running in standalone mode
02:47:46 (3572): called boinc_finish
[ /stderr ] 
 
------------ 
period_search_10210_windows_x86_64__avx.exe -verb -nog / input_22147_73_short.wu : 
AppName: period_search_10210_windows_x86_64__avx.exe 
AppArgs: -verb -nog 
TaskName: input_22147_73_short.wu 
Started at  : 02:47:51.504 
Ended at    : 02:47:53.595 
      2.028 secs Elapsed
      0.016 secs CPU time
Speedup     : 99.99%
Ratio       : 17396.06x
 
R2: .\ref\ref-period_search_10210_windows_intelx86__sse2.exe-input_22147_73_short.wu.res 
Result      : Strongly similar,  Q= 1.010e+004%
 
[ stderr ] 
02:47:51 (5224): Can't open init data file - running in standalone mode
02:47:51 (5224): called boinc_finish
[ /stderr ] 
 
------------ 
period_search_10210_windows_x86_64__sse2.exe -verb -nog / input_22147_73_short.wu : 
AppName: period_search_10210_windows_x86_64__sse2.exe 
AppArgs: -verb -nog 
TaskName: input_22147_73_short.wu 
Started at  : 02:47:56.855 
Ended at    : 02:47:58.961 
      2.028 secs Elapsed
      0.016 secs CPU time
Speedup     : 99.99%
Ratio       : 17396.06x
 
R2: .\ref\ref-period_search_10210_windows_intelx86__sse2.exe-input_22147_73_short.wu.res 
Result      : Strongly similar,  Q= 1.010e+004%
 
[ stderr ] 
02:47:56 (5784): Can't open init data file - running in standalone mode
02:47:56 (5784): called boinc_finish
[ /stderr ] 
 
------------ 
period_search_10210_windows_x86_64__sse3.exe -verb -nog / input_22147_73_short.wu : 
AppName: period_search_10210_windows_x86_64__sse3.exe 
AppArgs: -verb -nog 
TaskName: input_22147_73_short.wu 
Started at  : 02:48:02.206 
Ended at    : 02:48:04.281 
      2.028 secs Elapsed
      0.016 secs CPU time
Speedup     : 99.99%
Ratio       : 17396.06x
 
R2: .\ref\ref-period_search_10210_windows_intelx86__sse2.exe-input_22147_73_short.wu.res 
Result      : Strongly similar,  Q= 1.010e+004%
 
[ stderr ] 
02:48:02 (6844): Can't open init data file - running in standalone mode
02:48:02 (6844): called boinc_finish
[ /stderr ] 
 
------------ 
period_search_10210_windows_x86_64_bd_fma4_gcc.exe -verb -nog / input_22147_73_short.wu : 
AppName: period_search_10210_windows_x86_64_bd_fma4_gcc.exe 
AppArgs: -verb -nog 
TaskName: input_22147_73_short.wu 
Started at  : 02:48:07.525 
Ended at    : 02:48:09.647 
      2.044 secs Elapsed
      0.031 secs CPU time
Speedup     : 99.99%
Ratio       : 8978.61x
 
R2: .\ref\ref-period_search_10210_windows_intelx86__sse2.exe-input_22147_73_short.wu.res 
Result      : Strongly similar,  Q= 1.010e+004%
 
[ stderr ] 
02:48:07 (1964): Can't open init data file - running in standalone mode

Using: FMA4

02:48:07 (1964): called boinc_finish
[ /stderr ] 
 
------------ 
Quick timetable 
 
WU : input_22147_73_short.wu 
period_search_10210_windows_intelx86__sse2.exe -verb -nog :
  Elapsed 281.144 secs
      CPU 278.337 secs
period_search_10112_windows_intelx86__cuda55.exe -verb -nog :
  Elapsed 126.292 secs, speedup: 55.08%  ratio: 2.23x
      CPU 2.465 secs, speedup: 99.11%  ratio: 112.92x
period_search_10112_windows_x86_64__cuda55.exe -verb -nog :
  Elapsed 2.278 secs, speedup: 99.19%  ratio: 123.42x
      CPU 0.234 secs, speedup: 99.92%  ratio: 1189.47x
period_search_10210_windows_intelx86.exe -verb -nog :
  Elapsed 578.662 secs, speedup: -105.82%  ratio: 0.49x
      CPU 575.800 secs, speedup: -106.87%  ratio: 0.48x
period_search_10210_windows_intelx86__avx.exe -verb -nog :
  Elapsed 2.044 secs, speedup: 99.27%  ratio: 137.55x
      CPU 0.031 secs, speedup: 99.99%  ratio: 8978.61x
period_search_10210_windows_intelx86__sse3.exe -verb -nog :
  Elapsed 2.028 secs, speedup: 99.28%  ratio: 138.63x
      CPU 0.016 secs, speedup: 99.99%  ratio: 17396.06x
period_search_10210_windows_x86_64__avx.exe -verb -nog :
  Elapsed 2.028 secs, speedup: 99.28%  ratio: 138.63x
      CPU 0.016 secs, speedup: 99.99%  ratio: 17396.06x
period_search_10210_windows_x86_64__sse2.exe -verb -nog :
  Elapsed 2.028 secs, speedup: 99.28%  ratio: 138.63x
      CPU 0.016 secs, speedup: 99.99%  ratio: 17396.06x
period_search_10210_windows_x86_64__sse3.exe -verb -nog :
  Elapsed 2.028 secs, speedup: 99.28%  ratio: 138.63x
      CPU 0.016 secs, speedup: 99.99%  ratio: 17396.06x
period_search_10210_windows_x86_64_bd_fma4_gcc.exe -verb -nog :
  Elapsed 2.044 secs, speedup: 99.27%  ratio: 137.55x
      CPU 0.031 secs, speedup: 99.99%  ratio: 8978.61x
 
------------ 

18) (Message 3083)
Posted 22 May 2014 by Profile (retired account)
Post:
Hello BilBg,

thanks a lot for the bench package. Should help a lot to get some more reliable comparisons. I was just looking for something like that. :) The usage is fairly straightforward IMHO.

However, have you tried to run the bench with more than two science apps (reference and another) in one run? I tried to run all apps with your default short workunit on a Win7 64bit system with AMD FX-8350 & GTX Titan and I guess something went wrong with the second GPU bench and all CPU benches after the reference and the second app. I have a hunch that some leftover from the previous run interfered here, could that be? The script MBbench210_Asteroids.cmd is fairly complex to me, although I should be able to understand it, so it might take me some time to find the reason here. So help is appreciated. :)

Here's the quick timetable. The results with an elapsed time of 2.something secs appear to be bogus.

WU : input_22147_73_short.wu 
period_search_10210_windows_intelx86__sse2.exe -verb -nog :
  Elapsed 281.144 secs
      CPU 278.337 secs
period_search_10112_windows_intelx86__cuda55.exe -verb -nog :
  Elapsed 126.292 secs, speedup: 55.08%  ratio: 2.23x
      CPU 2.465 secs, speedup: 99.11%  ratio: 112.92x
period_search_10112_windows_x86_64__cuda55.exe -verb -nog :
  Elapsed 2.278 secs, speedup: 99.19%  ratio: 123.42x
      CPU 0.234 secs, speedup: 99.92%  ratio: 1189.47x
period_search_10210_windows_intelx86.exe -verb -nog :
  Elapsed 578.662 secs, speedup: -105.82%  ratio: 0.49x
      CPU 575.800 secs, speedup: -106.87%  ratio: 0.48x
period_search_10210_windows_intelx86__avx.exe -verb -nog :
  Elapsed 2.044 secs, speedup: 99.27%  ratio: 137.55x
      CPU 0.031 secs, speedup: 99.99%  ratio: 8978.61x
period_search_10210_windows_intelx86__sse3.exe -verb -nog :
  Elapsed 2.028 secs, speedup: 99.28%  ratio: 138.63x
      CPU 0.016 secs, speedup: 99.99%  ratio: 17396.06x
period_search_10210_windows_x86_64__avx.exe -verb -nog :
  Elapsed 2.028 secs, speedup: 99.28%  ratio: 138.63x
      CPU 0.016 secs, speedup: 99.99%  ratio: 17396.06x
period_search_10210_windows_x86_64__sse2.exe -verb -nog :
  Elapsed 2.028 secs, speedup: 99.28%  ratio: 138.63x
      CPU 0.016 secs, speedup: 99.99%  ratio: 17396.06x
period_search_10210_windows_x86_64__sse3.exe -verb -nog :
  Elapsed 2.028 secs, speedup: 99.28%  ratio: 138.63x
      CPU 0.016 secs, speedup: 99.99%  ratio: 17396.06x
period_search_10210_windows_x86_64_bd_fma4_gcc.exe -verb -nog :
  Elapsed 2.044 secs, speedup: 99.27%  ratio: 137.55x
      CPU 0.031 secs, speedup: 99.99%  ratio: 8978.61x


The result file from the first run is included below.

EDIT: Just finished a second run with only one GPU app and one CPU app and this time it worked. So there is definitely a problem with running two GPU apps or two CPU apps (not including the reference run) after each other, at least on my system. And it is not caused be the apps themselves, they seem to be ok.

Here's the quick timetable from the second run.

WU : input_22147_73_short.wu 
period_search_10210_windows_intelx86__sse2.exe -verb -nog :
  Elapsed 281.144 secs
      CPU 278.337 secs
period_search_10112_windows_x86_64__cuda55.exe -verb -nog :
  Elapsed 156.780 secs, speedup: 44.23%  ratio: 1.79x
      CPU 2.777 secs, speedup: 99.00%  ratio: 100.23x
period_search_10210_windows_x86_64_bd_fma4_gcc.exe -verb -nog :
  Elapsed 216.466 secs, speedup: 23.01%  ratio: 1.30x
      CPU 213.721 secs, speedup: 23.22%  ratio: 1.30x


I have no idea why the 64bit CUDA app is slower here than the 32bit app in the first run. In both bench runs, the CPU was running with 4.0 GHz (no turbo or throttling) during the cpu tests, the GPU with 980 / 1500 MHz (not overclocked, not in DP mode) during the CUDA tests. I should do some more testing here...


- don't mind the lines that say "Result : Strongly similar, Q= ..." - this info was intended for SETI@home and may/will be wrong for Asteroids@home


To avoid those, it seems sufficient to delete the line "if exist .\testDatas\ref\ref*!wunbr%%w!.res call .\tools\mb_validate.cmd" from the MBbench210_Asteroids.cmd file (I commented it out instead). In that case, I suppose, you could also delete mb_validate.cmd, rescmpv5.exe and the rescmp subdirectory from the .\Tools directory.

The result file from the second run is also included below.
19) (Message 2331)
Posted 31 Dec 2013 by Profile (retired account)
Post:
EDIT: Not quite. 46 min. @ 980 MHz GPU and ~ 74°C now.


In comparison a first workunit in DP mode on the Titan: 52 min. @ 823 MHz GPU and ~ 71°C. Keeping in mind that the shader clock is only 84%, this does indicate some speedup in double prec. mode on a clock per clock basis. No significant changes in GPU and memory load.
20) (Message 2330)
Posted 31 Dec 2013 by Profile (retired account)
Post:
Will also give it a try on my GT 650M now that the last GPUGrid short run is finished, but I guess it will be slower than the CPU there...


Yes, indeed: GT 650M ~18700 s (@ 950 MHz, DDR3 @ 900 MHz)


Next 20