New optimized versions for SSE3 released


Message boards : News : New optimized versions for SSE3 released

Message board moderation

To post messages, you must log in.
AuthorMessage
Jan Vaclavik

Send message
Joined: 26 Jan 13
Posts: 31
Credit: 1,501,198
RAC: 270
Message 1522 - Posted: 19 Aug 2013, 10:27:46 UTC - in response to Message 1499.  

Last modified: 19 Aug 2013, 10:28:36 UTC
The next will be SSE2 and AVX. And I have to think over the credit, I think that we could send longer WUs which would be better. But first we try to finish the applications to knowing how much time would it take.

Does the AVX version use SSSE3, SSE4.1 and SSE4.2 as well?
Not that I have CPU with AVX instructions, just wondering.
ID: 1522 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HA-SOFT, s.r.o.
Project developer
Project tester

Send message
Joined: 21 Dec 12
Posts: 176
Credit: 134,882,488
RAC: 2,414
Message 1523 - Posted: 19 Aug 2013, 11:01:34 UTC - in response to Message 1522.  

Last modified: 19 Aug 2013, 11:02:32 UTC
No, mix of SSEx and AVX instruction is very ineffective. Especially when SSE2 instruction follow AVX instruction.

We will not release AVX version, simply because it's slower than SSE3. The final approach will be:

1. Standard app
2. Pure SSE2 app
3. SSE3 app (the fastest one)

Kyong is testing SSE2 now. I'm working on standard app now (some backports from sse3 version) as preparation step for nVidia CUDA development.
ID: 1523 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile chip
Avatar

Send message
Joined: 1 Jun 13
Posts: 7
Credit: 0
RAC: 0
Message 1524 - Posted: 19 Aug 2013, 11:12:29 UTC - in response to Message 1523.  

Last modified: 19 Aug 2013, 11:17:57 UTC
nVidia CUDA development

WOOOW!

SSE3 app (the fastest one)

On i7-2600K @ 4500MHz (HT off) calculation time is 1200 sec, but I see host, where time is ~600 sec! Therefore is a faster version?
ID: 1524 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HA-SOFT, s.r.o.
Project developer
Project tester

Send message
Joined: 21 Dec 12
Posts: 176
Credit: 134,882,488
RAC: 2,414
Message 1525 - Posted: 19 Aug 2013, 11:21:20 UTC - in response to Message 1524.  
On i7-2600K @ 4500MHz (HT off) calculation time is 1200 sec, but I see host, where time is ~600 sec! Therefore is a faster version?


This is Intel Hasvel. Maybe it have better support for AVX.
ID: 1525 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HA-SOFT, s.r.o.
Project developer
Project tester

Send message
Joined: 21 Dec 12
Posts: 176
Credit: 134,882,488
RAC: 2,414
Message 1527 - Posted: 19 Aug 2013, 12:50:08 UTC - in response to Message 1525.  
I have inspect this and it looks like AVX2 app for Intel Hasvel.

AVX2 brings new integer instructions to 256 bit AVX world which is missing in AVX. We use them in app so our AVX app must use SSE2 instructions for integers.

We use Visual studio 2010 for win builds and there is no AVX2 support. Visual studio 2012 do not support Win Vista and older OS.

I have ordered one i5-4670 in our company and I will test AVX2. If tests will be succesfull we will create download section and let users download special app with app_info.xml included.

ID: 1527 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
VictordeHollander

Send message
Joined: 15 Feb 13
Posts: 5
Credit: 2,128,794
RAC: 4
Message 1532 - Posted: 19 Aug 2013, 20:18:14 UTC - in response to Message 1513.  
Do you have some antivirus installed?


Yes, I do. AVG Antivirus 2013
ID: 1532 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HA-SOFT, s.r.o.
Project developer
Project tester

Send message
Joined: 21 Dec 12
Posts: 176
Credit: 134,882,488
RAC: 2,414
Message 1537 - Posted: 19 Aug 2013, 23:02:09 UTC - in response to Message 1532.  
Do you have some antivirus installed?


Yes, I do. AVG Antivirus 2013


This should not be problem. Task was completed by another cruncher.
ID: 1537 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jan Vaclavik

Send message
Joined: 26 Jan 13
Posts: 31
Credit: 1,501,198
RAC: 270
Message 1546 - Posted: 20 Aug 2013, 12:52:06 UTC - in response to Message 1527.  

Last modified: 20 Aug 2013, 12:52:29 UTC
I have inspect this and it looks like AVX2 app for Intel Hasvel.

AVX2 brings new integer instructions to 256 bit AVX world which is missing in AVX. We use them in app so our AVX app must use SSE2 instructions for integers.

We use Visual studio 2010 for win builds and there is no AVX2 support. Visual studio 2012 do not support Win Vista and older OS.

I have ordered one i5-4670 in our company and I will test AVX2. If tests will be succesfull we will create download section and let users download special app with app_info.xml included.


Boinc does not allow you to release an application which would be only available to specific Windows versions?
ID: 1546 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HA-SOFT, s.r.o.
Project developer
Project tester

Send message
Joined: 21 Dec 12
Posts: 176
Credit: 134,882,488
RAC: 2,414
Message 1547 - Posted: 20 Aug 2013, 13:06:33 UTC - in response to Message 1546.  
Boinc does not allow you to release an application which would be only available to specific Windows versions?


I'm not sure it's Kyongs magic on server side. But boinc scheduler is not easy thing to work with.
ID: 1547 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
cyrusNGC_224@P3D

Send message
Joined: 1 Apr 13
Posts: 37
Credit: 153,496,537
RAC: 0
Message 1548 - Posted: 20 Aug 2013, 20:03:27 UTC - in response to Message 1523.  
Kyong is testing SSE2 now. I'm working on standard app now (some backports from sse3 version) as preparation step for nVidia CUDA development.
Only CUDA, no OpenCL?
ID: 1548 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rick A. Sponholz
Avatar

Send message
Joined: 1 Oct 12
Posts: 17
Credit: 24,549,679
RAC: 0
Message 1572 - Posted: 24 Aug 2013, 15:42:02 UTC
How do I get a copy of the optimized version for sse3? how do I get a copy of the cuda? I'd love to accomplish more with my machines. Thanks in advance, Rick
ID: 1572 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HA-SOFT, s.r.o.
Project developer
Project tester

Send message
Joined: 21 Dec 12
Posts: 176
Credit: 134,882,488
RAC: 2,414
Message 1579 - Posted: 25 Aug 2013, 20:22:26 UTC - in response to Message 1572.  
How do I get a copy of the optimized version for sse3? how do I get a copy of the cuda? I'd love to accomplish more with my machines. Thanks in advance, Rick


SSE3 application is distributed automatically. As I see in your hosts you have got SSE3 app. You have Intel Haswell so you will profit from AVX version of app which will be released soon.
ID: 1579 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HA-SOFT, s.r.o.
Project developer
Project tester

Send message
Joined: 21 Dec 12
Posts: 176
Credit: 134,882,488
RAC: 2,414
Message 1580 - Posted: 25 Aug 2013, 20:26:37 UTC - in response to Message 1548.  

Last modified: 25 Aug 2013, 20:27:38 UTC
Only CUDA, no OpenCL?


I will do CUDA version for nVidia cards so I can talk about CUDA version only.

Kyong has some long term plans about OpenCL, but I can't say more.
ID: 1580 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rick A. Sponholz
Avatar

Send message
Joined: 1 Oct 12
Posts: 17
Credit: 24,549,679
RAC: 0
Message 1581 - Posted: 25 Aug 2013, 22:17:12 UTC - in response to Message 1579.  
Thanks for the reply HA-SOFT. I thought I had to do something manually, like I do for Seti. Can't wait for the cuda, as I have lots of capability I'm willing to contribute. Rick
ID: 1581 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 20 Jul 13
Posts: 15
Credit: 5,985,840
RAC: 0
Message 1582 - Posted: 26 Aug 2013, 1:33:02 UTC - in response to Message 1581.  
Thanks for the reply HA-SOFT. I thought I had to do something manually, like I do for Seti. Can't wait for the cuda, as I have lots of capability I'm willing to contribute. Rick

oh my god you're not kidding...i just saw your arsenal, and your GPU power is insane! of the 5 multi-GPU machines you've got on A@H right now, might i ask what motherboards you're using in each of them?

thanks,
Eric
ID: 1582 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rick A. Sponholz
Avatar

Send message
Joined: 1 Oct 12
Posts: 17
Credit: 24,549,679
RAC: 0
Message 1621 - Posted: 30 Aug 2013, 15:23:27 UTC - in response to Message 1582.  
Don't know much about my motherboards. I know they're all Intel boards, 4 machines with two video card slots, one machine only has one video card slot with an adapter I installed to get the second video card to work. Sorry, but that's all I know. Rick
ID: 1621 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jan Vaclavik

Send message
Joined: 26 Jan 13
Posts: 31
Credit: 1,501,198
RAC: 270
Message 1622 - Posted: 30 Aug 2013, 16:14:45 UTC
Is there some huge performance difference on certain CPUs between SSE2 and SSE3 versions?
Both run pretty much the same on my old Core 2 Duo, which makes the SSE3 version seem somewhat pointless.
ID: 1622 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HA-SOFT, s.r.o.
Project developer
Project tester

Send message
Joined: 21 Dec 12
Posts: 176
Credit: 134,882,488
RAC: 2,414
Message 1639 - Posted: 2 Sep 2013, 8:55:52 UTC - in response to Message 1622.  
Is there some huge performance difference on certain CPUs between SSE2 and SSE3 versions?
Both run pretty much the same on my old Core 2 Duo, which makes the SSE3 version seem somewhat pointless.


See:

http://asteroidsathome.net/boinc/forum_thread.php?id=171&postid=1551#1551
ID: 1639 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ananas

Send message
Joined: 18 Mar 13
Posts: 32
Credit: 2,506,320
RAC: 0
Message 1701 - Posted: 7 Sep 2013, 16:04:12 UTC - in response to Message 1547.  

Last modified: 7 Sep 2013, 16:08:29 UTC
Boinc does not allow you to release an application which would be only available to specific Windows versions?


I'm not sure it's Kyongs magic on server side. But boinc scheduler is not easy thing to work with.

If you want to support a specific CPU feature, that BOINC doesn't report, you could use a method similar to the RNA-World one :

They deliver only one ZIP (for each OS basic type) that has everything : common files, SSE4 version, SSE3 version ... basic version.

Plus they use an unzip "wrapper", that checks the CPU features, unzips the common files, then checks the ZIP file for the best version that fits the CPU. If unavailable, it checks the next one and so on.

Advantage : It will work even with ancient BOINC versions, that are much older than the date the CPU feature has been introduced - without the need for am optimized app_info.xml.
ID: 1701 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : News : New optimized versions for SSE3 released