AVX's 3 times as fast as e.g. SSE2/SSE3 ??


Message boards : Number crunching : AVX's 3 times as fast as e.g. SSE2/SSE3 ??

Message board moderation

To post messages, you must log in.
AuthorMessage
DanHansen@Denmark
Avatar

Send message
Joined: 16 Nov 12
Posts: 24
Credit: 23,025,000
RAC: 0
Message 3601 - Posted: 12 Sep 2014, 16:36:52 UTC

Last modified: 12 Sep 2014, 17:20:34 UTC
Hi,

CPU: i5 haswell (4670K/3570K/3470)
Kernel: Linux 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Numbers retrieved from BoincStats

Does this sound right to you guys? AVX's runs 3 times faster than e.g. SSE2/SSE3's ? I'm comparing 3 machines, all of them using i5 CPU haswell(4670K/3570K/3470). One system is now running AVX's and gets a little hot and the other systems is running SSE2/SSE3's where the CPU temps is only 65 degrees Celsius.
Never mind the heat, it's not more than 79-81 degrees Celsius on the "hot one". I just noticed this, because the mobo on this system controls the fanspeed and increases/decreases fanspeed all the time.
So, does AVX's run 3 times as fast (does the job 3 times faster) as SSE2/SSE3's ??? Here's the numbers:

System1: Estimated Time: 1:19:08 <---- AVX
System2: Estimated Time: 2:15:50 <---- SSE2
System2: Estimated Time: 1:42:45 <---- SSE3

Computer:	beaufort
Project	Asteroids@home
	
Name	ps_140901_211285_4_0
	
Application	Period Search Application 102.10 (avx)
Workunit name	ps_140901_211285_4
State	Running High P.
Received	02-09-2014 19:26:01
Report deadline	13-09-2014 07:25:52
Estimated app speed     796,82 GFLOPs/sec  <----- 796,82 GFLOPs/sec
Estimated task size	1.380.023 GFLOPs  <------ SAME JOBSIZE !?
CPU time at last checkpoint	01:09:45
CPU time	01:10:26
Elapsed time	01:10:57
Estimated time remaining	00:08:11
Fraction done	86,672%
Virtual memory size	21,62 MB
Working set size	8,68 MB
Directory	slots/9
Process ID	1525
	

Computer:	wellington
Project	Asteroids@home
	
Name	ps_140901_251160_4_1
	
Application	Period Search Application 102.10 (sse2)
Workunit name	ps_140901_251160_4
State	Waiting to run
Received	08-09-2014 00:35:12
Report deadline	18-09-2014 12:35:10
Estimated app speed	258,84 GFLOPs/sec  <----- 258,84 GFLOPs/sec
Estimated task size	1.380.023 GFLOPs  <------ SAME JOBSIZE !?
CPU time at last checkpoint	01:00:05
CPU time	01:00:51
Elapsed time	01:01:00
Estimated time remaining	01:14:50
Fraction done	47,044%
Virtual memory size	21,55 MB
Working set size	5,90 MB
Directory	slots/3
Process ID	15245
	

Computer:	halifax
Project	Asteroids@home
	
Name	ps_140901_211683_7_1
	
Application	Period Search Application 102.10 (sse3)
Workunit name	ps_140901_211683_7
State	Waiting to run
Received	02-09-2014 20:24:04
Report deadline	13-09-2014 08:23:51
Estimated app speed	235,40 GFLOPs/sec  <----- 235,40 GFLOPs/sec
Estimated task size	1.380.023 GFLOPs  <------ SAME JOBSIZE !?
CPU time at last checkpoint	01:00:03
CPU time	01:00:03
Elapsed time	01:00:14
Estimated time remaining	00:42:31
Fraction done	63,982%
Virtual memory size	21,55 MB
Working set size	6,02 MB
Directory	slots/3
Process ID	0



It's just that the estimated time of a job on the same machine looks different too. This is 4 jobs running at the same machine, and all 4 of them are AVX's. But 2 jobs will be done in 28-30 min. and the other 2 in 67-71 min. Here's one of each of them. Why the big difference if jobs are the same sizes. All 4 AVX's are "State: Running High P.":

System1: Estimated Time: 1:15:56 <---- AVX
System1: Estimated Time: 0:29:59 <---- AVX

Computer:	beaufort
Project	Asteroids@home
	
Name	ps_140901_211286_5_1
	
Application	Period Search Application 102.10 (avx)
Workunit name	ps_140901_211286_5
State	Running High P.
Received	02-09-2014 19:26:01
Report deadline	13-09-2014 07:25:56
Estimated app speed	796,82 GFLOPs/sec
Estimated task size	1.380.023 GFLOPs
CPU time at last checkpoint	00:11:17
CPU time	00:12:16
Elapsed time	00:12:22  <----------------- TIME ELAPSED
Estimated time remaining	00:17:37  <----- TIME REMAINING
Fraction done	13,197%
Virtual memory size	21,62 MB
Working set size	8,57 MB
Directory	slots/10
Process ID	1722
	

Computer:	beaufort
Project	Asteroids@home
	
Name	ps_140901_211285_3_0
	
Application	Period Search Application 102.10 (avx)
Workunit name	ps_140901_211285_3
State	Running High P.
Received	02-09-2014 19:26:01
Report deadline	13-09-2014 07:25:54
Estimated app speed	796,82 GFLOPs/sec
Estimated task size	1.380.023 GFLOPs
CPU time at last checkpoint	01:04:37
CPU time	01:05:26
Elapsed time	01:05:56  <----------------- TIME ELAPSED
Estimated time remaining	00:10:00  <----- TIME REMAINING
Fraction done	81,337%
Virtual memory size	21,62 MB
Working set size	8,66 MB
Directory	slots/8
Process ID	1691



I'm no guru and not sure about these things, job sizes, job type's and how to read data etc. I don't know if all AVX's are "State: Running High P." and so on, so if my questions are a little "out there" I'm very sorry :o

Regarding the Estimated times for the jobs, there must e an error in BoincStats or somewhere in the system, because those numbers are extremely weird. Looks like the estimates changes for some jobs and not others. Anyway, it's the AVX's vs. SSE2/SSE3's and speed which is important here ;)

How much time is saved running AVX ??? I've been reading this post http://asteroidsathome.net/boinc/forum_thread.php?id=200#1826 and here's a guy who says 3-4 minutes only !?!?

.
Project Headless CLI Linux Multiple GPU Boinc Servers
Ubuntu Server 14.04.1 64bit
Kernel 3.13.0-32-generic
CPU's i5-4690K
GPU's GT640/GTX750TI
Nvidia v.340.29
BOINC v.7.2.42

ID: 3601 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
cyrusNGC_224@P3D

Send message
Joined: 1 Apr 13
Posts: 37
Credit: 153,496,537
RAC: 0
Message 3604 - Posted: 12 Sep 2014, 20:08:36 UTC
Asteroids has no constant wu's anymore.
ID: 3604 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile BilBg
Avatar

Send message
Joined: 19 Jun 12
Posts: 221
Credit: 623,640
RAC: 0
Message 3605 - Posted: 13 Sep 2014, 2:48:57 UTC - in response to Message 3601.  
Well, I have made Benchmark package but it is for Windows:
http://asteroidsathome.net/boinc/forum_thread.php?id=306
http://asteroidsathome.net/boinc/forum_thread.php?id=306&postid=3094#3094

But you may do the same edits to the script file
I used MBbench 2.10 as a base (and you may compare to see my changes), you may use KWSN Linux MB Bench v2.01.08 as a base

('Estimated Time' don't mean anything)



- ALF - "Find out what you don't do well ..... then don't do it!" :)
ID: 3605 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Chris Skull

Send message
Joined: 30 Dec 12
Posts: 6
Credit: 55,514,280
RAC: 0
Message 4769 - Posted: 19 Jan 2016, 6:53:36 UTC - in response to Message 3605.  
On my machines i notice that SSE2 units run faster as SSE3 and AVX... schould be nicely if user can select the apps they wish in they're preferences...
Only 3 examples:

i5-3470 Win7:
SSE2: 293.98 GFLOPS
SSE3: 231.04 GFLOPS

A10-7870K Linux:
SSE2: 164.79 GFLOPS
AVX: 156.70 GFLOPS
SSE3: 150.51 GFLOPS

AMD Phenom(tm) II X4 965 Win7:
SSE2: 128.30 GFLOPS
SSE3: 118.18 GFLOPS
Greetz
Chris

ID: 4769 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : AVX's 3 times as fast as e.g. SSE2/SSE3 ??