Support AVX

Sebastian M. Bobrecki
Sebastian M. Bo...
Joined: 20 Feb 05
Posts: 63
Credit: 1529602660
RAC: 105

Maybe now, after almost two

Maybe now, after almost two years, it's time to think again about support of AVX / FMA (3/4)?

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 578346872
RAC: 197536

With AVX Haswell can send 256

With AVX Haswell can send 256 bit vectors into the pipeline each clock tick, whereas Ivy still needed 2 128 bit vectors in 2 clocks. Intel slides say they want to enlarge the width to 512 bit in 2 years or so. Sounds like something worth using.. if it's not too much hassle.

And - with all due respect - for the next few years AVX support will liekly gain you more throughput than Einstein@Android.

MrS

Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 578346872
RAC: 197536

Any updates after one more

Any updates after one more year?

MrS

Scanning for our furry friends since Jan 2002

MarkJ
MarkJ
Joined: 28 Feb 08
Posts: 437
Credit: 139002861
RAC: 0

Later BOINC clients (ie 7.3

Later BOINC clients (ie 7.3 and 7.4) also report AVX feature if the CPU supports it.

ahj
ahj
Joined: 25 Jul 10
Posts: 17
Credit: 4331992
RAC: 0

Paging Bernd for any updates

Paging Bernd for any updates

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 728279580
RAC: 1177325

Fair question. Currently

Fair question.

Currently the foundations are being laid so that the next generation GW app will use AVX (actually the best SIMD aritecture available on a given host). I think we might see this on E@H later this year.

The BRP app is mainly intended for GPU now and we won't touch the CPU code, I guess.

The FGRP (gamma ray pulsar search in FERMI/LAT data) app could benefit from an AVX enabled FFT.

Cheers
HB

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 578346872
RAC: 197536

RE: the best SIMD

Quote:
the best SIMD aritecture available on a given host


That would be the ideal solution :)

I'm curious: what do you have to do to realize this?

MrS

Scanning for our furry friends since Jan 2002

Stranger7777
Stranger7777
Joined: 17 Mar 05
Posts: 436
Credit: 429865636
RAC: 78586

The time has come. The next

The time has come. The next GW run will surely support AVX, will it? ;-)

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 578346872
RAC: 197536

The new search on the

The new search on the advanced-generation LIGO detector data has an AVX app, although currently only for Linux.

MrS

Scanning for our furry friends since Jan 2002

Jesse Viviano
Jesse Viviano
Joined: 8 Jun 05
Posts: 33
Credit: 133045917
RAC: 0

If you implement AVX, make

If you implement AVX, make sure you have a way to deny 256-bit wide AVX to AMD Bulldozer and Piledriver processors and instead serve either SSE3 or 128-bit wide AVX plus FMA4 to those processors unless you prove that the 256-bit AVX meets a special case. See http://www.agner.org/optimize/ on why this should be done in most cases. The only advantage I can see to sending 256-bit AVX to those processors is if the programmer can fit the entire working set in the 256-bit registers and not in the 128-bit registers. If neither fit, 128-bit AVX and SSE3 are faster than 256-bit AVX due to some horrendous performance of the 256-bit registers when they need to be written out to memory especially in Piledriver. If both fit, then the 128-bit AVX or SSE is better because a 256-bit instruction takes two of the four shared decoders to decode while the 128-bit instruction uses just one. Bulldozer's set of four shared instruction decoders also has problems when handling 256-bit AVX instructions that must be split into two 128-bit instructions each because this set can only split one of these instructions per clock cycle, so a second 256-bit instruction could stall the decoder set.

Steamroller fixes these problems, so you should serve 256-bit wide AVX with optional FMA4 to this processor with no problem. I would expect the same for Excavator.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.