Optimized Boinc - SSE, SSE2...

Brian Silvers
Brian Silvers
Joined: 26 Aug 05
Posts: 772
Credit: 282700
RAC: 0

RE: RE: RE: If you look

Message 79590 in response to message 79589

Quote:
Quote:
Quote:

If you look at Crunch3r's Seti optimised app's pages you will see the sse etc versions available for Seti. Others that you might think should be there like AMD SSE3 are not released because the AMD version of SSE3 is not the same as Intels and causes crashes or errors.

In truth as seen by Seti, SSE3 optimised apps don't work any faster than SSE2.

Seti itself doesn't offer different versions because of the extra workload involved and the problems of testing each optimised application on all platforms. And the problem will only get worse when Intel produce SSEn, because the there will be different generations of Mac's as well.

Another "full disclosure" statement is that the reason that there are AMD vs. Intel SSE and SSE2 applications is not because of the implementation of SSEx, but because of the Intel Compiler generating code targeted for Intel-specific architecture vs. "generic" for AMD. I had always translated that to be more buffering tricks and other architecture-specific optimizations, not differences in SSEx. If that is not the case, someone please speak up and clarify... :-)


From [url=In April 2005, AMD introduced a subset of SSE3 in revision E (Venice and San Diego) of their Athlon 64 CPUs.]Wikipedia SSE3 article[/url];
Quote:
In April 2005, AMD introduced a subset of SSE3 in revision E (Venice and San Diego) of their Athlon 64 CPUs.

What difference this makes I have no idea and if they have included full SSE3 in subsequent cpu's I have no idea.

Keep in mind that what you quoted and what I'm quoting now is Wiki, which can be edited by anyone, but according to this entry about x86, the only instructions left out were instructions specific to HyperThreading. I suppose one could argue that this means that they're not equivalent, and that would be true, but the difference is minor. Even if they had included the HT-related instructions, one could still claim that AMD didn't "follow the SSE3 standard" because the functions did not work (since no HT, they never would work)...

ML1
ML1
Joined: 20 Feb 05
Posts: 347
Credit: 86563414
RAC: 306

RE: ... the only

Message 79591 in response to message 79590

Quote:
... the only instructions left out were instructions specific to [Intel] HyperThreading. I suppose one could argue that this means that they're not equivalent, and that would be true, but the difference is minor. Even if they had included the HT-related instructions, one could still claim that AMD didn't "follow the SSE3 standard" because the functions did not work (since no HT, they never would work)...


It's a very old anti-competitive game to 'extend' or deliberately break 'standards'.

I still consider the Intel use of the CPUID checks to be incredibly brazen and very costly for those stung by the consequences. I guess we'll find out in a few years time as the court cases rumble along after the event...

Out of interest, what would an AMD SSE2 CPU do with one of the HyperThreading instructions? NOP or cause a trap event?

Happy crunchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 561
Credit: 4527270
RAC: 0

RE: Out of interest, what

Message 79592 in response to message 79591

Quote:
Out of interest, what would an AMD SSE2 CPU do with one of the HyperThreading instructions? NOP or cause a trap event?


Which instruction on which AMD CPU?

Any "unknown" instructions generate "illegal instruction" exception.

ML1
ML1
Joined: 20 Feb 05
Posts: 347
Credit: 86563414
RAC: 306

RE: RE: Out of interest,

Message 79593 in response to message 79592

Quote:
Quote:
Out of interest, what would an AMD SSE2 CPU do with one of the HyperThreading instructions? NOP or cause a trap event?

Which instruction on which AMD CPU?

Any "unknown" instructions generate "illegal instruction" exception.


For any code compiled for an Intel SSE2 CPU target that is instead run on an AMD SSE2 capable CPU (and with any "AuthenticAMD" Intel check removed).

Are the SSE2 HT-specific codes effectively ignored or does everything come to a halt with the "illegal instruction" exception?

Regards,
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)

ErichZann
ErichZann
Joined: 11 Feb 05
Posts: 120
Credit: 81582
RAC: 0

RE: 3) Put different

Message 79594 in response to message 79572

Quote:

3) Put different versions of the same science algorithm into a single program, which is distributed automatically to all users. The first thing the program will do is to detect the CPU's capabilities and then select the appropriate code-path.
There are some drawbacks as well, for example you have to generate a new app version which will be updated automatically on ALL clients even if only the code for one of the code-paths was changed.

Well... That option doesn't sound too bad for me, i really dont see the problem of an (automatic) download of a new version from time to time (the science app is like 3 mb....)

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 42

What happened was that I was

Message 79595 in response to message 79586

What happened was that I was reading up in Wikipedia what SSE2 did. There I then misunderstood what it does for the AMDs. On a 64bit AMD CPU, SSE2 adds 8 XMM registers that only work in 64bit mode. Hence my confusion and why I took Brian's advice and went to bed. The aspirins weren't working anyway. :-(

Oh well, you can't have a good night all nights and when you make one mistake, more are prone to crawl up on you in quick succession. ;-)

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 561
Credit: 4527270
RAC: 0

RE: Are the SSE2

Message 79596 in response to message 79593

Quote:
Are the SSE2 HT-specific codes effectively ignored or does everything come to a halt with the "illegal instruction" exception?


As far as I know SSE2 doesn't consist HT-specific codes.
So, the AMD processors know all of the SSE2 instructions except two.

CLFLUSH (flush cache line) and PAUSE (delay pre-decoding).

PAUSE is a NOP code on older and AMD processors.
But CLFLUSH produces the "illegal instruction" effect.

ML1
ML1
Joined: 20 Feb 05
Posts: 347
Credit: 86563414
RAC: 306

RE: PAUSE is a NOP code on

Message 79597 in response to message 79596

Quote:
PAUSE is a NOP code on older and AMD processors.
But CLFLUSH produces the "illegal instruction" effect.


OK, so how often is the CLFLUSH instruction used, if ever?

Just as the "AuthenticAMD" string test generated by ICC can be overwritten so that the 'CPUID test' then always runs the (SSEn) optimised code, can the CLFLUSH instruction be overwritten by, for example, a NOP to avoid the "illegal instruction" trap?

Regards,
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 561
Credit: 4527270
RAC: 0

RE: OK, so how often is the

Message 79598 in response to message 79597

Quote:
OK, so how often is the CLFLUSH instruction used, if ever?


It depends on the application.

But I was wrong.
CLFLUSH is not part of SSE2 instructions, but it was introduced with them.
It has an own CPUID feature flag, so it can be appear in non-SSE2 CPUs too.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 736185180
RAC: 1280665

I think MONITOR and MWAIT are

I think MONITOR and MWAIT are part of SSE2, but not supported by AMD.

CU
Bikeman

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.