Information about the new S5 workunits

Akos Fekete

Joined: 13 Nov 05

Posts: 561

Credit: 4527270

RAC: 0

RE: Does anybody still have

22 May 2007 14:11:54 UTC

Message 37887 in response to message 37886

(moderation:

)

Quote:

Does anybody still have a copy of the S5R1 or S5RI science app for Windows??

Of course!

Quote:

Is the string "AuthenticAMD" also appearing in those apps?

Yes. I checked. ( S5RI 4.24 windows )

roadrunner_gs

Joined: 7 Mar 06

Posts: 94

Credit: 3369656

RAC: 0

Couldn't it be they now just

22 May 2007 14:12:19 UTC

Message 37888

(moderation:

)

Couldn't it be they now just link against the mathlib from the ICC whereas before they linked against the Microsoft VCC standard-mathlib?

Annika

Joined: 8 Aug 06

Posts: 720

Credit: 494410

RAC: 0

Will something be done about

22 May 2007 14:21:09 UTC

Message 37889

(moderation:

)

Will something be done about the AMD penalty under Windows? What are the project devs planning; it can't be in their interest that a significant percentage of boxes is running at 70% of their potential or less, so, do you plan to change this part of the app in the next release of the Einstein science app?

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 2960995982

RAC: 696495

RE: RE: Does anybody

22 May 2007 14:26:22 UTC

Message 37890 in response to message 37887

(moderation:

)

Quote:

Quote:
Does anybody still have a copy of the S5R1 or S5RI science app for Windows??

Of course!
Quote:
Is the string "AuthenticAMD" also appearing in those apps?

Yes. I checked. ( S5RI 4.24 windows )

Do you have a test S5RI datapak so you could run an offline comparison of the patched 4.24 app - on an AMD SSE2, of course?

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 734233262

RAC: 1292589

RE: RE: Does anybody

22 May 2007 14:53:30 UTC

Message 37891 in response to message 37887

(moderation:

)

Quote:

Quote:
Does anybody still have a copy of the S5R1 or S5RI science app for Windows??

Of course!
Quote:
Is the string "AuthenticAMD" also appearing in those apps?

Yes. I checked. ( S5RI 4.24 windows )

Hmmm, so maybe the modf function wasn't used as heavily in the old app. Bernd mentioned something that the old app used some alternative to modf which was later found to be numerically suboptimal in the context of the new run.

Akos, you will know what I mean, something along the lines
frac = x - (UINT4) x instead of frac = modf(x,&dummy)

This explains why the old app didn't suffer.

As to the compiler, I guess Microsoft may have licensed Intel's math library, or maybe they use the Intel compiler to build their math lib (just kidding, don't sue me, MS ...).

BRM

M. Schmitt

Joined: 27 Jun 05

Posts: 478

Credit: 15872262

RAC: 0

RE: As to the compiler, I

22 May 2007 15:04:20 UTC

Message 37892 in response to message 37891

(moderation:

)

Quote:

As to the compiler, I guess Microsoft may have licensed Intel's math library, or maybe they use the Intel compiler to build their math lib (just kidding, don't sue me, MS ...).

CU

BRM

Afaik you can download math libs from Intel and AMD for free. Don't know about the licence though.

cu,
Michael

Annika

Joined: 8 Aug 06

Posts: 720

Credit: 494410

RAC: 0

Update from the Opteron... a

22 May 2007 18:43:04 UTC

Message 37893

(moderation:

)

Update from the Opteron... a full 50 percent increase! The WU isn't finished yet but it's more than half crunched so the estimate should be quite okay. Looks like that kind of box uses SSE2 a lot normally, therefore the huge 70% penalty and now the big performance increase. I think this kind of box will benefit most if a patch is applied on the large scale... okay, dunno how many people combine a server CPU and Windows, but still, it's a significant difference and 70% on a fast machine can have quite an effect even if there are not that many boxes of this kind around.

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 734233262

RAC: 1292589

RE: Update from the

22 May 2007 20:10:15 UTC

Message 37894 in response to message 37893

(moderation:

)

Quote:

Update from the Opteron... a full 50 percent increase! The WU isn't finished yet but it's more than half crunched so the estimate should be quite okay. Looks like that kind of box uses SSE2 a lot normally, therefore the huge 70% penalty and now the big performance increase. I think this kind of box will benefit most if a patch is applied on the large scale... okay, dunno how many people combine a server CPU and Windows, but still, it's a significant difference and 70% on a fast machine can have quite an effect even if there are not that many boxes of this kind around.

Hi all!

So cool!

I looked at that suspect code in the math lib again and I think that it is not "evil", just not correct. It could be that whoever wrote this, didn't want to exclude all AMDs from SSE2 but only a certain processor family. After the comparison with the string "AuthenticAMD", the code does some more arithmetic with the CPU model and extended CPU model info, (my assembly language knowledge isn't that good anymore), it's possible that the intention was to exclude only the first generation of 130nm "Newcastle" K8s. I think what it actually does might be the opposite: enable SSE2 on the Newcastles and disabling it for all others.

Was there something wrong with the Newcastle SSE2 implementation? I didn't find anything by googling. Maybe it was just plain slow??

BRM

roadrunner_gs

Joined: 7 Mar 06

Posts: 94

Credit: 3369656

RAC: 0

RE: (...) I looked at that

22 May 2007 20:20:49 UTC

Message 37895 in response to message 37894

(moderation:

)

Quote:

(...)
I looked at that suspect code in the math lib again and I think that it is not "evil", just not correct. It could be that whoever wrote this, didn't want to exclude all AMDs from SSE2 but only a certain processor family. After the comparison with the string "AuthenticAMD", the code does some more arithmetic with the CPU model and extended CPU model info, (my assembly language knowledge isn't that good anymore), it's possible that the intention was to exclude only the first generation of 130nm "Newcastle" K8s. I think what it actually does might be the opposite: enable SSE2 on the Newcastles and disabling it for all others.

look here

Quote:

Was there something wrong with the Newcastle SSE2 implementation? I didn't find anything by googling. Maybe it was just plain slow??
(...)

Not as i know, but i go searching.

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 734233262

RAC: 1292589

RE: RE: (...) I looked at

22 May 2007 20:23:53 UTC

Message 37896 in response to message 37895

(moderation:

)

Quote:

Quote:
(...)
I looked at that suspect code in the math lib again and I think that it is not "evil", just not correct. It could be that whoever wrote this, didn't want to exclude all AMDs from SSE2 but only a certain processor family. After the comparison with the string "AuthenticAMD", the code does some more arithmetic with the CPU model and extended CPU model info, (my assembly language knowledge isn't that good anymore), it's possible that the intention was to exclude only the first generation of 130nm "Newcastle" K8s. I think what it actually does might be the opposite: enable SSE2 on the Newcastles and disabling it for all others.

look here

Quote:
Was there something wrong with the Newcastle SSE2 implementation? I didn't find anything by googling. Maybe it was just plain slow??
(...)

Not as i know, but i go searching.

Clawhammer and Newcastle, that is. Everything that would report Family 15, extended family 0

BRM

Information about the new S5 workunits

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner