Credits for Arecibo vs. GW-search

kimmerin
kimmerin
Joined: 29 Sep 08
Posts: 16
Credit: 11090767
RAC: 0
Topic 194272

Hello,

after finishing some Arecibo-WUs I can see that the credits being given for them is significantly lower than the ones being given to the GW-search:

At one of my computers a GW-search lead to about 8,33... credits per 1000 CPU seconds.

The pulsar-search only lead to about 6 credits per 1000 CPU seconds.

Is there a specific reason for this or will this be adjusted. I can imagine that many people will exclude Arecibo-data being processed because of this.

Regards, Lothar

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 833457326
RAC: 1153485

Credits for Arecibo vs. GW-search

Quote:

Hello,

after finishing some Arecibo-WUs I can see that the credits being given for them is significantly lower than the ones being given to the GW-search:

At one of my computers a GW-search lead to about 8,33... credits per 1000 CPU seconds.

The pulsar-search only lead to about 6 credits per 1000 CPU seconds.

Is there a specific reason for this or will this be adjusted. I can imagine that many people will exclude Arecibo-data being processed because of this.

Regards, Lothar

Thanks for the feedback. The devs are aware of this and will use the statistics of the returned results so far to re-calibrate the credits of the Arecibo search. It's impossible to get it 100 % equal with the S5R5 search across all platforms (SSE, SSE2...) but I'm sure the credits will be adjusted to be a bit more fair to the Arecibo search rather sooner than later. Stay tuned!

Thx again,
Bikeman

bahndamm_net-boincers
bahndamm_net-bo...
Joined: 11 Feb 05
Posts: 14
Credit: 12053362
RAC: 865

RE: It's impossible to

Message 92296 in response to message 92295

Quote:

It's impossible to get it 100 % equal with the S5R5 search across all platforms (SSE, SSE2...) but I'm sure the credits will be adjusted to be a bit more fair to the Arecibo search rather sooner than later. Stay tuned!

Hi,

Could it be, that the Arceibo-Application is based on a totally different evaluation model (i.e. integer arithmetics)?

I've two VIA C7 CPUs running, and both finish the Arecibo-WUs much faster than the regular ones. This one for example finishes Einstein-WUs between 210,000 and 320,000 seconds for 150 to 200 credits. The Arecibo-WUs take 160,000 seconds for 250 credits!

Every other brand cpu takes about 10-25% more processing time für Arecibo data, but these little guys need 25-50% less!

Since the C7 does integer arthmetics quite ok but stalls at floating point calculations, I would guess, the Arecibo-Application makes excessive use of integer operations. This would also explain, why there is no SSE-Version of the Arecibo-Application.

Am I right with this?

Rudi

PS: Of course, the C7 is not "faster" with integer, it's just totally sluggish with floating point, so there's no advantage for these here.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 833457326
RAC: 1153485

This is indeed

This is indeed remarkable.

No, AFAIK the "Arecibo" app is not less dependent on Floating Point calculations compared to the S5R5 app.

Given the clock rates of the C7, the runtime of the S5R5 app is incredibly long. It's worth noting that the C7 supports SSE2 so the SSE2 optimized app variant gets selected. But.... I'm not sure whether the C7 implementation of SSE(2) is really meant to provide maximum performance or just compatibility.

You could test whether the generic, "unoptimized" FPU x87 code is actually faster than the SSE2 implementation on the C7. If you create a file named CPU_TYPE_0 in the BOINC folder (the one with the projects subfolder) forces the "switcher" to select the standard app variant even if the switcher detects SSE(2) support.

CU
Bikeman

bahndamm_net-boincers
bahndamm_net-bo...
Joined: 11 Feb 05
Posts: 14
Credit: 12053362
RAC: 865

RE: This is indeed

Message 92298 in response to message 92297

Quote:

This is indeed remarkable.

No, AFAIK the "Arecibo" app is not less dependent on Floating Point calculations compared to the S5R5 app.

Well this is indeed. Two similar apps with 50% difference in processing time. Go figure...

Quote:

I'm not sure whether the C7 implementation of SSE(2) is really meant to provide maximum performance or just compatibility.

Since it even supports SSE3, I'd rather play the "compatibility" card.

Quote:

You could test whether the generic, "unoptimized" FPU x87 code is actually faster than the SSE2 implementation on the C7. If you create a file named CPU_TYPE_0 in the BOINC folder (the one with the projects subfolder) forces the "switcher" to select the standard app variant even if the switcher detects SSE(2) support.

I'll give that a try, as soon as the current WU are finished.

See you guys

Rudi

bahndamm_net-boincers
bahndamm_net-bo...
Joined: 11 Feb 05
Posts: 14
Credit: 12053362
RAC: 865

RE: You could test whether

Message 92299 in response to message 92298

Quote:

You could test whether the generic, "unoptimized" FPU x87 code is actually faster than the SSE2 implementation on the C7. If you create a file named CPU_TYPE_0 in the BOINC folder (the one with the projects subfolder) forces the "switcher" to select the standard app variant even if the switcher detects SSE(2) support.

Wow. This is next to ridiculus...

The first FPU-only processed result took nearly 80% longer than the one done with SSE2.

So much for "only compatibility". Doesn't end up though, why the Arecibo-App is so much faster on this type of CPU...

I'll guess, I have to live with that. There are just some VIA/Centaur CPUs around here... so no reason to burn valuable programming-time for these. Maybe the next regular app-update will do better, by some x86-miracle.

Thanks all

Rudi

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 833457326
RAC: 1153485

Hi! Thanks for trying

Message 92300 in response to message 92299

Hi!

Thanks for trying this, it excludes one possible explanation for what you observed.

Next possible explanation would be the size of the L2 cache (it's 128 KB???). The S5R5 app seems to benefit from a lot of cache because it involves some memory intensive computations (the "pattern matching" part of the computation, while the other part is more floating point arithmetic intensive). 128 KB isn't too much, even the consumer AMD CPUs that have 512 KB per core seem to suffer a bit compared to Intel CPUs with 1MB+ L2 cache per core when it comes to S5R5 performance.

It happens that I'm currently looking into making the app less dependent on L2 cache (actually a by-product of my first steps in CUDA-land :-) ), and it would be interesting to test-drive a prototype on your system exactly because it is "cache-challenged". Do you run a 64 or 32 bit Linux?

EDIT: ooops, the C7 is only 32 bit. Ok, if interested in benchmarking a prototype app, send me a PM.

CU
Bikeman

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.