Processor affinity screwed up?

Ulrich Metzner
Ulrich Metzner
Joined: 22 Jan 05
Posts: 113
Credit: 963370
RAC: 0
Topic 193943

Hello everyone,

i noticed a problem with the new Einstein application 6.04:

Several times i notice the system is not running at full load. Then i find that an Einstein task is using 2 processors instead of only one, thus handicapping for example a parallel running Seti application. So the Seti application runs only at about 30% processing and the total system load is only 70-80%. Once i manually set the processor affinity of the Einstein application to use the other processor only, the system load returns to 100% on my Core2duo. This only happens with Einstein application not with other projects. I searched the board for processor affinity but found not a single mentioning of this problem. The BOINC client i use is version 6.1.0 by cruncher.

I forgot: I'm using Windows XP professional 32bit version.

Any help appreciated.

Aloha, Uli

Stan Pope
Stan Pope
Joined: 22 Dec 05
Posts: 80
Credit: 426811575
RAC: 0

Processor affinity screwed up?

Here are some data points all using BOINC 5.10.13:

Configuration that shows loss of processor affinity:
Pentium D 3.0GHz wtih Vista Home Premium Service Pack 1

Other configurations here that show proper affinity:
Pentium 4 HT 3.0GHz Vista Home Premium Service Pack 1
AMD Athlon 64 X2 Dual Core 4800+ Vista Home Premium Service Pack 1
Core 2 Quad Q6600 Win XP Service Pack 3
Core 2 Quad Q6600 Vista Home Premium Service Pack 1

Any more in depth details useful? e.g. motherboard, BIOS, ???
Should we put this in the "bug reports" area instead?

Stan

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 3002821799
RAC: 712357

What two applications are you

What two applications are you actually seeing this affinity "problem" for?

My guess is that BOINC is launching einstein_S5R4_6.04_windows_intelx86.exe (81 KB), and giving it a whole core all to itself. Since the only role of this app is to detect the presence/absence of SSE capabilities in the CPU, and then launch the appropriate worker app (einstein_S5R4_6.04_windows_intelx86_1.exe, in your case), that's rather a waste of a core.

I think it's highly unlikely that the Einstein project staff will waste time modifying their application to work with an unsupported BOINC feature in an elderly, unofficial BOINC version - I say 'elderly' because even the current (downloaded today) Version 5 of Crunch3r's v6.1.0 is dated 8 February 2008, and hence can't be a true BOINC v6 compliant application.

I think you have three options:

1) Use a stock (supported) BOINC version which doesn't try to enforce processor affinity - I've never seen anyone on any BOINC board post convincing figures to suggest there's any benefit from using affinity on modern CPUs anyway.

2) Write an app_info.xml file to launch instein_S5R4_6.04_windows_intelx86_1.exe directly without benefit of the CPU switcher.

3) Go back to the source of the problem, and ask Crunch3r to write a version of BOINC which handles affinity for child processes too.

Ulrich Metzner
Ulrich Metzner
Joined: 22 Jan 05
Posts: 113
Credit: 963370
RAC: 0

Thanks for the answers! I

Thanks for the answers!

I find the processor affinity is essential for maximum performance. This feature takes care that there is no cache and pipeline trashing. If there is no processor affinity you can see the processes struggle for resources and the load never goes to 100%. With affinity enabled both processes run at exactly 50% load each. If all other BOINC clients don't support this feature, i think it is a blatant waste of cpu and memory management power.

So the best solution will be to contact crunch3r.
Thanks everyone!

Aloha, Uli

archae86
archae86
Joined: 6 Dec 05
Posts: 3163
Credit: 7346581687
RAC: 2202725

RE: I find the processor

Message 85675 in response to message 85674

Quote:
I find the processor affinity is essential for maximum performance.

I'm one of the few people who has actually posted results attempting to evaluate this point experimentally.

For my particular system and applications, there actually appeared to be a small penalty of turning on processor affinity.

While I'll cheerfully acknowledge that this answer could vary some with detailed application characteristics and system architecture, I think the general conclusion that processor affinity is always helpful is quite clearly wrong.

In your particular case, the penalty that you have already paid in lost work because your specialized BOINC client was allocating an entire core to a dispatching process has probably already lost you more work than any possible gain you might get for many months.

I'm not really saying this to discourage you in your quest, but to balance your comment for others who might be reading and might be influenced.

Possibly your mix of applications (I only run SETI and Einstein, ran the tests a while ago, and don't recall the mix of work I used) gives you an affinity reward not seen by others.

Stan Pope
Stan Pope
Joined: 22 Dec 05
Posts: 80
Credit: 426811575
RAC: 0

RE: What two applications

Message 85676 in response to message 85673

Quote:
What two applications are you actually seeing this affinity "problem" for?

Mostly it is imbalance, as much as 80-20, between instances of einstein_S5R4_6.04_windows_intelx86_1.exe.

I don't know if the usual balanced behavior (50-50 or 25-25-25-25 as shown in Windows Task Manager) is the developer's intention or or just a happy accident ... I haven't found specs that would tell me.

Since the imbalance on this one PC is only part-time, I'm not going to worry about it for myself. I "piped up" just to offer case info for someone who wants to pursue it.

Stan

archae86
archae86
Joined: 6 Dec 05
Posts: 3163
Credit: 7346581687
RAC: 2202725

RE: Mostly it is imbalance,

Message 85677 in response to message 85676

Quote:
Mostly it is imbalance, as much as 80-20, between instances of einstein_S5R4_6.04_windows_intelx86_1.exe.


I can mention a couple of ways to get imbalance which I've observed on my Duo and my Quad.

1. As is well known, there is a startup phase each time a new Einstein WU starts up or one resumes from a checkpoint (not for a simple task resume from a memory image). During this phase Process Explorer reports that the Einstein task racks up a huge number of I/O read bytes, I've seen from about 750 Megabytes up past 1.3 Gigabytes--very reproducible within sequential work from the same frequency, but with big fixed effects, I think the higher frequencies do more). During this period, the Einstein task can't use up a full processor, partly because work gets attributed to the System task using perhaps a third of the CPU expended on the Einstein task, but also, at least when four tasks are launching from the same start time, there is appreciable idle time. How long this period lasts seems very highly variable--not only to CPU speed, and environment of other tasks, but also, at the least, to one's choice of anti-virus and firewall software (both). I've seen as little as well under twenty seconds, and as much as five minutes or more).

So, anyway, if I look when one Einstein task is in that startup phase, and other(s) not, there is a big CPU time imbalance, with the starting up one getting well under 1/n, and the others getting just about 1/n, assuming nothing else is going on.

2. When "something else" is going on which uses appreciable resources, I've noticed it to take them unequally from the current running Einstein tasks.

I should note that these observations are all with a client that enforces no CPU affinity preferences (stock BOINC 5.10.45) on Windows XP Pro systems, with Conroe-generation CPUs. As affinity enforcement removes a degree of freedom of the OS to schedule available work to available resources, I suspect it would make these particular cases worse.

Ulrich Metzner
Ulrich Metzner
Joined: 22 Jan 05
Posts: 113
Credit: 963370
RAC: 0

RE: (...) I should note

Message 85678 in response to message 85677

Quote:
(...)
I should note that these observations are all with a client that enforces no CPU affinity preferences (stock BOINC 5.10.45) on Windows XP Pro systems, with Conroe-generation CPUs. As affinity enforcement removes a degree of freedom of the OS to schedule available work to available resources, I suspect it would make these particular cases worse.


There is absolutely no degree of freedom removed for the OS since all processes run at idle priority. So the OS has no restrictions on getting processing resources from any processor it wants. The statement, that there is no benefit from using processor affinity contradicts with every experience i have from my own work in this context. There is a clear penalty of cache and pipeline trashing when more than one process uses the same processor. The lesser the context switching the lesser the trashing - simple as that. Crunch3r shares my opinion on this and i will try to support him in finding a decent solution.

Aloha, Uli

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 3002821799
RAC: 712357

Along with archae86, I have

Along with archae86, I have attempted to measure the practical benefits of processor affinity. This chart was the result:


(direct link)

OK, that's an elderly chart - first posted 21 December 2006: you can tell from the Crunch3r affinity BOINC I tested (v5.7.5). Also, I was testing the old-style SETI Linefeed applications, not the modern MultiBeam ones.

But I still feel that the difference between standard BOINC (blue markers) and affinity BOINC (green markers) is unconvincing - even with two CPUs, four caches and eight core pipelines just waiting to be trashed! The solution, in my case, was to feed the dual Xeons with quad-channel memory (yellow markers).

I'm willing to be convinced, but I'd like to see some actual, practical, real-world WU timings (since this is Einstein, corrected for sequence periodicity and high-frequency wiggles). What sort of percentage speed increase, overall, do you reckon you get from affinity?

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4345
Credit: 252815400
RAC: 41141

IMHO The easiest solution

IMHO The easiest solution would be to write an app_info.xml for the 6.04 App that uses the file einstein_S5R4_6.04_windows_intelx86_1.exe as (only) App binary, thus avoiding the switcher process. You'd need to update your App manually then, but apparently you're used to tweaking your BOINC installation anyway.

BM

BM

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 800793154
RAC: 1221858

Hi! Interesting stuff

Hi!

Interesting stuff indeed. However, I really don't expect the answer to the question "Does it matter to BOINC perfoamnce to enforce CPU affinity" to be as simple as "yes," or "no".

Think about the the huge scope of different configurations & use cases we are talking about:

* multi-core vs multi CPU
* NUMA style SMP vs shared memory
* multi cores that share (some) cache levels across cores vs dedicated caches per core or per core-pair on the same die or whatever....
* CPUs with an abundance of cache and those with rather small caches...
* Not to forget hyperthreading as an additional complication ..
* The load on the system from other processes than BOINC.

I even would not be surprised if different versions of Windows (e.g. Server and home editions) already had different scheduler strategies and/or built in heuristics for CPU affinity.

CU
Bikeman

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.