To little GPU-taks in forehand - progamming error?

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 534
Credit: 664736543
RAC: 565172
Topic 229500

Hallo!

Sorry, I've to write somewhat more, so you may understand my "problem":

I got an new PC with CPU Ryzen7 7700X and GPU Radeon RX6600. The CPU houses also an iGPU with 12GB memory but much less crunching capabilities than the RX6600, which has 8GB memory only. So BOINC selected the iGPU as the more capable GPU and ignored the RX6600. To avoid this, I installed a configuration file, to overcome this. Now the iGPU is used for driving my monitor only. The running times for FGRBP1G shrunk from about 8000[s] to about 520[s]. As I want to have a buffer of tasks for a day, the program does calculate still from the about 8000[s] running time of the iGPU. So I get to little number of tasks in spare. It does ignore the new situation. For a while I thought, it will adapt, but I find no indication of this, within the last days.

Do I've to incorporate a further instruction into the configfile? I didn't find a suitable one. Or is this a small problem of the server program?

I'll be pleased to get your assistance and remain with

kind regards and happy crunching

Martin

 

Scrooge McDuck
Scrooge McDuck
Joined: 2 May 07
Posts: 1077
Credit: 18244286
RAC: 11685

Hallo Martin,du kannst

Hallo Martin,

du kannst natürlich einfach die Zahl hochsetzen, für wieviele Tage BOINC Aufgaben vorhalten soll. Dann wirst du aber höchstwahrscheinlich viel zu viele CPU-Aufgaben (O3MD1 usw.) bekommen, die nicht rechtzeitig vor Fristablauf fertig werden würden. Es gibt da zwar einen "Task duration correction factor". Ich weiß aber nicht, ob man den auch wirksam händisch editieren kann; wie man also BOINC überzeugt, diesen unverändert zu lassen und nicht zu überschreiben. Aber dieser Wert gilt für einen Computer/Host. Da gibt es keine Unterscheidung zwischen CPU, GPU0, GPU1 usw. Hmmm, ich habe keine Idee wie sich dieses Problem gut lösen ließe. Ich habe das gleiche Problem mit umgekehrtem Effekt. Die Laufzeit für BRP4G GPU-Aufgaben auf Intel iGPU (5-10 Minuten) verändert die Laufzeitschätzung für FGRP1G GPU weit unter die reale Laufzeit. Werden also ein Dutzend  bis zwanzig BRP4G Tasks auf der iGPU abgeschlossen, liegt anschließend die Schätzung für FGRP1G bei ~1,5 h statt ~8 h und BOINC fordert dann viel zu viele GPU-Aufgaben an, die innerhalb der Frist, bei FGRP1G eine Woche, nicht abgearbeitet werden können. Wünschenswert wäre also den Buffer für CPU und GPU getrennt regulieren zu können. Ich weiß nicht wie.

[ You can adjust the number of days (in BOINC's configuration) for which workunits should be buffered. Then you'll get way too many CPU tasks which can't be finished until deadline. There's a 'task duration correction factor'. I don't know if you can edit it or if BOINC client will overwrite it regularly. But this factor is per host. There's no differentiation between CPU, GPU0, GPU1... Hmmm, I've no idea how to solve this problem. I've the same problem with the opposite effect. The runtime of BRP4G GPU tasks (5..10 minutes) changes the runtime estimation of other GPU tasks, e.g. FGRP1G. If a dozen... twenty BRP4G tasks have been finished in a row, then runtime estimations of other FGRPB1G tasks drop far below the real runtime to ~1.5h instead of ~8 hours. So the BOINC client will request further GPU tasks, way more than can be finished within the one week deadline. It would be nice to adjust the buffer size for GPU (or GPU0, GPU1) and CPU separately. I don't know how to do it. ]

Viele Grüße und weiterhin fröhliches Crunchen

Scrooge

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118329179119
RAC: 25363469

astro-marwil wrote:.... As I

astro-marwil wrote:
.... As I want to have a buffer of tasks for a day, the program does calculate still from the about 8000[s] running time of the iGPU.

Martin,
I very much doubt that your problem has anything to do with the iGPU or its previous poor performance.  It very much looks like it is being caused by trying to run too many CPU tasks at the same time.

If you look at your list of GW tasks on the website you can see a huge difference between CPU time and run time - eg something like less than 30,000s for CPU time and greater than 40,000s for the total elapsed time.  This is a classic sign that your CPU is overloaded.  You should immediately change your preferences to allow BOINC to use much less than 100% of your CPU cores.  Try 50% for starters.  The elapsed time will drop much closer to the value for CPU time and BOINC will stop fetching as much CPU work.  I notice you have so much that you had to abort some.

The real cause for the low number of GPU tasks is that duration correction factor is being controlled by all the slow running CPU tasks.  Every time a CPU task finishes, the long elapsed time will drive the DCF higher and thereby drive the estimate for GPU tasks to a much higher value than what it will actually take.  Because of that, BOINC will not be fetching enough GPU tasks to satisfy your "buffer of tasks for a day" requirement.

Because of the 'single DCF applying to all searches' problem that Einstein has, you can never achieve a situation where all searches you run will have exactly the correct number of tasks in reserve that you specify.  However you can get a lot closer by allowing the CPU to run more efficiently on fewer tasks as described above.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118329179119
RAC: 25363469

Scrooge McDuck wrote: ....

Scrooge McDuck wrote:

.... I've the same problem with the opposite effect. The runtime of BRP4G GPU tasks (5..10 minutes) changes the runtime estimation of other GPU tasks, e.g. FGRP1G. If a dozen... twenty BRP4G tasks have been finished in a row, then runtime estimations of other FGRPB1G tasks drop far below the real runtime to ~1.5h instead of ~8 hours

8 hours for FGRPB1G???  Even for an internal GPU that seems crazy long.  If you look at your CPU task times, once again you have a huge difference in CPU time and run time.  Maybe you would get much better iGPU performance for FGRPB1G if you solved the huge differece you have for CPU tasks - I saw ~60K CPU time and >100K run time.  That's way too large a difference.  What % of CPU cores are you allowing BOINC to use?

Undoubtedly, using an iGPU is going to affect CPU task times anyway so there will always be some difference in the times.  I don't run any iGPUs so I don't have any experience.

Also, you could edit DCF but it's pointless since the next task to finish will immediately drive it back in possibly a single step if it were to a higher value (or several steps if it were to a lower value).  The best thing is to make sure your CPU is not overloaded - I would think that CPU time and run time should be a lot closer than the current values.

Cheers,
Gary.

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 534
Credit: 664736543
RAC: 565172

Hallo! Many thanks for

Hallo!

Many thanks for your return. It's helping. Especially, Gary's hint was of value. I'm playing with the adjustments. It is, as Gary said, reducing the CPU-load reduces the runtimes of all applications. But now I get by far too many tasks for O3MD1.

I think, it would be of help, to have in the <app_config> an instruction to vary the DCF. They also have an instruction to vary the CPU-priority of an application. Why not an additional instruction for varying the DCF? This seems to me of greater importance for smooth running applications.

Kind regards and happy crunching

Martin

 

Harri Liljeroos
Harri Liljeroos
Joined: 10 Dec 05
Posts: 4458
Credit: 3261163041
RAC: 1857299

astro-marwil

astro-marwil wrote:

Hallo!

Many thanks for your return. It's helping. Especially, Gary's hint was of value. I'm playing with the adjustments. It is, as Gary said, reducing the CPU-load reduces the runtimes of all applications. But now I get by far too many tasks for O3MD1.

I think, it would be of help, to have in the <app_config> an instruction to vary the DCF. They also have an instruction to vary the CPU-priority of an application. Why not an additional instruction for varying the DCF? This seems to me of greater importance for smooth running applications.

Kind regards and happy crunching

Martin

Boinc has already solved this issue: it is called 'Credit New'. This system has separate values of 'Application Processing Rate' (APR) for each application of the project. Unfortunately the credit calculation with this system is so flawed (=random) that participants hate it deeply. There's tons of heated discussions about it at Seti@home forums where it was first introduced.

So your suggestion probably has little chance to be implemented.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118329179119
RAC: 25363469

astro-marwil wrote:... I'm

astro-marwil wrote:
... I'm playing with the adjustments.

Make sure you hasten slowly with further playing:-).

The full effect of a change will take quite a while to occur so you should resist the temptation to keep making more adjustments until you see the full effect of what you have already done.

astro-marwil wrote:
... But now I get by far too many tasks for O3MD1.

No, you're not getting more tasks now.

You already had a lot of O3MD1 tasks before you made the change.  You got just one new O3MD1 task more than 20 hours ago and none since.  You currently have 170 in progress - which is a lot less than you had a day ago.  You can do a quick calculation based on the current run time and how many are running in parallel to see if BOINC can complete them all within the deadline.  Please do that calculation and report what you get.  Assuming you set BOINC to use 50% of the processors, a rough calculation says that the remaining 170 tasks will be completed by the deadline.

You now have some more FGRPB1G tasks in progress and this should continue to increase as those tasks should be running consistently more quickly than previously and the DCF (and task estimates) will not be driven as high by long running CPU tasks.

astro-marwil wrote:
I think, it would be of help, to have in the <app_config> an instruction to vary the DCF. They also have an instruction to vary the CPU-priority of an application. Why not an additional instruction for varying the DCF? This seems to me of greater importance for smooth running applications.

The DCF is a dynamic parameter that is entirely under the control of the BOINC client.  If a user were to tweak the value, BOINC would adjust it to something else immediately after the completion of every single task of any type.  It makes no sense at all to think that a user could tweak it to give smoother running.

The problem exists because Einstein uses a customised version of the BOINC server code for which there is only a single DCF value.  DCF is a mechanism to 'correct' task estimates based on the actual run times that a particular host experiences.  The problem could be largely mitigated if each different search had a unique DCF parameter that was immune to interference from what was happening with other searches that used their own unique DCF parameter while running on a given host.

It would probably be a monumental task for E@H staff to rewrite the customised server code to handle multiple DCFs (one per search) - or any other mechanism for handling the correction of run time estimates - so it's unlikely to happen.  It's quite easy for the user to work around the DCF issue - as you should be able to see after your changes take full effect.

Cheers,
Gary.

Scrooge McDuck
Scrooge McDuck
Joined: 2 May 07
Posts: 1077
Credit: 18244286
RAC: 11685

Gary, thanks for the detailed

Gary, thanks for the detailed explanation of how DCF works, the reminder that einstein uses modified server code from BOINC for reasons, and that in turn introduces some bumps in the road over time. Looking in the client_state.xml, the definition of e@h apps, their app versions, corresponding executables and files, one wrongly assumes that different e@h searches which run different apps/executables are completely independent of one another concerning their runtime estimates. But they aren't as I learned now. A lot of things that I have observed and wondered about are now becoming very clear. Thanks for that.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118329179119
RAC: 25363469

Scrooge McDuck wrote:Gary,

Scrooge McDuck wrote:
Gary, thanks for the detailed explanation of how DCF works ...

You're most welcome!

I'm glad you found it useful.

Cheers,
Gary.

Scrooge McDuck
Scrooge McDuck
Joined: 2 May 07
Posts: 1077
Credit: 18244286
RAC: 11685

Gary Roberts schrieb:8

Gary Roberts wrote:

8 hours for FGRPB1G???  Even for an internal GPU that seems crazy long.  If you look at your CPU task times, once again you have a huge difference in CPU time and run time.  Maybe you would get much better iGPU performance for FGRPB1G if you solved the huge differece you have for CPU tasks - I saw ~60K CPU time and >100K run time.  That's way too large a difference.  What % of CPU cores are you allowing BOINC to use?

Gary, your observations are correct and your advice would be helpful to improve my RAC. The difference of CPU times and run times is caused by an external tool (TThrottle) which throttles BOINC's CPU tasks limiting CPU temperature. It's a small form factor device which heat spreader is dimensioned for minute long peak loads not 24/7 full performance. BIOS will speed up CPU fan to uncomfortable noise level if high CPU loads lasts longer than ~10 minutes. So I run it throttled (to ~40..60% of maximum performance) all the time.

I use 50% of CPU cores (quad core, 8 virtual cores) but max 3 concurrently on O3MD1 CPU. And yes, CPU tasks greatly slow down iGPU tasks (presumably memory bottleneck). Especcially O3MD1 CPU tasks which are more demanding (heat... CPU temperature) than FGRP5 or BRP4X64 slow things down. I'd achieve maximum RAC by only running FGRPB1G GPU tasks (~3...4 hours) on internal iGPU and no CPU tasks at all.

Normally my box is unsuitable for crunching.... an old daily driver, a (low-noise) small desktop computer. But anyway, BOINC's idea is to use 'spare CPU cycles' for scientific computing. That's what I'm doing. I'm fine with my sub-par RAC. ;-)

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.