GPU significantly slowed down by other WUs

Michael Hoffmann
Michael Hoffmann
Joined: 31 Oct 10
Posts: 32
Credit: 31031260
RAC: 0
Topic 197082

According to BIONIC, the GPU-Tasks use 0.5 CPUs and one GPU. However, normal CPU tasks use the full number of CPUs available. In my case there are six CPU tasks running parallel to the GPU task.
I realized, that this way, the GPU process becomes very slow, the speed increases, if you manually limit the system to five CPU tasks.
But I am not willing to manually assign tasks only because of this GPU issue.
Is there anything that can be done about that?

Om mani padme hum.

Filipe
Filipe
Joined: 10 Mar 05
Posts: 186
Credit: 415545483
RAC: 197411

GPU significantly slowed down by other WUs

yes! Go to your boinc preferences and set the % use of cpu to 80% (Instead of 100%)

This way, you will have one core free to feed the GPU, and the remaining 5 will do cpu tasks.

Hope this help.

Michael Hoffmann
Michael Hoffmann
Joined: 31 Oct 10
Posts: 32
Credit: 31031260
RAC: 0

Thank your for that hint,

Thank your for that hint, Filipe. It helped a little, sped up the process a little. I think I'll have to experiment with CPU-usage settings, as 80% does not seem to be the optimum.

It would be great, if the application would do that job for me in the future. I'll have a look if that issue is already filed in the Bionc bugtracker.

Om mani padme hum.

Filipe
Filipe
Joined: 10 Mar 05
Posts: 186
Credit: 415545483
RAC: 197411

Try freeing another core,

Try freeing another core, with the setting to 75%.

You will have 2cores for feeding the gpu and run other apllications (Windows, firefox, antivirus). And 4 cores running cpu's jobs.

I beleive this could improve things a lot.

Michael Hoffmann
Michael Hoffmann
Joined: 31 Oct 10
Posts: 32
Credit: 31031260
RAC: 0

Found out that I made a

Found out that I made a mistake in the adjustments. Now that it's corrected, it works fine with 80%.
Thanks a lot again.

However, I created a ticket in the bugtracker. Maybe one day, the system will do the adjustments for me.

Om mani padme hum.

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 1

The thing here is, that it's

The thing here is, that it's the way that the Einstein GPU application works is why it is taking so much CPU. In other projects, the GPU application does most, if not all of the calculations on the GPU. At Einstein, only some of the calculations are done on the GPU, but most are still done on the CPU. Which is why it takes so much CPU and CPU time.

Having BOINC automatically stop using one or more cores is going to be catastrophic for some projects. For what if there's someone with 4 dual GPUs in his system, on a 4 core CPU? If each GPU takes half a CPU, this person's BOINC would automatically deny work for any CPU.

But it's worse than that.
The project developers overestimate how much CPU their GPU application actually takes. So it can be that their application only takes 0.05 CPU at peak intervals. Yet, since BOINC can't deactivate only 0.05th of a CPU, it would have to deactivate 1 core for this application. Then what happens when there's 2 of these GPUs that both take 0.05 CPU? As a user, you can decide to just deactivate 1 CPU core. BOINC might want to use 2 cores.

Of course there's ways to automate everything, but would we want to? Wouldn't we, the users of BOINC, want some control over the application, decide how much it can take? With automation comes more bugs. With automation comes the request to have an option to disable it.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118632700629
RAC: 18345503

RE: According to BIONIC,

Quote:
According to BIONIC, the GPU-Tasks use 0.5 CPUs and one GPU.


That applies only for tasks running on AMD GPUs using OpenCL. For nvidia GPUs using CUDA, tasks need 0.2 CPUs and 1 GPU.

This may change over time as the apps are optimised to have more of the work done on the GPU and less on the CPU. The biggest factor may well be the ongoing developments in both OpenCL and CUDA. As an example of this, 9 months ago I bought an nvidia GTX650 and an AMD HD7770 (almost equal in price). At that time, the GTX650 was 20% faster than the HD7770 (running two simultaneous GPU tasks on each). 3 months later, AMD released optimised drivers which brought the 7770 to be almost equal to the 650 (around 3% behind).

There have been further improvements in OpenCL since then. With the release of the BRP5 (PAS) app, the 7770 is now at least 5% in front. I'm guessing that the BRP5 app may have been built against newer OpenCL libraries which have been further optimised. I don't think (but I might be wrong) that the app itself has changed all that much.

AMD's OpenCL implementation is well known to have been 'underdeveloped' when compared to nvidia's CUDA. It would appear that they are catching up progressively so it may well be that future apps built against future versions of the OpenCL libraries will require less of a CPU to function efficiently. AMD's hardware has always been well regarded so software improvements should be able to extract more performance in the future.

Quote:
However, normal CPU tasks use the full number of CPUs available. In my case there are six CPU tasks running parallel to the GPU task.


Which is precisely why the GPU task will be adversely affected.

Quote:
I realized, that this way, the GPU process becomes very slow, the speed increases, if you manually limit the system to five CPU tasks.


I don't know why you seem to regard this limiting to five as being undesirable and 'manual'. You are making a setting change which 'automatically' gives you better performance. You could make it even more automatically 'automatic' by another setting change to allow your GPU to crunch 2 BRP5 tasks simultaneously. If you did that you could leave your CPU setting on 100% and the two GPU tasks would 'automatically' exclude one CPU core from being used. The real benefit would be that you would most likely get two BRP5 tasks crunched in somewhat less time than double the 'single task' time, meaning that your GPU was actually being used more efficiently. It's well worth trying it out, anyway.

Quote:
But I am not willing to manually assign tasks only because of this GPU issue.
Is there anything that can be done about that?


Look in your E@H project prefs for "Utilisation factor of BRP apps" and set it to 0.5 to run two tasks in parallel. Set the CPU usage to 100%. Enjoy.

EDIT: If that actually seems to give worse performance, try excluding another core as well. Maybe your particular GPU might need more CPU support than the newer 7xxx series cards. Don't be frightened to experiment to find optimum conditions.

Cheers,
Gary.

Michael Hoffmann
Michael Hoffmann
Joined: 31 Oct 10
Posts: 32
Credit: 31031260
RAC: 0

RE: Look in your E@H

Quote:

Look in your E@H project prefs for "Utilisation factor of BRP apps" and set it to 0.5 to run two tasks in parallel. Set the CPU usage to 100%. Enjoy.

EDIT: If that actually seems to give worse performance, try excluding another core as well. Maybe your particular GPU might need more CPU support than the newer 7xxx series cards. Don't be frightened to experiment to find optimum conditions.

Well, sounds nice in theory but it did not work and I am tired of trying different settings. The 80%-method was close enough, so I'll go with that one.

Om mani padme hum.

5pot
5pot
Joined: 8 Apr 12
Posts: 107
Credit: 7577619
RAC: 0

Gary, while they are catching

Gary, while they are catching up, don't forget that your CUDA app is still using CUDA 3.2 due to the bug in your app that prevents newer CUDA apps from being used. I did read the bug appears to be gone in the CUDA 5.5 version, so here's hoping we will see a HUGE improvement.

While they are entirely different apps, GPUgrid saw over a 40% speed up in their apps when switching to CUDA 4. Food for thought :)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.