GPU problem

xemanu
xemanu
Joined: 27 Nov 16
Posts: 3
Credit: 85110650
RAC: 0
Topic 205554

Hello all,

I have three computers in this project. All three were running, according to my configuration on the website, two GPUs (1 CPU + 0.5 NVIDIA GPUs) simultaneously.

But a few days ago, one of the computers, only runs one GPU, while the other two computers continue to run two GPUs correctly. I haven't made any changes, so I can't understand what is happening.

The graphics card (NVIDIA GeForce GTX 1080) is new with three months of work. I have done several bechnarks, and it is in perfect condition, with a huge performance in all tests.

The performance drop of this computer is very pronounced in the statistics graph. The GPU load indicating GPU-Z, has dropped from 98-99% constant, to 40-70%.

What may be the cause of this change? It has a solution?

Thank you.

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

That host seems to be

That host seems to be crunching also Multi-Directed Continuous Gravitational Wave search CV v1.00 (AVX) tasks. Here's my theory:

Host has downloaded too much of those CPU tasks. Now it thinks it won't be able to finish all of them until deadline. Host is in panic while trying to finish the CPU tasks in time. That's why It has cut off one 'side' of the GPU work to be able to run one more CPU task in parallel.

How many CPU tasks is that host running? How have you set the work cache setting (how many days)?

xemanu
xemanu
Joined: 27 Nov 16
Posts: 3
Credit: 85110650
RAC: 0

The host is running 14

The host is running 14 CPUs Multi-Directed Continuous Gravitational Wave search CV v1.00 (AVX) tasks and the work cache setting is 7 days. There are 407 task pending to start, deadline 02/24. Many task, perhaps.

The CPU is an Intel Core I7 6950X, 10 cores 20 threads.

If I reduce my cache setting to 3 days, e.g., so I understand in your answer, it will improve the performance of the GPU?

Thank you very much for your answer.

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

Those CPU tasks have taken

Those CPU tasks have taken about 6 to 12 hours per task. Let's optimistically say they would all finish in average 7 hours. 407 tasks divided for 14 CPU threads is 29 tasks per CPU thread. 29 x 7 hours is 203 hours which is over 8 days. There's no chance that pile of CPU work would finish before deadline.

Lowering the work cache at this point won't instantly help in getting the GPU to start running 2x.

I suggest you disable CPU work for that host on the project preference settings. Then check that new work is not allowed for EInstein and click update for it on the Boinc Manager. Then abort about 300 of those CPU tasks on that host. Then allow new work (with work cache still 7). It should receive only GPU tasks and it should start running GPU 2X.

Server estimates the completion times very wrong for those CV v1.00 tasks. That's why that problem is happening in the beginning.

Easiest way to avoid that problem is disabling CPU tasks completely for the most of the time. Then occasionally set the work cache very low (start by setting it to 1) and allow also CPU work at the project settings.... and save and click update.... and host will receive a bunch of them. Then if the amount was OK, disable CPU work again. For "normal" time you could keep the work cache bigger, because then it's only allowed for GPU.

archae86
archae86
Joined: 6 Dec 05
Posts: 3163
Credit: 7329401687
RAC: 2308944

Richie_9 wrote:Easiest way to

Richie_9 wrote:
Easiest way to avoid that problem ...

The even easier way is to set a very much lower cache setting, and just leave it alone.  I suggest 0.25 + 0.1.

On the rare occasions that Einstein servers go down or other work stoppage give trouble, this may put you out of work a little while, but it will let you cruise along unaffected by a host of issues all the rest of the time, with no need for micromanagement.

xemanu
xemanu
Joined: 27 Nov 16
Posts: 3
Credit: 85110650
RAC: 0

My delay in replying is due

My delay in replying is due to the time difference.

I aborted most deadline tasks 02/24. and I've reduced my cache setting to 0.50 + 0.1. I  updated  it on the Boinc Manager. Only with this, has returned to work GPU 2X. Since it has already started working correctly, I haven't disabled CPU tasks.

I think it's already fixed.

Thank you very much for your help

Heinz
Heinz
Joined: 28 Nov 11
Posts: 1
Credit: 4575107
RAC: 0

I actually have no chance to

I actually have no chance to get on the four scripts of einstein@home with your Boinc Manager.                               Would be glad to get your support on it.

Thank you in advance

Heinz

 

solling2
solling2
Joined: 20 Nov 14
Posts: 219
Credit: 1578414609
RAC: 14523

Heinz_4 wrote:I actually have

Heinz_4 wrote:
I actually have no chance to get on the four scripts of einstein@home with your Boinc Manager.                

Obviously your computer doesn't have a GPU. Crunching with the CPU seems to run without problems, just fine.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.