GPU utilization benchmark

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 3016453769

RAC: 1138847

RE: While I'm mentioning

20 Mar 2012 17:23:49 UTC

Message 108814 in response to message 108811

(moderation:

)

Quote:

While I'm mentioning the 80 limit, I think I saw another awkward consequence. As BRP work was recently much more routinely available than the GW work for my CPU slots and my host was pretty hungry after going to zero, there was a tendency to fulfill nearly all the 80 quota with BRP, leaving me with a very small amount of pending CPU work when the quota gate closed, sometimes taking my CPU to idle later. I'd have guessed there to be separate maximum daily download quotas for CPU and GPU, but in this instance it appeared not.

It would be interesting to try that same quota experiment at Albert, as they are testing newer server software there to support OpenCL. In theory, the newer software should support separate 'per application' quotas better - though that would mainly come into play when the quota for a single application needs to be reduced by error results.

I don't think we've ever seen that quota system working properly at SETI, two years after the new server software was deployed.

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 3016453769

RAC: 1138847

RE: There is something

20 Mar 2012 17:33:51 UTC

Message 108815 in response to message 108812

(moderation:

)

Quote:

There is something there also... As now that BOINC knows that it can run only one BRP per card it forced SETI to panic mode... I guess the scheduler (at least in 6.10.60) is still not fully tuned to deal with wrong settings in multiple GPU tasks...

What 'count' value are you using for your SETI tasks? I think you may cause your client to have scheduling problems if you use different values (especially 1.0 and 0.5 for two projects sharing the same card).

Consider what happens if two SETI tasks start together. One finishes first: then there's half a GPU spare. Einstein isn't allowed to fit into that, so another SETI task gets started to fill the space. And another, and another. Einstein never gets a look in, because no SETI task ever runs long enough to be eligible for pre-emption under the 'Task Switch Interval' rules.

If your card has enough memory to allow two SETI tasks, or one Einstein and one SETI task, to run together, but not two Einstein tasks, you can get round that with fractional count values like 0.59 for Einstein and 0.40 for SETI - they don't have to add up to 1.

archae86

Joined: 6 Dec 05

Posts: 3164

Credit: 7372341687

RAC: 2180660

RE: Consider what happens

20 Mar 2012 19:10:19 UTC

Message 108816 in response to message 108815

(moderation:

)

Quote:

Consider what happens if two SETI tasks start together. One finishes first: then there's half a GPU spare. Einstein isn't allowed to fit into that, so another SETI task gets started to fill the space. And another, and another. Einstein never gets a look in, because no SETI task ever runs long enough to be eligible for pre-emption under the 'Task Switch Interval' rules.

Hummm... are you sure? Not that I have comprehensive data, but what I have seen repeatedly on a host running triplex Einstein BRP GPU and singleton SETI GPU was that (if I may anthropomorphize a little) when the scheduler thought it time to run a SETI task the currently running triplet of Einstein would finish one at a time, and when the last finished, the single SETI would run, ending with the start of a new triplet of Einstein. In other words, well behaved.

By contrast, on the same host, something about the applications or my setup meant that SETI and Einstein coexisted badly. So when I allowed multiple SETI and multiple Einstein GPU instances, I'd get them running together, and for some reason get terribly low productivity. The low productivity states was probably a special case, not to be generalized, but I'd imagine, at least for the version of BOINGmgr I was running (6.12.34), that the behavior of stemming the flow of triplet Einstein to allow a singleton SETI in would be general.

hotze33

Joined: 10 Nov 04

Posts: 100

Credit: 368387400

RAC: 0

@Damaraland Have you tried to

20 Mar 2012 20:07:51 UTC

Message 108817

(moderation:

)

@Damaraland
Have you tried to bench only with the 560Ti? From my experience it is more efficient just to use one card due to bandwidth limitations. I donÂ´t know which mainboard you have but on my (asus evo) the bandwidth goes from PCIex16->PCIex8 when using 2 gpus. This increases the runtime of the workunits by 30%. So I had a Win7 with GTX470 and GTX460 running 2 wu with a RAC of 47000. Now with just the 470 and 3 wu I have RAC >40000 but 160W less power consumption.
Running one wu takes around 2200s and 3 at the same times gives 1300s eq. average.
This is consistent with the gpu utilization you can see in tools like msi afterburner. One wu has around 55%, 2 arround 80% and 3 have above 90%.
I had also a GTX260 working which was a total waste of energy compared to the GTX470.

Horacio

Joined: 3 Oct 11

Posts: 205

Credit: 80557243

RAC: 0

RE: RE: There is

20 Mar 2012 22:52:15 UTC

Message 108818 in response to message 108815

(moderation:

)

Quote:

Quote:
There is something there also... As now that BOINC knows that it can run only one BRP per card it forced SETI to panic mode... I guess the scheduler (at least in 6.10.60) is still not fully tuned to deal with wrong settings in multiple GPU tasks...

What 'count' value are you using for your SETI tasks? I think you may cause your client to have scheduling problems if you use different values (especially 1.0 and 0.5 for two projects sharing the same card).

Consider what happens if two SETI tasks start together. One finishes first: then there's half a GPU spare. Einstein isn't allowed to fit into that, so another SETI task gets started to fill the space. And another, and another. Einstein never gets a look in, because no SETI task ever runs long enough to be eligible for pre-emption under the 'Task Switch Interval' rules.

If your card has enough memory to allow two SETI tasks, or one Einstein and one SETI task, to run together, but not two Einstein tasks, you can get round that with fractional count values like 0.59 for Einstein and 0.40 for SETI - they don't have to add up to 1.

In that box I run SETI and Einstein and both are running 1 Wu per GPU. The GPUs are gt9500 with 512Mb so there is no gain in trying to run more than 1 WU. I did the fractional trick on other boxes, and it worked perfect with Boinc 6.12 but it dosnt work exactly as expected with 6.10 cause the strict queue order of that version. (But I dont want to go back to 6.12 due to the impossible backlogs times, besides any servers/pipes issues, my Inet is rather bad and everything combined gets me out of work unless I do a constant babysitting...)

What I think is that the scheduller was not aware that the BRP were efectively running just 1 per card and due to the 0.5 in the count it thought that it was having enough time to crunch everything, even when it was not able to run the second task.
In fact after processing some Setis, it was Einstein who went into panic mode...

Ill let BOINC to deal with this by itself, ussually it gets sorted faster this way than when I try to help...

But, its true, when BOINC is under panic mode the tasks are started and suspended at several degrees of completion which is not nice for my OCD :D

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 3016453769

RAC: 1138847

RE: Hummm... are you sure?

20 Mar 2012 22:55:36 UTC

Message 108819 in response to message 108816

(moderation:

)

Quote:

Hummm... are you sure? Not that I have comprehensive data, but what I have seen repeatedly on a host running triplex Einstein BRP GPU and singleton SETI GPU was that (if I may anthropomorphize a little) when the scheduler thought it time to run a SETI task the currently running triplet of Einstein would finish one at a time, and when the last finished, the single SETI would run, ending with the start of a new triplet of Einstein. In other words, well behaved.

Can you remember whether this happened when there were still Einstein tasks cached, ready to run? I'd expect that to happen when LTD (long term debt) inhibited work fetch until the project was dry, but there would be those who say that the intervening period (when one-third, then two-thirds, of the GPU could be, but wasn't, running an Einstein task) represented underutilisation.

But I haven't tested the exact scenario I posited for a long time. I'm currently testing {Einstein, SETI, GPUGrid} with {0.48,0.48, 0.51} - allowing any combination of two tasks except GPUGrid+GPUGrid. It runs, though not very efficiently - quite possibly the PCIe bus contention that Hotze mentions.

archae86

Joined: 6 Dec 05

Posts: 3164

Credit: 7372341687

RAC: 2180660

RE: Can you remember

20 Mar 2012 23:01:41 UTC

Message 108820 in response to message 108819

(moderation:

)

Quote:

Can you remember whether this happened when there were still Einstein tasks cached, ready to run? I'd expect that to happen when LTD (long term debt) inhibited work fetch until the project was dry, but there would be those who say that the intervening period (when one-third, then two-thirds, of the GPU could be, but wasn't, running an Einstein task) represented underutilisation.

Definitely there was available Einstein work in the queue. This, as I recall it anyway, was a consistent repeating behavior. I was actually trying to take advantage of it, as I thought I'd seen an Einstein GPU task efficiency advantage for the first triple run after a SETI GPU task. I had SETI set to 4% work share, so this was happening something like four to twelve times a day, depending on the work content of the current SETI work.

Yes, some might call it underutilization--but as you pointed out, unless it is allowed there is not much of a way to honor the work share request in that case.

If this is of interest, I can try putting it back to that state, and make any observations possibly useful. It is not happening now, as I've put SETI on that host to 0 (i.e. fallback) work share, and just occasionally suspend the Einstein work to grind down the queue to avoid deadline misses on the SETI side.

GPU utilization benchmark

Forums › Cruncher's Corner

RE: While I'm mentioning

RE: There is something

RE: Consider what happens

@Damaraland Have you tried to

RE: RE: There is

RE: Hummm... are you sure?

RE: Can you remember

Comment viewing options

Forums › Cruncher's Corner