PCIe 40x, 16x, what next?

ExtraTerrestria...

Joined: 10 Nov 04

Posts: 770

Credit: 580331504

RAC: 123890

RE: The benefit of lower

23 Dec 2014 13:38:12 UTC

Message 128414 in response to message 128413

(moderation:

)

Quote:

The benefit of lower gpu utilization is lower power/heat.

That's a very inefficient way to throttle power usage, because you're still running the GPU in a "full throttle" mode (it doesn't know you want it throttled, but assumes you want the task finished as qickly as possible). And the calculation breaks when the single WU has to wait for something are too short for the GPU to power down.

If you load your higher with multiple WUs, but comppensate by reducing the power target (this should also be possible for AMDs) they will reduce clock speed and voltage. The latter gives you better energy efficiency.

This is independent of "single vs. triple GPU". And I wrote it because I think your config with just 1 WU per GPU exaggerates the PCIe differences (due to the worse load balancing / micro pauses already mentioned).

MrS

Scanning for our furry friends since Jan 2002

disturber

Joined: 26 Oct 14

Posts: 30

Credit: 57155818

RAC: 0

RE: BTW: the top hosts

23 Dec 2014 19:39:30 UTC

Message 128415 in response to message 128412

(moderation:

)

Quote:

BTW: the top hosts with 2 Tahitis (smaller than your Hawaiis) achieve about 120k RAC per GPU, using i7 4770K as hosts, i.e. with 8x PCIe 3 or with a PLX.

MrS

How does one achieve those outputs for the AMD cards? I just bought a Gigabyte 7970 OC gpu clock is 1000MHz. The host is a 2500K running at 4.3GHz. Slot is 16x PCIe 2. I am running 2 Perseus Arm WU and each finishes in 90 minutes. Is there anything I can do to bring up the output from 87k (calculated based on elapsed time and credit). I ran Arecebo tasks before and I was getting even lower RAC (78k). Is PCIe 2 limiting me, or running windows? I rolled back the latest driver to an earlier version.

http://einsteinathome.org/host/11685226

woohoo

Joined: 28 Jul 14

Posts: 20

Credit: 352552543

RAC: 0

If the goal was efficiency, I

23 Dec 2014 20:39:24 UTC

Message 128416 in response to message 128414

(moderation:

)

If the goal was efficiency, I don't think I would have got the most power hungry gpus around. Heat is somewhat important as too much causes failures. But let's look at some numbers:

with one wu my system is using 540w, the 295x2 runs at 52c and the 290x runs at 68c

with two wu the system uses 585w, the 295x2 runs at 53c and the 290x runs at 70c. the power increase isn't very large, and some of that would be due to the cpu working about 20% harder to feed double the gpu wus.

so i'm not one to tell anyone how to run their stuff, but this is my first time running a video card cooled by a fish pump and reading horror stories about pump failures, leaks, and automatic throttling down due to overheating has led me to be just a little bit conservative considering the coin i just dropped. none of the top computers run more than two gpus on a 16 lane platform so let's just say i was trying to see how much blood i could squeeze from a rock. my 295x2 only has 85% gpu utilization with two wus so it could be pushed harder as 53c is not a lot of heat.

woohoo

Joined: 28 Jul 14

Posts: 20

Credit: 352552543

RAC: 0

my lazy math says you should

23 Dec 2014 20:49:54 UTC

Message 128417 in response to message 128415

(moderation:

)

my lazy math says you should be getting more than 100k on that. pcie2 is slower than pcie3 but you could try going to 3 or 4 wu

disturber

Joined: 26 Oct 14

Posts: 30

Credit: 57155818

RAC: 0

RE: my lazy math says you

23 Dec 2014 22:08:51 UTC

Message 128418 in response to message 128417

(moderation:

)

Quote:

my lazy math says you should be getting more than 100k on that. pcie2 is slower than pcie3 but you could try going to 3 or 4 wu

I will try that now. My version of the card is a terrible overclocker. After raising the mem clock to 1025 and the gpu clock to 1050, it spit out nothing but 0 second workunits that failed with error on compute. My Nvidia cards behave differently by crunching the whole wu and then come up with errors.

woohoo

Joined: 28 Jul 14

Posts: 20

Credit: 352552543

RAC: 0

If it were me I wouldn't

23 Dec 2014 22:17:54 UTC

Message 128419 in response to message 128418

(moderation:

)

If it were me I wouldn't worry so much about trying to overclock. Some will even downclock to stock if reduces errors. i would just use gpu-z to check average temperature and gpu usage.

so on my 290x it was 81% gpu usage on one wu and 97% on two wu so going to three wu might not help much more

but on my 295x2 which is bottlenecked more on the pcie it's 75% on one wu and 85% on two wu so going to three or four wu might help, but at the same time i only have four cpu cores so i have to keep an eye on that too

disturber

Joined: 26 Oct 14

Posts: 30

Credit: 57155818

RAC: 0

I am going to go with 3

24 Dec 2014 2:57:10 UTC

Message 128420 in response to message 128419

(moderation:

)

I am going to go with 3 Perseus Arm wu at a time. Because of the performance penalty of the PCIe 2, the gpu was only loaded to 84% with 2 wu. This increased to 93% and gave me 9% more calculated RAC. This is not a lot, but it does put me right at the 100k RAC. I have two cpu processes running, reducing to 1 did not seem to have any effect on the time.

woohoo

Joined: 28 Jul 14

Posts: 20

Credit: 352552543

RAC: 0

I'm going to run just one wu

24 Dec 2014 3:11:48 UTC

Message 128421 in response to message 128420

(moderation:

)

I'm going to run just one wu at a time because for some reason I have a lot of invalids and I never that problem before. Or maybe the driver is the problem.

archae86

Joined: 6 Dec 05

Posts: 3159

Credit: 7245376671

RAC: 1311195

woohoo wrote:I'm going to run

24 Dec 2014 4:24:00 UTC

Message 128422 in response to message 128421

(moderation:

)

woohoo wrote:

I'm going to run just one wu at a time because for some reason I have a lot of invalids

Have you tried turning the clock rate down? (core or memory or both?)

I noticed that on both your hosts the majority of the Perseus jobs listed in the task list with "validate error" show on the task page outcome: Validate error (58:00111010)

That specific outcome first showed up on my GTX 970 during core clock overclocking experiments today, never having shown up at all in months of operation of five cards on three hosts. Of course the similarity could be a coincidence, but turning down the clock(s) would be quickly diagnostic.

Even if you think yourself not overclocked this might be worth a try. I currently have two of my five cards (both are GTX 660s, as it happens) slightly underclocked.

woohoo

Joined: 28 Jul 14

Posts: 20

Credit: 352552543

RAC: 0

using Nvidia cards in the

24 Dec 2014 6:05:01 UTC

Message 128423 in response to message 128422

(moderation:

)

using Nvidia cards in the past I've always been able to use Precision X to change clocks but Catalyst isn't allowing my changes to stick so I will stay with one wu per gpu to see if the invalids go away.

PCIe 40x, 16x, what next?

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner