Nvidia GTX 970

Manuel Palacios

Joined: 18 Jan 05

Posts: 40

Credit: 224259334

RAC: 0

19 Mar 2015 4:29:07 UTC

Topic 198018

(moderation:

)

I am currently running 2 EVGA SC Nvidia GTX970's with an Intel Core i5 4690k + 2x4gb 2133mhz ram. I am wondering how many concurrent tasks other GPU users are using for this card. I'm currently running .2 CPU + .5 GPU and am getting ~87% GPU usage, and ~80% MCU usage. Is .33 advisable for this setup?

Also, I know the issue where these cards compute on P2 power state and that limits the card's ram speed to 3004mhz while in P0 the ram goes its designated speed of 3505mhz.

Can anyone enlighten me how to get that to run stable? When I use Nvidia Inspector to set the GDDR speed to 3505mhz under P2, my machine will shut down and reboot.

Thanks!

mikey

Joined: 22 Jan 05

Posts: 12834

Credit: 1884144890

RAC: 905602

Nvidia GTX 970

19 Mar 2015 11:06:01 UTC

Message 131537

(moderation:

)

Quote:

I am currently running 2 EVGA SC Nvidia GTX970's with an Intel Core i5 4690k + 2x4gb 2133mhz ram. I am wondering how many concurrent tasks other GPU users are using for this card. I'm currently running .2 CPU + .5 GPU and am getting ~87% GPU usage, and ~80% MCU usage. Is .33 advisable for this setup?

Thanks!

Yes I would at last try it, I am running a GTX760 and running 2 units at once just fine, I would think at only 87% usage you have enough headroom left. One thing to keep an eye on though would be to leave a cpu core free for each gpu if you do, at least in the beginning, to ensure it is well fed and running at top speed.

I can't help you with the other question though, sorry.

archae86

Joined: 6 Dec 05

Posts: 3161

Credit: 7315748355

RAC: 2302660

I've been running a GTX 970

19 Mar 2015 12:56:23 UTC

Message 131538

(moderation:

)

I've been running a GTX 970 here for a while at 3X. I don't recall my testing details and am not sure whether I tried 4X or not, but am sure 3X was more productive than 2X.

But all my testing was on Perseus with v 1.39 code. The current work of comparable interest is Parkes PMPS. More importantly, there is a beta application out (currently version 1.52) which has dramatically helped the Einstein productivity of my GTX 970. But I have not redone any multiplicity benefit testing. Your choice of 2X is probably not very greatly different in output than the optimum.

If productivity is your main interest, then run, do not walk, to your Einstein account configuration settings for the location (aka venue) you have that host in, and enable beta test. You have chosen to hide your computers, so I can't just click on your task list to see whether you have already done so. I suggest you unhide them if you ask for assistance here.

Manuel Palacios

Joined: 18 Jan 05

Posts: 40

Credit: 224259334

RAC: 0

Mikey: thank you for your

19 Mar 2015 16:22:43 UTC

Message 131539 in response to message 131538

(moderation:

)

Mikey: thank you for your answer, at 87% GPU usage I am completely unsure if that 3rd task would be anymore productive, but nevertheless there is some headroom left in the cards.

Archae: I have made the computers visible, so you should be able to see them now and and thus also the amount of time it takes to complete tasks. Also, I am indeed running the beta Parks 1.52 app. Also, I am running my Corei5 4690k at 3.9ghz, it crunches Primegrid Seventeen or Bust tasks on 3 cores, and 1 core is left free to feed the GPU's.

Finally, i'm not sure what the correlation is between GPU percentage usage and how efficiently tasks process, in other words, if the step from .5 to .33 would positively or negatively affect output under the beta 1.52 app.

Thanks!

Manuel Palacios

Joined: 18 Jan 05

Posts: 40

Credit: 224259334

RAC: 0

RE: I've been running a GTX

20 Mar 2015 2:34:13 UTC

Message 131540 in response to message 131538

(moderation:

)

Quote:

I've been running a GTX 970 here for a while at 3X. I don't recall my testing details and am not sure whether I tried 4X or not, but am sure 3X was more productive than 2X.

But all my testing was on Perseus with v 1.39 code. The current work of comparable interest is Parkes PMPS. More importantly, there is a beta application out (currently version 1.52) which has dramatically helped the Einstein productivity of my GTX 970. But I have not redone any multiplicity benefit testing. Your choice of 2X is probably not very greatly different in output than the optimum.

If productivity is your main interest, then run, do not walk, to your Einstein account configuration settings for the location (aka venue) you have that host in, and enable beta test. You have chosen to hide your computers, so I can't just click on your task list to see whether you have already done so. I suggest you unhide them if you ask for assistance here.

I was also able to dig up a thread concerning the P states issue with the 970, I saw that you were able to get it to run at 3505mhz+ stable in P2 and even clock the memory higher than that after some trial and error using Nvidia Inspector. Is there any chance you can help me out with that process as well? Each time I try to configure the cards to run at 3505mhz (the stock speed rating) my pc will crunch for a few hours before ultimately shutting down due to error.

Thanks in advance!

mikey

Joined: 22 Jan 05

Posts: 12834

Credit: 1884144890

RAC: 905602

RE: Mikey: thank you for

20 Mar 2015 11:24:53 UTC

Message 131541 in response to message 131539

(moderation:

)

Quote:

Mikey: thank you for your answer, at 87% GPU usage I am completely unsure if that 3rd task would be anymore productive, but nevertheless there is some headroom left in the cards.

Archae: I have made the computers visible, so you should be able to see them now and and thus also the amount of time it takes to complete tasks. Also, I am indeed running the beta Parks 1.52 app. Also, I am running my Corei5 4690k at 3.9ghz, it crunches Primegrid Seventeen or Bust tasks on 3 cores, and 1 core is left free to feed the GPU's.

Finally, i'm not sure what the correlation is between GPU percentage usage and how efficiently tasks process, in other words, if the step from .5 to .33 would positively or negatively affect output under the beta 1.52 app.

Thanks!

The only real way to know is to try it, what works perfectly on my machine may fail spectacularly on your machine, or vice versa. Your cpu, ram, gpu, projects you also run, etc, etc, ETC is all different, so until you actually try it you will always be just guessing. My suggestion would be to right down how long it is taking on average over say 20 units now, then switch to one more unit at a time on the gpu and let it run some units. You should be able to tell within the first 10 units if they are faster, significantly slower or about the same amount of time per unit. Faster or about the same amount of time is good, significantly slower is bad. Somewhat slower is expected as long as the time difference isn't near the length of time to run a unit on it's own the old way, you are looking for an RAC jump, not a drop!

archae86

Joined: 6 Dec 05

Posts: 3161

Credit: 7315748355

RAC: 2302660

RE: I saw that you were

20 Mar 2015 17:08:17 UTC

Message 131542 in response to message 131540

(moderation:

)

Quote:

I saw that you were able to get it to run at 3505mhz+ stable in P2 and even clock the memory higher than that after some trial and error using Nvidia Inspector. Is there any chance you can help me out with that process as well? Each time I try to configure the cards to run at 3505mhz (the stock speed rating) my pc will crunch for a few hours before ultimately shutting down due to error.

My current GPU clock rate is a little lower than the one I used for a couple of months earlier this year when I was running Perseus on v1.39. I got five validate errors on this host on Parkes on work running on 1.50 first Beta and 1.52 second Beta code, and tentatively concluded that my particular GTX970 would not run error-free quite so fast on the new code as on the old.

I just now checked (and it has been a half dozen days since my most recent validate error) and, as logged by GPU-Z, my current GTX 970 stats are:
[pre]
multiplicity: 3X
Core clock 1427
GPU Mem clock 1949
Power 78%
GPU load 92%
Mem con load 84%
VDDC 1.20[/pre]
I show here short-term (roughly two minutes) averages as performed by GPU-Z on the parameters which displayed obvious fluctuation (power and loads). They may differ somewhat from a proper average of hours. Not by much, I suspect.

I should mention that different monitoring applications will report different memory clock rates for the same operating condition. For example nvidia Inspector describes my current operating condition as having a GPU memory clock rate of 3899.

I don't have any special advice on ways and means to run faster. I suspect the GPU core clock maximum error-free speed may be increased by higher voltage, if your control application gives you that option, and if the board actually heeds the command. Of course that condition will raise power consumption Other than that, choice of code and operating conditions may matter. You may find that running a different application will give you a higher maximum SPU clock rate than running 1.52 Parkes, though I'd personally not want to do that. I've made no observations nor heard reports regarding application impact on GPU memory clock rate maximum.

Others have given you good advice, I think, on best multiplicity--try it and see. While version 1.47/1.50 had such a range of elapsed times that a small trial would tell you little, the elapsed times are much more consistent in 1.52. In particular, a large number of the WUs fall in a base population. When all of the "parallel" tasks on the GPU belong to the base population, the elapsed time and CPU time seem to be in a very tight distribution. So I think you can make a good comparison of the improvement or harm caused by a different multiplicity by monitoring the relative ETs of each simultaneous batch of work. When all are about the same, then you probably have base population units, and you can compare the performance with the base population of a different multiplicity with simple arithmetic. Of course a really large sample will do a better job of predicting long-term performance, but I don't think we know how representative the work distributed in any given hour, day, or week is of the long-term Parkes work distribution regarding whatever it is that causes data-dependent execution performance variation. Failing that, no one can tell you how well disbursed in distribution time your work unit sample needs to be, nor how large it needs to be.

I'd say maximum core clock and memory clock rate setting is somewhat similar--best advice is "try and see". There is no good reason to expect that someone else's sample of a GTX 970 card will top out at the same maximum clock rates as your sample, nor that the failure syndrome when you go faster than it will work will be the same.

Good luck.

Manuel Palacios

Joined: 18 Jan 05

Posts: 40

Credit: 224259334

RAC: 0

RE: I just now checked

20 Mar 2015 18:41:01 UTC

Message 131543 in response to message 131542

(moderation:

)

Quote:

I just now checked (and it has been a half dozen days since my most recent validate error) and, as logged by GPU-Z, my current GTX 970 stats are:
[pre]
multiplicity: 3X
Core clock 1427
GPU Mem clock 1949
Power 78%
GPU load 92%
Mem con load 84%
VDDC 1.20[/pre]
I show here short-term (roughly two minutes) averages as performed by GPU-Z on the parameters which displayed obvious fluctuation (power and loads). They may differ somewhat from a proper average of hours. Not by much, I suspect.

I should mention that different monitoring applications will report different memory clock rates for the same operating condition. For example nvidia Inspector describes my current operating condition as having a GPU memory clock rate of 3899.

Good luck.

http://img.techpowerup.org/150320/nvidia_20150320_132702.png - P0 MEM Timings
http://img.techpowerup.org/150320/nvidia_20150320_132726.png - P2 Mem Timings
http://imgur.com/8Hr7Tdy - GPU-Z sensors screenshot

Ok please refer to the above images that help me elucidate what exactly the problem I am facing is, and the reason why I took interest in your particular case archae. As you can see, and validate for yourself, the GTX 970 has 4 different power states P8, P5, P2, P0, right?

Good. So then, every time I open up BOINC the card sets itself to P2 power state, and because of this my memory clock will go no higher than 3005mhz

Now, under gaming circumstances etc. the card(s) will operate in P0 and thus the memory timings go up to the stock rate of 3505mhz

I am not interested in overclocking the memory clock rate of my card past it's stock ratings. I just want for the cards to run at 3505mhz like they do normally in P0, but for some reason, with BOINC the card runs in P2 and thus decreases the clock of the memory by 500mhz

This is a substantial difference, and fixing this would increase crunch time by a good percentage.

Hopefully that helps put the problem into perspective and helps you guys help me. I can't be the only one with a 970, or even a MAXWELL card that is running into this issue in E@H and I think it's worthwhile to investigate.

Thank you![/b][url][/url]

archae86

Joined: 6 Dec 05

Posts: 3161

Credit: 7315748355

RAC: 2302660

RE: Hopefully that helps

20 Mar 2015 18:56:45 UTC

Message 131544 in response to message 131543

(moderation:

)

Quote:

Hopefully that helps put the problem into perspective

Well, that is not news. The desire to raise memory clock in the P2 state was the whole reason I downloaded Nvidia Inspector pursuant to suggestions on this forum.

How are you handling fan speed, and what temperature is your GPU reporting shortly before you have system failure?

For reasons particular to a non-performance aspect of my operations, I tend to run my GPU fans at fixed rates. For my GTX970, I currently intend to run at 55%, though I am not sure that is properly in place.

While it is highly likely the GPU will run correctly at a somewhat higher core clock rate if kept a bit cooler, I don't know what influence the GPU fan speed and reported temperature have on survivable GPU mem clock rate, but doubt it is zero.

Manuel Palacios

Joined: 18 Jan 05

Posts: 40

Credit: 224259334

RAC: 0

RE: RE: Hopefully that

20 Mar 2015 19:08:02 UTC

Message 131545 in response to message 131544

(moderation:

)

Quote:

Quote:
Hopefully that helps put the problem into perspective

Well, that is not news. The desire to raise memory clock in the P2 state was the whole reason I downloaded Nvidia Inspector pursuant to suggestions on this forum.

How are you handling fan speed, and what temperature is your GPU reporting shortly before you have system failure?

For reasons particular to a non-performance aspect of my operations, I tend to run my GPU fans at fixed rates. For my GTX970, I currently intend to run at 55%, though I am not sure that is properly in place.

While it is highly likely the GPU will run correctly at a somewhat higher core clock rate if kept a bit cooler, I don't know what influence the GPU fan speed and reported temperature have on survivable GPU mem clock rate, but doubt it is zero.

I just edited my above post and you can see the sensors tab in GPU-Z. I run both my 970's at 35% fan speed, right now they're crunching some long Parkes Wus that keep them only ~70%, with regular fluctuations anywhere from ~50% usage to ~75% usage. I have a very well ventilated case, and the GPUs have never run higher than 54C with normal operating range being mid-high 40C. Right now they are sitting at 47C and 43C.

The times that I have clocked the memory manually to 3505mhz in P2 using Nvidia Inspector, E@H has run for hours overnight, and ultimately my computer reboots some time at night while i'm asleep. Mind you, these two cards have not come up with any invalids or errors, even after the reboots. I just log back in and the WUs get crunched normally.

Keith Myers

Joined: 11 Feb 11

Posts: 5044

Credit: 19049297568

RAC: 6522873

I use NvidiaInspector to set

22 Mar 2015 6:43:10 UTC

Message 131546 in response to message 131545

(moderation:

)

I use NvidiaInspector to set the P2 clock +100Mhz so that I run the memory at 3605Mhz for my 970s. I use SIV to set the maximum GPU temp to 65C by having SIV modulate the fan speeds. I run 6 CPU tasks and 3 tasks per GPU and that keeps the system running around 92% CPU utilization and around 99% GPU utilization with no errors and no downclocks or reboots in the systems. I recommend SIV for controlling the GPU and also for system monitoring.

Cheers, Keith

Nvidia GTX 970

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner