I'm almost positive it has to do with the way the drivers handle CUDA tasks. My other computer was also having instability issues and it wasn't until I changed the global settings in the Nvidia Control Panel to Prefer Max Performance instead of Adaptive.
I have now been running my memory offset at +300 mhz under P2 meaning I'm ~3800 mhz memory clock and have had no issues with the system rebooting and no invalids or errors as far as reported work.
The cards can run at full speeds and beyond for GPU Compute tasks, but it takes a lot of tinkering to get them there. It's a bit of a shame. Ever since Fermi cards offered such excellent all-round performance and Nvidia noticed they were undercutting themselves have they gone out of their way to handicap consumers ability to utilize the full capacities of their GPUs.
Well there are some differences, I run AMD CPU's, you run Intel, and so the motherboards and chipsets will be miles apart.
I haven't had any difficulty using NVInspector to reset the P2 state on my GPU's
and I used NVI to make 2 cmd files so all I need to do is run them prior to running BOINC and the P2 states are reset to 3505 & 3600 respectively.
I don't O/C the cards in any other way, nor do I O/C my CPU, my rig has far to many other add-ons gobbling power and making heat, so I just run stock clocks apart from P2 resets and only 1 WU per GPU..
OTOH Keith also runs AMD, and does O/C both CPU & GPU's and runs multiple WU, without any apparent problems..
But I concur NVidia has once again shot itself in the foot by crippling their high end cards.
I used NVI to make 2 cmd files so all I need to do is run them prior to running BOINC and the P2 states are reset to 3505 & 3600 respectively.
More than one of us has tried to do that, and stumbled a bit.
You would do us a public service to post the exact contents of your successful cmd files.
Yes, I do understand that to do this one must assure that the NVI commands run between initial boot and the launching of boincmgr, but the delayed launching of boincmgr and means to control the relative timing of boot time command files launches is within my grasp. I think it is the actual NVI commands that have eluded me.
In the expanded panel of NVI is a button "Create Clocks Shortcuts". It creates simple and self-explanatory command line parameters. I've lumped the settings for P0 and P2 into a single .bat, but have not rebootet since and not actually tried it, so hesitate to post it here.
That's what I did:-) But I didn't make a batch file, since BOINC auto starts at logon under Windows, I login, exit BOINC then simply click on the shortcuts and use OHM [Open Hardware Monitor] to check that the P2 states have actually been changed.
Then I restart BOINC and carry on crunching.
BTW has anyone any definite idea why running S@H shorties on one GPU would speed up E@H WU on another GPU?
I know it happens, I've watched it, but don't know WHY:-) And its niggling at the back of my mind, like an itch I just cant quite reach:-/
I used NVI to make 2 cmd files so all I need to do is run them prior to running BOINC and the P2 states are reset to 3505 & 3600 respectively.
More than one of us has tried to do that, and stumbled a bit.
You would do us a public service to post the exact contents of your successful cmd files.
Yes, I do understand that to do this one must assure that the NVI commands run between initial boot and the launching of boincmgr, but the delayed launching of boincmgr and means to control the relative timing of boot time command files launches is within my grasp. I think it is the actual NVI commands that have eluded me.
Thanks.
Humm, Y'know I've not even looked at whats in those shortcuts..
I just used NVI to create them, then used them.. See my reply to MrS as to what I actually do.. Since I don't use any inbuilt timings, its all run hands on..
Jason, one of the developers, is making some investigations into the P2 memory speed reductions and has made some interesting observations. Basically, systems with fast CPUs don't load the video driver or GPU apps heavily enough so the cards think it is beneficial to downclock the memory state. Even more apparent with heavy utilization. He might be on to something in my opinion.
Well if the GPU's are being underused I have just enabled 2 x WU per GPU in both E@H & S@H, and am currently running 1 WU per project on each GPU..
So GPU 0 has 1 x S@H & 1 x E@H running together.
Result is BOTH take longer to run.
E@H from 1H 05Min 30 sec to 2 H 17 min & S@H from 5 min aprox to 15 mins plus
This is running the Parkes beta app from E@H, and lunatics for S@H
With the old BRP perseus app and tasks the E@H ran faster than if run on their own or single per GPU ie were less than 2 x single time..
Guess its just an interesting situation:-) But I still have to keep an eye on the situation:-/ Getting my kip during the day:-) And running 0:30 h to 07:30 h
on economy 7 tarif to cut costs a bit.
Sorry my terminology wasn't the best, were not cmd, but shortcuts and only contained a call to NVI to set memory to 3505 etc.
No timings were used, I run the shortcuts manually.
Cliff, well that is to be expected with running multiple WU on a card compared to just one. Completion times will be greater. I look at it as a benefit to higher RAC per day, not in minimal task completion times. I used eFMer's SetiPerformance utility to determine the highest efficiency for each card. In my case, that is 3X WU per card. Now granted, that utility just uses Seti WU for its metric and for sure MilkyWay and Einstein WU's most definitely don't behave the same as Seti. For example, when a MilkyWay WU occupies a card along with the two other Seti WU's .... the Seti task completion times just about double because the MW WU takes damn near all the card resources. They really suck up the power of the card. The only saving grace is that they last only about 1-2 minutes. If I have met my project resource obligation then they don't run all that often. The E@H WU's seem to be fairly neutral toward the Seti WU completion and their saving grace, at least for the new Parkes data is their high credit count. I just rebalanced project resources in the last week to give E@H more time compared to MW@H just because it pays better. I seem to be running the Parkes WU's at 1 hour-57 minutes on average now. I do run 100 Mhz higher memory clock speed more than you however so that likely is the reason why I complete them faster than you even though I run 3X per card compared to your 2X. The Parkes data really seems to like a high memory clock speed. I've seen way more benefit to those tasks compared to Seti or MilkyWay tasks. My $0.02
Cliff, well that is to be expected with running multiple WU on a card compared to just one. Completion times will be greater. I look at it as a benefit to higher RAC per day, not in minimal task completion times.
I just rebalanced project resources in the last week to give E@H more time compared to MW@H just because it pays better. I seem to be running the Parkes WU's at 1 hour-57 minutes on average now. I do run 100 Mhz higher memory clock speed more than you however so that likely is the reason why I complete them faster than you even though I run 3X per card compared to your 2X. The Parkes data really seems to like a high memory clock speed. I've seen way more benefit to those tasks compared to Seti or MilkyWay tasks. My $0.02
Cheers, Keith
Hi Keith,
My observation, wasn't really about longer completion times per se running double.
I knew that those times would increase. What was of interest was an 'apparant' doubling of those times. Then I realised that BM was the culprit, since E@H uses DCF for estimates:-( And BM borks DCF estimates, in fact just about all estimates under BM are inaccurate:-/
The 'actual' run times of E@H parkes WU for a 980 running dual[ 1 x S@H & 1 x E@H] is 1:h 21m 34s.Thats 2 projects per GPU:-)
lAnd for my 970 running the same way the time is 1h 31m 44s Also running dual projects and WU.
I manually edited the client state DCF values down by around 20% and re-ran BOINC, the timings were then estimated at about 20 mins over actual, and have decreased as BM sorts out its DCF estimates to approach reality:-)
Those timings are somewhat dependant on the length/duration of the S@H WU and individual E@H WU diffs.
But they are the median so far.
A bonus is that so far at least Temps are staying pretty low as well.
About 14.5C - 17.3C CPU core and 60-62C for the 970 and 40-42C for the 980.
As you know I run my rig at stock clocks bar the correction of the P2 state:-)
I haven't seen the need as yet to o/c my gpu's they seem to be doing ok:-)
RE: I'm almost positive it
)
Well there are some differences, I run AMD CPU's, you run Intel, and so the motherboards and chipsets will be miles apart.
I haven't had any difficulty using NVInspector to reset the P2 state on my GPU's
and I used NVI to make 2 cmd files so all I need to do is run them prior to running BOINC and the P2 states are reset to 3505 & 3600 respectively.
I don't O/C the cards in any other way, nor do I O/C my CPU, my rig has far to many other add-ons gobbling power and making heat, so I just run stock clocks apart from P2 resets and only 1 WU per GPU..
OTOH Keith also runs AMD, and does O/C both CPU & GPU's and runs multiple WU, without any apparent problems..
But I concur NVidia has once again shot itself in the foot by crippling their high end cards.
Cheers,
Cliff,
Been there, Done that, Still no damm T Shirt.
cliff wrote:I used NVI to
)
More than one of us has tried to do that, and stumbled a bit.
You would do us a public service to post the exact contents of your successful cmd files.
Yes, I do understand that to do this one must assure that the NVI commands run between initial boot and the launching of boincmgr, but the delayed launching of boincmgr and means to control the relative timing of boot time command files launches is within my grasp. I think it is the actual NVI commands that have eluded me.
Thanks.
In the expanded panel of NVI
)
In the expanded panel of NVI is a button "Create Clocks Shortcuts". It creates simple and self-explanatory command line parameters. I've lumped the settings for P0 and P2 into a single .bat, but have not rebootet since and not actually tried it, so hesitate to post it here.
MrS
Scanning for our furry friends since Jan 2002
Hi MrS, That's what I
)
Hi MrS,
That's what I did:-) But I didn't make a batch file, since BOINC auto starts at logon under Windows, I login, exit BOINC then simply click on the shortcuts and use OHM [Open Hardware Monitor] to check that the P2 states have actually been changed.
Then I restart BOINC and carry on crunching.
BTW has anyone any definite idea why running S@H shorties on one GPU would speed up E@H WU on another GPU?
I know it happens, I've watched it, but don't know WHY:-) And its niggling at the back of my mind, like an itch I just cant quite reach:-/
Cheers
Cliff,
Been there, Done that, Still no damm T Shirt.
RE: cliff wrote:I used NVI
)
Humm, Y'know I've not even looked at whats in those shortcuts..
I just used NVI to create them, then used them.. See my reply to MrS as to what I actually do.. Since I don't use any inbuilt timings, its all run hands on..
Cheers,
Cliff,
Been there, Done that, Still no damm T Shirt.
Cliff take a look at this
)
Cliff take a look at this thread over in Seti.
http://setiathome.berkeley.edu/forum_thread.php?id=76997
Jason, one of the developers, is making some investigations into the P2 memory speed reductions and has made some interesting observations. Basically, systems with fast CPUs don't load the video driver or GPU apps heavily enough so the cards think it is beneficial to downclock the memory state. Even more apparent with heavy utilization. He might be on to something in my opinion.
Cheers, Keith
Hi Keith, Well if the
)
Hi Keith,
Well if the GPU's are being underused I have just enabled 2 x WU per GPU in both E@H & S@H, and am currently running 1 WU per project on each GPU..
So GPU 0 has 1 x S@H & 1 x E@H running together.
Result is BOTH take longer to run.
E@H from 1H 05Min 30 sec to 2 H 17 min & S@H from 5 min aprox to 15 mins plus
This is running the Parkes beta app from E@H, and lunatics for S@H
With the old BRP perseus app and tasks the E@H ran faster than if run on their own or single per GPU ie were less than 2 x single time..
Guess its just an interesting situation:-) But I still have to keep an eye on the situation:-/ Getting my kip during the day:-) And running 0:30 h to 07:30 h
on economy 7 tarif to cut costs a bit.
Regards,
Cliff,
Been there, Done that, Still no damm T Shirt.
Hi, Sorry my terminology
)
Hi,
Sorry my terminology wasn't the best, were not cmd, but shortcuts and only contained a call to NVI to set memory to 3505 etc.
No timings were used, I run the shortcuts manually.
Regards,
Cliff
Cliff,
Been there, Done that, Still no damm T Shirt.
Cliff, well that is to be
)
Cliff, well that is to be expected with running multiple WU on a card compared to just one. Completion times will be greater. I look at it as a benefit to higher RAC per day, not in minimal task completion times. I used eFMer's SetiPerformance utility to determine the highest efficiency for each card. In my case, that is 3X WU per card. Now granted, that utility just uses Seti WU for its metric and for sure MilkyWay and Einstein WU's most definitely don't behave the same as Seti. For example, when a MilkyWay WU occupies a card along with the two other Seti WU's .... the Seti task completion times just about double because the MW WU takes damn near all the card resources. They really suck up the power of the card. The only saving grace is that they last only about 1-2 minutes. If I have met my project resource obligation then they don't run all that often. The E@H WU's seem to be fairly neutral toward the Seti WU completion and their saving grace, at least for the new Parkes data is their high credit count. I just rebalanced project resources in the last week to give E@H more time compared to MW@H just because it pays better. I seem to be running the Parkes WU's at 1 hour-57 minutes on average now. I do run 100 Mhz higher memory clock speed more than you however so that likely is the reason why I complete them faster than you even though I run 3X per card compared to your 2X. The Parkes data really seems to like a high memory clock speed. I've seen way more benefit to those tasks compared to Seti or MilkyWay tasks. My $0.02
Cheers, Keith
RE: Cliff, well that is to
)
Hi Keith,
My observation, wasn't really about longer completion times per se running double.
I knew that those times would increase. What was of interest was an 'apparant' doubling of those times. Then I realised that BM was the culprit, since E@H uses DCF for estimates:-( And BM borks DCF estimates, in fact just about all estimates under BM are inaccurate:-/
The 'actual' run times of E@H parkes WU for a 980 running dual[ 1 x S@H & 1 x E@H] is 1:h 21m 34s.Thats 2 projects per GPU:-)
lAnd for my 970 running the same way the time is 1h 31m 44s Also running dual projects and WU.
I manually edited the client state DCF values down by around 20% and re-ran BOINC, the timings were then estimated at about 20 mins over actual, and have decreased as BM sorts out its DCF estimates to approach reality:-)
Those timings are somewhat dependant on the length/duration of the S@H WU and individual E@H WU diffs.
But they are the median so far.
A bonus is that so far at least Temps are staying pretty low as well.
About 14.5C - 17.3C CPU core and 60-62C for the 970 and 40-42C for the 980.
As you know I run my rig at stock clocks bar the correction of the P2 state:-)
I haven't seen the need as yet to o/c my gpu's they seem to be doing ok:-)
Regards,
Cliff,
Been there, Done that, Still no damm T Shirt.