I read this post over on the Seti@home forum about a gui program that shows GPU load in Linux, maybe it could work for you!?
I don't run Linux and have no experience with it so just guessing.
Read you guys say the newest CUDA 5.5 may solve the bug that has been interfering with your app.
Depends on what you mean by "the bug". There has been a bug that prevented us from using anything newer than CUDA 3.2. NVIDIA was able to reproduce it but couldn't fix it for quite some time. This bug now seems fixed with CUDA 5.5 and we could start testing it on albert as we have built all necessary binaries. It's just a matter of time.
However, there's also this performance regression we see with the latest drivers (>= 319.xx) but that's unrelated to CUDA 5.5 as we received reports that also our CUDA 3.2 app is affected, which clearly points at the driver. That's one of the reasons why we don't yet roll out CUDA 5.5 apps as these would require the affected driver versions, whereas the CUDA 3.2 app can still be used with older unaffected drivers.
Best,
Oliver
I was interested by this, because I've recently upgraded an old 9800GT host from the 310.70 to 326.14 beta driver. That should have crossed the 'slowdown' threshhold at 319, but I did not see a slowdown with a thirdy-party SETI app I'm testing.
My slowdowns of the Seti apps on my 9800GTX+ happened after upgrading past 301.xx drivers, a lot earlier than the drivers you were trying.
I read this post over on the Seti@home forum about a gui program that shows GPU load in Linux, maybe it could work for you!?
I don't run Linux and have no experience with it so just guessing.
I'm not aware that support for that got dropped in a certain driver version, but it may of course very well be the case (for desktop cards). All our Linux boxes run Tesla-series cards and those still show these values with driver 319.37 (Tesla only fan, Fermi also GPU until). But again, installing the CUDA toolkit shouldn't make any difference.
Sorry,
Oliver
Thanks for taking the time. When I say that support was discontinued with a driver release I am merely parroting what I had read in other threads elsewhere. I don't believe it should be this hard with Linux but ....
- GPUs have been reporting utilization and various other sensor information for years now, it's just that noone bothers to build tools like GPU-Z under linux. Or if they try, too much variability among the distros gets in the way. The small market presence of linux doesn't help, either.
- As far as I know the severe performance drop with 319.x and subsequent drivers only affects linux.
- robl, just average your current GPU runtimes over at least 10 WUs, then set your GPU up to run 2 Einstein tasks in parallel via your user profile and run some WUs. Discard the first 2 ones (where the switch happened), then average again over 10+ WUs. Divide this average by 2 and that's how much you can squeeze out of your GPU, compared to the reference time. The higher the throughput gain, the lower your GPU utilization was. It's a bit indirect but should give you all the information you need at the end of the day.
- GPUs have been reporting utilization and various other sensor information for years now, it's just that noone bothers to build tools like GPU-Z under linux.
I did look into this, if only to code up a command line tool to monitor my fleet of linux boxes. However, I couldn't find an API that reads out the CPU usage so it's not clear to me how GPU-Z does it.
nvidia have recently released a perfkit for performance monitoring, which I might have a play with (it looks seriously detailed). Or maybe have a play with GPU-Z on a windows box to see if I can figure out how it determines the GPU load.
The author of another very nice and powerful monitoring tool under Win, HWiNFO64, is active on another message board. He's very helpful, so it might be a good idea to ask him. PM me if you want me to forward him your contact details :)
- GPUs have been reporting utilization and various other sensor information for years now, it's just that noone bothers to build tools like GPU-Z under linux.
I did look into this, if only to code up a command line tool to monitor my fleet of linux boxes. However, I couldn't find an API that reads out the CPU usage so it's not clear to me how GPU-Z does it.
nvidia have recently released a perfkit for performance monitoring, which I might have a play with (it looks seriously detailed). Or maybe have a play with GPU-Z on a windows box to see if I can figure out how it determines the GPU load.
I spent a little time looking at the Nvidia pyNVML, to find nothing useful other than confirming ¨N/A¨ reported by nvidia-smi.
@robl I read this post
)
@robl
I read this post over on the Seti@home forum about a gui program that shows GPU load in Linux, maybe it could work for you!?
I don't run Linux and have no experience with it so just guessing.
It's called Open hardware monitor.
RE: RE: Hi, RE: Read
)
My slowdowns of the Seti apps on my 9800GTX+ happened after upgrading past 301.xx drivers, a lot earlier than the drivers you were trying.
Claggy
RE: @robl I read this post
)
Holmis
I wrote to arkayn on seti and he said he could never get it to work.
RE: I'm not aware that
)
Thanks for taking the time. When I say that support was discontinued with a driver release I am merely parroting what I had read in other threads elsewhere. I don't believe it should be this hard with Linux but ....
Any news? There are 326.80
)
Any news? There are 326.80 drivers available. Are you seeing any performance improvement running Kepler on CUDA 5.5 ?
-----
I'd be interested in this as
)
I'd be interested in this as well...
Oliver
Einstein@Home Project
Some quick comments: -
)
Some quick comments:
- GPUs have been reporting utilization and various other sensor information for years now, it's just that noone bothers to build tools like GPU-Z under linux. Or if they try, too much variability among the distros gets in the way. The small market presence of linux doesn't help, either.
- As far as I know the severe performance drop with 319.x and subsequent drivers only affects linux.
- robl, just average your current GPU runtimes over at least 10 WUs, then set your GPU up to run 2 Einstein tasks in parallel via your user profile and run some WUs. Discard the first 2 ones (where the switch happened), then average again over 10+ WUs. Divide this average by 2 and that's how much you can squeeze out of your GPU, compared to the reference time. The higher the throughput gain, the lower your GPU utilization was. It's a bit indirect but should give you all the information you need at the end of the day.
MrS
Scanning for our furry friends since Jan 2002
RE: - GPUs have been
)
I did look into this, if only to code up a command line tool to monitor my fleet of linux boxes. However, I couldn't find an API that reads out the CPU usage so it's not clear to me how GPU-Z does it.
nvidia have recently released a perfkit for performance monitoring, which I might have a play with (it looks seriously detailed). Or maybe have a play with GPU-Z on a windows box to see if I can figure out how it determines the GPU load.
The author of another very
)
The author of another very nice and powerful monitoring tool under Win, HWiNFO64, is active on another message board. He's very helpful, so it might be a good idea to ask him. PM me if you want me to forward him your contact details :)
MrS
Scanning for our furry friends since Jan 2002
RE: RE: - GPUs have been
)
I spent a little time looking at the Nvidia pyNVML, to find nothing useful other than confirming ¨N/A¨ reported by nvidia-smi.
This https://devtalk.nvidia.com/default/topic/524665/no-gpu-usage-for-gt640-in-nvidia-smi/ seems to suggest GPU usage reporting broken since 270.26.
Maybe the perkit has some way of turning it back on.