I have been noticing very low credit lately for Einstein@home, and started to look into why.
When I check my results, I see a LOT of invalid results(which might be because I'm very OC'd), but more importantly, I don't see ANY results for the "CPU" tasks.
I've looked through the invalid, valid, error, pending, and in-progress results, but I just don't see these results anywhere. ALl the results are for CUDA tasks.
On the PCs, I see the work units completing and being uploaded, supposedly successfully on my two machines. This would help explain why I'm really getting so low of a credit rating per day.
I have no idea how to "debug" this. Do I need to just disconnect from the project, wipe out the data dirs and start over?
I noticed it's sort of hard to correlate the workunit NAMES to IDs in various places, such as the slot/# dirs, the results on the web, etc. etc.
I have 2 computers that are running Windows 7, each has at least 1 GPU, and I have a custom app_info.xml file in each machine so my GPUs can perform multiple work units simultaneously.
There are 4 CPU tasks at a time on 1 machine (with a NVIDIA GTX 460) and 8 CPU tasks on the other machine, with a GTX 570, and 2 GTX 560 TI 448s.
Can anyone give me an idea what to look for or see what's going on? I have a hard time understanding the internals of the log files and xml files...
FYI - The machine with 8 CPU jobs and 8 GPUs (3 jobs each) is "new" meaning I just re-built it with Windows 7 x64, and it's running the latest NVIDIA driver (which I am also starting to suspect).
It's got a i7-2600k and uses internal graphics for the display, so GPUs are just crunching.
Copyright © 2024 Einstein@Home. All rights reserved.
No work for my CPUs being seen in results?
)
Could we have an actual number for that driver, please?
If you mean 295.73 WHQL, or the 290.53 or 295.51 Beta drivers which preceeded it, there are multiple reports on BOINC project message boards that they cause problems. Reversion to 285.62 is recommended.
Love to give the rev number.
)
Love to give the rev number. I can't seem to figure out in Windows 7 how to get the driver version. Not coming up the way I expect.
Gives me 8.17.12.9573 2/9/2012. I think htis is the 295.73 WHQL version.
But would the graphics driver mess up the reporting of main science apps?
If I remove it, and install another one, do I have to do anything else to try to recover from this?
I really am wondering where all my credit is going for the main science apps...
RE: Love to give the rev
)
The Easy way to get the Driver version is to right click on the desktop, choose NVIDIA control panel, then once Nvidia control panel has opened, click Help, then System Information.
With the 295.xx series of drivers once the Monitor goes to sleep, the Cuda device becomes unavailable, eithier disable the monitor going to sleep, or install 290.53 or earlier drivers.
Claggy
Oh and I forgot, the reason I
)
Oh and I forgot, the reason I loaded the 295.73 drivers was because I have 2 GTX 560 Ti 448s that are NOT recognized in the 285.62 driver. I'm trying to get those and a GTX 570 used in the same computer.
I uninstalled the 295.73 and loaded the 285.62 for the time being.
Maybe going to try the CUDA development drivers... 286.19, maybe that will work.
Kinda annoying, since I spent a lot of money on these cards, and I'm having nothing but trouble with the 560 TI 448s.
It's strange that you can't
)
It's strange that you can't revert to an earlier driver. According to the NVidia driver website (archive/beta pages), the GTX 560 Ti was supported under Win7/x64 right back to v266.66 - although that looks like a special 'first release' driver, which might not support the GTX 570. The two compatibility lists seem to converge from v270.61 onwards.
One other point - the GTX 560 Ti cards (especially the early ones) have been reported as producing errors under heavy CUDA processing because the on-board BIOS was set to supply too low voltage to the GPU. Turning up the voltage using overclocking tools was found to increase reliability, though I'm not qualified to advise on the details.
A similar problem with failing tasks has also been noted if the computer's PSU isn't fully capable of supplying enough power for multiple GPUs, especially on the 12v rails.
I have the GTX 560 TI 448
)
I have the GTX 560 TI 448 versions, that's why. They are "special" and only a limited set of cards, will be gone soon. Almost a GTX 570, and they have more memory.
AH! I can set show_names=1 in the url and then I can find the non-GPU tasks, there are some...
Well - ok so I'm not really loosing CPU credits, just now have to get my graphics cards working correctly.
I do have a special Antec 1200W PS made specifically to handle this CPU and up to 4 graphics cards. I don't think that's it, bu I can try to increase the voltage a little.
I think it much more likely the way I'm using the cards in the machine, either MOBO can't handle it, or the NVIDIA drivers can't becuase the cards are too similar or something