Hi,
I got 2 PCs with one Nvidea Geforce 460 each.
http://einsteinathome.org/host/3947978
http://einsteinathome.org/host/3947696
I got the following problems:
1) When only 1 CUDA task is running on 3947696, BOINC estimates the time for computing one WU to be an hour. That's close to the real execution time of 0:58. However, when I run 2 CUDA task on the PC, BOINC estimates the execution time to be 21 hours while the real value is only 1:25.
That means, that even if I set the buffer to a value of 10 days, BOINC will only download about 22 CUDA WUs and those WUs get computed in about 7.5 hours. Since this PC is disconnected from the internet during night for about 8 hours, it happens often that the Geforce 460 has not enough work to do.
It also means, that hundreds of CPU WUs get downloaded too and those WUs need many days to compute, which means my wingmen have to wait for a long time before they get their credits.
Is there any way to get more CUDA WUs?
2) Although both PCs got the same graphics card, the one with more memory and better Processor (3947696) is a lot faster when crunching CUDA WUs (about 35%). Letting the CUDA task on 3947978 have more processor time doesn't channge anything. Any idea how I can improve the slow PC (3947978)?
3) This morning, when 3947696 reported the completed WUs back to the server, a lot of WUs were errorous.
The log says "WU download error: couldn't get input files" and "MD5 check failed". Some of those WUs have 0 CPU time which means there was no computing at all while others have very high CPU time values. How can the CPU time be 5000 seconds if the logs says that it couldn't get input files? It's the first time this happened so I am asking myself what went wrong.
Thanks in advance for helping me.
Copyright © 2024 Einstein@Home. All rights reserved.
CUDA probs and other things
)
Do you switch users on the pc's? If so you must stop crunching before you do or errors will occur, this is a Windows thing and is not fixable by the current version of Boinc. I have read where the newer 6.12.xx versions address this but am not totally sure it is fixed. It would be the newer 6.12.xx versions not the older ones, 6.12.15 is the newest Beta version. On the Home Page where is says Download Boinc, click on it then click on the All Versions link on the next page and you will see the Beta version download link.
RE: Maybe...but it would
)
That's what I am doing right now. Thought there might be an explanation why BOINC ist estimating the time so badly.
OK, I guess I have to live with it.
No, I didn't switch users on that PC. There must be another reason...
RE: RE: RE: Do you
)
RE: Did you play games,
)
No, not at all. I simply clicked on "update" in the boinc client as first action in the morning.
Still asking myself why BOINC makes a wrong estimation of the execution time when more than 1 GPU WU is running. On my slower PC the estimated time is 14 hours and on the fast PC the estimated time is 21 hours. Really strange...
RE: Still asking myself why
)
Because the GPU scheduler in BOINC isn't built with running multiple tasks on the same piece of hardware in mind. That you do this, is something of your own choice, it's not something the software anticipates or supports. And if I read the developers correctly, it isn't something that's going to be supported (any time soon) either.
RE: Still asking myself
)
It's probably because you're using an app_info file to run 2 wus that's missing the tags (as are the ones posted on 3wu thread).
Grab the flops value out of your client_state.xml and pop it into the app_info. You will then need to tweak the value a little until the times are roughly correct.
RE: RE: Still asking
)
Yeah, that did the trick! Thank you very much, John.
Now I am able to reduce the CPU WUs in the cache.