By "report it" Jord means "hit the update button in BOINC Manager because at the moment the result has not been reported back to the server so we can't see any details yet about the error message that will have been sent when the result was uploaded."
By "report it" Jord means "hit the update button in BOINC Manager because at the moment the result has not been reported back to the server so we can't see any details yet about the error message that will have been sent when the result was uploaded."
Thanks for the explanation... unfortunately, I'm a Boinc noob and didn't know that clicking update sent the damaged unit back to you, and now it's gone. :( Perhaps Boinc sent it back automatically?
BOINC reports automatically upon the next contact with the server.
Now the problem here is that the task has already been purged from the database ...
No it hasn't. It's still there showing exit code -226 and the message "Too many harmless exits. Further down near the bottom of stderr.out you can see where the program was repeatedly starting and stopping unable to acquire a lock file.
Maybe the OP did some sort of cleanup in the BOINC folder and removed files that shouldn't be removed or perhaps something like an antivirus program did something that upset BOINC.
@Jaranth - returning of results is a two stage process. Firstly results are uploaded as soon as crunching of a task finishes. At a later stage (usually the next time BOINC needs to request work) the fact that the upload has occurred is "reported" to the server. This happens automatically but you can intervene manually (via the update button) when you need to report urgently - as in this case since we needed to see the error message.
BOINC reports automatically upon the next contact with the server.
Now the problem here is that the task has already been purged from the database ...
No it hasn't.
OK, I swear that at the time I looked he had 2 results there that were still busy. Neither showed as an error in any way. (Which is correct as I posted it at 21:10:03 UTC and it was reported at 21:13:28 UTC
Quote:
Maybe the OP did some sort of cleanup in the BOINC folder and removed files that shouldn't be removed or perhaps something like an antivirus program did something that upset BOINC.
No, then you run into a different error all together.
This is the infamous "exited with zero status" problem, where the checkpoint cannot be written. After 100 times of this happening, BOINC will now exit-kill the application. Go count the amount of times it happens in there, I bet it's 100. ;-)
This problem can happen when you have multiple CPUs/cores or use Hyperthreading and you allow BOINC to throttle the CPUs. It can be overcome by not using BOINC throttle set at 100%), or telling BOINC to use one CPU only. No, it's not fixed in BOINC 6, as it's a bug with the throttling, which is unchanged in 6.
Is there any chance that selecting "Leave applications in memory while suspended?" would make a difference with this problem?
I only ask because I only have used the CPU throttling on one machine, but I never had this problem (that I know of) and the only non-standard setting that I can think of was the "leave in memory" thing.
Of course I have no idea if the client actually exits the application at any point with throttling (though I tend to doubt it), so this may be completely unrelated and I was simply lucky in the past.
When BOINC uses the CPU throttling, it won't exit the applications in use, just pause them, run them, pause them, run them, etc. I don't think it can read the applications back into memory within the two seconds of pausing/running, it's not that fast.
Would the setting make a difference? I don't know, a group of people using the throttling option should test that.
Computation Error
)
Can you report it so we can see the stderr.txt?
By "report it" Jord means
)
By "report it" Jord means "hit the update button in BOINC Manager because at the moment the result has not been reported back to the server so we can't see any details yet about the error message that will have been sent when the result was uploaded."
Cheers,
Gary.
RE: By "report it" Jord
)
Thanks for the explanation... unfortunately, I'm a Boinc noob and didn't know that clicking update sent the damaged unit back to you, and now it's gone. :( Perhaps Boinc sent it back automatically?
Well... live and learn, I'll remember next time!
RE: Perhaps Boinc sent it
)
BOINC reports automatically upon the next contact with the server.
Now the problem here is that the task has already been purged from the database, we can't check it in your list. So... let's wait for the next one.
When it happens again, please post the messages from your Messages tab in BOINC Manager, as we can learn some of that as well.
RE: RE: Perhaps Boinc
)
No it hasn't. It's still there showing exit code -226 and the message "Too many harmless exits. Further down near the bottom of stderr.out you can see where the program was repeatedly starting and stopping unable to acquire a lock file.
Maybe the OP did some sort of cleanup in the BOINC folder and removed files that shouldn't be removed or perhaps something like an antivirus program did something that upset BOINC.
@Jaranth - returning of results is a two stage process. Firstly results are uploaded as soon as crunching of a task finishes. At a later stage (usually the next time BOINC needs to request work) the fact that the upload has occurred is "reported" to the server. This happens automatically but you can intervene manually (via the update button) when you need to report urgently - as in this case since we needed to see the error message.
Cheers,
Gary.
Looks like similar messages I
)
Looks like similar messages I had just the other day when my WU did a client error on me. http://einsteinathome.org/task/104127185
It hasn't repeated itself yet
And here's the one Jaranth reported.
http://einsteinathome.org/task/103605812
RE: RE: RE: Perhaps
)
OK, I swear that at the time I looked he had 2 results there that were still busy. Neither showed as an error in any way. (Which is correct as I posted it at 21:10:03 UTC and it was reported at 21:13:28 UTC
No, then you run into a different error all together.
This is the infamous "exited with zero status" problem, where the checkpoint cannot be written. After 100 times of this happening, BOINC will now exit-kill the application. Go count the amount of times it happens in there, I bet it's 100. ;-)
This problem can happen when you have multiple CPUs/cores or use Hyperthreading and you allow BOINC to throttle the CPUs. It can be overcome by not using BOINC throttle set at 100%), or telling BOINC to use one CPU only. No, it's not fixed in BOINC 6, as it's a bug with the throttling, which is unchanged in 6.
Jord, Is there any chance
)
Jord,
Is there any chance that selecting "Leave applications in memory while suspended?" would make a difference with this problem?
I only ask because I only have used the CPU throttling on one machine, but I never had this problem (that I know of) and the only non-standard setting that I can think of was the "leave in memory" thing.
Of course I have no idea if the client actually exits the application at any point with throttling (though I tend to doubt it), so this may be completely unrelated and I was simply lucky in the past.
When BOINC uses the CPU
)
When BOINC uses the CPU throttling, it won't exit the applications in use, just pause them, run them, pause them, run them, etc. I don't think it can read the applications back into memory within the two seconds of pausing/running, it's not that fast.
Would the setting make a difference? I don't know, a group of people using the throttling option should test that.