I don't know if someone here is still opposed to my idea of providing better feedback for invalid results, but one of the volunteer developers over at SETI saw some merit to what I was trying to get across, although they didn't think it was wise to add the burden on the system (specifically SETI's system) while it was still having so many other issues...
Brian
I'm sure absolutely nobody is opposed to this suggestion of yours :). The problem is how to arrive at this better feedback. The validator would have to be a far more intelligent beast than it currently is. At the moment it is smart enough to recognise that there is a difference between some result pairings but I'm sure it doesn't have a clue as to what is causing those differences. Therefore it can't really make some sort of useful addition to the "checked but no consensus yet" outcome that it currently reports. Even if the validator started saying that parameter X was Y% different in the two results, would that really constitute better feedback? You would probably only get better feedback if the validator could make some authoritative statement about the possible reasons for the difference and that is where the extra intelligence would be needed.
In the distant past when Bruce was able to post more regularly, he referred to "tweaking" or "relaxing" the parameters that the validator was using to make the judgement. He also said that it wasn't easy to arrive at a good compromise and so he would err on the side of allowing a small fraction of "good" results to be marked invalid rather than missing "bad" results.
The problem will exist if HR is turned on or not. So we might as well turn HR on and get the credit due.
I agree fully with this sentiment. Having said that, the reason why HR is not being used is probably just that it is too difficult to achieve it with the current way that work is distributed by the project.
Difficult or not, I still feel like having a good winge about the current waste of what are probably quite correct results that are being rejected simply because they were done on a (more accurate) Linux box rather than Windows :).
There really aren't very many of these! I would estimate that it is less than 5% of invalidations.
Also, in this particular case, you are quite right to point out that the 3 results in question were done under the same OS - WinXP - 2 on Intel and 1 on AMD. If you look at the full results list of the AMD box you will see that it is a laptop with three "client error (compute error)" results in it's current list along with that single invalidation. Can we perhaps imagine "overheating laptop" or at least some hardware issue as the reason behind this particular case?
If you look at the full results list of the AMD box you will see that it is a laptop with three "client error (compute error)" results in it's current list along with that single invalidation. Can we perhaps imagine "overheating laptop" or at least some hardware issue as the reason behind this particular case?
Nope. Error 10, which is what that host is getting for the "client error" results, is a problem with the science application checkpointing. Checkpointing improvements are in the 4.23 beta release, so it would be interesting if this particular person picked up the 4.23 release to see if it helped.
Error 10, which is what that host is getting for the "client error" results, is a problem with the science application checkpointing....
Sure, you can read in the stderr.out output that 2 of the three client errors were due to the app being unable to make sense of a saved checkpoint when BOINC was being restarted. If this was being caused by a software bug associated with checkpointing, wouldn't many more people be seeing exactly the same thing? Isn't it more likely (on balance) that some sort of hardware flakiness is occasionally causing a bad checkpoint to be written or reread?
I fully admit that I have no idea of the precise cause of the client errors but I'd tend to think that the major reason is likely to be hardware related. It's also interesting to see the bit of garbage "@his program cannot be run in DOS mode.
$" which is inserted at the end of the output of the third client error. How do you explain that bit as being a software checkpointing bug?
Error 10, which is what that host is getting for the "client error" results, is a problem with the science application checkpointing....
Sure, you can read in the stderr.out output that 2 of the three client errors were due to the app being unable to make sense of a saved checkpoint when BOINC was being restarted. If this was being caused by a software bug associated with checkpointing, wouldn't many more people be seeing exactly the same thing? Isn't it more likely (on balance) that some sort of hardware flakiness is occasionally causing a bad checkpoint to be written or reread?
I fully admit that I have no idea of the precise cause of the client errors but I'd tend to think that the major reason is likely to be hardware related. It's also interesting to see the bit of garbage "@his program cannot be run in DOS mode.
$" which is inserted at the end of the output of the third client error. How do you explain that bit as being a software checkpointing bug?
Oh, and it could be because the application spit out the junk about DOS mode while in the midst of crashing, considering it is the final entry in the output.
Yes, there could be hardware problems...but checkpointing is an acknowledged problem at this point, which means all of us are vulnerable to it. Why it happens to some and not others, who knows...
RE: I don't know if
)
I'm sure absolutely nobody is opposed to this suggestion of yours :). The problem is how to arrive at this better feedback. The validator would have to be a far more intelligent beast than it currently is. At the moment it is smart enough to recognise that there is a difference between some result pairings but I'm sure it doesn't have a clue as to what is causing those differences. Therefore it can't really make some sort of useful addition to the "checked but no consensus yet" outcome that it currently reports. Even if the validator started saying that parameter X was Y% different in the two results, would that really constitute better feedback? You would probably only get better feedback if the validator could make some authoritative statement about the possible reasons for the difference and that is where the extra intelligence would be needed.
In the distant past when Bruce was able to post more regularly, he referred to "tweaking" or "relaxing" the parameters that the validator was using to make the judgement. He also said that it wasn't easy to arrive at a good compromise and so he would err on the side of allowing a small fraction of "good" results to be marked invalid rather than missing "bad" results.
Cheers,
Gary.
RE: The problem will exist
)
I agree fully with this sentiment. Having said that, the reason why HR is not being used is probably just that it is too difficult to achieve it with the current way that work is distributed by the project.
Difficult or not, I still feel like having a good winge about the current waste of what are probably quite correct results that are being rejected simply because they were done on a (more accurate) Linux box rather than Windows :).
Cheers,
Gary.
RE: RE: The problem will
)
There really aren't very many of these! I would estimate that it is less than 5% of invalidations.
Also, in this particular case, you are quite right to point out that the 3 results in question were done under the same OS - WinXP - 2 on Intel and 1 on AMD. If you look at the full results list of the AMD box you will see that it is a laptop with three "client error (compute error)" results in it's current list along with that single invalidation. Can we perhaps imagine "overheating laptop" or at least some hardware issue as the reason behind this particular case?
Cheers,
Gary.
RE: RE: The problem will
)
Even if it fixed only a portion of the problem, it should be turned on. Something is better than nothing.
Reno, NV Team: SETI.USA
RE: If you look at the
)
Nope. Error 10, which is what that host is getting for the "client error" results, is a problem with the science application checkpointing. Checkpointing improvements are in the 4.23 beta release, so it would be interesting if this particular person picked up the 4.23 release to see if it helped.
Brian
RE: Nope. I admire your
)
I admire your bravery :).
Sure, you can read in the stderr.out output that 2 of the three client errors were due to the app being unable to make sense of a saved checkpoint when BOINC was being restarted. If this was being caused by a software bug associated with checkpointing, wouldn't many more people be seeing exactly the same thing? Isn't it more likely (on balance) that some sort of hardware flakiness is occasionally causing a bad checkpoint to be written or reread?
I fully admit that I have no idea of the precise cause of the client errors but I'd tend to think that the major reason is likely to be hardware related. It's also interesting to see the bit of garbage "@his program cannot be run in DOS mode.
$" which is inserted at the end of the output of the third client error. How do you explain that bit as being a software checkpointing bug?
Cheers,
Gary.
RE: RE: Nope. I admire
)
Read this thread over in Problems and Bug Reports
Oh, and it could be because the application spit out the junk about DOS mode while in the midst of crashing, considering it is the final entry in the output.
Yes, there could be hardware problems...but checkpointing is an acknowledged problem at this point, which means all of us are vulnerable to it. Why it happens to some and not others, who knows...