validate error, idea's?

Pete Burgess
Pete Burgess
Joined: 7 Dec 05
Posts: 21
Credit: 318570870
RAC: 0

Hi Bernd, Try this one

Message 99806 in response to message 99805
archae86
archae86
Joined: 6 Dec 05
Posts: 3163
Credit: 7329961687
RAC: 2315972

RE: That shouldn't happen.

Message 99807 in response to message 99805

Quote:

That shouldn't happen. Can you name an example?

BM


one

In this case three fast responders got credit. The fourth, who received issue on 1 October, answered in less than 5 elapsed days but is deemed late (possibly this relates to the early date of original issue--13 Sep).

But that is the only one I currently see in this class.

Another possibly suspicious class is like this

I have two or three of these, for which the following seems to be true:

1. ABP2 10x WU
2. two results initially issued during the troubled period (e.g. 30 Sep)
3. one quick response
4. one slow response coming in for example 7 October.
5. currently the quick responder status is shown as "validation inconclusive"
6. currently the slow responder is shown as "validate error"

There are just enough of these last to make me a little suspicious--I normally have a very low rate of invalid. But by the same token, if there are very few, and associated with this event, not worth worrying about. I only mention because you mentioned interest in an example, and when I started typing Pete Burgess had not yet responded.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5879
Credit: 118880519153
RAC: 23242589

RE: That shouldn't happen.

Message 99808 in response to message 99805

Quote:
That shouldn't happen. Can you name an example?


I'm seeing these too.

Here is one that contains two results that were "completed too late to vaildate". These two would have been the 'reissue tasks' as a result of both initial tasks being completed quickly and giving validate errors before anyone was actually aware of the problem. The full date range for the entire quorum of 4 tasks was from 30 Set to 05 Oct so no deadline was even remotely at risk.

EDIT: Here is yet another quorum containing 4 results, two of which were "completed too late to validate". This is a bit different again since one of the initial two that were returned very quickly (in fact the very first one to be returned) is one of those supposedly 'too late'. I haven't actually looked very far yet but I'm seeing quite a few of these 'too late - zero credits' results. I hope it's something simple to rectify.

Cheers,
Gary.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4335
Credit: 252477342
RAC: 35469

The "too late" tasks are

The "too late" tasks are apparently a side effect of me messing around with transition and validation in order to fix the problem with the early ABP2x10 workunits. I now know how to avoid this for future re-validations.

In total there are 1807 tasks affected. I won't change the state of these tasks anymore, but will manually grant credit for them either today or early next week.

Edit: Done.

BM

BM

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6591
Credit: 329307560
RAC: 284221

RE: The "too late" tasks

Message 99810 in response to message 99809

Quote:
The "too late" tasks are apparently a side effect of me messing around with transition and validation in order to fix the problem with the early ABP2x10 workunits. I now know how to avoid this for future re-validations.


Dominoes. Well, we live and learn. Let's hope for no recurrence to provoke the need.

Quote:
In total there are 1807 tasks affected. I won't change the state of these tasks anymore, but will manually grant credit for them either today or early next week.


Sounds good to me. Outstanding effort Bernd. Well done! :-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4335
Credit: 252477342
RAC: 35469

RE: RE: I now know how to

Message 99811 in response to message 99810

Quote:
Quote:
I now know how to avoid this for future re-validations.

Dominoes. Well, we live and learn. Let's hope for no recurrence to provoke the need.


As more and more results drop in on the wrong server and will be transferred to the correct one, some more rounds of re-validation will be necessary.

BM

BM

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6591
Credit: 329307560
RAC: 284221

RE: As more and more

Message 99812 in response to message 99811

Quote:
As more and more results drop in on the wrong server and will be transferred to the correct one, some more rounds of re-validation will be necessary.


Oh. OK. So the 'coherence' of result files b/w locations is still an issue? URL's, DNS or somesuch ?

Cheers, Mike.

( edit ) Forgive me. I haven't seen my emails for 14 hours and won't for another two. So I could have missed the mailing list updates ....

( edit ) Sorry. Silly me ... I've recalled the answer to that. Don't worry ... sigh, Friday is my 'long' day .... and I'm at the long end. :-)

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

archae86
archae86
Joined: 6 Dec 05
Posts: 3163
Credit: 7329961687
RAC: 2315972

RE: At the moment I am

Message 99813 in response to message 99803

Quote:
At the moment I am posting the server status pages lists 45064 ABP2 workunits awaiting validation as of an update at 7 Oct 2010 19:15:03 UTC. Watching that number decline is probably a useful index of progress at the project level on resolving this issue.

As of 8 Oct 2010 14:45:03 UTC the ABP2 workunits awaiting validation has dropped to zero. But I that is not yet the end.

As Bernd mentioned below

Quote:
As more and more results drop in on the wrong server and will be transferred to the correct one, some more rounds of re-validation will be necessary.

and progress in that process is not likely captured in simple observation of the status page "awaiting validation" number.

On my hosts, the current pending where at least two have returned results is utterly dominated by WU's set to the (21,-1) state, which Bernd has mentioned

Quote:
For workunits like this one (minimum quorum = 21) files are still missing on the right server.


The other major category of pending on my hosts is utterly normal--I've returned a result sooner than my current quorum partner. So despite a very small number of oddities, the general population seems to be behaving well.

But there will be a long tail of observable oddity. As Gary Roberts pointed out, some WUs generated new (excess) results sent out only a couple of days ago, and those won't go past deadline until sometime around October 20.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.