Thanks, Gary. Here are two more which appeared this morning.
273794486
274216816
With the previous three I've reported, that's 1 day 14 hours 42 minutes and 34 seconds of CPU time down the toilet in the past two weeks. I've now turned off FGRP1 tasks for a while in my preferences. I hope the "file has entries that aren't numbers" error means something to someone.
270352809
272967613
273026007
274259792 (not the only validate error for this workunit, and I note that the Linux and Mac results so far have been invalid, while the Windows result is pending)
One of my two computers, to my knowledge, has never had an invalid task on anything other than gamma ray work.
Validate error [6] (00001010)
- result file has entries that aren't numbers
- a number is out of valid range for this result
Quote:
274216816
Validate error [6] (00000010)
- result file has entries that aren't numbers
Quote:
With the previous three I've reported, that's 1 day 14 hours 42 minutes and 34 seconds of CPU time down the toilet in the past two weeks.
As pointed out in the opening post of this thread, there is a disproportionately high rate of validate errors for FGRP1 tasks on Linux and Mac OS X. Everybody using these systems is suffering and I'm very sorry it's taking so long to find the cause. The Devs were informed at the time and are trying to work out what is doing this. They do have many things competing for their time and this particular problem is obviously not simple to diagnose.
Quote:
I've now turned off FGRP1 tasks for a while in my preferences.
That's the only thing you can do if the 'loss rate' is unacceptable. In your case, you seem to be really being hammered so I can fully understand your concerns.
Quote:
I hope the "file has entries that aren't numbers" error means something to someone.
The actual message may bear little relationship to what is actually doing the damage. Since it's not showing up in Windows, I would guess that it's probably some obscure problem somewhere specific to the unix world that is being triggered occasionally by whatever ... Unfortunately, nothing has yet been found and the Devs have other priorities they must also attend to.
Validate error [6] (00000010)
- result file has entries that aren't numbers
Quote:
272967613
Validate error [6] (00000010)
- result file has entries that aren't numbers
Quote:
273026007
Validate error [6] (00000010)
- result file has entries that aren't numbers
Quote:
274259792
Validate error [6] (00000010)
- result file has entries that aren't numbers
Quote:
(not the only validate error for this workunit, and I note that the Linux and Mac results so far have been invalid, while the Windows result is pending)
There are now three validate errors (all Mac OS X/Linux) for the WU quorum you mention. It's possible that this could be bad data. The 'in progress' task is on a Windows machine so the answer will be revealed shortly. If it fails (validate error) it's very likely bad data and I'll report it to the Devs. If it succeeds, the data is OK and it's just a random triple coincidence of the validate error problem. In that case, I'll report it as well, just in case a triple occurrence like this might help with the diagnosis. The extra info associated with the other two failed tasks is exactly the same as yours.
Quote:
One of my two computers, to my knowledge, has never had an invalid task on anything other than gamma ray work.
That's not really surprising since validate errors are quite rare for CPU tasks other than FGRP1 tasks.
It won't be something simple or silly. The problem is being looked at and it is elusive.
I have brought this triple validate error quorum to the attention of the Devs in the (probably forlorn) hope that it might provide additional insights. Anyway, fingers crossed ...
Here is the extra info for
)
Here is the extra info for these two task IDs.
Cheers,
Gary.
271583014
)
271583014
Thanks, Gary. Here are two
)
Thanks, Gary. Here are two more which appeared this morning.
273794486
274216816
With the previous three I've reported, that's 1 day 14 hours 42 minutes and 34 seconds of CPU time down the toilet in the past two weeks. I've now turned off FGRP1 tasks for a while in my preferences. I hope the "file has entries that aren't numbers" error means something to someone.
NG
NG
270352809 272967613 273026007
)
270352809
272967613
273026007
274259792 (not the only validate error for this workunit, and I note that the Linux and Mac results so far have been invalid, while the Windows result is pending)
One of my two computers, to my knowledge, has never had an invalid task on anything other than gamma ray work.
27158301
)
27158301
Extra info for this
)
Extra info for this task
Cheers,
Gary.
RE: 273794486 Validate
)
As pointed out in the opening post of this thread, there is a disproportionately high rate of validate errors for FGRP1 tasks on Linux and Mac OS X. Everybody using these systems is suffering and I'm very sorry it's taking so long to find the cause. The Devs were informed at the time and are trying to work out what is doing this. They do have many things competing for their time and this particular problem is obviously not simple to diagnose.
That's the only thing you can do if the 'loss rate' is unacceptable. In your case, you seem to be really being hammered so I can fully understand your concerns.
The actual message may bear little relationship to what is actually doing the damage. Since it's not showing up in Windows, I would guess that it's probably some obscure problem somewhere specific to the unix world that is being triggered occasionally by whatever ... Unfortunately, nothing has yet been found and the Devs have other priorities they must also attend to.
Cheers,
Gary.
RE: 270352809 Validate
)
There are now three validate errors (all Mac OS X/Linux) for the WU quorum you mention. It's possible that this could be bad data. The 'in progress' task is on a Windows machine so the answer will be revealed shortly. If it fails (validate error) it's very likely bad data and I'll report it to the Devs. If it succeeds, the data is OK and it's just a random triple coincidence of the validate error problem. In that case, I'll report it as well, just in case a triple occurrence like this might help with the diagnosis. The extra info associated with the other two failed tasks is exactly the same as yours.
That's not really surprising since validate errors are quite rare for CPU tasks other than FGRP1 tasks.
Cheers,
Gary.
That workunit has now
)
That workunit has now validated (Windows-Windows).
I hope it's not something silly like line ends or "5.0E-3" being detected as non-numerical characters...
It won't be something simple
)
It won't be something simple or silly. The problem is being looked at and it is elusive.
I have brought this triple validate error quorum to the attention of the Devs in the (probably forlorn) hope that it might provide additional insights. Anyway, fingers crossed ...
Cheers,
Gary.