Could somebody give me a similar report?
It would be useful to decide the direction of further development.
I can increase the accuracy in some way, so I will do that if it's needed.
It seems like (almost?) every invalid WU with D41.12 also had a Power PC or Linux client in the cluster. Has anyone validated with one of these clients in their cluster or had an invalid result with only other optimized or standard Win clients?
I have at least 1 invalid result that is not directly attributeable to my messing around (possibly another). Both look like sync errors to me. Here is the latest.
5.2.13 BoincStudio 0.4b
2006-04-30 20:11:25.4101 [normal]: Optimised by akosf D41.12 --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-04-30 20:11:25.4101 [normal]: Started search at lalDebugLevel = 0
2006-04-30 20:11:26.4413 [normal]: Checkpoint-file 'Fstat.out.ckp' not found.
2006-04-30 20:11:26.4413 [normal]: No usable checkpoint found, starting from beginning.
2006-04-30 20:12:29.2694 [normal]: Fstat file reached MaxFileSizeKB ==> compactifying ... done.
2006-04-30 20:54:59.1757 [normal]: Search finished successfully.
ps
changing your 'Write to disk at most every' may help to prevent this error. Should not be a near integer multiple of the completion time.
I have at least 1 invalid result that is not directly attributeable to my messing around (possibly another). Both look like sync errors to me. Here is the latest.
5.2.13 BoincStudio 0.4b
2006-04-30 20:11:25.4101 [normal]: Optimised by akosf D41.12 --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-04-30 20:11:25.4101 [normal]: Started search at lalDebugLevel = 0
2006-04-30 20:11:26.4413 [normal]: Checkpoint-file 'Fstat.out.ckp' not found.
2006-04-30 20:11:26.4413 [normal]: No usable checkpoint found, starting from beginning.
2006-04-30 20:12:29.2694 [normal]: Fstat file reached MaxFileSizeKB ==> compactifying ... done.
2006-04-30 20:54:59.1757 [normal]: Search finished successfully.
ps
changing your 'Write to disk at most every' may help to prevent this error. Should not be a near integer multiple of the completion time.
If you are referring to WU 7602326, again, one of the comps in that cluster was running a Linux client. Don't know yet if this is a trend, but it might be.
There is no linux here.
ps I have increased 'Write to disk at most every' beyond the completion time. Will advise. 7602326
oooops
Sorry, I misunderstood your post initially Brian, sorry. I think it is a timing issue between the science app and the manager.
D41.11/12 results: valid:
)
D41.11/12 results:
valid: 21
invalid: 0
Could somebody give me a similar report?
It would be useful to decide the direction of further development.
I can increase the accuracy in some way, so I will do that if it's needed.
32 completed without errors 9
)
32 completed without errors
9 valid
0 invalid
Second result with D41.12 on
)
Second result with D41.12 on same machine is also bad. That's 2 for 2.
http://einsteinathome.org/task/27335956
All resuls done with S41.12 on same machine were good (hundreds of them).
Back to S41.12 on that machine (an Athlon64 3400+) until this gets resolved.
I have 2 other machines that have each produced 1 good result so far with D41.12.
17 completed without
)
17 completed without errors
4 Valid
1 invalid
12 pending
EDIT:
Is Intel more exactly or my AMD Athlon XP?
As of now, 15 complete
)
As of now, 15 complete without errors.
2 Valid
13 Pending
It seems like (almost?) every
)
It seems like (almost?) every invalid WU with D41.12 also had a Power PC or Linux client in the cluster. Has anyone validated with one of these clients in their cluster or had an invalid result with only other optimized or standard Win clients?
Lots of errors on my K6-2
)
Lots of errors on my K6-2 400Mhz:
http://einsteinathome.org/host/599639/tasks
5.3.12.tx36
- exit code -1073741819 (0xc0000005)
2006-05-01 01:13:08.4269 [normal]: Optimised by akosf D41.12 --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-05-01 01:13:08.4369 [normal]: Started search at lalDebugLevel = 0
2006-05-01 01:13:11.9920 [normal]: Checkpoint-file 'Fstat.out.ckp' not found.
2006-05-01 01:13:11.9920 [normal]: No usable checkpoint found, starting from beginning.
***UNHANDLED EXCEPTION****
Reason: Access Violation (0xc0000005) at address 0x0040AD0A read attempt to address 0x02C30000
1: 05/01/06 01:13:12
The only unit that had no error was one that had been startet with D40.
http://einsteinathome.org/task/27206675
I switched back to D40.
I have at least 1 invalid
)
I have at least 1 invalid result that is not directly attributeable to my messing around (possibly another). Both look like sync errors to me. Here is the latest.
5.2.13 BoincStudio 0.4b
2006-04-30 20:11:25.4101 [normal]: Optimised by akosf D41.12 --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-04-30 20:11:25.4101 [normal]: Started search at lalDebugLevel = 0
2006-04-30 20:11:26.4413 [normal]: Checkpoint-file 'Fstat.out.ckp' not found.
2006-04-30 20:11:26.4413 [normal]: No usable checkpoint found, starting from beginning.
2006-04-30 20:12:29.2694 [normal]: Fstat file reached MaxFileSizeKB ==> compactifying ... done.
2006-04-30 20:54:59.1757 [normal]: Search finished successfully.
ps
changing your 'Write to disk at most every' may help to prevent this error. Should not be a near integer multiple of the completion time.
RE: I have at least 1
)
If you are referring to WU 7602326, again, one of the comps in that cluster was running a Linux client. Don't know yet if this is a trend, but it might be.
There is no linux here. ps I
)
There is no linux here.
ps I have increased 'Write to disk at most every' beyond the completion time. Will advise.
7602326
oooops
Sorry, I misunderstood your post initially Brian, sorry. I think it is a timing issue between the science app and the manager.