Hi all,
OK - so A36 is now "old hat" - but such is the pace of development change that some (like me) are still using it on remote PC's so cannot easily upgrade them quickly...!
Anyways - can someone take a look at this - I left it running as I thought it was "strange"
This is on a "clunker" of a PC - P3 @ 550MHz (it's a small file server, so it's on 24/7 anyways, so why not do something useful before it gets upgraded !! But it works OK for other projects with lesser requirements)!!
Result ID 21947155
Name r1_1495.5__1707_S4R2a_0
Workunit 5907773
Created 19 Mar 2006 10:46:11 UTC
Sent 19 Mar 2006 10:46:12 UTC
Received 24 Mar 2006 13:58:51 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 427166
Report deadline 2 Apr 2006 10:46:12 UTC
CPU time 334745.806
stderr out 5.2.13
2006-03-19 10:59:27.3899 [normal]: Start of BOINC application 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'.
2006-03-19 10:59:27.5000 [normal]: Started search at lalDebugLevel = 0
2006-03-19 10:59:32.4899 [normal]: Checkpoint-file 'Fstat.out.ckp' not found.
2006-03-19 10:59:32.4899 [normal]: No usable checkpoint found, starting from beginning.
2006-03-19 11:11:49.5399 [normal]: Optimized by akosf (A-36) --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-03-19 11:11:49.5900 [normal]: Started search at lalDebugLevel = 0
2006-03-19 14:55:36.4099 [normal]: Optimized by akosf (A-36) --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-03-19 14:55:36.4099 [normal]: Started search at lalDebugLevel = 0
2006-03-19 14:55:40.3099 [normal]: Found checkpoint-file 'Fstat.out.ckp'
2006-03-19 14:55:40.3099 [normal]: Trying to read Fstat-file into toplist ...
2006-03-19 14:56:21.2299 [normal]: Checksum Ok. Successfully read_toplist_from_fp()
2006-03-19 14:56:21.2299 [normal]: Resuming computation at (434/215995118/4334945).
2006-03-20 00:04:16.6599 [normal]: Optimized by akosf (A-36) --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-03-20 00:04:16.7199 [normal]: Started search at lalDebugLevel = 0
2006-03-20 00:04:21.3899 [normal]: Found checkpoint-file 'Fstat.out.ckp'
2006-03-20 00:04:21.3899 [normal]: Trying to read Fstat-file into toplist ...
2006-03-20 00:05:07.0299 [normal]: Checksum Ok. Successfully read_toplist_from_fp()
2006-03-20 00:05:07.0799 [normal]: Resuming computation at (4466/215995118/4334945).
2006-03-20 12:11:14.3699 [normal]: Optimized by akosf (A-36) --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-03-20 12:11:14.4200 [normal]: Started search at lalDebugLevel = 0
2006-03-20 12:11:18.3199 [normal]: Found checkpoint-file 'Fstat.out.ckp'
2006-03-20 12:11:18.3199 [normal]: Trying to read Fstat-file into toplist ...
2006-03-20 12:12:00.1199 [normal]: Checksum Ok. Successfully read_toplist_from_fp()
2006-03-20 12:12:00.1199 [normal]: Resuming computation at (8510/215995118/4334945).
2006-03-21 03:38:51.9699 [normal]: Optimized by akosf (A-36) --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-03-21 03:38:52.1899 [normal]: Started search at lalDebugLevel = 0
2006-03-21 03:38:59.2699 [normal]: Found checkpoint-file 'Fstat.out.ckp'
2006-03-21 03:38:59.3299 [normal]: Trying to read Fstat-file into toplist ...
2006-03-21 03:39:45.3499 [normal]: Checksum Ok. Successfully read_toplist_from_fp()
2006-03-21 03:39:45.3499 [normal]: Resuming computation at (12488/215995118/4334945).
2006-03-21 13:56:21.4599 [normal]: Optimized by akosf (A-36) --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-03-21 13:56:21.6800 [normal]: Started search at lalDebugLevel = 0
2006-03-21 13:56:30.1999 [normal]: Found checkpoint-file 'Fstat.out.ckp'
2006-03-21 13:56:30.1999 [normal]: Trying to read Fstat-file into toplist ...
2006-03-21 13:57:19.1899 [normal]: Checksum Ok. Successfully read_toplist_from_fp()
2006-03-21 13:57:19.1899 [normal]: Resuming computation at (15947/215995118/4334945).
2006-03-22 10:45:53.0100 [normal]: Optimized by akosf (A-36) --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-03-22 10:45:53.0599 [normal]: Started search at lalDebugLevel = 0
2006-03-22 10:45:57.2399 [normal]: Found checkpoint-file 'Fstat.out.ckp'
2006-03-22 10:45:57.2399 [normal]: Trying to read Fstat-file into toplist ...
2006-03-22 10:46:50.1299 [normal]: Checksum Ok. Successfully read_toplist_from_fp()
2006-03-22 10:46:50.1299 [normal]: Resuming computation at (28751/215995118/4334945).
2006-03-23 12:26:08.5000 [normal]: Optimized by akosf (A-36) --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-03-23 12:26:08.7199 [normal]: Started search at lalDebugLevel = 0
2006-03-23 12:26:24.9200 [normal]: Found checkpoint-file 'Fstat.out.ckp'
2006-03-23 12:26:24.9200 [normal]: Trying to read Fstat-file into toplist ...
2006-03-23 12:27:16.9399 [normal]: Checksum Ok. Successfully read_toplist_from_fp()
2006-03-23 12:27:16.9399 [normal]: Resuming computation at (43509/215995118/4334945).
2006-03-24 13:50:55.6499 [normal]: Search finished successfully.
Validate state Invalid
Claimed credit 384.277984848295
Granted credit 0
application version 4.37
The real issue here is: after 94 hours, I got ZERO credit...Thanks to the project - you really know how to hurt a guy !
As for the other two WU's left on this machine - they will be aborted because already they are shown to be another 92+ hours before completion - and no point carrying on with them !!.
So, this machine will go back to being an optimised-SETI only cruncher...!
regards,
Tim
regards,
Tim
UK BOINC Team Founder
Join the UK BOINC Team: http://www.ukboincteam.org.uk/newforum
Copyright © 2024 Einstein@Home. All rights reserved.
A36 issue? 94 hours to crunch ONE WU !
)
Hi Timbo!
Did you find the reason of this problem?
Or could you give me the link of this result?
Perhaps, it would be interesting!
Oh, and I feel with you!
But every interesting thing has at least one fault.
The result can be found hear
)
The result can be found hear
94H would sounds high event with the standard ap. My old PIII 450 did an old Einstein unit in 40H. The longest Albert results should take only 30-35H on your machine using the standard application. With A36 the result should be done in 10-20H.
Edit. You are running multiple project and windows 98 on this host, right? The 94H is because of the time bug with BOINC and windows 98. What is your “switch project every� set to?
Then you're really interested in a subject, there is no way to avoid it. You have to read the Manual.
RE: Or could you give me
)
I think it's this one:
http://einsteinathome.org/task/21947155
But looks like BOINC-related problem to me, rather than A-36.
Thanks for the links! I
)
Thanks for the links!
I have never seen fault under Einstein@Home in same constellation.
The validator said it to invalid, so the imprecise result is just possible.
The 94 hours is come from Win98, it isn't careful in task time measuring.
I have a couple of old
)
I have a couple of old Windows 98 machines, a P II 350MHz and a Celeron 400MHz, both running Seti and Einstein.
I've sometimes had these very long run times, especially on the P II (though never so bad that it failed to validate....)
They seem to run much smoother and quicker if you set the flag to keep the app in memory when suspended.
Could yours be switched out of memory? Would that account for the multiple restarts in the result file?
RE: Hi Timbo! Did you find
)
Well, I wasn't sure about it at all.
So, to start with, as the progress moved so slowly I thought - maybe it's like some of the SETI Classic WU's that took a long time - the ones with low angle range..!.
But as it went on and on and on, and other PC's (in my small collection) were returning valid work much more quickly. So I thought must be something wrong with the PC - but I didn't want to stop, just in case it's like the current Rosetta problem.
So, the machine is on 24/7 only working on Einstein - I don't know why it switched - except that it had a cache of 3 Einstein WU's - but it didn't start the other two at all....they were always at zero progress and zero time.
After it finished, I thought something is strange, so had to abort the 2 cache WU's and instead I downloaded some SETI WU's - they seemed to progress OK.
So, I was still "stuck".
Everything was normal on the PC - all seemed OK
In the end, I rebooted the PC and started fresh with a single new Einstein to check.
It finshed in 7 hours - like normal - but I did change the client over to S39L "just in case" - see this result here:
http://einsteinathome.org/task/22575228
So, the PC must have had some "process going on - but I'd already checked the Task Manager and nothing was working except "normal services" and BOINC....!
VERY strange....!
A lesson to everyone with a farm - don't assume the PC is always working OK, especially if on 24/7.
Will try and figure out answers to otehr comments tomorrow.
Too late right now (12:30 am UK)
regards,
Tim
(edit) added new result link
regards,
Tim
UK BOINC Team Founder
Join the UK BOINC Team: http://www.ukboincteam.org.uk/newforum
Of my four machines running
)
Of my four machines running both Einstein and SETI, two are Pentium IIIs running Win98SE.
Both have had instances of extremely slow or zero rate of progress, leading to eventual reporting of completed results with extremely long execution times (up to 10x normal, I think).
Someone suggested that turn the general preference setting:
Leave applications in memory while preempted?
to No for these two machines. I've done so, and not seen that particular misbehavior since. But I can't affirm this was cause and effect, as the issue was irregular.
However, one of the two machines still occasionally gets into what I might call "double count time" mode.
In this state, boincmgr ticks along updating CPU time every 5 seconds, as usual, but the CPU time increments by 10 seconds!! If I don't notice until completion, I believe it returns the result with this falsified long time. If I do notice, and either reboot the PC, or, I think, just exit and restart BOINCmgr, on restart it displays half the previously displayed accumulated CPU for that result and increments by 5 seconds as it should.
Among the several applications and the OS involved, I certainly can't apportion blame--but I'd not assume akosf's science application was specifically at fault without more evidence.
I need to get the Win98SE machines up to XP, the question is how to do the conversion, how much hardware to change out (lots), or whether to just get new machines and abandon the installations I can't transfer from the old successfully.
RE: I need to get the
)
I guess with PC's so cheap these days, it's probably a question of upgrading to a new machine - a new copy of Win XP Pro (in UK) costs nearly as much as a better-spec PC (with a faster CPU/bigger HDD/faster CDROM/more memory) including at least Win XP Home....
So, it might be time to say "bye, bye" to the old (but still faithful)workhorse's we both have..!
regards,
Tim
regards,
Tim
UK BOINC Team Founder
Join the UK BOINC Team: http://www.ukboincteam.org.uk/newforum
RE: I guess with PC's so
)
Either that, or convert to Linux. I have an AMD K6/2 350 that is too slow even for Windows 2000 (never mind XP) but runs Linux quite well. It takes a few days to complete a WU, though...
I have also small problem
)
I have also small problem with A-36. One of my results is invalid
http://einsteinathome.org/task/21183490
Can U tell my why?