Hi there,
Work units for Einstein@Home are listed as 'Running' in my task list but both the 'Progress' and 'To Completion' rates are not changing over time.
I aborted one task and downloaded another but this did not fix the problem.
Any ideas anyone?
Regards,
Mark
Copyright © 2024 Einstein@Home. All rights reserved.
Work Units Not Being Processed
)
The first thing to try is to stop and then start Boinc again.
This is not a solution if it happens often but sometimes a kick in the
but works for computers. ;)
Anders n
RE: RE: Hi there, Work
)
Yeah, I tried this a few times without any success. Thanks anyway.
Regards,
Mark
RE: Hi there, Work units
)
Over what sort of timescale are you seeing this 'not changing'?
Einstein progress proceeds in a series of jumps: judging by the completed task still showing for your computer, I would expect the jumps to be about 80 seconds apart on your machine.
Hi Mark - Welcome to the
)
Hi Mark - Welcome to the message boards. Not Mark "Tubby" Taylor by any chance?
(Sorry - I just had to ask :-).)
I sometimes see this type of behaviour on some of my machines. The OS is still running but all signs of progress for the science app have stopped. Even though the app is listed as "running" in BOINC Manager, Windows Task Manager actually shows the machine as idle. Under normal conditions the CPU time would be progressing every second but in this catatonic state nothing seems to be moving. I can always get progress moving again by simply stopping and restarting BOINC so I'm not sure why you can't do the same.
In my case, the freezing of progress is due to the fact that the machines are overclocked and sometimes the ambient temperature simply gets too hot. In other words the freezing of the app doesn't happen if the aircon is coping OK. It usually happens at a weekend when the aircon is off and it's an overly hot day.
Your machine appears to be a laptop so overclocking shouldn't be an issue but temperature may well be. Does the laptop case feel hot to the touch and have you ckecked/cleaned the air passageways recently. Elevating the laptop from the table also works wonders for cooler running.
Aborting the task isn't really going to solve things as the next task will attempt to run while the machine is still hot. To see if heat is the problem, when it happens next shut down the machine until it is cool (>30mins) and then restart it. If the machine crunches for a little bit and then freezes, heat is most likely your problem.
Let us know how you get on.
Cheers,
Gary.
If they're definitely shown
)
If they're definitely shown as 'running' in the BOINC client, but the windows task manager shows no activity with the einstein application then I'd say it's time to try 'resetting' the project in the BOINC client.
I'm having a difficult time imagining what's gone wrong, but there's surely something amiss with the application itself or the workunit files. I think a reset should replace both.
Good luck,
Thunder
RE: ... I'd say it's time
)
I'd disagree :-).
Problems with the software are more likely to result in client errors. The previous task completed successfully and the task in which the freeze occurred had accumulated a significant amount of CPU time before the freeze occurred. I would think it's much more likely that a hardware issue would give those symptoms.
Cheers,
Gary.
RE: I'd disagree
)
Bah... well, all are entitled to their opinions. ;) [duck]Me[/duck]
Okay, I'll admit that suggestion was a stab in the dark. In 3+ years of running boinc on 20+ computers, I've never seen precisely what Mark is describing. I've seen it happen before, but never had to do anything more complicated than rebooting the computer to fix it.
Come to think of it... was that suggested? The only time I've ever seen the boinc client continue to show 'running' but no activity from the application is when the app has 'frozen'. In this case, restarting the BOINC client did nothing (for me) because the application process never terminated so was still running (frozen) both pre/post restart of the BOINC client.
I *could* tell Mark to find/kill the application process after exiting BOINC, but since I have no idea if he's that savvy, I'd say just reboot and see if that does it.
I once again aborted a unit
)
I once again aborted a unit and downloadeded another and now the program appears to be working fine.
Thanks to all for their advice.
Regards,
Mark
RE: I once again aborted a
)
G'Day mark,
I have had a number of work units stop doing anything but Boinc Manager says they are still running (no cpu load).
Updating via BM does not work.
Stopping BM and then restarting it has got my work units running again and then they finish ok.
No need to reset the project, or to abort the work unit.
RE: I once again aborted a
)
Hi Mark,
I'm pleased that things seem to be back to normal.
I think you must be mistaken about aborting a second E@H task. Your results list shows only one aborted task and this goes back to before your opening post in this thread.
I notice you are supporting a number of projects so you would have quite a few different tasks on your tasks list at any given time. The normal switching between tasks at regular intervals (ie, suspend a task for one project and resume a task for another) may give the impression that a partly crunched task has suddenly become frozen. Is it possible that something like this may have been the cause of the behaviour you reported in your opening message?
Cheers,
Gary.