My boinc client is performing a gamma ray pulsar (Version 1.04). Now it seems to stuck in an endless loop. Every time
the
remaining time is 0:30
elapsed time is 11:39
it jumps back to
remaining time 0.31
elapsed time 11:37
This behaviour repeats over and over again. Another parallel running pulsar search has ended more than two hours before.
Thomas
Copyright © 2024 Einstein@Home. All rights reserved.
Stuck in endless loop
)
Do you have 'Leave tasks in memory while suspended?' and 'Suspend work while computer is in use?' set to 'Yes' or 'No' in your computing preferences?,
If you're got 'Leave tasks in memory while suspended?' set to 'No' every time you interrupt the computer, the app will exit fully, and start from the last checkpoint next time it starts,
Einstein apps often checkpoint infrequently, so having 'Leave tasks in memory while suspended?' to 'Yes' is essential.
Claggy
Hello Claggy, I have
)
Hello Claggy,
I have already set the options exactly the way you have suggested. In my opinion it is not a problem of a suspended task, because you can look at the running task in the boinc client while the times reset like described above.
Thomas
I'm having a very similar
)
I'm having a very similar problem that could be the same thing. All my tasks get to 98.654% done, the time remaining counter drops to zero and is replaced by "---", but the task just keeps running forever and nothing else happens.
I checked my settings as per Claggy's message, no change.
gamma ray pulsar (Version 1.04)
BOINC manager v 7.4.27
... and no sooner do I post
)
... and no sooner do I post than the problem appears to be fixed. All back to normal.
RE: I'm having a very
)
How long is "forever"?
Please read this thread and then try to let the run for at least 1 hour after reaching this stage.
I have two work units stuck
)
I have two work units stuck at 95.666% complete at 9hrs + elapsed time 25:41 remaining. What should I do?
Are they in the "running"
)
Are they in the "running" state or are they "waiting to run"?
95.xxx% is the last progress percentage update before the FGRP (Gamma -ray) tasks reach the final variable length stage of processing at 98-99% complete.
On my i7 3770K these tasks normally run for about 12 hours so you probably just need some patience.
In some sense my thread was
)
In some sense my thread was captured by other problems with the gamma ray pulsar search. I don't mind. But my original problem is different from "stucking at xxx %". And it is unresolved.
My problem is, that every time the remaining time is 30 minutes it jumps back to about 31 minutes. And the elapsed time jumps back at the same moment from 11:39 to 11:37.
I understand that it is difficult, to calculate the time to end a process. That's the reason for the odd behaviour of a lot of progress bars. But the elapsed time should only increase and never decrease.
RE: In some sense my thread
)
Could the units be 'suspending' due to the pc doing other things? Are you using all of the Boinc defaults from when you first installed Boinc? If yes to the 2nd question then what you are seeing could be normal. The units 'checkpoint' or saving themselves at certain points as they crunch, if your unit gets suspended due to the pc doing something else, then when it comes back to the unit again if picks up at the last 'checkpoint' and continues on from there.
RE: My problem is, that
)
I interpret that as the task exiting and then restarting from a checkpoint. Now we need to find out why.
Just to recap, how is the setting for "Leave tasks in memory while suspended" set? Remember that if you've set the prefs through Boinc manager they always override the web based settings.
It might be helpful to check in Boinc's event log and post some of the messages here. Open the event log from Boinc's advanced view and then look in the advanced menu.
If there are any messages about starting and restarting tasks the please post some of them here. Or fully restart Boinc and then after some minutes when the tasks as reset a few time post all of the messages here.
The only time the elapsed time would decrease is when a task is restarted from an earlier checkpoint where the elapsed time was smaller. All time from the checkpoint up until the reset would be lost and should not be counted.