When restarting either my laptop or just the Boinc App all progress made on the active WU is lost and that workunit restarts. I have aborted the WU but the next one does the same. I am also running 2 CPDN WUs which are running fine and remembering progress across sessions.
Progress percentage does not appear to be incrementing either, stuck on 0%.
Copyright © 2024 Einstein@Home. All rights reserved.
Workunits resetting when restarting Boinc App.
)
First it sounds like the checkpointing is not happening prior to your shut downs, the time frame can be adjusted but increase the disk activity and will shorten the life of your hard drive. If it continues to be a problem you may wish to find a project with shorter units or find a way to not 'shut down' the laptop. My laptop runs 24/7/365 but only uses one core to crunch with.
Second Einstein units are long and your laptop could be slow especially if you are sharing time with CPDN thus not be progressing as fast as you might like. The progress bar is an estimate so don't take it too literally. What does it say in your Messages tab of the Boinc Manager? Could it be suspending crunching due to high cpu usage by something else? If so this too can be adjusted.
RE: First it sounds like
)
That was my first thought too, but on further reading it sounds more like the tasks not running at all, in spite of telling so.
But since the OP's computers are hidden, no further research is possible.
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
RE: Progress percentage
)
Be warned that the Global Correlations application takes a very long time to start displaying progress, especially if your memory subsystem is in any way challenged.
I've just started 8 of them at once on my dual-Xeon (host 831490), and watched as each one took a full 11 minutes before the progress started incrementing (and I got several 250MB 'excessive memory usage' performance alerts from Norton along the way). Even if you're not doing as extreme a test as that (for an unrelated aspect of BOINC), allow plenty of time for the startup code to run before you write off the task.
RE: I've just started 8 of
)
This was probably mainly because they were standing on each others feet for disk access. Unfortunately the code that grabs the necessary segments from the data files is not very efficient, which causes more and more trouble on modern multi-core systems where n processes share one disk interface.
Fixing / improving this code has been on my todo list for more than half a year now, but so far I didn't find the time.
BM
BM
Thank for the advice, I
)
Thank for the advice, I aborted 2 WUs and the next one which ran seemed to behave normally.
The symptom was that the elapsed time and time to complete would reset, this would happen even after many hours of crunching (it's a work laptop so has to be moved about and cannot be on permanently). I think that the percentage did increment after a while, that was just what I noticed in re-testing while writing the initial post.
I will just write those 2 WUs off as duds due to either being corrupt in some way or in conflict with other apps running at the time.