Thanks for the responses guys, I will give the next client a go. It might also stop the annoying habit of every now and then now uploading any results. Doing an update has no effect, only restarting the manager allows result reporting.
Happens about once a week to once a fortnight.
As all three Linux machines have the same Client and all are doing the same thing then I will update to see if this improves.
I can't recall if this happens on my Windows machine, but it might.
I have a remote computer that seems very on/off in it's reporting of results so I might need to update that one as well.
Just an aside to this improved application is WU 91935877 which ran for 37,913.32 seconds, this I expected from this WU as it starts to scale up toward the Peak from out of the trough.
But WU 91937267, which is the next WU in the frequency run and therefore should take virtually the same amount of time, has in fact taken 62,138.42 seconds.
Can someone explain this please?
There are now three close sequence numbers, the third having a runtime of 39,038 which is in line with the original 37,913. The 62,138 should really be around 38,500 to fit the sequence properly.
If you examine the stderr.out for all three tasks you can notice several things:-
* None of the three were stopped and restarted at any stage so all three must have remained in memory when the CPU was switched to other projects.
* All three have wall clock elapsed times considerably in excess of recorded CPU times (101,880, 93,600, 72,000 respectively) indicating that other projects were sharing the CPUs from time to time.
* All three have quite similar numbers of skypoints per checkpoint - look at each line in stderr.out and see how many numbers there are before a 'c'.
* On this basis it is hard to say that one could have had almost double the CPU time. You should see only about half the number of skypoints per checkpoint if that were true.
* One conclusion might be a possible hiccup or bug in the measuring of CPU time for the anomalous task. The wall clock time for that task (93600) is not out of line with the average for the other two "normal" results.
I'd consider upgrading BOINC to the latest recommended version.
Well Gary, it looks like I will not be updating my Boinc client.
I downloaded the latest and only download option for Linux (ver 5.10.28).
I then installed and tried to restart Boinc, nothing happened.
It was then I noticed that the this only Linux version is called "boinc_ubuntu_5.10.28_i686-pc-linux-gnu.sh".
Now I maybe a bit lacking in knowledge when it comes to Linux but as I am running Fedora, which even I know is not based on Ubuntu, then I fail to see how a Ubuntu named file is going to run on Fedora.
My Fedora system will have nothing to do with this file.
Well Gary, it looks like I will not be updating my Boinc client.
I downloaded the latest and only download option for Linux (ver 5.10.28).
I then installed and tried to restart Boinc, nothing happened.
It was then I noticed that the this only Linux version is called "boinc_ubuntu_5.10.28_i686-pc-linux-gnu.sh".
Now I maybe a bit lacking in knowledge when it comes to Linux but as I am running Fedora, which even I know is not based on Ubuntu, then I fail to see how a Ubuntu named file is going to run on Fedora.
My Fedora system will have nothing to do with this file.
Can anyone enlighten me please.
"boinc_ubuntu_5.10.28_i686-pc-linux-gnu.sh" is a release using newer versions of some libraries. Specifically, it uses library versions as can be found on a decently-recent Ubuntu machine. It has worked for about 90% of testers, on lots of different distros, Fedora included. "Made for Ubuntu, but actually works on most non-ancient systems". It definitely doesn't use anything "specific to Ubuntu".
In any case, you could try getting the portable release (should work on ANY Linux, but doesn't have the BOINC manager).
Well Gary, it looks like I will not be updating my Boinc client.
I downloaded the latest and only download option for Linux (ver 5.10.28).
I then installed and tried to restart Boinc, nothing happened.
It was then I noticed that the this only Linux version is called "boinc_ubuntu_5.10.28_i686-pc-linux-gnu.sh".
Now I maybe a bit lacking in knowledge when it comes to Linux but as I am running Fedora, which even I know is not based on Ubuntu, then I fail to see how a Ubuntu named file is going to run on Fedora.
My Fedora system will have nothing to do with this file.
Can anyone enlighten me please.
"boinc_ubuntu_5.10.28_i686-pc-linux-gnu.sh" is a release using newer versions of some libraries. Specifically, it uses library versions as can be found on a decently-recent Ubuntu machine. It has worked for about 90% of testers, on lots of different distros, Fedora included. "Made for Ubuntu, but actually works on most non-ancient systems". It definitely doesn't use anything "specific to Ubuntu".
It also works on my SuSE 10.3, starting from a command line as "run_client" and "run_manager" in a KDE environment.
Tullio
This version works much better with 4.19 than the default version.
Fraction done and estimated time to finish are broken with EAH 4.20 + CC 4.19 Linux (Windows does not have that problem), with EAH 4.27 those features returned :
I switched to 4.27 while the result, that caimed 65.03 credits, ran - so it reported only the time between that restart and the end. All EAH 4.20 results reported 0 seconds.
p.s.: it's a SUSE 2.4.21 without X-Windows on a P3
Fraction done and estimated time to finish are broken with EAH 4.20 + CC 4.19 Linux (Windows does not have that problem), with EAH 4.27 those features returned :
FWIW: This is not a feature of the Core Client but of the Linux kernel version (actually the pthread library that comes with older kernels that violates the Posix standard). We fixed this in BOINC to preserve the "old" (non-standard) CPU time query on Linux (calling getrusage(2) in a signal handler...) while using the standard way on MacOS that avoids the deadlocking we were observing there.
I like it as it helps the cache management. Without time and percentage, the CC fetches new work only when it ran completely dry, now it respects my cache settings. I don't cache much but it's still better than running dry :-)
The old CC version I'm using has been necessary because Squid didn't let me authenticate with 4.x versions > 4.19. I haven't tried 5.x yet though.
I reported this a few days ago at Ralph@Home (the Rosetta alpha project) in this thread.
Because it involves E@H (and only E@H), I'm reposting it here.
The app hasn't exhibited this behavior running in tandem with anything else (LHC, Brats, Hydrogen, PrimeGrid and probably at least one other I'm forgetting).
Quote:
I had noticed a few days ago that an Einstein and a Rosetta Mini task were running together. Both were listed as running, but only the Rosetta task was accruing CPU time. Top confirmed that the E@H task wasn't getting any CPU time. Stopping/starting the daemon got things running properly again and I didn't think anything else of it.
I saw the same thing again this evening. I took some screen shots before and after restarting the daemon.
It's this host which runs Fedora 7. Latest updates were installed Friday evening. BOINC 5.10.21 is installed via rpm and it runs as a system daemon. It's this result.
Kathryn, I do know this kind of behaviour, but I'm not convinced it has much to do with Rosetta. On my box it has happened with Einstein/Prime Grid as well as with two Einstein tasks running parallely. Maybe some projects are not affected or are more likely to develop this problem than others but it is not, afaik, a Rosetta problem.
For info: What kind of box and OS are you talking about? Since this is the Linux app thread, you obviously have a Linux box ;-) but which distro and kernel version? And what kind of CPU is it? Your description makes me think of a dual core but there are quite a few different ones out there...
The box I experienced this problem with is running Kubuntu 7.10, 2.6.22.14 kernel and an ancient BOINC core client running from command line with normal user privileges (no daemon involved).
CPU is a Core Duo Mobile "Yonah".
Thanks for the responses
)
Thanks for the responses guys, I will give the next client a go. It might also stop the annoying habit of every now and then now uploading any results. Doing an update has no effect, only restarting the manager allows result reporting.
Happens about once a week to once a fortnight.
As all three Linux machines have the same Client and all are doing the same thing then I will update to see if this improves.
I can't recall if this happens on my Windows machine, but it might.
I have a remote computer that seems very on/off in it's reporting of results so I might need to update that one as well.
RE: RE: Just an aside to
)
Well Gary, it looks like I will not be updating my Boinc client.
I downloaded the latest and only download option for Linux (ver 5.10.28).
I then installed and tried to restart Boinc, nothing happened.
It was then I noticed that the this only Linux version is called "boinc_ubuntu_5.10.28_i686-pc-linux-gnu.sh".
Now I maybe a bit lacking in knowledge when it comes to Linux but as I am running Fedora, which even I know is not based on Ubuntu, then I fail to see how a Ubuntu named file is going to run on Fedora.
My Fedora system will have nothing to do with this file.
Can anyone enlighten me please.
RE: Well Gary, it looks
)
"boinc_ubuntu_5.10.28_i686-pc-linux-gnu.sh" is a release using newer versions of some libraries. Specifically, it uses library versions as can be found on a decently-recent Ubuntu machine. It has worked for about 90% of testers, on lots of different distros, Fedora included. "Made for Ubuntu, but actually works on most non-ancient systems". It definitely doesn't use anything "specific to Ubuntu".
In any case, you could try getting the portable release (should work on ANY Linux, but doesn't have the BOINC manager).
RE: RE: Well Gary, it
)
It also works on my SuSE 10.3, starting from a command line as "run_client" and "run_manager" in a KDE environment.
Tullio
This version works much
)
This version works much better with 4.19 than the default version.
Fraction done and estimated time to finish are broken with EAH 4.20 + CC 4.19 Linux (Windows does not have that problem), with EAH 4.27 those features returned :
http://einsteinathome.org/host/389866/tasks
I switched to 4.27 while the result, that caimed 65.03 credits, ran - so it reported only the time between that restart and the end. All EAH 4.20 results reported 0 seconds.
p.s.: it's a SUSE 2.4.21 without X-Windows on a P3
RE: Fraction done and
)
FWIW: This is not a feature of the Core Client but of the Linux kernel version (actually the pthread library that comes with older kernels that violates the Posix standard). We fixed this in BOINC to preserve the "old" (non-standard) CPU time query on Linux (calling getrusage(2) in a signal handler...) while using the standard way on MacOS that avoids the deadlocking we were observing there.
BM
BM
I like it as it helps the
)
I like it as it helps the cache management. Without time and percentage, the CC fetches new work only when it ran completely dry, now it respects my cache settings. I don't cache much but it's still better than running dry :-)
The old CC version I'm using has been necessary because Squid didn't let me authenticate with 4.x versions > 4.19. I haven't tried 5.x yet though.
I reported this a few days
)
I reported this a few days ago at Ralph@Home (the Rosetta alpha project) in this thread.
Because it involves E@H (and only E@H), I'm reposting it here.
The app hasn't exhibited this behavior running in tandem with anything else (LHC, Brats, Hydrogen, PrimeGrid and probably at least one other I'm forgetting).
Kathryn :o)
Einstein@Home Moderator
It is not clear to me - is
)
It is not clear to me - is this version better than the current 4.31 official app?
Kathryn, I do know this kind
)
Kathryn, I do know this kind of behaviour, but I'm not convinced it has much to do with Rosetta. On my box it has happened with Einstein/Prime Grid as well as with two Einstein tasks running parallely. Maybe some projects are not affected or are more likely to develop this problem than others but it is not, afaik, a Rosetta problem.
For info: What kind of box and OS are you talking about? Since this is the Linux app thread, you obviously have a Linux box ;-) but which distro and kernel version? And what kind of CPU is it? Your description makes me think of a dual core but there are quite a few different ones out there...
The box I experienced this problem with is running Kubuntu 7.10, 2.6.22.14 kernel and an ancient BOINC core client running from command line with normal user privileges (no daemon involved).
CPU is a Core Duo Mobile "Yonah".