The other projects I have running are not Einstein@home projects (I didn't know there were more).
There aren't. I was speaking of projects like SETI@home, ClimatePrediction or Milkyway@home. It's strange that they don't show up at your account. Click on my name (or Gary's) in the Author column to see of what I'm speaking.
Quote:
I'm not familiar with the infamous "too many exits" problems.
I am not throttling my CPU. I allow 100% of the processors and 100% of CPU time.
I was only guessing there, since your tasks are frequently restarted within a few minutes. If that would happen more than 100 times in succession, the task would be aborted with that error message. Oftentimes CPU throttling is causing that behaviour, that's why I asked.
Did you try to suspend/resume the task or BOINC as a whole to "unstuck" the task? Does a reboot help?
By the way, I'm running version 5.10.45 too (on windows XP), without any problems. (And 5.8.16 on NT4, but not with Einstein :-)
Gruß,
Gundolf
I did try suspend/resume on the task and on BOINC. I haven't watched for a stuck task before and after reboot. Being on Linux, I don't reboot often.
Yes, the Einstein@Home task is actually listed as 'running' when it appears to be making no progress....
Is your machine overclocked? I run a large number of machines, most of which are overclocked and on older ones, where the cooling may be less efficient than it used to be, I have seen a similar behaviour, particularly if the ambient is rather warm. The symptoms are pretty much as you describe. The machine doesn't lock up or crash but an E@H task (supposedly running) does cease to make any progress. I could always kick start the task (at least temporarily) by simply stopping and restarting BOINC. Have you specifically tried that? You really don't need to abort them as changing to a different task is unlikely to change the behaviour. The problem is most likely to result from some hardware issue on your machine.
E@H tasks do seem to be harder on the processor than those of many other projects. Whenever I found this type of problem, I could always remove it by backing off on the overclock and by lowering the ambient or cleaning the heat sink and checking the fan to make sure CPU cooling was as good as possible.
Quote:
I have also had this happen with some World Community Grid projects, but never with SETI@home....
I checked over at Seti for the name 'LFenske' and drew a blank. I guess you must be using different account names at different projects and this would be why your other projects are not listed at the bottom of your user account information here, as Gundolf was talking about. There is no particular problem with using different account details at different projects - other than not being able to see all your stats in the one place.
I am not overclocking. This is a stock HP xw4600 and it seems to run pretty cool; the fans are normally rather quiet. I'll look around to see if I can determine the readings from temperature sensors.
I have tried stopping and restarting BOINC and it did not unstick any stuck task. Aborting a stuck task does allow it to go on to a task from a different project, and eventually get back to Einstein@home and often succeed.
That should be no problem. Just go to your account at each project (but one :-) and change the email address. That'll create new CPIDs at first, but they will align with time.
Gruß,
Gundolf
[edit]Okay, late again :-)[/edit]
Changing my e-mail addresses wouldn't make me lose history, would it?
...Which is around about the 5% that you've allotted to E@H. So my guess is there's not a problem with E@H per se ......
But the status should be "waiting to run" and not "running" when the task waits for its resource share. See Message 100243:Yes, the Einstein@Home task is actually listed as 'running' when it appears to be making no progress, along with one other task, since this is dual core. Einstein@Home is allocated a 5% resource share. In the past, I have let it go for a week, and seen no progress. Also, when a task is in this state, xload shows a load average of only around one (the other running task). When it does work, the load average is around two...
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
I am not overclocking. This is a stock HP xw4600 ....
OK, that's good. For heat to be the cause of the problem you'd probably be able to sense it by putting your hand on the case without even resorting to reading the sensor outputs. However, to be sure, check CPU temp and CPU fan speed for any irregularities.
Quote:
I have tried stopping and restarting BOINC and it did not unstick any stuck task.
Are you really sure of that? Do the BOINC startup messages indicate that the same stuck task is launched again after the restart and does it then still sit there 'running' but making no progress in CPU time or % completed? I've seen many stuck tasks over the years and in 100% of cases, stopping BOINC completely (check with task manager) and then restarting has always allowed some further progress, even if the task gets stuck again at a later stage.
The next move I would make would be to remove the ram module(s) and vacuum or blow out the RAM slots. I would try reseating each module a couple of times - ie using friction to ensure good clean contact of all pins. Make sure you guard against static discharge. If possible, I would try running the machine with a different and known good RAM stick and see if that can alter the characteristics of the problem in any way.
I know this sounds kinda lame (almost like 'Did you turn your PC off an on again' :-) ), but could you consider upgrading BOINC itself? The version you run is rather old.
I know this sounds kinda lame (almost like 'Did you turn your PC off an on again' :-) ), but could you consider upgrading BOINC itself? The version you run is rather old.
Thanks. I have upgraded (I didn't know about "hardy-backported") and have not had problems since. I'm not ready to claim that upgrading fixed it nor that the problem is solved, but it seems likely at this point. Time will tell.
I know this sounds kinda lame (almost like 'Did you turn your PC off an on again' :-) ), but could you consider upgrading BOINC itself? The version you run is rather old.
Thanks. I have upgraded (I didn't know about "hardy-backported") and have not had problems since. I'm not ready to claim that upgrading fixed it nor that the problem is solved, but it seems likely at this point. Time will tell.
Time does tell. I still have not had any problems. I declare the problem to be gone. Thank you all for your kind responses.
RE: RE: The other
)
I did try suspend/resume on the task and on BOINC. I haven't watched for a stuck task before and after reboot. Being on Linux, I don't reboot often.
RE: RE: Yes, the
)
I am not overclocking. This is a stock HP xw4600 and it seems to run pretty cool; the fans are normally rather quiet. I'll look around to see if I can determine the readings from temperature sensors.
I have tried stopping and restarting BOINC and it did not unstick any stuck task. Aborting a stuck task does allow it to go on to a task from a different project, and eventually get back to Einstein@home and often succeed.
RE: RE: I wonder if I can
)
Changing my e-mail addresses wouldn't make me lose history, would it?
Regards,
- Larry
RE: Changing my e-mail
)
No. They'll just merge into the one lump of data about your past activities.
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
Looking at the task details
)
Looking at the task details for the two aborts I note the following :
Name h1_0984.65_S5R4__855_S5R5a_2
Sent 25 Oct 2009 21:12:42 UTC
Received 27 Oct 2009 0:22:14 UTC
CPU time 6442.999
Name h1_0984.65_S5R4__817_S5R5a_2
Sent 27 Oct 2009 0:22:16 UTC
Received 28 Oct 2009 20:20:08 UTC
CPU time 5738.346
Which gives for time_spent_calculating / time_task_held :
6442.99 / 97772 = 0.0658 = 6.58 %
5738.346 / 158272 = 0.0363 = 3.63 %
Which is around about the 5% that you've allotted to E@H. So my guess is there's not a problem with E@H per se ......
Cheers, Mike.
( edit ) And the unit now being worked upon is a PALFA ( not a GW search ) so it'll behave differently to those other two.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
RE: ...Which is around
)
But the status should be "waiting to run" and not "running" when the task waits for its resource share. See Message 100243:
Yes, the Einstein@Home task is actually listed as 'running' when it appears to be making no progress, along with one other task, since this is dual core. Einstein@Home is allocated a 5% resource share. In the past, I have let it go for a week, and seen no progress. Also, when a task is in this state, xload shows a load average of only around one (the other running task). When it does work, the load average is around two...
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
RE: I am not overclocking.
)
OK, that's good. For heat to be the cause of the problem you'd probably be able to sense it by putting your hand on the case without even resorting to reading the sensor outputs. However, to be sure, check CPU temp and CPU fan speed for any irregularities.
Are you really sure of that? Do the BOINC startup messages indicate that the same stuck task is launched again after the restart and does it then still sit there 'running' but making no progress in CPU time or % completed? I've seen many stuck tasks over the years and in 100% of cases, stopping BOINC completely (check with task manager) and then restarting has always allowed some further progress, even if the task gets stuck again at a later stage.
The next move I would make would be to remove the ram module(s) and vacuum or blow out the RAM slots. I would try reseating each module a couple of times - ie using friction to ensure good clean contact of all pins. Make sure you guard against static discharge. If possible, I would try running the machine with a different and known good RAM stick and see if that can alter the characteristics of the problem in any way.
Cheers,
Gary.
Hi! I know this sounds
)
Hi!
I know this sounds kinda lame (almost like 'Did you turn your PC off an on again' :-) ), but could you consider upgrading BOINC itself? The version you run is rather old.
The "hardy-backported" repository has a 6.x version of the boinc client software, see https://help.ubuntu.com/community/Repositories/Ubuntu on how to enable additional repositories.
Good Luck
Bikeman
RE: Hi! I know this sounds
)
Thanks. I have upgraded (I didn't know about "hardy-backported") and have not had problems since. I'm not ready to claim that upgrading fixed it nor that the problem is solved, but it seems likely at this point. Time will tell.
RE: RE: Hi! I know this
)
Time does tell. I still have not had any problems. I declare the problem to be gone. Thank you all for your kind responses.