It's been a long while since I've used my XO laptop, and so I decided I would try it out again, esp. since there is now a new app on Einstein@Home. Unfortunately I still have some bad news to report.
After downloading the new app the XO ran for 20 hours, which I thought was encouraging, but then the task ended with a computation error. The good news is that the new app has a better traceback of the problem. Here are the task details. This was an "input domain error" due to non-finite Dphi_alpha. Maybe this is not the same as the previous "signal 8" error?
This was with XO build 650, with the 2.6.22 kernel (preemptable), as before. I'm going to try the XO update proceedure, to get to build 711, and see what happens. (Actually, I was going to try the update, and decided to try just the new app first without the update. Smaller changes are better for figuring out what is happening.)
As long as the kernel isn't updated, it's not surprising to get FPU related errors (either signal 8 or input domain errors). What happens is that basically the Linux kernel bug can cause the FPU registers to contain random stuff after a context switch :-(.
As long as the kernel isn't updated, it's not surprising to get FPU related errors (either signal 8 or input domain errors). What happens is that basically the Linux kernel bug can cause the FPU registers to contain random stuff after a context switch :-(
I was hoping that the new app might make a difference, since the XO was able to run the SETI@Home app without this problem. But after several WU's failed with signal 8 I see that's not the case, and I agree with you that it's not surprising. But worth a check. The next test is to update the XO to build 711 and see if the newer kernel still has this problem.
As long as the kernel isn't updated, it's not surprising to get FPU related errors (either signal 8 or input domain errors). What happens is that basically the Linux kernel bug can cause the FPU registers to contain random stuff after a context switch :-(.
Here is an update about Einstein@Home on my XO laptop. Over the holidays I had some time to update the software on the XO to the latest production build (build 767). The bad news is that the kernel version is 2.6.25 with PREEMPT set, and from what I've read that means the FPU bug is still present. When I started BOINC I got a WU from LHC@Home, and since those are rare I just let it run. It ended abnormally after 11 hours.
But then I got a WU from Einstein@Home, and the good news is that it seems to be working. It's slow, as you'd expect, but the WU has now run for 213 hours, and claims to be 63% done. So cross your fingers for another 100 hours and we'll see if it works.
Here is an update about Einstein@Home on my XO laptop...
And here is the final outcome. The WU completed successfully after 340:52:32 (a little over 14 days of continuous operation).
The other machine assigned the WU completed it in under 10 hours. When mine didn't make it back (I don't know how long) it was flagged as "no reply" and another machine took the WU and did it in 8 hours. So when the XO finished, the result was not run through the validator.
Still, it's interesting to see that even though the kernel may still have the preemption bug, it seems the S5R4 app code does not trigger it. Clearly the XO laptop is not an ideal machine for Einstein@Home, but I'm glad to see that with this newer kernel it can finish a WU.
XO Update: It's been a
)
XO Update:
It's been a long while since I've used my XO laptop, and so I decided I would try it out again, esp. since there is now a new app on Einstein@Home. Unfortunately I still have some bad news to report.
After downloading the new app the XO ran for 20 hours, which I thought was encouraging, but then the task ended with a computation error. The good news is that the new app has a better traceback of the problem. Here are the task details. This was an "input domain error" due to non-finite Dphi_alpha. Maybe this is not the same as the previous "signal 8" error?
This was with XO build 650, with the 2.6.22 kernel (preemptable), as before. I'm going to try the XO update proceedure, to get to build 711, and see what happens. (Actually, I was going to try the update, and decided to try just the new app first without the update. Smaller changes are better for figuring out what is happening.)
- Eric Myers
Hi! As long as the kernel
)
Hi!
As long as the kernel isn't updated, it's not surprising to get FPU related errors (either signal 8 or input domain errors). What happens is that basically the Linux kernel bug can cause the FPU registers to contain random stuff after a context switch :-(.
CU
Bikeman
RE: As long as the kernel
)
I was hoping that the new app might make a difference, since the XO was able to run the SETI@Home app without this problem. But after several WU's failed with signal 8 I see that's not the case, and I agree with you that it's not surprising. But worth a check. The next test is to update the XO to build 711 and see if the newer kernel still has this problem.
- Eric Myers
Bikeman wrote: As long as the
)
Here is an update about Einstein@Home on my XO laptop. Over the holidays I had some time to update the software on the XO to the latest production build (build 767). The bad news is that the kernel version is 2.6.25 with PREEMPT set, and from what I've read that means the FPU bug is still present. When I started BOINC I got a WU from LHC@Home, and since those are rare I just let it run. It ended abnormally after 11 hours.
But then I got a WU from Einstein@Home, and the good news is that it seems to be working. It's slow, as you'd expect, but the WU has now run for 213 hours, and claims to be 63% done. So cross your fingers for another 100 hours and we'll see if it works.
- Eric Myers
Eric Myers wrote:Here is an
)
And here is the final outcome. The WU completed successfully after 340:52:32 (a little over 14 days of continuous operation).
The other machine assigned the WU completed it in under 10 hours. When mine didn't make it back (I don't know how long) it was flagged as "no reply" and another machine took the WU and did it in 8 hours. So when the XO finished, the result was not run through the validator.
Still, it's interesting to see that even though the kernel may still have the preemption bug, it seems the S5R4 app code does not trigger it. Clearly the XO laptop is not an ideal machine for Einstein@Home, but I'm glad to see that with this newer kernel it can finish a WU.
- Eric Myers