Einstein@Home on OLPC

Eric Myers
Eric Myers
Joined: 8 Nov 04
Posts: 45
Credit: 1349848
RAC: 0

XO Update: It's been a

XO Update:

It's been a long while since I've used my XO laptop, and so I decided I would try it out again, esp. since there is now a new app on Einstein@Home. Unfortunately I still have some bad news to report.

After downloading the new app the XO ran for 20 hours, which I thought was encouraging, but then the task ended with a computation error. The good news is that the new app has a better traceback of the problem. Here are the task details. This was an "input domain error" due to non-finite Dphi_alpha. Maybe this is not the same as the previous "signal 8" error?

This was with XO build 650, with the 2.6.22 kernel (preemptable), as before. I'm going to try the XO update proceedure, to get to build 711, and see what happens. (Actually, I was going to try the update, and decided to try just the new app first without the update. Smaller changes are better for figuring out what is happening.)

- Eric Myers

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 761968411
RAC: 1102971

Hi! As long as the kernel

Hi!

As long as the kernel isn't updated, it's not surprising to get FPU related errors (either signal 8 or input domain errors). What happens is that basically the Linux kernel bug can cause the FPU registers to contain random stuff after a context switch :-(.

CU

Bikeman

Eric Myers
Eric Myers
Joined: 8 Nov 04
Posts: 45
Credit: 1349848
RAC: 0

RE: As long as the kernel

Message 76698 in response to message 76697

Quote:
As long as the kernel isn't updated, it's not surprising to get FPU related errors (either signal 8 or input domain errors). What happens is that basically the Linux kernel bug can cause the FPU registers to contain random stuff after a context switch :-(


I was hoping that the new app might make a difference, since the XO was able to run the SETI@Home app without this problem. But after several WU's failed with signal 8 I see that's not the case, and I agree with you that it's not surprising. But worth a check. The next test is to update the XO to build 711 and see if the newer kernel still has this problem.

- Eric Myers

Eric Myers
Eric Myers
Joined: 8 Nov 04
Posts: 45
Credit: 1349848
RAC: 0

Bikeman wrote: As long as the

Message 76699 in response to message 76697

Bikeman wrote:

As long as the kernel isn't updated, it's not surprising to get FPU related errors (either signal 8 or input domain errors). What happens is that basically the Linux kernel bug can cause the FPU registers to contain random stuff after a context switch :-(.


Here is an update about Einstein@Home on my XO laptop. Over the holidays I had some time to update the software on the XO to the latest production build (build 767). The bad news is that the kernel version is 2.6.25 with PREEMPT set, and from what I've read that means the FPU bug is still present. When I started BOINC I got a WU from LHC@Home, and since those are rare I just let it run. It ended abnormally after 11 hours.

But then I got a WU from Einstein@Home, and the good news is that it seems to be working. It's slow, as you'd expect, but the WU has now run for 213 hours, and claims to be 63% done. So cross your fingers for another 100 hours and we'll see if it works.

- Eric Myers

Eric Myers
Eric Myers
Joined: 8 Nov 04
Posts: 45
Credit: 1349848
RAC: 0

Eric Myers wrote:Here is an

Message 76700 in response to message 76699

Eric Myers wrote:
Here is an update about Einstein@Home on my XO laptop...


And here is the final outcome. The WU completed successfully after 340:52:32 (a little over 14 days of continuous operation).

The other machine assigned the WU completed it in under 10 hours. When mine didn't make it back (I don't know how long) it was flagged as "no reply" and another machine took the WU and did it in 8 hours. So when the XO finished, the result was not run through the validator.

Still, it's interesting to see that even though the kernel may still have the preemption bug, it seems the S5R4 app code does not trigger it. Clearly the XO laptop is not an ideal machine for Einstein@Home, but I'm glad to see that with this newer kernel it can finish a WU.

- Eric Myers

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.