Hi everybody,
Since july 29th, I received 41 CasA WU's, and 20 of them ended with a computation error, like this one today:
dim. 04 août 2013 13:50:13 CEST | Einstein@Home | Computation for task h1_0230.80_S6Directed__S6CasAf40a_231Hz_46_1 finished
dim. 04 août 2013 13:50:13 CEST | Einstein@Home | Output file h1_0230.80_S6Directed__S6CasAf40a_231Hz_46_1_0 for task h1_0230.80_S6Directed__S6CasAf40a_231Hz_46_1 absent
dim. 04 août 2013 13:50:13 CEST | Einstein@Home | Output file h1_0230.80_S6Directed__S6CasAf40a_231Hz_46_1_1 for task h1_0230.80_S6Directed__S6CasAf40a_231Hz_46_1 absent
I had not so many errors with previous Einstein WU, so what is the problem?
Copyright © 2024 Einstein@Home. All rights reserved.
50% of erroneous CasA WU
)
I forgot to give my configuration:
Mageia Linux 3
Linux 3.8.13.4-desktop-1.mga3 on x86_64
Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz, 8 cores
8 GB memory
56 GB available on disk for Boinc
Boinc 7.0.36-2.mga3.x86_64
The key from stderr of the
)
The key from stderr of the failed tasks is
Required frequency-bins [-7, 8]
(there are no negative bin indices). This means that your CPU is doing wrong floating-point math. Most of these errors come from the CPU getting too hot (overclocked?), more rarely this happens with 'preemptive' Linux kernels that are 'lazy' restoring the floating-point registers after a context switch. I don't know your distribution, but I know that SUSE had such kernels installed as the default. If possible, monitor your CPU temperature and/or try a different kernel.BM
BM
Hi, It's not a temperature
)
Hi,
It's not a temperature problem, I am permanently monitoring temperature and it remains between 60°C and 70°C (I replaced the original fan by a big one).
I installed the package cpufreqd, and configured it to maintain the frequency of the four cores at maximum, but the computation errors still remained.
Finally, I found no other solution than disable the CasA computation in my Einstein preferences. Sad.
Joseph