Gravitation Wave Computation Errors

Sunny129
Sunny129
Joined: 5 Dec 05
Posts: 162
Credit: 160342159
RAC: 0

well i got a slew of errors

well i got a slew of errors last night while i was sleeping. by the time i saw the errors, their slots had already been emptied. i have some other S6GC tasks that are only a few minutes away from completion right now, which means some new S6GC tasks should start in a few minutes too. hopefully one of them will throw an error within the first 20 seconds or so of running, and hopefully i'll be able to get a "snapshot" of the init_data.xml file...

*EDIT* - well guys, i don't know...2 more S6GC tasks have finished without error, and the two tasks that started immediately afterward did not throw errors shortly after beginning. i don't know that i'll ever be in the right place at the right time to catch the init_data.xml file of a failed S6GC task.

that being said, i suppose i'll go ahead and post the full Stderr output of one of the tasks that failed last night:

Quote:


Stderr output

6.12.34

- exit code -1073741819 (0xc0000005)

2012-01-10 09:09:03.7187 (2624) [normal]: This program is published under the GNU General Public License, version 2
2012-01-10 09:09:03.7187 (2624) [normal]: For details see http://einstein.phys.uwm.edu/license.php
2012-01-10 09:09:03.7187 (2624) [normal]: This Einstein@home App was built at: May 5 2011 14:54:09

2012-01-10 09:09:03.7187 (2624) [normal]: Start of BOINC application 'projects/einstein.phys.uwm.edu/einstein_S6Bucket_1.01_windows_intelx86__SSE2.exe'.
Activated exception handling...
command line: projects/einstein.phys.uwm.edu/einstein_S6Bucket_1.01_windows_intelx86__SSE2.exe --Freq=402.613682292 --FreqBand=0.05 --dFreq=1.61431023092e-06 --f1dot=-2.64248266531e-09 --f1dotBand=2.90673093185e-09 --df1dot=5.78907096294e-11 --skyGridFile=../../projects/einstein.phys.uwm.edu/skygrid_GC_Dc0.5_m0.3_0410Hz_S6Bucket.dat --numSkyPartitions=2352 --partitionIndex=759 --gammaRefine=230 --ephemE=../../projects/einstein.phys.uwm.edu/earth_09_11 --ephemS=../../projects/einstein.phys.uwm.edu/sun_09_11 --nCand1=3000 -o ../../projects/einstein.phys.uwm.edu/h1_0402.50_S6GC1__759_S6BucketA_1_0 --gridType=3 --printCand1 --semiCohToplist --segmentList=../../projects/einstein.phys.uwm.edu/S6GC1_T60h_v1_Segments.seg --outputFX -d1 --Dterms=8 --DataFiles1=..\..\projects\einstein.phys.uwm.edu\h1_0402.50_S6GC1;..\..\projects\einstein.phys.uwm.edu\l1_0402.50_S6GC1;..\..\projects\einstein.phys.uwm.edu\h1_0402.55_S6GC1;..\..\projects\einstein.phys.uwm.edu\l1_0402.55_S6GC1;..\..\projects\einstein.phys.uwm.edu\h1_0402.60_S6GC1;..\..\projects\einstein.phys.uwm.edu\l1_0402.60_S6GC1;..\..\projects\einstein.phys.uwm.edu\h1_0402.65_S6GC1;..\..\projects\einstein.phys.uwm.edu\l1_0402.65_S6GC1;..\..\projects\einstein.phys.uwm.edu\h1_0402.70_S6GC1;..\..\projects\einstein.phys.uwm.edu\l1_0402.70_S6GC1;..\..\projects\einstein.phys.uwm.edu\h1_0402.75_S6GC1;..\..\projects\einstein.phys.uwm.edu\l1_0402.75_S6GC1
2012-01-10 09:09:03.7187 (2624) [debug]: Flags: LAL_NDEBUG, HS_OPTIMIZATION, i386, SSE, SSE2, GNUC
2012-01-10 09:09:03.7187 (2624) [debug]: Set up communication with graphics process.
Code-version: %% LAL: 6.5.0.2 (CLEAN af1da736ca74032ae60fad53eef75bb34ba5d3e5)
%% LALApps: 6.5.0.2 (CLEAN af1da736ca74032ae60fad53eef75bb34ba5d3e5)

2012-01-10 09:09:03.9843 (2624) [normal]: Reading input data ... ERROR: data gap or overlap in SFT#60 (GPS 949967770.000000) between bin 724589 read from file '..\..\projects\einstein.phys.uwm.edu\l1_0402.50_S6GC1' and bin 724680 read from file '..\..\projects\einstein.phys.uwm.edu\l1_0402.60_S6GC1'
XLAL Error - XLALLoadSFTs (/home/bema/EinsteinAtHome/EinsteinAtHome/source/lalsuite/lalpulsar/src/SFTfileIO.c:939): I/O error
-------------------

Error occured on Tuesday, January 10, 2012 at 09:09:22.

D:\Documents and Settings\All Users\Application Data\BOINC\projects\einstein.phys.uwm.edu\einstein_S6Bucket_1.01_windows_intelx86__SSE2.exe caused an Access Violation at location 004bfffc in module D:\Documents and Settings\All Users\Application Data\BOINC\projects\einstein.phys.uwm.edu\einstein_S6Bucket_1.01_windows_intelx86__SSE2.exe Reading from location 058cffb4.

Registers:

eax=058cff68 ebx=058bd938 ecx=05832e40 edx=05832e40 esi=05832e20 edi=058bd950

eip=004bfffc esp=0022f1e0 ebp=0022f2e8 iopl=0 nv up ei pl nz na pe nc

cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010202

Call stack:

004BFFFC D:\Documents and Settings\All Users\Application Data\BOINC\projects\einstein.phys.uwm.edu\einstein_S6Bucket_1.01_windows_intelx86__SSE2.exe:004BFFFC XLALLoadSFTs /home/bema/EinsteinAtHome/EinsteinAtHome/source/lalsuite/lalpulsar/src/SFTfileIO.c:773

004C1B30 D:\Documents and Settings\All Users\Application Data\BOINC\projects\einstein.phys.uwm.edu\einstein_S6Bucket_1.01_windows_intelx86__SSE2.exe:004C1B30 LALLoadMultiSFTs /home/bema/EinsteinAtHome/EinsteinAtHome/source/lalsuite/lalpulsar/src/SFTfileIO.c:1646

00406024 D:\Documents and Settings\All Users\Application Data\BOINC\projects\einstein.phys.uwm.edu\einstein_S6Bucket_1.01_windows_intelx86__SSE2.exe:00406024 SetUpSFTs /home/bema/EinsteinAtHome/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/GCT/HierarchSearchGCT.c:1946

0040843E D:\Documents and Settings\All Users\Application Data\BOINC\projects\einstein.phys.uwm.edu\einstein_S6Bucket_1.01_windows_intelx86__SSE2.exe:0040843E MAIN /home/bema/EinsteinAtHome/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/GCT/HierarchSearchGCT.c:665

00417C44 D:\Documents and Settings\All Users\Application Data\BOINC\projects\einstein.phys.uwm.edu\einstein_S6Bucket_1.01_windows_intelx86__SSE2.exe:00417C44 main /home/bema/EinsteinAtHome/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/hough/src2/EinsteinAtHome/hs_boinc_extras.c:1046

004010A7 D:\Documents and Settings\All Users\Application Data\BOINC\projects\einstein.phys.uwm.edu\einstein_S6Bucket_1.01_windows_intelx86__SSE2.exe:004010A7 __mingw_CRTStartup /home/ron/devel/debian/mingw32-runtime/mingw32-runtime-3.13/build_dir/src/mingw-runtime-3.13-20070825-1/crt1.c:237

00401143 D:\Documents and Settings\All Users\Application Data\BOINC\projects\einstein.phys.uwm.edu\einstein_S6Bucket_1.01_windows_intelx86__SSE2.exe:00401143

]]>

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 798388467
RAC: 1197659

Hi! Hmm.... looks like

Hi!

Hmm.... looks like file corruption or bad memory to me, more likely the first. It's all I/O related so far as I could see and the wingmen are fine, ruling out corrupt files coming out of the work generator.

Did you try a good disk test already? Anyway, since input files can be reused by many workunits, a rest of the project to force fresh downloads might be advisable once you have made sure the disk isn't dying.

HB

Sunny129
Sunny129
Joined: 5 Dec 05
Posts: 162
Credit: 160342159
RAC: 0

RE: Hi! Hmm.... looks like

Quote:

Hi!

Hmm.... looks like file corruption or bad memory to me, more likely the first. It's all I/O related so far as I could see and the wingmen are fine, ruling out corrupt files coming out of the work generator.

Did you try a good disk test already? Anyway, since input files can be reused by many workunits, a rest of the project to force fresh downloads might be advisable once you have made sure the disk isn't dying.

HB


i ran memtest on this rig just recently, so i'm fairly confident that its not the memory. i haven't run a disk test - can you recommend a good one? if it ends up revealing that nothing is wrong w/ the HDD, i'll run memtest again to double-check the memory.

TIA,
Eric

Jelle
Jelle
Joined: 16 Aug 11
Posts: 11
Credit: 110202671
RAC: 0

Just a suggestion out of left

Just a suggestion out of left field. Due to an embarrassing situation I was in a while ago myself when jobs suddenly started failing. After almost tearing my hair out looking for solutions and running hardware tests, on opening up my computer I found it was caked in dust. Lots of running continuously may be attracting dust too. I'm not in a dusty environment, so it had never occurred to me that could be the problem. May be too far-fetched for your situation, but you might want to have a peak anyway.

Sunny129
Sunny129
Joined: 5 Dec 05
Posts: 162
Credit: 160342159
RAC: 0

RE: Just a suggestion out

Quote:
Just a suggestion out of left field. Due to an embarrassing situation I was in a while ago myself when jobs suddenly started failing. After almost tearing my hair out looking for solutions and running hardware tests, on opening up my computer I found it was caked in dust. Lots of running continuously may be attracting dust too. I'm not in a dusty environment, so it had never occurred to me that could be the problem. May be too far-fetched for your situation, but you might want to have a peak anyway.


thanks for the suggestion, but i've already got that angle covered. i clean the dust out of my machines n a regular basis.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 798388467
RAC: 1197659

RE: i ran memtest on this

Quote:
i ran memtest on this rig just recently, so i'm fairly confident that its not the memory. i haven't run a disk test - can you recommend a good one? if it ends up revealing that nothing is wrong w/ the HDD, i'll run memtest again to double-check the memory.

I haven't tried it myself but maybe http://www.seagate.com/www/en-us/support/downloads/seatools could help.

There are also some tools that will check the slef-diagnostic record of the harddisc (google for SMART and disk)

HB

Sunny129
Sunny129
Joined: 5 Dec 05
Posts: 162
Credit: 160342159
RAC: 0

i'll check it out...i don't

i'll check it out...i don't have a Seagate (i have a Samsung Spinpoint), but i ould imagine that the Seagate software will work on just about any HDD...

also, to update the thread, i got another wave of S6GC errors (9 to be exact, all of which again ran for only 20-something seconds). this happened back on the 13th, which happened to be a Friday as luck would have it...so maybe that explains it ;-). joking aside, i then finished the remaining tasks in the queue, most of which were gamma ray tasks anyway. when they were completed and reported, i detached from and reattached to the Einstein@Home project to force the download of new files (just in case there may have been any corrupt files being shared by multiple tasks in my previous queue got replaced). unfortunately my host only downloaded 1 S6GC task and a whole bunch of gamma ray tasks, so it'll probably be a while before my host fetches more S6GC tasks to see if i get anymore errors...

NomexMaximus
NomexMaximus
Joined: 8 Nov 10
Posts: 2
Credit: 110859
RAC: 0

Hello Bernd, I have

Hello Bernd,

I have frequent (all?) computation errors on the einstein gravitational wave tasks. I am running Debian 6.0.3 on an AMD64 machine.

Task ID
click for details
Show names Work unit ID
click for details Computer Sent Time reported
or deadline
explain Status Run time
(sec) CPU time
(sec) Claimed credit Granted credit Application
267668384 114219209 4490667 14 Jan 2012 13:45:33 UTC 28 Jan 2012 13:45:33 UTC In progress --- --- --- --- Gravitational Wave S6 GC search v1.01 (SSE2)
267663179 114216943 4490667 14 Jan 2012 13:44:27 UTC 28 Jan 2012 13:44:27 UTC In progress --- --- --- --- Gamma-ray pulsar search #1 v0.23
267626589 114200302 4490667 14 Jan 2012 8:18:02 UTC 28 Jan 2012 8:18:02 UTC In progress --- --- --- --- Gamma-ray pulsar search #1 v0.23
267601937 114188722 4490667 14 Jan 2012 5:25:30 UTC 28 Jan 2012 5:25:30 UTC In progress --- --- --- --- Gamma-ray pulsar search #1 v0.23
267547966 114164680 4490667 13 Jan 2012 21:20:46 UTC 27 Jan 2012 21:20:46 UTC In progress --- --- --- --- Gamma-ray pulsar search #1 v0.23
267547827 114164623 4490667 13 Jan 2012 21:20:46 UTC 27 Jan 2012 21:20:46 UTC In progress --- --- --- --- Gamma-ray pulsar search #1 v0.23
267547752 114164602 4490667 13 Jan 2012 21:20:46 UTC 27 Jan 2012 21:20:46 UTC In progress --- --- --- --- Gamma-ray pulsar search #1 v0.23
267547751 114164601 4490667 13 Jan 2012 21:20:46 UTC 27 Jan 2012 21:20:46 UTC In progress --- --- --- --- Gamma-ray pulsar search #1 v0.23
267448518 114119424 4490667 13 Jan 2012 7:01:36 UTC 27 Jan 2012 7:01:36 UTC In progress --- --- --- --- Gamma-ray pulsar search #1 v0.23
267448441 114119386 4490667 13 Jan 2012 7:01:36 UTC 27 Jan 2012 7:01:36 UTC In progress --- --- --- --- Gamma-ray pulsar search #1 v0.23
267448360 114119348 4490667 13 Jan 2012 7:01:36 UTC 27 Jan 2012 7:01:36 UTC In progress --- --- --- --- Gamma-ray pulsar search #1 v0.23
267447901 114119127 4490667 13 Jan 2012 7:01:36 UTC 27 Jan 2012 7:01:36 UTC In progress --- --- --- --- Gamma-ray pulsar search #1 v0.23
267423621 114108029 4490667 13 Jan 2012 7:01:36 UTC 27 Jan 2012 7:01:36 UTC In progress --- --- --- --- Binary Radio Pulsar Search (Arecibo) v1.00 (BRP3SSE)
267318116 114060235 4490667 12 Jan 2012 11:28:49 UTC 26 Jan 2012 11:28:49 UTC In progress --- --- --- --- Gamma-ray pulsar search #1 v0.23
267275482 114040551 4490667 12 Jan 2012 8:42:02 UTC 26 Jan 2012 8:42:02 UTC In progress --- --- --- --- Gravitational Wave S6 GC search v1.01 (SSE2)
267270395 114038208 4490667 12 Jan 2012 4:10:23 UTC 26 Jan 2012 4:10:23 UTC In progress --- --- --- --- Gamma-ray pulsar search #1 v0.23
267158272 113988171 4490667 11 Jan 2012 10:56:19 UTC 25 Jan 2012 10:56:19 UTC In progress --- --- --- --- Gamma-ray pulsar search #1 v0.23
267137908 113978829 4490667 11 Jan 2012 8:33:28 UTC 18 Jan 2012 13:17:34 UTC Completed and validated 29,130.13 28,100.38 307.92 337.00 Gamma-ray pulsar search #1 v0.23
267107920 113965237 4490667 11 Jan 2012 1:43:21 UTC 18 Jan 2012 3:15:20 UTC Error while computing 319.16 311.07 1.13 --- Gravitational Wave S6 GC search v1.01 (SSE2)
267106273 113964474 4490667 11 Jan 2012 1:43:15 UTC 18 Jan 2012 3:15:20 UTC Error while computing 325.56 312.80 1.13 --- Gravitational Wave S6 GC search v1.01 (SSE2)

NomexMaximus
NomexMaximus
Joined: 8 Nov 10
Posts: 2
Credit: 110859
RAC: 0

267107920 113965237 4490667 1

267107920 113965237 4490667 11 Jan 2012 1:43:21 UTC 18 Jan 2012 3:15:20 UTC Error while computing 319.16 311.07 1.13 --- Gravitational Wave S6 GC search v1.01 (SSE2)
267106273 113964474 4490667 11 Jan 2012 1:43:15 UTC 18 Jan 2012 3:15:20 UTC Error while computing 325.56 312.80 1.13 --- Gravitational Wave S6 GC search v1.01 (SSE2)
267106270 113964473 4490667 11 Jan 2012 1:43:15 UTC 18 Jan 2012 3:15:20 UTC Error while computing 378.11 368.44 1.34 --- Gravitational Wave S6 GC search v1.01 (SSE2)
267078045 113951853 4490667 11 Jan 2012 1:43:17 UTC 18 Jan 2012 3:15:20 UTC Error while computing 653.33 640.62 2.37 --- Gravitational Wave S6 GC search v1.01 (SSE2)
267074301 113950111 4490667 11 Jan 2012 1:43:23 UTC 18 Jan 2012 3:15:20 UTC Error while computing 316.25 309.39 1.13 --- Gravitational Wave S6 GC search v1.01 (SSE2)
267074240 113950080 4490667 11 Jan 2012 1:43:15 UTC 18 Jan 2012 3:15:20 UTC Error while computing 314.94 311.14 1.13 --- Gravitational Wave S6 GC search v1.01 (SSE2)
266970897 113903331 4490667 10 Jan 2012 10:57:28 UTC 18 Jan 2012 3:15:20 UTC Error while computing 317.27 310.68 1.13 --- Gravitational Wave S6 GC search v1.01 (SSE2)
266949944 113893607 4490667 10 Jan 2012 7:01:48 UTC 18 Jan 2012 3:15:20 UTC Error while computing 320.46 311.52 1.13 --- Gravitational Wave S6 GC search v1.01 (SSE2)
266559524 113712547 4490667 8 Jan 2012 7:01:41 UTC 15 Jan 2012 9:14:24 UTC Error while computing 1,066.98 1,061.56 4.02 --- Gravitational Wave S6 GC search v1.01 (SSE2)
266559518 113712544 4490667 8 Jan 2012 7:01:41 UTC 15 Jan 2012 9:14:24 UTC Error while computing 9,093.36 9,064.27 34.61 --- Gravitational Wave S6 GC search v1.01 (SSE2)
266554870 112659787 4490667 8 Jan 2012 7:01:37 UTC 15 Jan 2012 9:14:24 UTC Error while computing 17,139.11 17,085.92 64.48 --- Gravitational Wave S6 GC search v1.01 (SSE2)
265568942 112224279 4490667 2 Jan 2012 11:39:53 UTC 8 Jan 2012 7:01:36 UTC Error while computing 902.39 898.97 3.30 --- Gravitational Wave S6 GC search v1.01 (SSE2)
265558803 112219352 4490667 2 Jan 2012 11:28:13 UTC 12 Jan 2012 1:21:04 UTC Error while computing 1,878.44 1,860.64 7.00 --- Gravitational Wave S6 GC search v1.01 (SSE2)
265499297 113244346 4490667 2 Jan 2012 1:55:49 UTC 13 Jan 2012 21:20:46 UTC Error while computing 1,572.15 1,554.23 5.87 --- Gravitational Wave S6 GC search v1.01 (SSE2)
265497646 113243579 4490667 2 Jan 2012 1:55:47 UTC 13 Jan 2012 21:20:46 UTC Error while computing 312.92 306.77 1.13 --- Gravitational Wave S6 GC search v1.01 (SSE2)
265205826 113120468 4490667 31 Dec 2011 7:07:32 UTC 2 Jan 2012 10:41:56 UTC Error while computing 323.13 308.01 1.13 --- Gravitational Wave S6 GC search v1.01 (SSE2)
265205820 113120465 4490667 31 Dec 2011 7:04:15 UTC 2 Jan 2012 1:52:29 UTC Error while computing 314.20 306.18 1.13 --- Gravitational Wave S6 GC search v1.01 (SSE2)

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 3001445270
RAC: 700335

I've only bothered to check

I've only bothered to check one task, but it seems plausible that they all have this error message?

Quote:

...
2012-01-17 18:25:11.9850 (5786) [normal]: 1/9
2012-01-17 18:25:40.1701 (5786) [normal]: 1/10
2012-01-17 18:26:08.0349 (5786) [normal]: 1/11
c
2012-01-17 18:26:36.2141 (5786) [normal]: 1/12
libgcc_s.so.1 must be installed for pthread_cancel to work

-- signal handler called

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.