Immediate Computation error with Gravitational Wave search O1 all-sky tuning v1.00

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7301395023
RAC: 2241877
Topic 198422

On the server status page I spotted the availability of the new gravitational wave tuning run, and promptly increased the requested queue size of my hosts in hopes of getting this new type of work.

Three of my four machines promptly received work, and as the deadline was short, they promptly started the new tasks as otherwise in deadline trouble (so-called high priority).

All four tasks of this type which my machines have started on three different hosts have gone to computation error immediately. They are listed with status "Error While Computing" on the web page task list, show status in the history page of BOINCTasks as "Reported: Computation error (309,)", and have the same stderr appearance.

All show the exit status as -1073741515 (0xffffffffc0000135) Unknown error number
error task 1 on Stoll6
error task 2 on Stoll6
error task on Stoll7
error task on Stoll8

Logforme
Logforme
Joined: 13 Aug 10
Posts: 332
Credit: 1714373961
RAC: 0

Immediate Computation error with Gravitational Wave search O1 al

Promoted the one WU I got to run immediately and got the same error as you.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2988683213
RAC: 699949

RE: All show the exit

Quote:
All show the exit status as -1073741515 (0xffffffffc0000135) Unknown error number


May I remind you of the exchange we had about six months ago? Message 141791

"Error code 0xc0000135 (as it's usually written) means "The application failed to initialize properly", and that's usually because of a missing DLL."

Dependency Walker is an excellent tool for finding out what the application expects to have available, so that you can compare that with what the project have supplied.

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7301395023
RAC: 2241877

I've pretty much completely

I've pretty much completely forgotten that exchange, sadly.

Is that one where we eventually decided that Kaspersky was silently blocking something?

On the small chance that the app_version portion of client_state.xml is of interest this time, here it is from Stoll6:

    einstein_O1AS20-100T
    100
    windows_intelx86
    1.000000
    1.000000
    4743121778.798435
    SSE2
    7.1.0
    
        einstein_O1AS20-100T_1.00_windows_intelx86__SSE2.exe
        
    
    
        einstein_S5R6_3.01_graphics_windows_intelx86.exe
        graphics_app
    

I downloaded dependency walker, and pointed at the indicated executable.

The file not found complaint list seems to be this:

LIBSTDC++-6.DLL
API-MS-WIN-APPMODEL-RUNTIME-L1-1-0.DLL
API-MS-WIN-CORE-WINRT-ERROR-L1-1-0.DLL
API-MS-WIN-CORE-WINRT-L1-1-0.DLL
API-MS-WIN-CORE-WINRT-ROBUFFER-L1-1-0.DLL
API-MS-WIN-CORE-WINRT-STRING-L1-1-0.DLL
API-MS-WIN-SHCORE-SCALING-L1-1-1.DLL
DCOMP.DLL
IESHIMS.DLL
Logforme
Logforme
Joined: 13 Aug 10
Posts: 332
Credit: 1714373961
RAC: 0

The most likely thing that

The most likely thing that happened is that the developer forgot to include the LIBSTDC++-6.DLL in the package. Or that the executable should have been linked statically instead of dynamically.

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7301395023
RAC: 2241877

The server status indicates

The server status indicates that of 207 tasks distributed up to 11 Feb 2016, 15:55:01 UTC, already 41 have reported back failed.

I doubt so very many of are running Kaspersky--but of course there may be other problems mixed in with the one I and Logforme are reporting here.

Christian Beer
Christian Beer
Joined: 9 Feb 05
Posts: 595
Credit: 196961446
RAC: 202093

We are aware of the problem

We are aware of the problem and will deploy a new application soonish. I halted the distribution of work for now. There are some other problems that surfaced and need taking care of but nothing critical so far. Remember that this is still a Beta test.

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7301395023
RAC: 2241877

Wonderful, Christian. I

Wonderful, Christian.

I was only trying to be helpful, and will be quiet now.

Gavin
Gavin
Joined: 21 Sep 10
Posts: 191
Credit: 40644337738
RAC: 0

Got two tasks on this host

Got two tasks on this host that both failed with error message: 114 (0x72) Unknown error number. stderr output here.

Christian Beer
Christian Beer
Joined: 9 Feb 05
Posts: 595
Credit: 196961446
RAC: 202093

Update: * We released a new

Update:

  • * We released a new Windows application that so far seems to run, we are still investigating how we can fix the previous error with the missing libraries.
    * The error code 114 is because the work generator didn't include enough data files. I fixed that calculation but can't change the already created tasks. It also does not affect all tasks so I cancel them as they come in. The next batch of tasks will not have this problem.
    * The problem with missing result files was also addressed and should not happen for tasks that are send out today.
archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7301395023
RAC: 2241877

On seeing the good news that

On seeing the good news that some fixes were in and some work available, I fished for and got new "tuning" work for three of my four hosts.

On two of them the new work so far is running at the 25 to 30 minute point without obvious complaint.

However the third host Stoll8 has so far downloaded three of the units, started each immedidately (short deadline plus appreciable work queue), and failed after about 4 elapsed seconds with a new (to me) error syndrome.

Here are the links which can show the task stderr files:
Stoll8 error task 1
Stoll8 error task 2
Stoll8 error task 3

For all three:

1. The exit status on the task page shows as114 (0x72) Unknown error number
2. a message at the beginning of stderr asserts

The target internal file identifier is incorrect.
 (0x72) - exit code 114 (0x72)


3. a notation interior to the stderr says something like

2016-02-12 06:41:29.8351 (8144) [normal]: Reading input data ... ERROR: data gap or overlap at first bin of SFT#0 (GPS 1128211934.000000) expected bin 93869, bin 93870 read from file '..\..\projects\einstein.phys.uwm.edu\h1_0052.15_O1C01Cl1In1'
XLAL Error - XLALLoadSFTs (/home/jenkins/workspace/workspace/EAH-GW-Master/SLAVE/MINGW32/TARGET/windows-x32/EinsteinAtHome/source/lalsuite/lalpulsar/src/SFTfileIO.c:882): I/O error

with some more lines after that.

I've disabled new CPU work request on Stoll8, as it might repeat that syndrome, which seems unhelpful, and of course allowed the other two to continue.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.