(Copy of item posted originally to Cafe Einstein)
I just joined E@H yesterday but having been running SETI and Rosetta under BOINC. E@H has downloaded about 8 work units but all have ended up running from 2 secs to 1 hour and then crashing with the error
"There are no Child Processes to Wait for (0x80)
exit code 128(0x80)"
The error does not seem to occur during screensaver mode.
Any suggestions?
Tom
Copyright © 2024 Einstein@Home. All rights reserved.
Computation Error - Child Processes
)
(I moved this from the cafe einstein thread where I replied to the earlier post)
What were you doing when the error occurred? BOINC gets that message when it starts the Einstein application, and for some reason it doesn't start. Could be security settings (install under one account, run under another without the proper authorizations), or other software interfering. Like firewall, anti-virus, file backup utilities.
Check the messages in Boinc Manager to see what was happening up to the error. If you see something like:
Suspending computation - user is active
Pausing result
Resuming computation activity
Restarting result xxxx with einstein version 479
(and you get the exit code 128 error messaage)
It means that you did something on your computer, so it suspended the workunit until you finished, and the error occurred when BOINC restarted it. What did you do?
If you see somethin similar, but instead of "user is active" its switching to another application (Rosetta), its the same problem, the "restart" failed.
Either way, the actual message show what BOINC was doing. Seeing those would help, at a minimum the ones for the previous 3 hours.
One thing to try, run FileMon from System Internals to see whats going on with the BOINC files. In the filter dialog, set the "include" filter to "einstein*;slots*" leave the other two filters blank and select all the boxes on the bottom. (Funnel shaped icon brings up the Filemon Filter dialog). Set these options:
-Advanced Output
-Clock Time
-Show Milliseconds
-set history depth to 30000
In the Volumes menu, select only the drive BOINC is running on.
Bring up Boinc Manager and switch to the Messages tab so you can see what its doing. If it suspended the result, let the system sit until it restarts the result again. If you get the "exit code 128" error, stop the FileMon trace and check it.
You have to watch it and stop the trace when you get the error, otherwise the trace buffer will wrap and you'll lose the error information.
Look thru the trace, you might see something else accessing the files BOINC or Einstein use, or an error when the application starts. (the "magnifying glass" icon starts and stops the filemon trace).
Or save the trace to a file ("File", "Save", give it a name like "exit 128 trace"), and send it to me at wgdebug(at)yahoo.com. Also include the messages - files stdoutdae.txt and stderrdae.txt. Zip them, otherwise the mail programs munge the text and its unusable. If you do email them, put a note here so I know to check the account.
After you do that, you could try changing the preferences for "Leave applications in memory while preempted?" to "yes". That way it won't keep keep reloading the application every time you switch applications, or use the computer.
Walt
Thanks Walt for the
)
Thanks Walt for the info.
I will work through some of those issues tomorrow. The error appears to occur during the actual running of the program, not during a restart. I also have my settings to leave the program in memory while preempted.
The program is installed and running on my desktop computer with only one user.
I upgraded BOINC to the .13 version and have lost my old message file. I am running E@H at the moment to see if I can recreate the problem
I will keep you informed and will send you any system files which are created by following your suggestions.
Thanks
Tom
Hi Walt, I have sent off
)
Hi Walt,
I have sent off some files to you at Yahoo documenting the most recent computation errors.
I hope this helps.
Tom
RE: Hi Walt, I have sent
)
Hi Tom,
I got them. Yes it helps, I see the error.
Are you sure about not running the screensaver? The error occured right after it loaded the Intel OpenGL graphics driver. Thats loaded the first time graphics is displayed - either the screensaver or the "show graphics" button.
From the trace:
9:45:04.194 AM einstein_4.79_w:1696 IRP_MJ_CLOSE C:\\WINNT\\system32\\ialmgicd.dll SUCCESS
9:45:11.132 AM einstein_4.79_w:1696 IRP_MJ_WRITE C:\\Program Files\\BOINC\\slots\\1\\stderr.txt SUCCESS Offset: 0 Length: 128
First line shows when the DLL finished loading (thats part of opening the graphics device), the second line shows Einstein writing out an error message.
From stdoutdae.txt:
2005-12-15 09:45:11 [Einstein@Home] Unrecoverable error for result w1_1339.0__1339.3_0.1_T04_S4hD_0 ( - exit code -164 (0xffffff5c))
The "exit code -164" means it got an error while it was already handling an error, the first one was 0xC0000005. Thats from the result messages. Most likely this is one of the "graphics bug". In this case, one that occurs when initializing graphics.
This looks very much like a graphics problem.
You should set your screensaver to "none" or "blank" and not use the "show graphics" button. Run that way for a couple of days and see if the errors go away.
If so, and you want the graphics, try the beta test application. Its described here. And its probably a good idea to update your graphics drivers, Intel is another vendor that had buggy OpenGL drivers.
Walt
RE: ... And its probably a
)
tom,
This thread has links to download sites for graphics drivers. :-)
microcraft
"The arc of history is long, but it bends toward justice" - MLK
Thanks for your comments. I
)
Thanks for your comments. I got my first two units processed bu turning off the screensaver. This has prevented both the exit code 128 and 164 errors from occuring. I have downloaded the lasted Intel graphics driver and will now experiment with using the screensaver.
Tom
RE: Thanks for your
)
Most likely if you try to use the graphics again, you'll get bitten by the bug. If you want to use the graphics, you'll have to use the Beta App.
There are many threads on this topic. Just search for "Graphics Bug".
Kathryn
Kathryn :o)
Einstein@Home Moderator
Right you are. Starting using
)
Right you are.
Starting using the screensaver and both running work units ended up with the exit code -164 (0xffffff5c)
Tom
Moved to Beta .18 and
)
Moved to Beta .18 and everything seems to be running well.
Thanks for all the assistance
Tom