Can't report result - network peer error?

Scott Brown
Scott Brown
Joined: 9 Feb 05
Posts: 38
Credit: 215235
RAC: 0
Topic 192351


I get the following when trying to connect to E@H to report 2 completed workunits:

1/22/2007 8:57:11 AM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
1/22/2007 8:57:11 AM|Einstein@Home|Reason: Requested by user
1/22/2007 8:57:11 AM|Einstein@Home|Requesting 17280 seconds of new work, and reporting 2 results
1/22/2007 8:59:11 AM||Network error: failed sending data to the peer
1/22/2007 8:59:11 AM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi failed: http error
1/22/2007 8:59:11 AM|Einstein@Home|No schedulers responded

I had this once before when working on Primegrid (same machine) and was never able to get the machine to connect to that project again. The machine is not proxied (static IP on 24/7) and does not have trouble connecting to other projects. Windows firewall is on, but BOINC and Einstein apps are in the exceptions list. Detaching/reattaching didn't work for the Primegrid problem, so I am doubtful it will here. A complete reinstall would also not be preferred as I have a CPDN model more than a third completed (guess I could do a backup....).

Any ideas?

Steve Cressman
Steve Cressman
Joined: 9 Feb 05
Posts: 104
Credit: 139654
RAC: 0

Can't report result - network peer error?

have you tried a DNS flush? and then a complete restart of boinc.

98SE XP2500+ @ 2.1 GHz Boinc v5.8.8

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118636903936
RAC: 18397220

RE: I get the following

Quote:

I get the following when trying to connect to E@H to report 2 completed workunits:
....

What happens if you ping the server from a DOS box on that machine?

Can your other two machines still contact the server?

Quote:


A complete reinstall would also not be preferred as I have a CPDN model more than a third completed (guess I could do a backup....).

Any ideas?

I've never run CPDN but I assume you don't lose everything if you stop and restart BOINC? Uninstalling and reinstalling should be no worse than that, although I wouldn't think that it would do much as it seems to be a networking issue - ie more likely OS related - just a guess.

The only other suggestion is to ask yourself do you really need to use 5.3.12.tx36? I'd be tempted to install the current standard client and see if anything changes.

Cheers,
Gary.

KSMarksPsych
KSMarksPsych
Moderator
Joined: 15 Oct 05
Posts: 2702
Credit: 4090227
RAC: 0

Even if you uninstall through

Even if you uninstall through add/remove programs you'll keep the .xml files and slots directories and such.

I've done it multiple times alpha testing.

But to be on the safe side (especially with CPDN models that can crash if you look at them the wrong way) back up your BOINC folder if you do anything.

Now if the .xml files are screwed up somehow, a reinstall probably won't fix anything.

But definitely try what Steve and Gary suggested along with a reboot (especially with Windows......).

Kathryn :o)

Einstein@Home Moderator

Scott Brown
Scott Brown
Joined: 9 Feb 05
Posts: 38
Credit: 215235
RAC: 0

Thanks for the help everyone.


Thanks for the help everyone. Had two machines affected by this in unversity offices (one is remote and not yet fixed). Both got the error when they ran out of S5R1 units and weren't able to report the S5RI's. I had to detach, manually delete any E@H files reaming in the main BOINC directory, and reattach. The one machine corrected like this has successfully reported a couple results so far. For others who might see this error, please note that a reboot wasn't necessary (Win XP Pro).

@Gary
I still use the 5.3.12.tx36 client because it has always run very stably across all my platforms (and I have played with the CPU affinity stuff from time-to-time). Mainly I haven't upgraded because I am waiting on some major version change (especially given the remote machine).

Thanks again!

Scott

Saenger
Saenger
Joined: 15 Feb 05
Posts: 403
Credit: 33009522
RAC: 0

RE: Mi 24 Jan 2007 20:31:51

Quote:
Mi 24 Jan 2007 20:31:51 CET|Einstein@Home|Started upload of file l1_0635.5_S5R1__2963_S5RIa_1_0
Mi 24 Jan 2007 20:31:52 CET|Einstein@Home|Error on file upload: Maintenance underway: file uploads are temporarily disabled.
Mi 24 Jan 2007 20:31:52 CET|Einstein@Home|Temporarily failed upload of l1_0635.5_S5R1__2963_S5RIa_1_0: transient upload error
Mi 24 Jan 2007 20:31:52 CET|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0635.5_S5R1__2963_S5RIa_1_0
Mi 24 Jan 2007 20:53:50 CET|Einstein@Home|Started upload of file l1_0635.5_S5R1__2963_S5RIa_1_0
Mi 24 Jan 2007 20:53:52 CET|Einstein@Home|Error on file upload: Maintenance underway: file uploads are temporarily disabled.
Mi 24 Jan 2007 20:53:52 CET|Einstein@Home|Temporarily failed upload of l1_0635.5_S5R1__2963_S5RIa_1_0: transient upload error
Mi 24 Jan 2007 20:53:52 CET|Einstein@Home|Backing off 46 minutes and 54 seconds on upload of file l1_0635.5_S5R1__2963_S5RIa_1_0
Mi 24 Jan 2007 21:11:01 CET|Einstein@Home|Started upload of file l1_0635.5_S5R1__2963_S5RIa_1_0
Mi 24 Jan 2007 21:11:02 CET|Einstein@Home|Error on file upload: Maintenance underway: file uploads are temporarily disabled.
Mi 24 Jan 2007 21:11:02 CET|Einstein@Home|Temporarily failed upload of l1_0635.5_S5R1__2963_S5RIa_1_0: transient upload error
Mi 24 Jan 2007 21:11:02 CET|Einstein@Home|Backing off 41 minutes and 7 seconds on upload of file l1_0635.5_S5R1__2963_S5RIa_1_0

Will it work again some time in the nearer future?

The Server_status.php only gives this:

Quote:
The server status page has temporarily been taken offline due to database problems


wich isn't really helpful ;)

Grüße vom Sänger

BarryAZ
BarryAZ
Joined: 8 May 05
Posts: 190
Credit: 325858200
RAC: 15554

Well there have been a few

Well there have been a few issues over the past 6 weeks or so. According to the note on the home page, today's specific issue:

The Einstein@Home project was offline today due to a server crash. There was a kernel panic on fileserver that is unrelated to previous problems we have had with the fileservers.

That being said, with that note one might think that specific problem is resolved -- what I've seen is that new units reporting for the first time seem to upload. However, work units which failed and show up as transfers awaiting retry, are not getting uploaded -- they encounter the error message you show with 'Error on file upload: Maintenance underway and so on'.

I am not sure if that problem is related to the kernal panic crash mentioned today or related to the problems encountered in the past, or a separate and as yet undiagnosed problem.

Regarding the server status reports -- seems that the providing that report requires precious resources on the servers and rather than drain those apparently very limited resources to serve out the reports, that feature has been shut off.

To me, the impression is that this project is in a bit of 'fix it with duct tape' mode at the moment. Not sure exactly what (or if) can or will be done to bring the project back to the upper echelon of sturdy, stable BOINC projects -- a place the Einstein project had occupied for a long time....

Scott Brown
Scott Brown
Joined: 9 Feb 05
Posts: 38
Credit: 215235
RAC: 0

Oh well...reported a couple


Oh well...reported a couple of results on the 24th, then the same error message is back. With the connectivity hiccup yesterday, I am guessing that something is screwing up with the various server problems so I will go "no new work" on this box until things settle down a bit.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.